Since formally declaring itself a caliphate in 2014, the Islamic State of Iraq and the Levant (ISIL) has become the most deadly terror organization in the world, particularly affecting the middle east. Over 300 terrorist attacks have been attributed to ISIL in 2017 alone, while they continue to lose territory in Syria and Iraq, recently having Mosul freed from their control. However, it’s not clear whether this loss in territory has decreased their ability to carry out many devastating attacks or if ISIL has become more deadly as they gained more experience.
To model this trend, I use a model similar to the usual Bayesian Marked Point processes but endowed with slightly differing prior distributions to encourage dependency between adjacent intervals and discourage time intervals where few attacks happened.
Data from the National Consortium for the Study of Terrorism and Responses to Terrorism (START) was examined to see if the rates or intensities of ISIL attacks changed over time. Since this data only consists of terror attacks up until 2016, data coming from January 1, 2017 to October 22,2017 was obtained from another source obtained through crowdsourcing and Story Maps. The model is applied and picks out several interesting intervals of significant changes in ISIL’s effectiveness that match up to major news stories or events.
Note: If you are interested in the model, prior distributions and MCMC sampling scheme, skip to the end of this post. For everyone else, here’s the results of the application.
The model used is described after the application in detail. In general, the model used tries to determine how many intervals of time since 2014 have different rates of attacks or fatality rates and how these rates vary over time. The number of intervals and the boundary of the intervals are considered random variables, along with the mean rate and number of fatalities on each interval.
This type of model will allow us to determine if there was some period of time from 2014-Now where ISIL was more or less effective and look at the events around these time periods. This could give some evidence that different strategies are more or less effective.
I used data from the National Consortium for the Study of Terrorism and Responses to Terrorism (START), only keeping data from after January 1, 2014 and terror attacks attributed to ISIL or its known partners. Since this data only consists of terror attacks up until 2016, data coming from January 1, 2017 to October 22,2017 was obtained from another source obtained through crowdsourcing and Story Maps. Only confirmed or suspected ISIL attacks were used, so for example the Las Vegas shooting was not included although ISIL claimed responsibility. We removed any data points where the number of fatalities were not available, leaving 4893 attacks since 2014. The average number of fatalities over the entire dataset was 7.8.
We applied the PieceExpIntensity package (developed for this application) to the data assuming that with a maximum allowed 20 split points and 100,000 iterations, burning in the first half of the posterior samples. In the posterior, no samples had more than 10 different intervals of varying activity with 5 split points occurring the most, with probability .46. Four and six split points occurred with probabilities .30 and .22 which were the second and third most visited values of in the posterior distribution.
This model had mean split point locations at 1, 2014 of 54.3, 100.2, 159.7, 162.5 and 777.6 days after January 1, 2014. This corresponded to four split points occurring after the dates February 24, March 11, June 8 and June 12 2014 with a later mean split point location after February 19, 2016. There is such a short interval here between June 8th and June 12, 2014 because ISIL took Tikrit on June 12, 2014 during the Northern Iraq offensive, killing over people on this date. There were 23 separate attacks conducted by ISIL during this time period with 8 of these attacks killing 2,252 people and the other 15 not resulting in any casualties. The estimated posterior mean number of fatalities per attack on this interval was a staggering , higher than the next highest mean fatality rate. Thus the method was able to accurately find a known period of high terrorist activity.
We examine the posterior samples of to draw inference on ISIL’s effectiveness. Plots of the posterior means and credible intervals for and is shown below.
We consider parameters that do not have overlapping credible intervals to be vastly different, displaying some significance. The credible interval for does not overlap with any of the other credible intervals, indicating that the rate of terror attacks carried out by ISIL decreased drastically from February 20, 2016 until today compared to previous time periods. This was a two months after Ramadi was retaken and four months before Fallujah was retaken from ISIL, which suggest that military intervention has decreased the risk of an attack. None of the other have credible intervals that don’t overlap with that for for . When looking at the credible intervals for , we see that intervals and (occuring from 6/13/2014 until today) have significantly higher fatality rates than the first three time intervals. Recall here that the posterior mean and credible interval for are not displayed in Figure 1 but are 93 and (83.9, 100.2).
These results suggest that ISIL’s ability to carry out attacks has decreased dramatically since February 20,2016 but the deadliness of these attacks has increased compared to the early year’s after ISIL’s founding. Interestingly, the United States launched a successful airstrike on Libyan ISIL members including a senior leader who directed several attacks Noureddine Chouchane on February 19. It appears that the increase in the military push back, which resulted in a significant loss of ISIL teritory, has hindered ISIL’s ability to carry out attacks, but that their experience in conducting so many attacks since 2014 makes attacks much more deadly. It could also be that ISIL is feeling the pressure of losing their lands and attempting to dissuade further invasion with deadlier attacks when they have the ability to do so.
If you enjoyed this post check out some of my other blog posts!
Model, Prior and Sampling information
We require a model that can identify intervals of time where the rate and/or the intensity of events change. To accomplish this end, we use the piecewise exponential distribution as it can flexibly estimate the number of intervals needed to accurately characterize the distribution and the hazard of an event within each disjoint interval. Let denote the th event time with denoting the intensity of the event. In our case, is the number of fatalities caused by attack , which reflects the intensity of each attack. Define as a partition of the time scale from $0$ to the maximum observed value for () where rates and intensities of events differ. For all , we assume that the hazard of an event is and that the the conditional distribution of the intensity is . We assume that given a time scale partition all pairs for which lies in this interval are independent and that observations in different intervals are also independent. Denote as . These assumptions lead to the following likelihood
where is the number of events in the interval , $R_j$ is the set of events in the data set that have not yet occurred by time and . Additionally, is the set of attacks that happened between and . We adopt a Bayesian approach by endowing the parameters with priors. We assume that the split point locations positions are a priori uniform but that they are spaced out in a way to discourage intervals with few events. This is done by drawing uniform random variables from 0 to and taking the even order statistics as . Formally, this prior is:
Since we do not know how many times the rate and severity of these attacks changed, we do not fix and assume that where is set to less than 20, with a default suggestion of . We impose spatial dependency in the vectors and with the following priors:
This allows us to establish some dependency between adjacent intervals of which will help us model how the effectiveness and rate of attacks have changed over time. Finally, we assume that and to encourage more dependency.
We perform a reversible jump Markov Chain Monte Carlo (MCMC) sampling scheme on the parameters , reccommending at least 10,000 iterations to explore the sample space and obtain some convergence on .
For the sampling scheme, we do the following moves:
- Metropolis Hastings move on . We shuffle the entries of for if by proposing and do not change or . This move is accepted based on the likelihood ratio and the prior ratio for .
- Metropolis Hastings moves on and . Let denote either of these vectors for the time being. For , we propose as seen in Haneuse et al. and accept based on the likelihood ratio and the prior ratio for .
- Gibbs step on and by drawing from an inverse gamma distribution with parameters and for .
- Metropolis Hastings Green move on which is done every iteration via add and delete moves (if applicable). For an add move, we propose and add an extra split point by drawing it from to obtain . Let denote the location of this additional split point and given that , we define the new values and for via the multiplicative perturbation of suggested by Green and a weighted average which yields the new values.
where is drawn for both , separately. is accepted over the previous values with probability equal to the product of the likelihood ratio, the prior ratio and the following jacobians to maintain detailed parameter balance between the parameter spaces.
- Delete: For a delete move, we randomly choose some and propose removing from , which sets and equal to the weighted average of and . For we set . We assume a multiplicative perturbation of . We accept with probability equal to the likelihood ratio, prior ratio and the following jacobian.
Again here is drawn for both , separately.
We try both delete moves if and add moves if at every iteration. Potentially, this could result in shuffling with adjusted values for if both an add and delete move are executed within an iteration.
After the MCMC has been completed, we burn in the first half of the samples and draw inference on how the rate and intensity of events have changed over time by looking at posterior samples that have values equal to the mode of over all posterior samples. Generally, one mode is selected at a significantly higher proportion than the second most visited value of but if two values of are visited with frequencies that differ by less than then posterior inference should be conducted on both values of . In general, the two values will be and () and the extra split point seen in one set of samples may not lead to very different inferential conclusions.