September 23, 2005

Iraq: coalition casualties by province

In his attempted refutation of the Lancet study of Iraq excess fatalities, Seixon made an interesting argument. The Lancet study's claim, you'll recall, was that there had been an estimated 100,000 excess deaths in Iraq between March 2003 and September 2004, the period of war and early U.S. occupation.

The survey method involved, for a couple of reasons, the pairing off of two-thirds of Iraq's 18 provinces, and the transfer of survey clusters from one province in each pair, randomly chosen... allowing the survey to get something like a national representation while only working in 12 provinces. (Ultimately, this became 10, as Muthanna province had insufficient population to warrant a survey cluster, and the one cluster in Anbar province, in Fallujah, was dismissed as an outlier).

I won't get into the probability theory, but it should be intuitive that this will only work if the province-pairs are correctly paired, in terms of relative violence levels... otherwise the pairing process could unintentionally exclude six provinces with a greater than average violence level, and queer the casualties estimate for the whole country down, or vice versa.

Seixon is focussed on the probability stuff, where he's mostly out of his depth, I'm afraid. But he did strike on one useful idea, of running through the publicly available database of Coalition fatalities, and see if it justified his suspicion that non-violent provinces had been sampled out of the Lancet survey. If his numbers bore out, and the violent provinces had been incorrectly paired with non-violent provinces, and the non-violent ones then excluded by random chance, it would provide a statistical challenge to the Lancet paper of sorts.

So I decided to check his numbers.

The database Seixon is using is generally considered authoritative. It has the downside that it doesn't have a specific data field for province, but only location. Also, because that location is the place of death, it does not track very well deaths that occurred in hospital outside Iraq.

From the start, I couldn't replicate Seixon's numbers. Up to Sept. 30, 2004, there were exactly 1,200 Iraq war fatalities listed, combat and non-combat. Of those 1,200, 905 are listed as "hostile" and 295 as "non-hostile." I don't see any point in including non-combat fatalities, which would follow a different distribution less related to violence levels.

Of the 905 hostile deaths in that period, 864 are located by the database in Iraq. Of those 864, a number were not at locations that could be placed provincially, including 20 "not reporteds", a number in the "southern part," etc.

Of those that did have a place name, I spent some time with a gazetteer, and came up with the following (alternate spellings in parentheses):

Baghdad Prov. (incl Log Base Seitz, Camp Cuervo, Baghdad Airport): 207
Anbar Prov. (incl Al Asad, Al Asad AB, (Al) Habbaniyah, Al Qaim, Fallujah, Hadithah, Hamamiyat, Hit, Husaybah, Khaldiyah, Khutaylah, Mahahma, Qusayba, Ramadi, Haditha Dam): 150
Salah Ed Din Prov. (incl Ad Duluiyah, Ad Dwar, Al Ouja, Albu Shukur, Balad, Bayji, Camp Cooke, FOB Summerall, Samarra(h), Taji, Tarmiya, Tikrit): 79
Dhi Qar Prov. (An Nasiriyah): 50
Basra Prov. (incl. Umm Qasr, Khawr Al Amaya, Al Zubayr, Al Madinah): 36
Diyalah Prov. (incl. Al Ghalibiyah, As Suaydat, Baqubah, Balad Ruz/Belaruz, Buhri(t)z, Jalyula, Khalis, Khan Bani Saad, Muqdadiyah, Sadiyah): 33
Babil Prov. (incl. Al Haswah, Al Hillah, (Al) Iskandariyah, Al Mahmudiyah, Al Mussayib, Latifiya, Madlul): 29
Ninawa Prov. (incl Ash Sharqat, Mahmudiyah, Mosul, Qarrayah/Qayyarah, Shumayt, Tall Afar): 29
Najaf Prov. (incl. Ayyub, Kufa): 21
Karbala Prov.: 12
Wasit Prov. (incl As Suwayrah, Ali Aziziyal): 11
Tamim Prov. (incl. Hawijah, Kirkuk, Kirkuk AB, Taza): 11
Maysan Prov. (incl. Al Amarah, Ali As Sharqi, Majar-al-Kabir): 9
Qadisiyah Prov. (incl. Ad Diwaniyah, Scania): 4
Muthanna Prov. (incl. Ar Rumaythah, As Samawah): 3

I could find no records of casualties in Arbil, Dahuk or Sulaymaniyah provinces in the period prior to the Lancet survey.

What would these changes do to Seixon's numbers, which he gives in the form of Coalition deaths per million Iraqis? Well, here's the province, with Seixon's number in brackets, next to my matching figure. Provinces where field work was conducted by the Lancet and the results contributed to the 100K excess fatalities estimate are asterisked:

Anbar: 119 (417)
Salah Ed Din: 72 (148)*
Baghdad: 40 (69)*
Dhi Qar: 33 (35)*
Basra: 27 (36)
Diyala: 23 (38)*
Najaf: 21 (23)
Babil: 16 (48)*
Tamim: 16 (24)
Wasit: 13 (40)*
Missan: 13 (17)*
Ninawa: 12 (52)*
Karbala: 11 (26)*
Muthanna: 6 (8)
Qadisiyah: 5 (14)
Arbil: 0 (1)
Dahuk: 0 (0)
Sulaymaniyah: 0 (0)*

All the numbers of deaths per province from my calculation are smaller than Seixon's figures, which is what one would have expected due to my excluding the non-combat deaths. The remaining difference seems to be mostly due to Seixon using deaths from outside the Lancet survey period. Provinces where fighting flared up after the Lancet study was completed and published in October of 2004 have much larger numbers in Seixon's figures. For provinces which have been relatively quiet in the last year (such as Najaf) his numbers and mine are comparable, whereas provinces that continued to be "hot" after September 2004 show a wider discrepancy.

I think it should be obvious that using U.S. coalition fatalities as a baseline to check Iraqi population fatalities, if you're going to use significantly different chronological periods, could never hope to be statistically sound. If there's any value to this stat at all, it's got to be based on the anti-Coalition violence levels in the same period the Lancet actually surveyed, not including deaths later on in the insurgency.

Bottom line: overall in Iraq during the period the Lancet surveyed, with an estimated population of 24.4 million (The Lancet study's figure), there were 684 Coalition combat fatalities with a traceable location: 28 Coalition deaths per million occupied people in the first year-and-a-half of occupation, in other words.* In the 10 provinces the Lancet actually did usable fieldwork in (excluding the Fallujah/Anbar cluster), with 17.0 million people, there were 459 traceable combat fatalities in the same period... 27 deaths per million. Paradoxically, the provinces the Lancet used for its survey were just slightly safer for Coalition troops on average than the nation as a whole. So I'm afraid Seixon's inference based on this evidence that the Lancet's pairing process was incorrectly done falls apart.

There is an extensive thread on this at Tim Lambert's place, which I've contributed to extensively, and to which end I'm posting my data here.

*Caveat: This figure of 28 is for Coalition coalition deaths in all Iraqi provinces for which a point of death can be ascertained. Add in the deaths which occurred in unknown locations, hospitals outside Iraq, etc. and the total number of all occupation combat fatalities is actually 37 per million, or 2.47 combat deaths per year per 100,000 Iraqis occupied. One could probably compare that to other military peacemaking/occupation efforts (or civilian police work) to get some useful data, if one had the time.

