Wednesday, 27 July 2016

Public Zika virus data can be volatile Zika virus data...

So it turns out I hadn't had a stroke or started losing my mind. 

....10 hours earlier....

I received an answer to my questions tweeted at the Colombian National Institute of Health asking why Colombia's Zika virus (ZIKV) data had been revised downward. Was it simply data cleaning? How did it happen? Why now? As you can see from the drop in weekly figures (that last red bar), it was a quite a cleanup if so - a drop in 5,000 cases!
Graph No. 1. The cumulative curve of suspected ZVD cases
(pink circles, left-hand axis) and the change in suspected ZVD case
numbers when compared to the preceding week's total
(red bars, right-hand axis) including the original
epidemiological week No. 28 data. Data from [1].
Click on graph to enlarge.

Graph No.2. The cumulative curve of confirmed ZIKV infections 
(lilac circles, left-hand axis) and the change in confirmed ZIKV infection 
numbers when compared to the preceding week's total 
(purple bars, right-hand axis). Now added the reported umber of microcephaly cases 
confirmed as ZIKV infected (yellow bars, right-hand axis) including the original 
epidemiological week No. 28 data. To account for adjustments 
that take cases away when there is no weekly case growth, a negative 
value - the y-axes now allow for negative values. Data from [1]. 
Click on graph to enlarge.










































Dr Fernando Ruiz, the Deputy Minister of Public Health and Service Delivery Colombia kindly engaged me on Twitter, telling me that each week their data get adjusted to account for current and former week's lags. I sent him Graph 1 above to try and reiterate that this past epidemiological week had been a bit different than any other this year. When he bounced some numbers at me something seemed weird - these were different from what I'd recorded. 

Sure enough, my spreadsheet no longer matched up with the numbers I'd harvested on Sunday morning (my time, AEST) from the Week No.28 Colombian Epidemiological Bulletin.

Weird. My usual first reaction - it's all my fault. Had I been daydreaming when I copied the numbers across? Had my Excel formulae betrayed me (never!)? Had the kids edited my blog? Had the cat sneakily deleted and typed a few figures. Had I had a small cerebral incident? Am I having one now?

Am I doomed to never know the answer?

Thankfully, @FluTrackers had a post from @thelonevirologi including charts and numbers from the Colombian data and sure enough a key figure was there that was common to both our datasets - but no longer anywhere to be found on the Colombian bulletin - 7 166 confirmed laboratory cases. And, coming to the rescue of my sanity, @thelonevirologi still had the original PDF - the data had indeed been released wrongly and then corrected and re-released by the National Institute of Health. Phew.

Public data are volatile

This really is a stark reminder that public data are volatile and can change. 

Sometimes that change may not be identified by the publisher - no version numbering and no note to say what changed and why. Simple stuff to add, but sometimes completely absent. 

We bloggers, who live in the 'grey literature' world (and rarely attract citations from the scientific literature), may be better at understanding the need to own our changes and mistakes. We often try to correct them in a way that is obvious to those who use or even rely on our information. This is just good practice.

And what about Colombia's ZIKV numbers this week?

As to the updated ZIKV figures from Colombia, the revised versions show that clinically suspect ZIKV disease cases do in fact continue to rise (+933) and that there were 22 more confirmed cases among pregnant women added this past week. No general ZIKV disease confirmations were reported after the 176 from last week and no new cases of ZIKV-associated microcephaly were added this week after 4 consecutive weeks of growth. Perhaps this is one of those laboratory 'off weeks'.

Colombia notes that it expects ZIKV-related microcephaly cases to increase in September and October 2016 as more pregnant women come to term.[2] A nearly 8% increase in (known) miscarriages has already been reported in Colombia but no rise in the use of abortion clinics which might otherwise "hide" the congenital impact of ZIKV infection not registered as microcephaly.[3] 

Given these ZIKV infections are still being suspected and detected, it seems very strange that Colombia picked now to declare it's epidemic over.[2] For certain, numbers have been slowing each week for at least 6 weeks but they are still being reported (perhaps just lagging older results?). 

A quick summary: sexual events play a role in ZIKV transmission, persistence of virus is real at several sites, we have not yet examined all possible transmission avenues (oral and respiratory epithelium, eyes, ingestion) and we still don't know whether the 80% of cases that are asymptomatic play any role in human-to-mosquito or human-to-human transmission nor whether that 80% figure still holds today. 

Perhaps the Colombians simply mean that the ZIKV numbers per week have fallen below some arbitrary internal epidemic threshold value now. Maybe cases are still being identified, just not at epidemic levels or rates. I'd have thought a threshold would take more than a year and a bit to determine for a new disease with so much still unknown, but perhaps not.

Graph No. 3. The corrected cumulative curve of suspected ZVD cases
(pink circles, left-hand axis) and the change in suspected ZVD case
numbers when compared to the preceding week's total
(red bars, right-hand axis) including the updated
epidemiological week No. 28 data. Data from [1].
Click on graph to enlarge.



 
Graph No.4. The cumulative curve of confirmed ZIKV infections
(lilac circles, left-hand axis) and the change in confirmed ZIKV infection
numbers when compared to the preceding week's total
(purple bars, right-hand axis). Now added the reported umber of microcephaly cases
confirmed as ZIKV infected (yellow bars, right-hand axis) including the updated
epidemiological week No. 28 data. To account for adjustments
that take cases away when there is no weekly case growth, a negative
value - the y-axes now allow for negative values. Data from [1].
Click on graph to enlarge.

References...
  1. http://www.ins.gov.co/boletin-epidemiologico/Boletn%20Epidemiolgico/Forms/public.aspx
  2. http://www.nytimes.com/2016/07/26/world/americas/colombia-zika-epidemic-end.html?partner=rss&emc=rss&smid=tw-nytimes&smtyp=cur&_r=0
  3. https://www.washingtonpost.com/world/the_americas/colombia-offers-the-possibility-that-the-zika-epidemic-may-not-be-as-bad-as-feared/2016/07/12/d8c91e60-3d78-11e6-9e16-4cf01a41decb_story.html?postshare=8051469159730881&tid=ss_tw