Tuesday, March 25, 2014

How numbers and statistics can lie: a response to Elizabeth Angsioco

Payatas dump site (2007).  Photo by Kounosu in Wikipedia.
In her article in Manila Standard Today entitled, Numbers Don't Lie, Elizabeth Angsioco enumerated statistics that her group obtained from Payatas, Quezon City.  Payatas is the poster barangay of poverty, with people living in garbage dumps.  The purpose of this statistics is primarily to convince the Supreme Court to pass the RH Law in order to address the worsening maternal mortality and teenage pregnancy.

There are a some statistics that appear alarming at first sight.  I'll highlight only the following:
Your Pregnancy & Newborn Journey: A Guide for Pregnant Teens (Teen Pregnancy and Parenting series)
Your Pregnancy & Newborn Journey: A Guide for Pregnant Teens (Teen Pregnancy and Parenting series)
  1. 85 percent of respondents have gotten pregnant 1,933 times. 
  2. Respondents know of 422 cases of these complications and 298 incidences of maternal mortality. 
  3. 123 respondents (23 percent) experienced 148 cases of miscarriages.
  4. 27 percent had health problems during pregnancy with 42 percent suffering from serious problems. 
  5. 14 percent had childbirth complications and almost half of them said their lives were put at risk.
  6. The women know of 1,556 cases of teen-age pregnancies in their areas. 64 percent happened before the girls turned 15 while 16 of the girls got pregnant before they reached 13 years old.
  7. The first pregnancy of more than 50 percent (who have gotten pregnant) happened before they were 22 years old. The first childbirth of 36 percent of those who have given birth was when they were 16-20 with ten births when respondents were 11-15 years old. 
These statistics provide a good example on how to use or misuse statistics:

1. Do not substitute the whole for the part

Teenage Pregnancy: The Making and Unmaking of a Problem (Health & Society)
Teenage Pregnancy: The Making and Unmaking of a Problem (Health & Society)
In statement 1, "85 percent of respondents have gotten pregnant 1,933 times" is deceiving.  It is made to appear that each respondent got pregnant 1,933 times, which is a large number.  Of course, this number is ridiculous, because a woman releases only about 400 mature eggs for fertilization starting at her puberty until her menopausal stage.  To interpret the number 1,933 properly, we must take 85% of the number of respondents 621, which is 528.  If our hunch is correct that 1,933 is the sum of the number of pregnancies of each of the 528 respondents, then the average number of pregnancies per woman in this group is 3.66 pregnancies, which is not really big, considering that the desired number of children of the respondents in the survey is about 3 children per woman.

A similar case is statement 3, "123 respondents (23 percent) experienced 148 cases of miscarriages."  If our hunch is correct, what this statement should have been is that "Each of the 123 respondents  experienced about 148/123 = 1.2 miscarriages."

2. Make sure that you do not count the same thing twice
Death in Childbirth: An International Study of Maternal Care and Maternal Mortality 1800-1950
Death in Childbirth: An International Study of Maternal Care and Maternal Mortality 1800-1950

In statement 4, we read: "Respondents know of 422 cases of these complications and 298 incidences of maternal mortality."

We know that there were only 14% of the 621 respondents which experienced childbirth complications, and this amounts to 621(0.14) = 87 women, which is far smaller than the 422 cases.  What could have happened is that for each woman who experienced complication, 422/87 = 4.85 or about 5 respondents remembered it.  This number is essentially the number of neighboring families who knew about the pregnancy complication of their neighbor.  Thus, the same incident is recorded 5 times.  This is quintuplication of data.

We can make the same analysis and show that the 298 incidences of maternal mortality remembered by the respondents is just 298/4.85 = 61 deaths.

Another similar case is statement 6: "The women know of 1,556 cases of teen-age pregnancies in their areas."  If our neighborhood hypothesis of quintuplication is correct, we can estimate the actual number of teen-age pregnancies to actually be 1,556/4.85 = 321 pregnancies, which is roughly half the number of 621 respondents.  This may roughly coincide with 64 percent pregnancies that happened before the girls turned 15.

3. Define your terms and variables

Obstetrics: Normal and Problem Pregnancies: Expert Consult - Online and Print, 6e (Obstetrics Normal and Problem Preqnancies)
Obstetrics: Normal and Problem Pregnancies: Expert Consult - Online and Print, 6e (Obstetrics Normal and Problem Preqnancies)
In statement 4, we read: "27 percent had health problems during pregnancy with 42 percent suffering from serious problems."  What are these health problem? Are these problems normally associated with pregnancy such as headache?  We need to define these things precisely.

In statement 7, we read: "The first pregnancy of more than 50 percent (who have gotten pregnant) happened before they were 22 years old. The first childbirth of 36 percent of those who have given birth was when they were 16-20 with ten births when respondents were 11-15 years old."  I thought Angsioco is trying to measure teenage pregnancy.  At what age is teenage pregnancy? Is it when the woman got pregnant before the ideal age of 22? Or is it when she got pregnant when she is 13-19 years old?

Conclusions

The statistics published by Angsioco are plagued with defects. Numbers can lie, especially in the hands of two kinds of authors:
  1. Those who do not know how to collect and interpret data 
  2. Those who know how to collect and interpret data. but intentionally wish to deceive in order to push a particular agenda.