Sampling: Being Transparent

Recently, I listened to the audiobook “Invisible Women: Data Bias in a World Designed for Men”—a must read/listen if you like data and are interested in UX design.  The author, Caroline Criado Perez, writes about a data gap that disadvantages women in many facets of their life.  One point that particularly grabbed me was the following (transcribed passage):

“A 2015 report by the Insurance Institute for Highway Safety is excitingly headlined, ‘Improved vehicle designs bring down death rate,’ which sounds great.  Perhaps this is the result of the new legislation?  Unlikely.  Buried in the report is the following tell-tale line: ‘The rates include only driver deaths because the presence of passengers is unknown.’  This is a huge gender data gap.  When men and women are in a car together, the man is most likely to be driving.  So not collecting data on passengers more or less translates to not collecting data on women.  The infuriating irony of all this is the gendered passenger/driver norm is so prevalent that, as we have seen, the passenger seat is the only seat that is commonly tested with a female crash test dummy anyway, with the male crash test dummy still being the standard dummy for the driver’s seat.  In conclusion, a more accurate headline for the report would be ‘Improved vehicle design brings down death rates in the seat most likely to be occupied by men, but who knows about death rates in the seat most likely to be occupied by women, even though we already know women are 17 percent more likely to die in a car crash.’  Admittedly, this is less snappy.”

The author, who narrates the audiobook, was audibly perturbed.  I was right there with her nodding along in frustration as being transparent about sampling is a huge issues in research and evaluation (see my former blog post on sample representation) that permeates all aspects of our data-driven world.

For instance, at a micro-level, a teacher at my child’s school organized a Cultural Heritage Day.  I first became aware of it when we received a flyer announcing the event and the heritages that represent those of the school—in quite definitive terms.  My first thought was: “How do they know this represents the heritages of the school?”  I found out later that the teacher surveyed parents in the student pickup line.  My daughter takes the bus, so I was not included in the surveying, nor were many others.  Huge sampling issue.  In my opinion, it is sampling issue with social justice implications.  Who was not asked?  Whose cultures are not represented and celebrated?

Our colleagues at Slover Linett Audience Research were mulling over sampling issues simultaneously at a macro-level.  In his recent post, Peter Linett highlights sampling and transparency woes in political polling that turned him into a vehicular menace.  As Peter notes, the implications are big, as there are “lots of media stories about methodological social-science questions that are usually way too wonky for public consumption.”

That is why being transparent about sampling should not just concern researchers and evaluators.  It has important repercussions for those who are trying to use the data (for safety, social justice, understanding museum goers to enhance visitor experiences, etc.).  Therefore, I consider it our charge as researchers and evaluators to clearly describe and thoroughly present our research methods and to help teach others to become more data literate.  But moreover, the onus is on us to avoid, or at least pause to question, “snappy,” soundbite-reporting that may be misleading and result in faulty decision making.



I originally wrote this post before the Thanksgiving holiday.  Peter Linett’s post was published while I was editing the blog, which was quite serendipitous as it adds credence to my main point that sampling and reporting transparently is important!  

Related Posts

Leave a Reply