Making Sense (Not Just Cents) from Big Data
Also published on The Huffington Post
They are watching you. Every time you swipe your card at your local gas station they are watching you. Every time you buy a book from Amazon.com they are watching you. Every time you send a message from your computer keyboard, they are watching you. Every time you walk down a busy street, your image is captured. They are watching you.
There is a digital you that is being created and recreated every single day to capture your likes and dislikes, your habits, your friends, your emotions and your proclivities. Enter the era of Big Data (to be said in a deep and important voice with great gravitas). We are amassing data at a rate unknown to humans — even to our favorite geeks.
Every 15 minutes more information is generated than Shakespeare could have known in his entire lifetime. Folks around the globe are uploading so much digital photography that 35 percent of all of the world’s pictures are now available on Facebook. Google is used at the rate of 320 million hits a day. With information doubling every 2.5 years, we have entered the knowledge age where sifting through information — and making meaning from it — will be even more important and more powerful than collecting it.
There is no doubt that assembling all of this data has been a boon for advertisers who can now pinpoint precisely the kinds of clothes the Digital You buys from your favorite online merchant. Big Data surely makes cents. But does it make sense?
Scientists are still confused about what falls under the umbrella of Big Data. Discussions of this question are often more focused on how impressive it is to measure a terabyte than on how to use a terabyte of information in ways that connect the dots or find relations to help us explain who we are and why we do what we do. Just amassing data also puts us in the risky position of phishing for relationships without any a priori hypotheses — dangerous water for any scientific endeavor because sometimes relationships just emerge by chance.
So what can we do to use Big Data in a scientific way that is useful to society? Several years ago one of us was part of a smaller scale experiment in big data (notice the lower case letters b and d on big and data) within the field of human development. The charge was to investigate the effects of childcare on outcomes across time. As over 50 percent of mothers had entered the workforce, childcare was becoming a natural experiment and there was little information on its effects.
With 1,364 families enrolled across 10 states, dozens of scientists around the country collected data on a host of variables ranging from family demographics, children’s environment and stimulation at home, childcare type, interactions between parents and caregivers, along with child outcomes in health, social and cognitive development. The study boasted multiple measures collected at multiple times from a range of people who interacted with the target children.
This investigation moved our field from individual labs with a laser-beam focus on one area, to a kind of department store science, where we could create connections between areas of development. After all, children grow up in a context that intersects psychology and biology.
The big data we collected was generated by a set of questions we needed answers to about children’s adjustment to childcare. We learned many important things – including that parents are the key factor in their children’s lives, whether their children attend childcare or not.
Big Data is presently like the Wild West. How do we corral the Big Data piling up to make it useable? First, scientists need to start with a constrained set of questions. If we study the Digital You and others like you perhaps we can begin to understand how we choose our spouses or why we share what we do on Facebook. Second, scientists need to collaborate across their traditional disciplines to ask complicated questions together. When geneticists join with brain scientists, statisticians, and child psychologists, they might uncover the causes of autism. And these experts may invent the new language we need to truly communicate among fields and answer society’s perplexing questions.
Big Data is impressive. As we try to harness all this data we should step back and ask what we really want to know. The interdisciplinary combining of information can yield answers to some of the world’s deepest secrets. Computers will always be better than we are at collecting information. They are the master hunters and gatherers of our time. Our job as humans is to be visionary and to intuit meaning from the patterns of data we inspect. That is what we do best and if we don’t drown or stand in awe of the terabytes we generate we might actually create a multidisciplinary field that uses Big Data in a way that makes sense.