The lead poisoning crisis in Flint, Michigan exposed a toxic problem in the city’s water system and compelled other cities to investigate water quality. But what if city officials in Flint and elsewhere could have predicted where lead exposure might be lurking long before people drank contaminated water and fell ill?
New machine learning technologies and algorithms, which analyze data and build predictive models using historical data, are making such forecasting easier, according to data scientists speaking at ICML, a leading international conference on machine learning. Equipped with new software, data scientists analyze audio, video and unstructured text sources to deliver increasingly accurate predictions about human behavior, social problems and how the damage from natural or industrial disasters might unfold.
“We have to help people take the data in any form and turn it into how policy makers make fine-grained decisions,” says Rayid Ghani, director of the Center for Data Science and Public Policy at the University of Chicago. Ghani was one of several experts presenting his research at a workshop on machine learning in social good applications held on June 24 at ICML.
Data-for-Good, as this movement is known, is being applied to a growing number of issues in government. Algorithms can be used to determine which officers on a police force are under stress and likely to make an error in judgment. They can help the Environmental Protection Agency devise more efficient inspection systems, or spot opportunities to reduce maternal mortality rates in rural Mexico.
Ghani’s work with lead poisoning in Chicago is at the forefront of this growing movement. Today, most cities only check for hazards after a child gets sick. “That turns kids into lead poisoning sensors, which is too late for the kid because the effects are irreversible,” Ghani says.
His team is working on a pilot program in Chicago that takes a more action-oriented approach. It analyzes two decades of blood test levels, home inspections, property value assessments and census data to identify children at highest risk, so action can be taken before they are poisoned.
Curbing police misconduct is another area of interest. Ghani’s team collected decade-old data on every police officer in Charlotte, North Carolina–including citations, complaints, investigations and behavior–to help identify officers whose data profiles matched those of officers who committed misconduct in the past.
Data indicated that officers who responded to domestic abuse and suicide calls were most at risk. A new system is being tested that lets dispatchers know who these officers are, to help avoid sending them to crime scenes.
Ghani, who was the Chief Scientist for President Obama’s 2012 re-election campaign, acknowledges data-for-good still has a lot to prove before it enters the mainstream of policy making in government and social organizations.
“The potential is great, but we’re just at the very beginning,“ says Ghani. He notes that many organizations have “heard the hype” about data-for-good, but don’t know how it works or how to apply it to their particular issue.
Peter Bull, cofounder of DrivenData, a Boston-based social enterprise that holds data science competitions to help organizations tackle problems, says gathering the right data is critical. “Many nonprofits don’t have data literacy or access to experts to show how them how data could be effective,” says Bull.
For an aging-in-place project in Boston, for instance, DrivenData equipped volunteers with accelerometers, motion detectors and video cameras to record detailed information about their activities.
Gideon Mann, Head of Data Science at Bloomberg, says technologists must also push the government to allow academics and researchers to share data. “We have to increase the conversation about data-for-good and create new models of engagement,” he says.
Down the road, Ghani says data scientists must go beyond developing predictive models to testing and fine-tuning interventions that fix the problem. Otherwise, he says, “we watch things happen and pat ourselves on the back for getting the prediction right, but there’s no social impact.”