Data science and analytics are growing at an exponential rate, as the ability to collect, track and interpret data continues to accelerate. Data scientists are trying to deploy this massive data haul to formulate more effective public policy, according to participants at Bloomberg’s annual Data for Good Exchange (D4GX) conference, which took place on Sunday, September 25, 2016 at Bloomberg HQ in New York City.
The more than 450 attendees – including data scientists and representatives from industry, government, non-profits and academia – discussed the wide range of data-driven programs aimed at improving both governance and people’s lives. Attendees also addressed key issues about the ethics, responsibilities and standards of use in data science, amid growing concerns about data privacy, accuracy and algorithmic bias.
“Algorithms have a mathematical veneer that protects them from scrutiny, and it’s not okay,” said keynote speaker Cathy O’Neil, a data scientist and author of the new book “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.”
Speakers described programs that, for example, leverage telecommunications data to determine high-risk areas for malaria, or utilize data from cell phone towers to help urban planners better understand commuter traffic patterns. Presentations also covered data initiatives to help improve access to taxis in New York City for unbanked individuals and how a small town in Saskatchewan, Canada used sentient-data analysis to assess the emotional state of post-settlement refugees.
Increasingly, highly targeted data is being used to address the nation’s gun violence epidemic. After the University of Chicago Crime Lab traced weapons used in crimes in the city to three gun shops in Lyons, Illinois, the mayor of Lyons initiated a dialogue between gun shop owners and police and county officials.
The owners agreed to stricter controls on gun sales and committed to training employees to recognize possible “straw” gun purchases (when licensed gun buyers purchase weapons for people without a license). “Data was instrumental to achieving this goal and getting buy-in from gun shop owners,” said Mayor Christopher Getty. He noted that while it is difficult to implement change related to guns, in this particular case, “data gave us cover” by identifying the source of the firearms used in criminal activity.
In a similar way, Holly Howat of the Lafayette Parish Criminal Justice Coordinating Committee described how trace data – which tracked the movement of guns recovered by law enforcement – validated anecdotal evidence that many of the guns used in criminal activity in Lafayette Parish, Louisiana had actually been stolen from unlocked cars.
This discovery led to the development of the ‘Love It/Lock It’ initiative. The marketing campaign used both traditional and social media to encourage local residents to lock their cars. Overall, the project was a success, as the data identified a very specific factor that led to gun violence, while also taking into account local sensitivities about gun ownership.
“We weren’t taking people’s guns away, but making gun ownership more responsible by keeping guns out of the hands of criminals,” Howat said.
Another example of how cities are learning to use data for better governance and more efficient delivery of services is Flint, Michigan, in the aftermath of the city’s water-poisoning crisis. It was also an information crisis, because residents received little or no information about the dangerously high levels of lead in their water. It took an outside researcher using data to call out the issues.
Today, expanded data mining in Flint – including information gathered from digitizing waterlogged service manuals – is being deployed to identify homes that might have toxic levels of lead in their water, and to direct utility workers to the broken water pipes that are contributing to the contamination.
In a panel on how data can be used to help achieve the United Nations’ Sustainable Development Goals (SDGs), attendees mentioned programs for tracking the Zika virus, examining the durability of roofing materials and using maritime tracking techniques to rescue refugees. Yet, despite the growing interest in data-driven approaches to tackling social problems, “the vast potential of data is not being tapped,” according to Robert Kirkpatrick, the director of UN Global Pulse, an initiative that uses data science to promote sustainable development and humanitarian action.
Marcia Odell, senior director of Gender Equality at Plan International USA, sounded a cautionary note about data gathering in developing countries, warning that the way data is collected often overlooks gender inequality issues. For example, data mined from cell phones in the developing world is questionable when addressing gender disparity for women and girls because the vast majority of phones are owned by men and boys. “There are gender issues at that level of data, and that is core to how we make decisions,” she said.
The unprecedented global growth of data and data science has also fostered a surge in cybercrime, despite concerted efforts to stem cyber threats. “We are in a heavily infected and unhealthy ecosystem,” said Yurie Ito of CyberGreen, during the “Eliminating Cyber Threat Whack-a-Mole: Developing a Risk-Based Approach” panel. For his part, Manhattan District Attorney Cyrus Vance Jr. described a “tsunami of cybercrime” that is now the second biggest threat to New York City, after terrorism.
Panelists discussed the challenges of protecting data from rampant cyber threats, including the necessary participation of multiple levels of people and organizations. The best defense, according to Scott Carpenter, managing director of Jigsaw at Google, would be “to make protection so simple that people don’t even know it’s happening.”
Of equal importance, data scientists, governments, law enforcement, private industry and cyber-threat prevention organizations must “work together in an unselfish way to solve a worldwide crisis,” added Vance. “We have to move from prosecution mode to collaborative prevention mode.”
Much like fighting cybercrime, rooting out corruption is an ongoing struggle, according to European Union lawmakers, who spoke on a panel about using advanced data intelligence to combat corruption. Jirka Taylor of the RAND Corporation noted that the cost of corruption in the EU could be “north of $1 trillion” or between five and six percent of GDP. Describing corruption as “inherently transnational,” he said, “if we can raise red flags through open data the U.S. can follow, that would be for the greater good.”
Another highlight of D4GX 2016 was a series of presentations by NYC Media Lab around the Bloomberg-sponsored Immersion Day program, which placed university-based data science researchers within public-sector and nonprofit organizations to assess real-world data sets. Big cities figured prominently in these data experiments. For example, a project in Miami, Florida created a predictive model for algae blooms to improve water quality in Biscayne Bay, while another initiative in Rio de Janeiro devised a map to help startup businesses in fast-growing neighborhoods.
One recurring theme throughout the conference was for data science to be most effective, it must align closely with communities and reflect their needs in an authentic way. This is a foundational concept at The Red Hook Initiative, a youth-empowerment nonprofit in the Red Hook section of Brooklyn, which asks community members to help gather information about their neighborhood.
“We honor the community’s expertise,” said Anthony Schloss, the organization’s director of technology. “Then we look at the data and see what we’ve learned and where we need to go and what actions to take that will make the community better.”