Data scientists are always on the lookout for two things: an important real-world problem and interesting data sets to leverage against it. That’s according to Catherine Cramer, Manager of Industry Engagement at Columbia University’s Data Science Institute. This idea is at the heart of the workshop she is co-leading at the forthcoming Data for Good Exchange 2018, to be held on Sunday, September 16, 2018, at Bloomberg’s Global Headquarters in New York, which will offer participating data scientists a generous helping of both.
“This is a fantastic opportunity for people who are just champing at the bit to use data to solve real problems,” says Cramer.
This is the first year that workshops are being featured at the Data for Good Exchange conference, and Cramer’s is one of four being held. Titled, “Addressing Community Challenges with Data-Driven Solutions,” it will be co-led by the New York Hall of Science’s Chief Scientist, Stephen Miles Uzzo and Vice President of STEM Learning in Communities, Andrés Henriquez.
In this workshop, the real-world problems to be addressed are centered on the community of Corona, in Queens, New York. Many first-generation immigrants reside in Corona – almost two-thirds of its residents were born outside the United States, mostly in Central and South America. Many speak English as a second language, and more than 21 percent of Corona households have incomes below the federal poverty guidelines.
Elmcor Youth & Adult Activities, Inc., a non-profit, which will also be represented at the workshop, is among the largest social services agencies serving the population of Corona. Elmcor serves youth, young adults, seniors and others in the community through after-school programs, summer camps, food pantries, senior services, career guidance and preparation, and substance-abuse prevention. And while Elmcor has collected data about its programs for years, it does not have either staff with the technical ability or time to apply advanced analytical techniques to these data sets.
Of particular interest is data related to a work supported by a recent New York State grant for opioid-abuse prevention programs. If Elmcor could develop a more persuasive way to use this data to show the effectiveness of their programs, they’d be better positioned to secure increased funding for them. “Anecdotally, they know their programs have an impact, but they want to be able to demonstrate that through data,” says Cramer.
That’s the beauty of focusing on helping the community of Corona at the Data for Good Exchange. “The Data for Good Exchange is just an exemplary confluence of these two communities with mutual interests: people with challenges that need solving on one side and people who possess data science and data visualization skills on the other,” says Cramer. “They’re all interested in coming together for the common good, and they’re all in one place. It doesn’t get any better than that.”
During the workshop, representatives from Elmcor will present some background information about the community, the challenges they face, and the data they have available. Participants will then start to work out practical ways that data science can help address these challenges. The hope is that some of the participants will become interested in a longer, ongoing effort with Elmcor and other community organizations in Queens. The National Science Foundation currently funds related work as part of a larger initiative on developing a framework for data literacy, says Cramer.
“These problems are at the intersection of health, environment, economics, and culture,” says Uzzo. “It’s not like just asking a street sweeper to be sent more often.”
He points out that Corona, while unique in some ways, shares many of the same kinds of problems as other communities around the country. “Whether it’s the Bronx, Detroit or San Francisco, these kinds of problems have high-level patterns that resemble each other,” he says. “If we can address this in Corona, maybe we can then go back to Queensbridge or Compton and understand the ways things intersect in their communities, and what kind of brain trust is needed to address these problems.”
Uzzo says there’s a lot of discussion of what should be done in these communities, but less about what’s actually possible in the near-term. The communities themselves, he says, are the best source of information about what needs to happen.
“We are looking to empower people with data literacy and, ultimately, tools to benefit from the revolution in data science, and we see this kind of workshop as an exchange of data science and analytical ideas with the kinds of problems and data that these organizations derive from the community. We see this as a stepping stone to bridge communities of need with data science communities, and are very interested in how participatory design can engage with both more directly.”
Thanks to advances in data science, he adds, “We have a lot more tools than we used to. We need to bring the tools directly to the communities.”
He’s relying on the participants at Data for Good Exchange 2018 to begin making that happen.