Cathy O’Neil believes there is a dark side to numbers. A mathematician by training, she earned her doctorate at Harvard and went on to become a tenure-track professor at Barnard College. In 2007, O’Neil left academia to apply her skills as a quant at hedge fund giant D.E. Shaw. However, the global financial crisis that began in 2008 led O’Neil to question math and its role in fueling many of the world’s problems. The housing bust, financial collapse and rise in unemployment, she explains, “all had been aided and abetted by mathematicians wielding magic formulas.” In early 2011, she quit her hedge fund job.
O’Neil now works as a data scientist, and, as she states on her blog, MathBabe, she hopes to someday have a better answer to the question, “What can a non-academic mathematician do that makes the world a better place?” Through the blog, O’Neil vents about the role that fallible humans play in developing math-powered software that can be used to punish poor people and lead to inequities in criminal justice, education and job hiring, among other areas. That idea and others are captured in her current book, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.
O’Neil will be speaking at Bloomberg Data for Good Exchange, an ongoing event series by Bloomberg LP that showcases how data science can be used to solve problems at the core of society. This conversation has been edited for length.
What will you be speaking about at Data for Good Exchange?
I’ll be speaking about the book I wrote called Weapons of Math Destruction. It talks about how algorithms can be used against people, and how it’s having a destructive effect on equality and democracy. It’s exacerbating inequality. We’re using algorithms to reward rich people and prey on poor people.
Is there a way out of this situation we’re in?
Sure. I don’t think it’s inherent for algorithms to be like that. In fact, I don’t think it’s entirely like that. But I think that there’s a danger that we’re blindly trusting algorithms because they’re mathematical. I’m calling for awareness that algorithms can have a negative effect, but they’re not inherently unbiased. We shouldn’t be intimidated by them. We should be aware that they can be problematic and start interrogating them.
Are there ways that data can be used for good? Is it all bad?
At the beginning of the book, I talk about which kinds of algorithms I’m worried about. For an algorithm to be something I’m worried about, it has to have three characteristics. Number one: it has to be very widespread and high impact. That’s when it’s used to make important decisions in people’s lives. The second thing is that it’s opaque or secret – we don’t understand what’s going on. The third thing is that it’s destructive.
There are plenty of algorithms out there – I write algorithms myself – that are not like that, so I don’t want to say at all that every algorithm is like that. Moreover, I would argue that many algorithms that are worrying me now, could be recast in a more positive way. So I have hope. The main thing I want to do is raise awareness of the potential downsides.
Are there any industries that are particularly bad with how they use or misuse data?
There are three industries that are particularly bad. First, there’s criminal justice. People are only just starting to understand how biased the data of criminal justice is. The second one I would mention is education. They have a bunch of algorithms called teacher growth models or teacher value added models, and their goal is to identify bad teachers and get rid of bad teachers. But the problem is that statistically they are very weak – they’re very noisy models. The third example is anything to do with hiring. Large companies will use algorithms to help them hire people. That could be a personality test – like an online test that applicants have to go through. There’s no understanding of what those algorithms are doing.
You’ve worked with numbers your whole career. Did writing this book make you disillusioned about data?
I would say the opposite happened. When I quit my job in 2012 to write this book, I was really frustrated. Over the last four years, the field of interrogating algorithms has started. We’re seeing people like Julia Angwin at ProPublica, who is going to be part of Bloomberg Data for Good Exchange. People are becoming a lot more aware of these issues, so if anything, I’m much happier than I was a few years ago. I’m hopeful, actually. But I still think it’s urgent; I’m not saying our job is done. I just think that people are becoming aware that it needs to get done.
What would you say to people getting into data science?
I would like them to think not only about how to be a technically proficient data scientist but also how to put into place data oriented monitors to understand how your model is affecting the world.