Data Science is a field of tremendous growth and promise. The ability to combine Big Data, machine learning, artificial intelligence, old-school statistical modeling, and computer programming, is giving all industries an opportunity to tackle problems previously considered impossible. From marketing to health care, government to finance, organizations are embracing ever more complex processes with an increasing eye towards automation and algorithmic support. We have seen the human genome mapped down to the pairing level, batteries capable of supporting whole cities, and self-driving cars. But all of this technological advancement has risks.
As we move ever-faster towards a data-driven future, we are seeing an increasing frequency of scandals resulting from non-transparent data gathering, breaches in data privacy, improperly trained algorithms, or unintended consequences when models are used in real world applications. Laws across the globe struggle to keep up with the types of issues being brought to courts for resolution. As we wrestle with new realities in ownership, privacy, discrimination, and more, there have been calls to specifically regulate what data scientists and other analytics professionals can and cannot do and with what data. None of these have passed into legislation yet the question remains – what should data scientists do to ensure that they do not unintentionally damage those they are seeking to help?
That is what Data Science Ethics is all about. Determining, partly from the mistakes of those who have gone before, where the boundaries lie in this field of gray. On this site and in the Data Science Ethics Podcast, we will explore many instances where some moral boundary was crossed, usually inadvertently, and what could or should have been done differently.