Following the progress in computing and machine learning algorithms as well as the emergence of big data, artificial intelligence (AI) has become a reality impacting every fabric of our algorithmic society. Despite the explosive growth of machine learning, the common misconception that machines operate on zeros and ones, therefore they should be objective, still holds. But then, why does Google Translate convert these Turkish sentences with gender-neutral pronouns, “O bir doktor. O bir hemşire”, to these English sentences, “He is a doctor. She is a nurse”? As data-driven machine learning brings forth a plethora of challenges, I analyze what could go wrong when algorithms make decisions on behalf of individuals and society if they acquire statistical knowledge of language from historical human data.
In this talk, I show how we can repurpose machine learning as a scientific tool to discover facts about artificial and natural intelligence, and assess social constructs. I prove that machines trained on societal linguistic data inevitably inherit the biases of society. To do so, I derive a method that investigates the construct of language models trained on billions of sentences collected from the World Wide Web. I conclude the talk with future directions and open research questions in the field of ethics of machine learning.