Abeba Birhane | |
---|---|
Born | |
Alma mater | Bahir Dar University (BSc, BA) University College Dublin (MSc, PhD) |
Known for | Algorithmic bias Critical race theory Computer vision |
Scientific career | |
Fields | Cognitive Science Computer Science |
Institutions | University College Dublin Deepmind |
Abeba Birhane is an Ethiopian-born cognitive scientist who works at the intersection of complex adaptive systems, machine learning, algorithmic bias, and critical race studies. Birhane's work with Vinay Prabhu uncovered that large-scale image datasets commonly used to develop AI systems, including ImageNet and 80 Million Tiny Images, carried racist and misogynistic labels and offensive images. [1] [2] [3] She has been recognized by VentureBeat as a top innovator in computer vision [4] and named as one of the 100 most influential persons in AI 2023 by TIME magazine. [5]
Birhane was born in Ethiopia. [6] She received her Bachelors of Science in Psychology and a Bachelors of Arts in Philosophy from The Open University. [7] In 2015, she completed her Master of Science in Cognitive Science and, in 2021, her Ph.D. at the Complex Software Lab in the School of Computer Science at University College Dublin. [8] [9]
Birhane studied the impacts of emerging AI technologies and how they shape individuals and local communities. She found that AI algorithms tend to disproportionately impact vulnerable groups such as older workers, trans people, immigrants, and children. Her research on relational ethics won the best paper award at NeurIPS’s Black in AI workshop in 2019. [10] She has also studied and written about algorithmic colonization driven by corporate agendas. [11] [12] Her work in decolonizing computational sciences addressed the inherited oppressions in current systems especially towards women of color.
In 2020, Birhane and Vinay Prabhu, principal machine learning scientist at UnifyID, published a paper examining the problematic data collection, labelling, classification, and consequences of large image datasets. [13] These datasets, including ImageNet and MIT's 80 Million Tiny Images, have been used to develop thousands of AI algorithms and systems. Birhane and Prabhu found that they contained many racist and misogynistic labels and slurs as well as offensive images. [14] This resulted in MIT voluntarily and formally taking down the 80 Million Tiny Images dataset. [15] [16] [17]
More recently, Birhane has worked with Rediet Abebe, George Obaido, and Sekou Remy on researching the barriers to data sharing in Africa. They found that power imbalances are significant in the data sharing process, even when the data comes from Africa. Their research was published at the ACM Conference on Fairness, Accountability, and Transparency. [18]
In 2024, Birhane established the AI Accountability Lab research group at Trinity College Dublin. [19]
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Advances in the field of deep learning have allowed neural networks to surpass many previous approaches in performance.
Deep learning is a subset of machine learning that focuses on utilizing neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience and is centered around stacking artificial neurons into layers and "training" them to process data. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
DeepMind Technologies Limited, also known by its trade name Google DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014 and merged with Google AI's Google Brain division to become Google DeepMind in April 2023. The company is based in London, with research centres in Canada, France, Germany, and the United States.
Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.
Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. A survey from May 2020 exposes the fact that practitioners report a dire need for better protecting machine learning systems in industrial applications.
Artificial intelligence in healthcare is the application of artificial intelligence (AI) to analyze and understand complex medical and healthcare data. In some cases, it can exceed or augment human capabilities by providing better or faster ways to diagnose, treat, or prevent disease.
The CIFAR-10 dataset is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. There are 6,000 images of each class.
Joy Adowaa Buolamwini is a Canadian-American computer scientist and digital activist formerly based at the MIT Media Lab. She founded the Algorithmic Justice League (AJL), an organization that works to challenge bias in decision-making software, using art, advocacy, and research to highlight the social implications and harms of artificial intelligence (AI).
Artificial intelligence art is visual artwork created or enhanced through the use of artificial intelligence (AI) programs.
Amazon Rekognition is a cloud-based software as a service (SaaS) computer vision platform that was launched in 2016. It has been sold to, and used by, a number of United States government agencies, including U.S. Immigration and Customs Enforcement (ICE) and Orlando, Florida police, as well as private entities.
Computer Vision Annotation Tool (CVAT) is a free, open source, web-based image and video annotation tool used for labeling data for computer vision algorithms. Originally developed by Intel, CVAT is designed for use by a professional data annotation team, with a user interface optimized for computer vision annotation tasks.
Rediet Abebe is an Ethiopian computer scientist working in algorithms and artificial intelligence. She is an assistant professor of computer science at the University of California, Berkeley. Previously, she was a Junior Fellow at the Harvard Society of Fellows.
80 Million Tiny Images is a dataset intended for training machine learning systems constructed by Antonio Torralba, Rob Fergus, and William T. Freeman in a collaboration between MIT and New York University. It was published in 2008.
Toloka, based in Amsterdam, is a crowdsourcing and generative AI services provider.
Inioluwa Deborah Raji is a Nigerian-Canadian computer scientist and activist who works on algorithmic bias, AI accountability, and algorithmic auditing. Raji has previously worked with Joy Buolamwini, Timnit Gebru, and the Algorithmic Justice League on researching gender and racial bias in facial recognition technology. She has also worked with Google’s Ethical AI team and been a research fellow at the Partnership on AI and AI Now Institute at New York University working on how to operationalize ethical considerations in machine learning engineering practice. A current Mozilla fellow, she has been recognized by MIT Technology Review and Forbes as one of the world's top young innovators.
Karen Hao is an American journalist and data scientist. Currently a contributing writer for The Atlantic and previously a foreign correspondent based in Hong Kong for The Wall Street Journal and senior artificial intelligence editor at the MIT Technology Review, she is best known for her coverage on AI research, technology ethics and the social impact of AI. Hao also co-produces the podcast In Machines We Trust and writes the newsletter The Algorithm.
The Fashion MNIST dataset is a large freely available database of fashion images that is commonly used for training and testing various machine learning systems. Fashion-MNIST was intended to serve as a replacement for the original MNIST database for benchmarking machine learning algorithms, as it shares the same image size, data format and the structure of training and testing splits.
Meta AI is a company owned by Meta that develops artificial intelligence and augmented and artificial reality technologies. Meta AI deems itself an academic research laboratory, focused on generating knowledge for the AI community, and should not be confused with Meta's Applied Machine Learning (AML) team, which focuses on the practical applications of its products.
LAION is a German non-profit which makes open-sourced artificial intelligence models and datasets. It is best known for releasing a number of large datasets of images and captions scraped from the web which have been used to train a number of high-profile text-to-image models, including Stable Diffusion and Imagen.