Algorithmic accountability

Last updated

Algorithmic accountability refers to the issue of where accountability should be apportioned for the consequences of real-world actions that were taken on account of algorithms used to reach a decision. [1]

Contents

In principle, an algorithm should be designed in such a way that there is no bias behind the decisions that are made during its execution process. That is, the algorithm should evaluate only essential characteristics of the inputs presented, without making distinctions based on characteristics that usually should not be used in a social environment, such as the ethnicity of an individual who is being judged in a court of law. However, this principle may not always respected and on occasions individuals may be deliberately harmed by these outcomes. It is at this point that the debate arises about who should be held responsible for the losses caused by a decision made by the machine: the system itself or the individual who designed it with such parameters, since a decision that harms other individuals due to lack of impartiality or incorrect data analysis will happen because the algorithm was designed to perform that way. [2]

Algorithm usage

The algorithms designed nowadays are spread out in the most diverse sectors of society that have some involvement of computational techniques in their control systems, of the most diverse sizes and with the most varied applications, being present in, but not limited to medical, transportation and payment services. [3] In these sectors, the algorithms embedded in the applications perform activities of natures such as: [4]

The way these algorithms are implemented, however, can be quite confusing. Effectively, algorithms in general behave like black boxes, and in most cases it is not known the process that an input data goes through during the execution of a particular routine, but only the resulting output linked to what was initially entered. [5] In general, there is no knowledge related to the parameters that make up the algorithm and how biased to certain aspects they can be, which can end up raising suspicions about the bias with which an algorithm treats a set of inputs. It depends on the outputs that are generated after the executions and if there is any individual who feels harmed by the result presented, especially when another individual, under similar conditions, ends up getting a different answer. According to Nicholas Diakopoulos:

But these algorithms can make mistakes. They have biases. Yet they sit in opaque black boxes, their inner workings, their inner “thoughts” hidden behind layers of complexity. We need to get inside that black box, to understand how they may be exerting power on us, and to understand where they might be making unjust mistakes

Wisconsin Supreme Court case

As mentioned before, algorithms are widespread in the most diverse fields of knowledge and make decisions that affect the lives of the entire population. Moreover, their structure and parameters are often unknown by those who are affected by them. A case that illustrates this well was a recent ruling by the Wisconsin Supreme Court regarding so-called "risk assessment" for crime. [3] It was ruled that such a score, which is computed through an algorithm that takes various parameters from individuals, cannot be used as a determining factor for an accused to be arrested. In addition, and more importantly, the court ruled that all reports submitted to judges in such cases should contain an information related to the accuracy presented by the algorithm used to calculate the scores.

This event has been considered a major victory in the sense of how the data-driven society should deal with softwares that operates making decisions and how to make them reliable, since the use of these algorithms in highly complex situations like courts requires a very high degree of impartiality when treating the data provided as input. However, defenders of concepts related to big data argue that there is still much to be done regarding the accuracy presented by the results of algorithms, since there is still nothing concrete regarding how we can understand what is happening during data processing, leaving room for doubt regarding the suitability of the algorithm or those who designed it.[ citation needed ]

Controversies

Another case where there is the possibility of biased execution by an algorithm was the subject of an article in The Washington Post [6] discussing the passenger transportation tool Uber. After analyzing the data collected, it was possible to verify that the estimated waiting time for users of the service was higher depending on the neighborhood where these individuals lived. The main factors affecting the increase in time were the majority ethnicity and the average income of the neighborhood.

In the above case, environments with a majority white population and with higher purchasing power had lower waiting time rates, while neighborhoods with a population of other ethnicities and lower average income had higher waiting times. It is important, however, to make clear that this conclusion was based on the data collected, not necessarily representing a cause and effect relationship, but possibly a correlation, and no value judgment is made about the behavior adopted by the Uber app in these situations.

In an article published in the column "Direito Digit@l" in Migalhas website, [7] Coriolano Almeida Camargo and Marcelo Crespo discuss the use of algorithms in contexts previously occupied by human beings when making decisions and the flaws that can occur when validating whether the decision made by the machine was fair or not.

The issue transcends and will transcend the concern with which data is collected from consumers to the question of how this data is used by algorithms. Despite the existence of some consumer protection regulations, there is no effective mechanism available to consumers that tells them, for example, whether they have been automatically discriminated against by being denied loans or jobs.

The great evolution of technology that we are experiencing has brought a wide range of innovations to society, among them the introduction of the concept of autonomous vehicles controlled by systems. That is, by algorithms that are embedded in these devices and that control the entire process of navigation on streets and roads and that face situations where they need to collect data and evaluate the environment and the context where they are inserted in order to decide what actions should be taken at each moment, simulating the actions of a human driver behind the wheel.

In the same article in the excerpt above, Camargo and Crespo discuss the possible problems involving the use of embedded algorithms in autonomous cars, especially with regard to decisions made at critical moments in the process of using the vehicles.

The technological landscape is rapidly changing with the advent of very powerful computers and algorithms that are moving toward the impressive development of artificial intelligence. We have no doubt that artificial intelligence will revolutionize the provision of services and also industry. The problem is that ethical issues urgently need to be thought through and discussed. Or are we simply going to allow machines to judge us in court cases? Or that they decide who should live or die in accident situations that could be intervened by some technological equipment, such as autonomous cars?

In TechCrunch website, Hemant Taneja wrote: [8]

Concern about “black box” algorithms that govern our lives has been spreading. New York University’s Information Law Institute hosted a conference on algorithmic accountability, noting: “Scholars, stakeholders, and policymakers question the adequacy of existing mechanisms governing algorithmic decision-making and grapple with new challenges presented by the rise of algorithmic power in terms of transparency, fairness, and equal treatment.” Yale Law School’s Information Society Project is studying this, too. “Algorithmic modeling may be biased or limited, and the uses of algorithms are still opaque in many critical sectors,” the group concluded.

Possible solutions

Some discussions on the subject have already been held by experts in order to try to reach some viable solution to understand what goes on in the black boxes that "guard" the algorithms. It is advocated primarily that the companies that develop the code themselves, which are responsible for running the data analysis algorithms, should be responsible for ensuring the reliability of their systems, for example by disclosing what goes on "behind the scenes" in their algorithms.

In TechCrunch website, Hemant Taneja wrote: [8]

...these new utilities (the Googles, Amazons and Ubers of the world) must proactively build algorithmic accountability into their systems, faithfully and transparently act as their own watchdogs or risk eventual onerous regulation.

From the excerpt above, it can be seen that one possible way is the introduction of a regulation in the computer sectors that run these algorithms so that there is an effective supervision of the activities that are happening during their executions. However, the introduction of this regulation could end up affecting the software industries and developers, and it would possibly be more advantageous for them if they would willingly open and disclose the content of what is being executed and what parameters are used for decision making, which could even end up benefiting the companies themselves with regard to the way in which the solutions developed and applied by them work.

Another possibility discussed is self-regulation by the developer companies themselves through the software. [8]

In TechCrunch website, Hemant Taneja wrote: [8]

There’s another benefit — perhaps a huge one — to software-defined regulation. It will also show us a path to a more efficient government. The world’s legal logic and regulations can be coded into software and smart sensors can offer real-time monitoring of everything from air and water quality, traffic flows and queues at the DMV. Regulators define the rules, technologist create the software to implement them and then AI and ML help refine iterations of policies going forward. This should lead to much more efficient, effective governments at the local, national and global levels.

See also

Related Research Articles

In computer science, code coverage is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high test coverage has more of its source code executed during testing, which suggests it has a lower chance of containing undetected software bugs compared to a program with low test coverage. Many different metrics can be used to calculate test coverage. Some of the most basic are the percentage of program subroutines and the percentage of program statements called during execution of the test suite.

<span class="mw-page-title-main">Supervised learning</span> Machine learning task

Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labeled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning algorithms is learning a function that maps feature vectors (inputs) to labels (output), based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.

<span class="mw-page-title-main">Perceptron</span> Algorithm for supervised learning of binary classifiers

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

<span class="mw-page-title-main">Overfitting</span> Flaw in machine learning computer model

In mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitted model is a mathematical model that contains more parameters than can be justified by the data. The essence of overfitting is to have unknowingly extracted some of the residual variation as if that variation represented underlying model structure.

In computer science, program optimization, code optimization, or software optimization, is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be optimized so that it executes more rapidly, or to make it capable of operating with less memory storage or other resources, or draw less power.

<span class="mw-page-title-main">Machine learning</span> Study of algorithms that improve automatically through experience

Machine learning (ML) is a field devoted to understanding and building methods that let machines "learn" – that is, methods that leverage data to improve computer performance on some set of tasks. It is seen as a broad subfield of artificial intelligence.

<span class="mw-page-title-main">Power analysis</span> Form of side channel attack

Power analysis is a form of side channel attack in which the attacker studies the power consumption of a cryptographic hardware device. These attacks rely on basic physical properties of the device: semiconductor devices are governed by the laws of physics, which dictate that changes in voltages within the device require very small movements of electric charges (currents). By measuring those currents, it is possible to learn a small amount of information about the data being manipulated.

<span class="mw-page-title-main">Meta-learning (computer science)</span> Subfield of machine learning

Meta learning is a subfield of machine learning where automatic learning algorithms are applied to metadata about machine learning experiments. As of 2017, the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing learning algorithms or to learn (induce) the learning algorithm itself, hence the alternative term learning to learn.

Functional testing is a quality assurance (QA) process and a type of black-box testing that bases its test cases on the specifications of the software component under test. Functions are tested by feeding them input and examining the output, and internal program structure is rarely considered. Functional testing is conducted to evaluate the compliance of a system or component with specified functional requirements. Functional testing usually describes what the system does.

Data portability is a concept to protect users from having their data stored in "silos" or "walled gardens" that are incompatible with one another, i.e. closed platforms, thus subjecting them to vendor lock-in and making the creation of data backups or moving accounts between services difficult.

Node graph architecture is a software design structured around the notion of a node graph. Both the source code as well as the user interface is designed around the editing and composition of atomic functional units.

In the United States, the practice of predictive policing has been implemented by police departments in several states such as California, Washington, South Carolina, Alabama, Arizona, Tennessee, New York, and Illinois. Predictive policing refers to the usage of mathematical, predictive analytics, and other analytical techniques in law enforcement to identify potential criminal activity. Predictive policing methods fall into four general categories: methods for predicting crimes, methods for predicting offenders, methods for predicting perpetrators' identities, and methods for predicting victims of crime.

DeepFace is a deep learning facial recognition system created by a research group at Facebook. It identifies human faces in digital images. The program employs a nine-layer neural network with over 120 million connection weights and was trained on four million images uploaded by Facebook users. The Facebook Research team has stated that the DeepFace method reaches an accuracy of 97.35% ± 0.25% on Labeled Faces in the Wild (LFW) data set where human beings have 97.53%. This means that DeepFace is sometimes more successful than human beings. As a result of growing societal concerns Meta announced that it plans to shut down Facebook facial recognition system, deleting the face scan data of more than one billion users. This change will represent one of the largest shifts in facial recognition usage in the technology’s history. Facebook planned to delete by December 2021 more than one billion facial recognition templates, which are digital scans of facial features. However, it did not plan to eliminate DeepFace which is the software that powers the facial recognition system. The company has also not ruled out incorporating facial recognition technology into future products, according to Meta spokesperson.

Algorithmic transparency is the principle that the factors that influence the decisions made by algorithms should be visible, or transparent, to the people who use, regulate, and are affected by systems that employ those algorithms. Although the phrase was coined in 2016 by Nicholas Diakopoulos and Michael Koliska about the role of algorithms in deciding the content of digital journalism services, the underlying principle dates back to the 1970s and the rise of automated systems for scoring consumer credit.

Explainable AI (XAI), also known as Interpretable AI, or Explainable Machine Learning (XML), is artificial intelligence (AI) in which humans can understand the reasoning behind decisions or predictions made by the AI. It contrasts with the "black box" concept in machine learning, where even the AI's designers cannot explain why it arrived at a specific decision.

In the regulation of algorithms, particularly artificial intelligence and its subfield of machine learning, a right to explanation is a right to be given an explanation for an output of the algorithm. Such rights primarily refer to individual rights to be given an explanation for decisions that significantly affect an individual, particularly legally or financially. For example, a person who applies for a loan and is denied may ask for an explanation, which could be "Credit bureau X reports that you declared bankruptcy last year; this is the main factor in considering you too likely to default, and thus we will not give you the loan you applied for."

<span class="mw-page-title-main">Algorithmic bias</span> Technological phenomenon with social implications

Algorithmic bias describes systematic and repeatable errors in a computer system that create "unfair" outcomes, such as "privileging" one category over another in ways different from the intended function of the algorithm.

This glossary of computer science is a list of definitions of terms and concepts used in computer science, its sub-disciplines, and related fields, including terms relevant to software, data science, and computer programming.

Fairness in machine learning refers to the various attempts at correcting algorithmic bias in automated decision processes based on machine learning models. Decisions made by computers after a machine-learning process may be considered unfair if they were based on variables considered sensitive. Examples of these kinds of variable include gender, ethnicity, sexual orientation, disability and more. As it is the case with many ethical concepts, definitions of fairness and bias are always controversial. In general, fairness and bias are considered relevant when the decision process impacts people's lives. In machine learning, the problem of algorithmic bias is well known and well studied. Outcomes may be skewed by a range of factors and thus might be considered unfair with respect to certain groups or individuals. An example would be the way social media sites deliver personalized news to consumers.

Automated decision-making (ADM) involves the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM involves large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences.

References

  1. Shah, H. (2018). "Algorithmic accountability". Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 376 (2128): 20170362. Bibcode:2018RSPTA.37670362S. doi:10.1098/rsta.2017.0362. PMID   30082307. S2CID   51926550.
  2. Kobie, Nicole. "Who do you blame when an algorithm gets you fired?". Wired. Retrieved March 2, 2023.
  3. 1 2 Angwin, Julia (August 2016). "Make Algorithms Accountable". The New York Times. Retrieved March 2, 2023.
  4. Kroll; Huey; Barocas; Felten; Reidenberg; Robinson; Yu (2016). Accountable Algorithms. University of Pennsylvania. SSRN   2765268.
  5. "Algorithmic Accountability & Transparency". Nick Diakopoulos. Archived from the original on January 21, 2016. Retrieved March 3, 2023.
  6. Stark, Jennifer; Diakopoulos, Nicholas (March 10, 2016). "Uber seems to offer better service in areas with more white people. That raises some tough questions". The Washington Post. Retrieved March 2, 2023.
  7. Santos, Coriolano Aurélio de Almeida Camargo; Chevtchuk, Leila (October 28, 2016). "Por quê precisamos de uma agenda para discutir algoritmos?". Migalhas (in Portuguese). Retrieved March 4, 2023.
  8. 1 2 3 4 Taneja, Hemant (8 September 2016). "The need for algorithmic accountability". TechCrunch. Retrieved March 4, 2023.

Bibliography