COMPAS (software)

Last updated

Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) [1] is a case management and decision support tool developed and owned by Northpointe (now Equivant) used by U.S. courts to assess the likelihood of a defendant becoming a recidivist. [2] [3]

Contents

COMPAS has been used by the U.S. states of New York, Wisconsin, California, Florida's Broward County, and other jurisdictions. [4]

Risk assessment

The COMPAS software uses an algorithm to assess potential recidivism risk. Northpointe created risk scales for general and violent recidivism, and for pretrial misconduct. According to the COMPAS Practitioner's Guide, the scales were designed using behavioral and psychological constructs "of very high relevance to recidivism and criminal careers." [5]

Pretrial release risk scale
Pretrial risk is a measure of the potential for an individual to fail to appear and/or to commit new felonies while on release. According to the research that informed the creation of the scale, "current charges, pending charges, prior arrest history, previous pretrial failure, residential stability, employment status, community ties, and substance abuse" are the most significant indicators affecting pretrial risk scores. [5]
General recidivism scale
The General recidivism scale is designed to predict new offenses upon release, and after the COMPAS assessment is given. The scale uses an individual's criminal history and associates, drug involvement, and indications of juvenile delinquency. [6]
Violent recidivism scale
The violent recidivism score is meant to predict violent offenses following release. The scale uses data or indicators that include a person's "history of violence, history of non-compliance, vocational/educational problems, the person's age-at-intake and the person's age-at-first-arrest." [7]

The violent recidivism risk scale is calculated as follows:

where is the violent recidivism risk score, is a weight multiplier, is current age, is the age at first arrest, is the history of violence, is vocational education scale, and is history of noncompliance. The weight, , is "determined by the strength of the item's relationship to person offense recidivism that we observed in our study data." [8]

Interventions of AI and algorithms in the court are usually motivated by cognitive biases such as hungry judge effect. [9]

In July 2016, the Wisconsin Supreme Court ruled that COMPAS risk scores can be considered by judges during sentencing, but there must be warnings given to the scores to represent the tool's "limitations and cautions." [4]

A general critique of the use of proprietary software such as COMPAS is that since the algorithms it uses are trade secrets, they cannot be examined by the public and affected parties which may be a violation of due process. Additionally, simple, transparent and more interpretable algorithms (such as linear regression) have been shown to perform predictions approximately as well as the COMPAS algorithm. [10] [11] [12]

Another general criticism of machine-learning based algorithms is since they are data-dependent if the data are biased, the software will likely yield biased results. [13]

Specifically, COMPAS risk assessments have been argued to violate 14th Amendment Equal Protection rights on the basis of race, since the algorithms are argued to be racially discriminatory, to result in disparate treatment, and to not be narrowly tailored. [14]

Accuracy

In 2016, Julia Angwin was co-author of a ProPublica investigation of the algorithm. [15] The team found that "blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend," whereas COMPAS "makes the opposite mistake among whites: They are much more likely than blacks to be labeled lower-risk but go on to commit other crimes." [15] [10] [16] They also found that only 20 percent of people predicted to commit violent crimes actually went on to do so. [15]

In a letter, Northpointe criticized ProPublica's methodology and stated that: "[The company] does not agree that the results of your analysis, or the claims being made based upon that analysis, are correct or that they accurately reflect the outcomes from the application of the model." [15]

Another team at the Community Resources for Justice, a criminal justice think tank, published a rebuttal of the investigation's findings. [17] Among several objections, the CRJ rebuttal concluded that the Propublica's results: "contradict several comprehensive existing studies concluding that actuarial risk can be predicted free of racial and/or gender bias." [17]

A subsequent study has shown that COMPAS software is somewhat more accurate than individuals with little or no criminal justice expertise, yet less accurate than groups of such individuals. [18] They found that: "On average, they got the right answer 63 percent of their time, and the group's accuracy rose to 67 percent if their answers were pooled. COMPAS, by contrast, has an accuracy of 65 percent.". [10] Researchers from the University of Houston found that COMPAS does not conform to group fairness criteria and produces various kinds of unfair outcomes across sex- and race-based demographic groups. [19]

Further reading

See also

Related Research Articles

<span class="mw-page-title-main">Supervised learning</span> A paradigm in machine learning

Supervised learning (SL) is a paradigm in machine learning where input objects and a desired output value train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.

Risk assessment determines possible mishaps, their likelihood and consequences, and the tolerances for such events. The results of this process may be expressed in a quantitative or qualitative fashion. Risk assessment is an inherent part of a broader risk management strategy to help reduce any potential risk-related consequences.

Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of observations.

<span class="mw-page-title-main">Recidivism</span> Person repeating an undesirable behavior following punishment

Recidivism is the act of a person repeating an undesirable behavior after they have experienced negative consequences of that behavior, or have been trained to extinguish. It is also used to refer to the percentage of former prisoners who are rearrested for a similar offense.

<span class="mw-page-title-main">Isotonic regression</span> Type of numerical analysis

In statistics and numerical analysis, isotonic regression or monotonic regression is the technique of fitting a free-form line to a sequence of observations such that the fitted line is non-decreasing everywhere, and lies as close to the observations as possible.

A sex offender is a person who has committed a sex crime. What constitutes a sex crime differs by culture and legal jurisdiction. The majority of convicted sex offenders have convictions for crimes of a sexual nature; however, some sex offenders have simply violated a law contained in a sexual category. Some of the serious crimes which usually result in a mandatory sex-offender classification are sexual assault, statutory rape, bestiality, child sexual abuse, incest, rape, and sexual imposition.

<span class="mw-page-title-main">Psychopathy Checklist</span> Psychopathy scale

The Psychopathy Checklist or Hare Psychopathy Checklist-Revised, now the Psychopathy Checklist—revised (PCL-R), is a psychological assessment tool that is commonly used to assess the presence and extent of the personality trait psychopathy in individuals—most often those institutionalized in the criminal justice system—and to differentiate those high in this trait from those with antisocial personality disorder, a related diagnosable disorder. It is a 20-item inventory of perceived personality traits and recorded behaviors, intended to be completed on the basis of a semi-structured interview along with a review of "collateral information" such as official records. The psychopath tends to display a constellation or combination of high narcissistic, borderline, and antisocial personality disorder traits, which includes superficial charm, charisma/attractiveness, sexual seductiveness and promiscuity, affective instability, suicidality, lack of empathy, feelings of emptiness, self-harm, and splitting. In addition, sadistic and paranoid traits are usually also present.

ProPublica, legally Pro Publica, Inc., is a nonprofit organization based in New York City dedicated to investigative journalism. ProPublica states that its investigations are conducted by its staff of full-time investigative reporters, and the resulting stories are distributed to news partners for publication or broadcast. In some cases, reporters from both ProPublica and its partners work together on a story. ProPublica has partnered with more than 90 different news organizations and has won six Pulitzer Prizes.

<span class="mw-page-title-main">Pre-crime</span> Term for crimes not yet committed

Pre-crime is the idea that the occurrence of a crime can be anticipated before it happens. The term was coined by science fiction author Philip K. Dick, and is increasingly used in academic literature to describe and criticise the tendency in criminal justice systems to focus on crimes not yet committed. Precrime intervenes to punish, disrupt, incapacitate or restrict those deemed to embody future crime threats. The term precrime embodies a temporal paradox, suggesting both that a crime has not yet occurred and that it is a foregone conclusion.

Quantitative risk assessment (QRA) software and methodologies give quantitative estimates of risks, given the parameters defining them. They are used in the financial sector, the chemical process industry, and other areas.

Machine ethics is a part of the ethics of artificial intelligence concerned with adding or ensuring moral behaviors of man-made machines that use artificial intelligence, otherwise known as artificial intelligent agents. Machine ethics differs from other ethical fields related to engineering and technology. Machine ethics should not be confused with computer ethics, which focuses on human use of computers. It should also be distinguished from the philosophy of technology, which concerns itself with the grander social effects of technology.

In the United States, the practice of predictive policing has been implemented by police departments in several states such as California, Washington, South Carolina, Alabama, Arizona, Tennessee, New York, and Illinois. Predictive policing refers to the usage of mathematical, predictive analytics, and other analytical techniques in law enforcement to identify potential criminal activity. Predictive policing methods fall into four general categories: methods for predicting crimes, methods for predicting offenders, methods for predicting perpetrators' identities, and methods for predicting victims of crime.

Risk score is the name given to a general practice in applied statistics, bio-statistics, econometrics and other related disciplines, of creating an easily calculated number that reflects the level of risk in the presence of some risk factors.

Loomis v. Wisconsin, 881 N.W.2d 749, cert. denied, 137 S. Ct. 2290 (2017), was a Wisconsin Supreme Court case that was appealed to the United States Supreme Court.

<span class="mw-page-title-main">Algorithmic bias</span> Technological phenomenon with social implications

Algorithmic bias describes systematic and repeatable errors in a computer system that create "unfair" outcomes, such as "privileging" one category over another in ways different from the intended function of the algorithm.

<span class="mw-page-title-main">Julia Angwin</span> American investigative journalist

Julia Angwin is a Pulitzer Prize-winning American investigative journalist, New York Times bestselling author, and entrepreneur. She co-founded and was editor-in-chief of The Markup, a nonprofit newsroom that investigates the impact of technology on society. She was a senior reporter at ProPublica from 2014 to April 2018 and staff reporter at the New York bureau of The Wall Street Journal from 2000 to 2013. Angwin is author of non-fiction books, Stealing MySpace: The Battle to Control the Most Popular Website in America (2009) and Dragnet Nation (2014). She is a winner and two-time finalist for the Pulitzer Prize in journalism.

Fairness in machine learning refers to the various attempts at correcting algorithmic bias in automated decision processes based on machine learning models. Decisions made by computers after a machine-learning process may be considered unfair if they were based on variables considered sensitive. Examples of these kinds of variable include gender, ethnicity, sexual orientation, disability, language, and more. As it is the case with many ethical concepts, definitions of fairness and bias are always controversial. In general, fairness and bias are considered relevant when the decision process impacts people's lives. In machine learning, the problem of algorithmic bias is well known and well studied. Outcomes may be skewed by a range of factors and thus might be considered unfair with respect to certain groups or individuals. An example would be the way social media sites deliver personalized news to consumers.

Automated decision-making (ADM) involves the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM involves large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences.

<i>The Alignment Problem</i> 2020 non-fiction book by Brian Christian

The Alignment Problem: Machine Learning and Human Values is a 2020 non-fiction book by the American writer Brian Christian. It is based on numerous interviews with experts trying to build artificial intelligence systems, particularly machine learning systems, that are aligned with human values.

References

  1. "DOC COMPAS" . Retrieved 2023-04-04.
  2. Sam Corbett-Davies, Emma Pierson, Avi Feller and Sharad Goel (October 17, 2016). "A computer program used for bail and sentencing decisions was labeled biased against blacks. It's actually not that clear". The Washington Post . Retrieved January 1, 2018.{{cite news}}: CS1 maint: multiple names: authors list (link)
  3. Aaron M. Bornstein (December 21, 2017). "Are Algorithms Building the New Infrastructure of Racism?". Nautilus . No. 55. Retrieved January 2, 2018.
  4. 1 2 Kirkpatrick, Keith (2017-01-23). "It's not the algorithm, it's the data". Communications of the ACM. 60 (2): 21–23. doi:10.1145/3022181. S2CID   33993859.
  5. 1 2 Northpointe 2015, p. 27.
  6. Northpointe 2015, p. 26.
  7. Northpointe 2015, p. 28.
  8. Northpointe 2015, p. 29.
  9. Chatziathanasiou, Konstantin (May 2022). "Beware the Lure of Narratives: "Hungry Judges" Should Not Motivate the Use of "Artificial Intelligence" in Law". German Law Journal. 23 (4): 452–464. doi: 10.1017/glj.2022.32 . ISSN   2071-8322. S2CID   249047713.
  10. 1 2 3 Yong, Ed (2018-01-17). "A Popular Algorithm Is No Better at Predicting Crimes Than Random People" . Retrieved 2019-11-21.
  11. Angelino, Elaine; Larus-Stone, Nicholas; Alabi, Daniel; Seltzer, Margo; Rudin, Cynthia (June 2018). "Learning Certifiably Optimal Rule Lists for Categorical Data". Journal of Machine Learning Research. 18 (234): 1–78. arXiv: 1704.01701 . Retrieved 2023-07-20.
  12. Robin A. Smith. Opening the lid on criminal sentencing software. Duke Today, 19 July 2017
  13. O'Neil, Cathy (2016). Weapons of Math Destruction. Crown. p. 87. ISBN   978-0553418811.
  14. Thomas, C.; Nunez, A. (2022). "Automating Judicial Discretion: How Algorithmic Risk Assessments in Pretrial Adjudications Violate Equal Protection Rights on the Basis of Race". Law & Inequality . 40 (2): 371–407. doi: 10.24926/25730037.649 .
  15. 1 2 3 4 Angwin, Julia; Larson, Jeff (2016-05-23). "Machine Bias". ProPublica. Retrieved 2019-11-21.
  16. Israni, Ellora (2017-10-26). "When an Algorithm Helps Send You to Prison (Opinion)". The New York Times. Retrieved 2019-11-21.
  17. 1 2 Flores, Anthony; Lowenkamp, Christopher; Bechtel, Kristin. "False Positives, False Negatives, and False Analyses" (PDF). Community Resources for Justice. Retrieved 2019-11-21.
  18. Dressel, Julia; Farid, Hany (2018-01-17). "The accuracy, fairness, and limits of predicting recidivism". Science Advances. 4 (1): eaao5580. Bibcode:2018SciA....4.5580D. doi: 10.1126/sciadv.aao5580 . PMC   5777393 . PMID   29376122.
  19. Gursoy, Furkan; Kakadiaris, Ioannis A. (2022-11-28). "Equal Confusion Fairness: Measuring Group-Based Disparities in Automated Decision Systems". 2022 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE. pp. 137–146. arXiv: 2307.00472 . doi:10.1109/ICDMW58026.2022.00027. ISBN   979-8-3503-4609-1. S2CID   256669476.