Archetypal analysis

Last updated

Archetypal analysis in statistics is an unsupervised learning method similar to cluster analysis and introduced by Adele Cutler and Leo Breiman in 1994. Rather than "typical" observations (cluster centers), it seeks extremal points in the multidimensional data, the "archetypes". The archetypes are convex combinations of observations chosen so that observations can be approximated by convex combinations of the archetypes.

Literature

Related Research Articles

<span class="mw-page-title-main">Convex hull</span> Smallest convex set containing a given set

In geometry, the convex hull or convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space, or equivalently as the set of all convex combinations of points in the subset. For a bounded subset of the plane, the convex hull may be visualized as the shape enclosed by a rubber band stretched around the subset.

<span class="mw-page-title-main">Boosting (machine learning)</span> Method in machine learning

In machine learning, boosting is an ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Boosting is based on the question posed by Kearns and Valiant : "Can a set of weak learners create a single strong learner?" A weak learner is defined to be a classifier that is only slightly correlated with the true classification. In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification.

<span class="mw-page-title-main">Intermediate-mass black hole</span> Class of black holes with a mass range of 100 to 100000 solar masses

An intermediate-mass black hole (IMBH) is a class of black hole with mass in the range 102–105 solar masses: significantly more than stellar black holes but less than the 105–109 solar mass supermassive black holes. Several IMBH candidate objects have been discovered in the Milky Way galaxy and others nearby, based on indirect gas cloud velocity and accretion disk spectra observations of various evidentiary strength.

<span class="mw-page-title-main">Abell 1835 IR1916</span> Distant galaxy in the constellation Virgo

Abell 1835 IR1916 was a candidate for being the most distant galaxy ever observed, although that claim has not been verified by additional observations. It was claimed to lie behind the galaxy cluster Abell 1835, in the Virgo constellation.

<span class="mw-page-title-main">Messier 73</span> Asterism of four stars in the constellation Aquarius

Messier 73 is an asterism of four stars in the constellation Aquarius. It lies several arcminutes east of globular cluster M72. According to Gaia EDR3, the stars are 1030±9, 1249±10, 2170±22, and 2290±24 light-years from the Sun, with the second being a binary star.

<span class="mw-page-title-main">Dwarf galaxy</span> Small galaxy composed of up to several billion stars

A dwarf galaxy is a small galaxy composed of about 1000 up to several billion stars, as compared to the Milky Way's 200–400 billion stars. The Large Magellanic Cloud, which closely orbits the Milky Way and contains over 30 billion stars, is sometimes classified as a dwarf galaxy; others consider it a full-fledged galaxy. Dwarf galaxies' formation and activity are thought to be heavily influenced by interactions with larger galaxies. Astronomers identify numerous types of dwarf galaxies, based on their shape and composition.

<span class="mw-page-title-main">Messier 89</span> Elliptical galaxy in the constellation Virgo

Messier 89 is an elliptical galaxy in the constellation Virgo. It was discovered by Charles Messier on March 18, 1781. M89 is a member of the Virgo Cluster of galaxies.

<span class="mw-page-title-main">NGC 3603</span> Open cluster in the constellation Carina

NGC 3603 is a nebula situated in the Carina–Sagittarius Arm of the Milky Way around 20,000 light-years away from the Solar System. It is a massive H II region containing a very compact open cluster HD 97950.

<span class="mw-page-title-main">Random forest</span> Binary search tree based ensemble machine learning method


Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set. Random forests generally outperform decision trees, but their accuracy is lower than gradient boosted trees. However, data characteristics can affect their performance.

<span class="mw-page-title-main">Non-negative matrix factorization</span> Algorithms for matrix decomposition

Non-negative matrix factorization, also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix V is factorized into (usually) two matrices W and H, with the property that all three matrices have no negative elements. This non-negativity makes the resulting matrices easier to inspect. Also, in applications such as processing of audio spectrograms or muscular activity, non-negativity is inherent to the data being considered. Since the problem is not exactly solvable in general, it is commonly approximated numerically.

<span class="mw-page-title-main">Extreme mass ratio inspiral</span>

In astrophysics, an extreme mass ratio inspiral (EMRI) is the orbit of a relatively light object around a much heavier object, that gradually spirals in due to the emission of gravitational waves. Such systems are likely to be found in the centers of galaxies, where stellar mass compact objects, such as stellar black holes and neutron stars, may be found orbiting a supermassive black hole. In the case of a black hole in orbit around another black hole this is an extreme mass ratio binary black hole. The term EMRI is sometimes used as a shorthand to denote the emitted gravitational waveform as well as the orbit itself.

<span class="mw-page-title-main">Ultra diffuse galaxy</span> Extremely low luminosity galaxy

An ultra diffuse galaxy (UDG) is an extremely low luminosity galaxy, the first example of which was discovered in the nearby Virgo Cluster by Allan Sandage and Bruno Binggeli in 1984. These galaxies have been studied for many years prior to their renaming in 2015. Their lack of luminosity is due to the lack of star-forming gas, which results in these galaxies being reservoirs of very old stellar populations.

<span class="mw-page-title-main">Planet Nine</span> Hypothetical Solar System planet

Planet Nine is a hypothetical ninth planet in the outer region of the Solar System. Its gravitational effects could explain the peculiar clustering of orbits for a group of extreme trans-Neptunian objects (ETNOs), bodies beyond Neptune that orbit the Sun at distances averaging more than 250 times that of the Earth. These ETNOs tend to make their closest approaches to the Sun in one sector, and their orbits are similarly tilted. These alignments suggest that an undiscovered planet may be shepherding the orbits of the most distant known Solar System objects. Nonetheless, some astronomers question this conclusion and instead assert that the clustering of the ETNOs' orbits is due to observational biases, resulting from the difficulty of discovering and tracking these objects during much of the year.

<span class="mw-page-title-main">Dragonfly 44</span> Galaxy in constellation Coma Berenices

Dragonfly 44 is an ultra diffuse galaxy in the Coma Cluster. This galaxy is well-known because observations of the velocity dispersion in 2016 suggested a mass of about one trillion solar masses, about the same as the Milky Way. This mass was consistent with a count of about 90 and 70 globular clusters observed around Dragonfly 44 in two different studies.

<span class="mw-page-title-main">NGC 3860</span> Spiral galaxy in the constellation Leo

NGC 3860 is a spiral galaxy located about 340 million light-years away in the constellation Leo. NGC 3860 was discovered by astronomer William Herschel on April 27, 1785. The galaxy is a member of the Leo Cluster and is a low-luminosity AGN (LLAGN). Gavazzi et al. however classified NGC 3860 as a strong AGN which may have been triggered by a supermassive black hole in the center of the galaxy.

<span class="mw-page-title-main">NGC 708</span> Galaxy in the constellation Andromeda

NGC 708 is an elliptical galaxy located 240 million light-years away in the constellation Andromeda and was discovered by astronomer William Herschel on September 21, 1786. It is classified as a cD galaxy and is the brightest member of Abell 262. NGC 708 is a weak FR I radio galaxy and is also classified as a type 2 Seyfert galaxy.

<span class="mw-page-title-main">NGC 4065 Group</span> Group of galaxies in the constellation of Coma Berenices

The NGC 4065 Group is a group of galaxies located about 330 Mly (100 Mpc) in the constellation Coma Berenices. The group's brightest member is NGC 4065 and located in the Coma Supercluster.

<span class="mw-page-title-main">NGC 545</span> Galaxy in the constellation Cetus

NGC 545 is a lenticular galaxy located in the constellation Cetus. It is located at a distance of circa 250 million light years from Earth, which, given its apparent dimensions, means that NGC 545 is about 180,000 light years across. It was discovered by William Herschel on October 1, 1785. It is a member of the Abell 194 galaxy cluster and is included along with NGC 547 in the Atlas of Peculiar Galaxies.

<span class="mw-page-title-main">NGC 4298</span> Galaxy in the constellation Coma Berenices

NGC 4298 is a flocculent spiral galaxy located about 53 million light-years away in the constellation Coma Berenices. The galaxy was discovered by astronomer William Herschel on April 8, 1784 and is a member of the Virgo Cluster.

Adele Cutler is a statistician known as one of the developers of archetypal analysis and of the random forest technique for ensemble learning. She is a professor of mathematics and statistics at Utah State University.