Ranking

Last updated

A ranking is a relationship between a set of items such that, for any two items, the first is either "ranked higher than", "ranked lower than" or "ranked equal to" the second. [1] In mathematics, this is known as a weak order or total preorder of objects. It is not necessarily a total order of objects because two different objects can have the same ranking. The rankings themselves are totally ordered. For example, materials are totally preordered by hardness, while degrees of hardness are totally ordered. If two items are the same in rank it is considered a tie.

Contents

By reducing detailed measures to a sequence of ordinal numbers, rankings make it possible to evaluate complex information according to certain criteria. [2] Thus, for example, an Internet search engine may rank the pages it finds according to an estimation of their relevance, making it possible for the user quickly to select the pages they are likely to want to see.

Analysis of data obtained by ranking commonly requires non-parametric statistics.

Strategies for assigning rankings

It is not always possible to assign rankings uniquely. For example, in a race or competition two (or more) entrants might tie for a place in the ranking. [3] When computing an ordinal measurement, two (or more) of the quantities being ranked might measure equal. In these cases, one of the strategies shown below for assigning the rankings may be adopted. A common shorthand way to distinguish these ranking strategies is by the ranking numbers that would be produced for four items, with the first item ranked ahead of the second and third (which compare equal) which are both ranked ahead of the fourth. These names are also shown below.

Standard competition ranking ("1224" ranking)

In competition ranking, items that compare equal receive the same ranking number, and then a gap is left in the ranking numbers. The number of ranking numbers that are left out in this gap is one less than the number of items that compared equal. Equivalently, each item's ranking number is 1 plus the number of items ranked above it. This ranking strategy is frequently adopted for competitions, as it means that if two (or more) competitors tie for a position in the ranking, the position of all those ranked below them is unaffected (i.e., a competitor only comes second if exactly one person scores better than them, third if exactly two people score better than them, fourth if exactly three people score better than them, etc.).

Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 4 ("fourth").

Modified competition ranking ("1334" ranking)

Sometimes, competition ranking is done by leaving the gaps in the ranking numbers before the sets of equal-ranking items (rather than after them as in standard competition ranking).[ where? ] The number of ranking numbers that are left out in this gap remains one less than the number of items that compared equal. Equivalently, each item's ranking number is equal to the number of items ranked equal to it or above it. This ranking ensures that a competitor only comes second if they score higher than all but one of their opponents, third if they score higher than all but two of their opponents, etc.

Thus if A ranks ahead of B and C (which compare equal) which are both ranked head of D, then A gets ranking number 1 ("first"), B gets ranking number 3 ("joint third"), C also gets ranking number 3 ("joint third") and D gets ranking number 4 ("fourth"). In this case, nobody would get ranking number 2 ("second") and that would be left as a gap.

Dense ranking ("1223" ranking)

In dense ranking, items that compare equally receive the same ranking number, and the next items receive the immediately following ranking number. Equivalently, each item's ranking number is 1 plus the number of items ranked above it that are distinct with respect to the ranking order.

Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 3 ("Third").

Ordinal ranking ("1234" ranking)

In ordinal ranking, all items receive distinct ordinal numbers, including items that compare equal. The assignment of distinct ordinal numbers to items that compare equal can be done at random, or arbitrarily, but it is generally preferable to use a system that is arbitrary but consistent, as this gives stable results if the ranking is done multiple times. An example of an arbitrary but consistent system would be to incorporate other attributes into the ranking order (such as alphabetical ordering of the competitor's name) to ensure that no two items exactly match.

With this strategy, if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first") and D gets ranking number 4 ("fourth"), and either B gets ranking number 2 ("second") and C gets ranking number 3 ("third") or C gets ranking number 2 ("second") and B gets ranking number 3 ("third").

In computer data processing, ordinal ranking is also referred to as "row numbering".

Fractional ranking ("1 2.5 2.5 4" ranking)

Items that compare equal receive the same ranking number, which is the mean of what they would have under ordinal rankings. Equivalently, the ranking number of 1 plus the number of items ranked above it plus half the number of items equal to it. This strategy has the property that the sum of the ranking numbers is the same as under ordinal ranking. For this reason, it is used in computing Borda counts and in statistical tests (see below).

Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B and C each get ranking number 2.5 (average of "joint second/third") and D gets ranking number 4 ("fourth").

Here is an example: Suppose you have the data set 1.0, 1.0, 2.0, 3.0, 3.0, 4.0, 5.0, 5.0, 5.0.

The ordinal ranks are 1, 2, 3, 4, 5, 6, 7, 8, 9.

For v = 1.0, the fractional rank is the average of the ordinal ranks: (1 + 2) / 2 = 1.5. In a similar manner, for v = 5.0, the fractional rank is (7 + 8 + 9) / 3 = 8.0.

Thus the fractional ranks are: 1.5, 1.5, 3.0, 4.5, 4.5, 6.0, 8.0, 8.0, 8.0

Ranking in statistics

In statistics, ranking is the data transformation in which numerical or ordinal values are replaced by their rank when the data are sorted. For example, the numerical data 3.4, 5.1, 2.6, 7.3 are observed, the ranks of these data items would be 2, 3, 1 and 4 respectively. For example, the ordinal data hot, cold, warm would be replaced by 3, 1, 2. In these examples, the ranks are assigned to values in ascending order. (In some other cases, descending ranks are used.) Ranks are related to the indexed list of order statistics, which consists of the original dataset rearranged into ascending order.

Some kinds of statistical tests employ calculations based on ranks. Examples include:

The distribution of values in decreasing order of rank is often of interest when values vary widely in scale; this is the rank-size distribution (or rank-frequency distribution), for example for city sizes or word frequencies. These often follow a power law.

Some ranks can have non-integer values for tied data values. For example, when there is an even number of copies of the same data value, the above described fractional statistical rank of the tied data ends in ½.

Rank function in Excel

Microsoft Excel provides two ranking functions, the Rank.EQ function which assigns competition ranks ("1224") and the Rank.AVG function which assigns fractional ranks ("1 2.5 2.5 4") as described above. The functions have the order argument, [4] which is by default is set to descending, i.e. the largest number will have a rank 1. This is generally uncommon for statistics where the ranking is usually in ascending order, where the smallest number has a rank 1.

Comparison of rankings

A rank correlation can be used to compare two rankings for the same set of objects. For example, Spearman's rank correlation coefficient is useful to measure the statistical dependence between the rankings of athletes in two tournaments. And the Kendall rank correlation coefficient is another approach. Alternatively, intersection/overlap-based approaches offer additional flexibility. One example is the "Rank–rank hypergeometric overlap" approach, [5] which is designed to compare ranking of the genes that are at the "top" of two ordered lists of differentially expressed genes. A similar approach is taken by the "Rank Biased Overlap (RBO)", [6] which also implements an adjustable probability, p, to customize the weight assigned at a desired depth of ranking. These approaches have the advantages of addressing disjoint sets, sets of different sizes, and top-weightedness (taking into account the absolute ranking position, which may be ignored in standard non-weighted rank correlation approaches).

Ranking and socio-economic evaluation

The rank methodology based on some specific indices is one of the most common systems used by policy makers and international organizations in order to assess the socio-economic context of the countries. Some notable examples are: Human Development Index (United Nations), Doing Business Index (World Bank), Corruption Perceptions Index (Transparency International) and Index of Economic Freedom (the Heritage Foundation). For instance, the Doing Business Indicator of the World Bank measures business regulations and their enforcement in 190 countries. Countries are ranked according to 10 indicators that are synthetized to produce the final rank. Each indicator is composed of sub-indicators; for instance, the Registering Property Indicator is composed of 4 sub-indicators measuring time, procedures, costs and quality of the land registration system. Obviously, these kinds of ranks are based on subjective criteria for assigning the score. Sometimes, the adopted parameters may produce discrepancies with the empirical observations, therefore potential biases and paradox may emerge from the application of these criteria. [7]

Ranking as a social game

Being competitive is the very nature of human beings. The desire to achieve a higher social rank can be perceived as a driving force for human beings. In simple terms, we want to know who is the richest, the cleverest, the most handsome or prettiest. We are also sometimes ranked by others: our supervisors, our neighbors, and compare our status in society with that of the others. An inevitable question is how objective or subjective these rankings are? Many ranked lists are based on subjective categorization. We can even pose the question: do we always want to be seen objectively, or rather do not mind having a better image than we deserve? There are certainly specific difficulties in measuring society. In order to find our place in real and virtual communities we need to understand the issues emerging when navigating between objectivity and subjectivity by combining human and artificial intelligence. The set of subjects to treat this topics include comparison, ranking, rating, choices, laws, ranking games, struggle for reputation, etc (see Péter Érdi). [8]


1.   [i] Érdi, Péter ”Ranking- The unwritten rules of the social game we all play”, Oxford University Press (2020), ISBN   978-0-19-093546-7

Examples of ranking

See also

Related Research Articles

Within economics, the concept of utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness within the theory of utilitarianism by moral philosophers such as Jeremy Bentham and John Stuart Mill. The term has been adapted and reapplied within neoclassical economics, which dominates modern economic theory, as a utility function that represents a single consumer's preference ordering over a choice set but is not comparable across consumers. This concept of utility is personal and is based on choice rather than on the pleasure received, and so is more rigorously specified than the original concept but makes it less useful for ethical decisions.

Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions. Nonparametric statistics is based on either being distribution-free or having a specified distribution but with the distribution's parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference. Nonparametric tests are often used when the assumptions of parametric tests are violated.

Spearmans rank correlation coefficient

In statistics, Spearman's rank correlation coefficient or Spearman's ρ, named after Charles Spearman and often denoted by the Greek letter (rho) or as , is a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function.

In the social sciences, scaling is the process of measuring or ordering entities with respect to quantitative attributes or traits. For example, a scaling technique might involve estimating individuals' levels of extraversion, or the perceived quality of products. Certain methods of scaling permit estimation of magnitudes on a continuum, while other methods provide only for relative ordering of the entities.

In statistics, the Mann–Whitney U test is a nonparametric test of the null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.

College and university rankings are rankings of institutions in higher education which have been ranked on the basis of various combinations of various factors. None of the rankings give a comprehensive overview of the strengths of the institutions ranked because all select a range of easily quantifiable characteristics to base their results on. Rankings have most often been conducted by magazines, newspapers, websites, governments, or academics. In addition to ranking entire institutions, organizations perform rankings of specific programs, departments, and schools. Various rankings consider combinations of measures of funding and endowment, research excellence and/or influence, specialization expertise, admissions, student options, award numbers, internationalization, graduate employment, industrial linkage, historical reputation and other criteria. Various rankings mostly evaluating on institutional output by research. Some rankings evaluate institutions within a single country, while others assess institutions worldwide. The subject has produced much debate about rankings' usefulness and accuracy. The expanding diversity in rating methodologies and accompanying criticisms of each indicate the lack of consensus in the field. Further, it seems possible to game the ranking systems through excessive self-citations or by researchers supporting each other in surveys. UNESCO has questioned whether rankings "do more harm than good", while acknowledging that "Rightly or wrongly, they are perceived as a measure of quality and so create intense competition between universities all over the world".

Likert scale Psychometric measurement scale

A Likert scale is a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and is widely criticized by scholars in other disciplines. Other classifications include those by Mosteller and Tukey, and by Chrisman.

In statistics, a categorical variable is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property. In computer science and some branches of mathematics, categorical variables are referred to as enumerations or enumerated types. Commonly, each of the possible values of a categorical variable is referred to as a level. The probability distribution associated with a random categorical variable is called a categorical distribution.

High card by suit and low card by suit refer to assigning relative values to playing cards of equal rank based on their suit. When suit ranking is applied, the most common conventions are:

Composite measure in statistics and research design refer to composite measures of variables, i.e. measurements based on multiple data items.

In statistics, a rank correlation is any of several statistics that measure an ordinal association—the relationship between rankings of different ordinal variables or different rankings of the same variable, where a "ranking" is the assignment of the ordering labels "first", "second", "third", etc. to different observations of a particular variable. A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to assess the significance of the relation between them. For example, two common nonparametric methods of significance that use rank correlation are the Mann–Whitney U test and the Wilcoxon signed-rank test.

A rating scale is a set of categories designed to elicit information about a quantitative or a qualitative attribute. In the social sciences, particularly psychology, common examples are the Likert response scale and 1-10 rating scales in which a person selects the number which is considered to reflect the perceived quality of a product.

In computer science, fractional cascading is a technique to speed up a sequence of binary searches for the same value in a sequence of related data structures. The first binary search in the sequence takes a logarithmic amount of time, as is standard for binary searches, but successive searches in the sequence are faster. The original version of fractional cascading, introduced in two papers by Chazelle and Guibas in 1986, combined the idea of cascading, originating in range searching data structures of Lueker (1978) and Willard (1978), with the idea of fractional sampling, which originated in Chazelle (1983). Later authors introduced more complex forms of fractional cascading that allow the data structure to be maintained as the data changes by a sequence of discrete insertion and deletion events.

In linguistics, ordinal numerals or ordinal number words are words representing position or rank in a sequential order; the order may be of size, importance, chronology, and so on. They differ from cardinal numerals, which represent quantity and other types of numerals.

Learning to rank Use of machine learning to rank items

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data consists of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The ranking model purposes to rank, i.e. producing a permutation of items in new, unseen lists in a similar way to rankings in the training data.

Good Country Index Index measuring how much each country contributes to the planet and the human race

The Good Country Index measures how much each of the 163 countries on the list contribute to the planet, and to the human race, through their policies and behaviors.

Evaluation measures for an information retrieval system are used to assess how well the search results satisfied the user's query intent. Such metrics are often split into kinds: online metrics look at users' interactions with the search system, while offline metrics measure relevance, in other words how likely each result, or search engine results page (SERP) page as a whole, is to meet the information needs of the user.

Fair random assignment is a kind of a fair division problem.

Ordinal data is a categorical, statistical data type where the variables have natural, ordered categories and the distances between the categories is not known. These data exist on an ordinal scale, one of four levels of measurement described by S. S. Stevens in 1946. The ordinal scale is distinguished from the nominal scale by having a ranking. It also differs from interval and ratio scales by not having category widths that represent equal increments of the underlying attribute.

References

  1. http://www.merriam-webster.com/dictionary/ranking
  2. Malara, Zbigniew; Miśko, Rafał; Sulich, Adam. "Wroclaw University of Technology graduates' career paths".Cite journal requires |journal= (help)
  3. Sulich, Adam. "The young people's labour market and crisis of integration in European Union". Archived from the original on 2017-03-04. Retrieved 2017-03-04.
  4. "Excel RANK.AVG Help". Office Support. Microsoft. Retrieved 21 January 2021.
  5. Plaisier, Seema B.; Taschereau, Richard; Wong, Justin A.; Graeber, Thomas G. (September 2010). "Rank–rank hypergeometric overlap: identification of statistically significant overlap between gene-expression signatures". Nucleic Acids Research. 38 (17): e169. doi:10.1093/nar/gkq636. PMC   2943622 . PMID   20660011.
  6. Webber, William; Moffat, Alistair; Zobel, Justin (November 2010). "A Similarity Measure for Indefinite Rankings". ACM Transactions on Information Systems. 28 (4). doi:10.1145/1852102.1852106.
  7. RIEDS, Italian Review of Economics Demography and Statistics. "World Bank Doing Business Project and the statistical methods based on ranks: the paradox of the time indicator".Cite journal requires |journal= (help)
  8. Érdi, Péter. Ranking : the unwritten rules of the social game we all play. New York, NY, United States. ISBN   978-0-19-093546-7. OCLC   1102469441.