Relative species abundance is a component of biodiversity and is a measure of how common or rare a species is relative to other species in a defined location or community. [1] Relative abundance is the percent composition of an organism of a particular kind relative to the total number of organisms in the area.[ citation needed ] Relative species abundances tend to conform to specific patterns that are among the best-known and most-studied patterns in macroecology. Different populations in a community exist in relative proportions; this idea is known as relative abundance.
Relative species abundance and species richness describe key elements of biodiversity. [1] Relative species abundance refers to how common or rare a species is relative to other species in a given location or community. [1] [4]
Usually relative species abundances are described for a single trophic level. Because such species occupy the same trophic level they will potentially or actually compete for similar resources. [1] For example, relative species abundances might describe all terrestrial birds in a forest community or all planktonic copepods in a particular marine environment.
Relative species abundances follow very similar patterns over a wide range of ecological communities. When plotted as a histogram of the number of species represented by 1, 2, 3, ..., n individuals usually fit a hollow curve, such that most species are rare, (represented by a single individual in a community sample) and relatively few species are abundant (represented by a large number of individuals in a community sample)(Figure 1). [4] This pattern has been long-recognized and can be broadly summarized with the statement that "most species are rare". [5] For example, Charles Darwin noted in 1859 in The Origin of Species that "... rarity is the attribute of vast numbers of species in all classes...." [6]
Species abundance patterns can be best visualized in the form of relative abundance distribution plots. The consistency of relative species abundance patterns suggests that some common macroecological "rule" or process determines the distribution of individuals among species within a trophic level.
Relative species abundance distributions are usually graphed as frequency histograms ("Preston plots"; Figure 2) [7] or rank-abundance diagrams ("Whittaker Plots"; Figure 3). [8]
Frequency histogram (Preston plot):
Rank-abundance diagram (Whittaker plot):
When plotted in these ways, relative species abundances from wildly different data sets show similar patterns: frequency histograms tend to be right-skewed (e.g. Figure 2) and rank-abundance diagrams tend to conform to the curves illustrated in Figure 4.
Researchers attempting to understand relative species abundance patterns usually approach them in a descriptive or mechanistic way. Using a descriptive approach biologists attempt to fit a mathematical model to real data sets and infer the underlying biological principles at work from the model parameters. By contrast, mechanistic approaches create a mathematical model based on biological principles and then test how well these models fit real data sets. [9]
I. Motomura developed the geometric series model based on benthic community data in a lake. [12] Within the geometric series each species' level of abundance is a sequential, constant proportion (k) of the total number of individuals in the community. Thus if k is 0.5, the most common species would represent half of individuals in the community (50%), the second most common species would represent half of the remaining half (25%), the third, half of the remaining quarter (12.5%) and so forth.
Although Motomura originally developed the model as a statistical (descriptive) means to plot observed abundances, the "discovery" of his paper by Western researchers in 1965 led to the model being used as a niche apportionment model – the "niche-preemption model". [8] In a mechanistic model k represents the proportion of the resource base acquired by a given species.
The geometric series rank-abundance diagram is linear with a slope of –k, and reflects a rapid decrease in species abundances by rank (Figure 4). [12] The geometric series does not explicitly assume that species colonize an area sequentially, however, the model fits the concept of niche preemption, where species sequentially colonize a region and the first species to arrive receives the majority of resources. [13] The geometric series model fits observed species abundances in highly uneven communities with low diversity. [13] This is expected to occur in terrestrial plant communities (as these assemblages often show strong dominance) as well as communities at early successional stages and those in harsh or isolated environments (Figure 5). [8]
where:
The logseries was developed by Ronald Fisher to fit two different abundance data sets: British moth species (collected by Carrington Williams) and Malaya butterflies (collected by Alexander Steven Corbet). [14] The logic behind the derivation of the logseries is varied [15] however Fisher proposed that sampled species abundances would follow a negative binomial from which the zero abundance class (species too rare to be sampled) was eliminated. [1] He also assumed that the total number of species in a community was infinite. Together, this produced the logseries distribution (Figure 4). The logseries predicts the number of species at different levels of abundance (n individuals) with the formula:
where:
The number of species with 1, 2, 3, ..., n individuals are therefore:
The constants α and x can be estimated through iteration from a given species data set using the values S and N. [2] Fisher's dimensionless α is often used as a measure of biodiversity, and indeed has recently been found to represent the fundamental biodiversity parameter θ from neutral theory (see below).
Using several data sets (including breeding bird surveys from New York and Pennsylvania and moth collections from Maine, Alberta and Saskatchewan) Frank W. Preston (1948) argued that species abundances (when binned logarithmically in a Preston plot) follow a normal (Gaussian) distribution, partly as a result of the central limit theorem (Figure 4). [7] This means that the abundance distribution is lognormal. According to his argument, the right-skew observed in species abundance frequency histograms (including those described by Fisher et al. (1943) [14] ) was, in fact, a sampling artifact. Given that species toward the left side of the x-axis are increasingly rare, they may be missed in a random species sample. As the sample size increases however, the likelihood of collecting rare species in a way that accurately represents their abundance also increases, and more of the normal distribution becomes visible. [7] The point at which rare species cease to be sampled has been termed Preston's veil line. As the sample size increases Preston's veil is pushed farther to the left and more of the normal curve becomes visible [2] [10] (Figure 6). Williams' moth data, originally used by Fisher to develop the logseries distribution, became increasingly lognormal as more years of sampling were completed. [1] [3]
Preston's theory has an application: if a community is truly lognormal yet under-sampled, the lognormal distribution can be used to estimate the true species richness of a community. Assuming the shape of the total distribution can be confidently predicted from the collected data, the normal curve can be fit via statistical software or by completing the Gaussian formula: [7]
where:
It is then possible to predict how many species are in the community by calculating the total area under the curve (N):
The number of species missing from the data set (the missing area to the left of the veil line) is simply N minus the number of species sampled. [2] Preston did this for two lepidopteran data sets, predicting that, even after 22 years of collection, only 72% and 88% of the species present had been sampled. [7]
The Yule model is based on a much earlier, Galton–Watson model which was used to describe the distribution of species among genera. [16] The Yule model assumes random branching of species trees, with each species (branch tip) having the equivalent probability of giving rise to new species or becoming extinct. As the number of species within a genus, within a clade, has a similar distribution to the number of individuals within a species, within a community (i.e. the "hollow curve"), Sean Nee (2003) used the model to describe relative species abundances. [4] [17] In many ways this model is similar to niche apportionment models, however, Nee intentionally did not propose a biological mechanism for the model behavior, arguing that any distribution can be produced by a variety of mechanisms. [17]
Note: This section provides a general summary of niche apportionment theory, more information can be found under niche apportionment models.
Most mechanistic approaches to species abundance distributions use niche-space, i.e. available resources, as the mechanism driving abundances. If species in the same trophic level consume the same resources (such as nutrients or sunlight in plant communities, prey in carnivore communities, nesting locations or food in bird communities) and these resources are limited, how the resource "pie" is divided among species determines how many individuals of each species can exist in the community. Species with access to abundant resources will have higher carrying capacities than those with little access. Mutsunori Tokeshi [18] later elaborated niche apportionment theory to include niche filling in unexploited resource space. [9] Thus, a species may survive in the community by carving out a portion of another species' niche (slicing up the pie into smaller pieces) or by moving into a vacant niche (essentially making the pie larger, for example, by being the first to arrive in a newly available location or through the development of a novel trait that allows access previously unavailable resources). Numerous niche apportionment models have been developed. Each make different assumptions about how species carve up niche-space.
The Unified Neutral Theory of Biodiversity and Biogeography (UNTB) is a special form of mechanistic model that takes an entirely different approach to community composition than the niche apportionment models. [1] Instead of species populations reaching equilibrium within a community, the UNTB model is dynamic, allowing for continuing changes in relative species abundances through drift.
A community in the UNTB model can be best visualized as a grid with a certain number of spaces, each occupied with individuals of different species. The model is zero-sum as there are a limited number of spaces that can be occupied: an increase in the number of individuals of one species in the grid must result in corresponding decrease in the number of individuals of other species in the grid. The model then uses birth, death, immigration, extinction and speciation to modify community composition over time.
The UNTB model produces a dimensionless "fundamental biodiversity" number, θ, which is derived using the formula:
where:
Relative species abundances in the UNTB model follow a zero-sum multinomial distribution. [19] The shape of this distribution is a function of the immigration rate, the size of the sampled community (grid), and θ. [19] When the value of θ is small, the relative species abundance distribution is similar to the geometric series (high dominance). As θ gets larger, the distribution becomes increasingly s-shaped (log-normal) and, as it approaches infinity, the curve becomes flat (the community has infinite diversity and species abundances of one). Finally, when θ = 0 the community described consists of only one species (extreme dominance). [1]
An unexpected result of the UNTB is that at very large sample sizes, predicted relative species abundance curves describe the metacommunity and become identical to Fisher's logseries. At this point θ also becomes identical to Fisher's for the equivalent distribution and Fisher's constant x is equal to the ratio of birthrate : deathrate. Thus, the UNTB unintentionally offers a mechanistic explanation of the logseries 50 years after Fisher first developed his descriptive model. [1]
A histogram is a visual representation of the distribution of numeric data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to "bin" the range of values— divide the entire range of values into a series of intervals—and then count how many values fall into each interval. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The bins (intervals) must be adjacent and are often of equal size.
The likelihood function is the joint probability mass of observed data viewed as a function of the parameters of a statistical model. Intuitively, the likelihood function is the probability of having observed data assuming is the true value.
In statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter". In particular, a statistic is sufficient for a family of probability distributions if the sample from which it is calculated gives no additional information than the statistic, as to which of those probability distributions is the sampling distribution.
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.
In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
The unified neutral theory of biodiversity and biogeography is a theory and the title of a monograph by ecologist Stephen P. Hubbell. It aims to explain the diversity and relative abundance of species in ecological communities. Like other neutral theories of ecology, Hubbell assumes that the differences between members of an ecological community of trophically similar species are "neutral", or irrelevant to their success. This implies that niche differences do not influence abundance and the abundance of each species follows a random walk. The theory has sparked controversy, and some authors consider it a more complex version of other null models that fit the data better.
In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population. However, while problems associated with "mixture distributions" relate to deriving the properties of the overall population from those of the sub-populations, "mixture models" are used to make statistical inferences about the properties of the sub-populations given only observations on the pooled population, without sub-population identity information.
Species richness is the number of different species represented in an ecological community, landscape or region. Species richness is simply a count of species, and it does not take into account the abundances of the species or their relative abundance distributions. Species richness is sometimes considered synonymous with species diversity, but the formal metric species diversity takes into account both species richness and species evenness.
Spatial ecology studies the ultimate distributional or spatial unit occupied by a species. In a particular habitat shared by several species, each of the species is usually confined to its own microhabitat or spatial niche because two species in the same general territory cannot usually occupy the same ecological niche for any significant length of time.
The species–area relationship or species–area curve describes the relationship between the area of a habitat, or of part of a habitat, and the number of species found within that area. Larger areas tend to contain larger numbers of species, and empirically, the relative numbers seem to follow systematic mathematical relationships. The species–area relationship is usually constructed for a single type of organism, such as all vascular plants or all species of a specific trophic level within a particular site. It is rarely if ever, constructed for all types of organisms if simply because of the prodigious data requirements. It is related but not identical to the species discovery curve.
In probability theory, the Chinese restaurant process is a discrete-time stochastic process, analogous to seating customers at tables in a restaurant. Imagine a restaurant with an infinite number of circular tables, each with infinite capacity. Customer 1 sits at the first table. The next customer either sits at the same table as customer 1, or the next table. This continues, with each customer choosing to either sit at an occupied table with a probability proportional to the number of customers already there, or an unoccupied table. At time n, the n customers have been partitioned among m ≤ n tables. The results of this process are exchangeable, meaning the order in which the customers sit does not affect the probability of the final distribution. This property greatly simplifies a number of problems in population genetics, linguistic analysis, and image recognition.
Bootstrapping is any test or metric that uses random sampling with replacement, and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.
Interspecific competition, in ecology, is a form of competition in which individuals of different species compete for the same resources in an ecosystem. This can be contrasted with mutualism, a type of symbiosis. Competition between members of the same species is called intraspecific competition.
In ecology, local abundance is the relative representation of a species in a particular ecosystem. It is usually measured as the number of individuals found per sample. The ratio of abundance of one species to one or multiple other species living in an ecosystem is referred to as relative species abundances. Both indicators are relevant for computing biodiversity.
In ecology, the occupancy–abundance (O–A) relationship is the relationship between the abundance of species and the size of their ranges within a region. This relationship is perhaps one of the most well-documented relationships in macroecology, and applies both intra- and interspecifically. In most cases, the O–A relationship is a positive relationship. Although an O–A relationship would be expected, given that a species colonizing a region must pass through the origin and could reach some theoretical maximum abundance and distribution, the relationship described here is somewhat more substantial, in that observed changes in range are associated with greater-than-proportional changes in abundance. Although this relationship appears to be pervasive, and has important implications for the conservation of endangered species, the mechanism(s) underlying it remain poorly understood
In ecology the relative abundance distribution (RAD) or species abundance distribution species abundance distribution (SAD) describes the relationship between the number of species observed in a field study as a function of their observed abundance. The SAD is one of ecology's oldest and most universal laws – every community shows a hollow curve or hyperbolic shape on a histogram with many rare species and just a few common species. When plotted as a histogram of number of species on the y-axis vs. abundance on an arithmetic x-axis, the classic hyperbolic J-curve or hollow curve is produced, indicating a few very abundant species and many rare species. The SAD is central prediction of the Unified neutral theory of biodiversity.
Mechanistic models for niche apportionment are biological models used to explain relative species abundance distributions. These niche apportionment models describe how species break up resource pool in multi-dimensional space, determining the distribution of abundances of individuals among species. The relative abundances of species are usually expressed as a Whittaker plot, or rank abundance plot, where species are ranked by number of individuals on the x-axis, plotted against the log relative abundance of each species on the y-axis. The relative abundance can be measured as the relative number of individuals within species or the relative biomass of individuals within species.
Species distribution modelling (SDM), also known as environmental(or ecological) niche modelling (ENM), habitat modelling, predictive habitat distribution modelling, and range mapping uses computer algorithms to predict the distribution of a species across geographic space and time using environmental data. The environmental data are most often climate data (e.g. temperature, precipitation), but can include other variables such as soil type, water depth, and land cover. SDMs are used in several research areas in conservation biology, ecology and evolution. These models can be used to understand how environmental conditions influence the occurrence or abundance of a species, and for predictive purposes (ecological forecasting). Predictions from an SDM may be of a species’ future distribution under climate change, a species’ past distribution in order to assess evolutionary relationships, or the potential future distribution of an invasive species. Predictions of current and/or future habitat suitability can be useful for management applications (e.g. reintroduction or translocation of vulnerable species, reserve placement in anticipation of climate change).
Taylor's power law is an empirical law in ecology that relates the variance of the number of individuals of a species per unit area of habitat to the corresponding mean by a power law relationship. It is named after the ecologist who first proposed it in 1961, Lionel Roy Taylor (1924–2007). Taylor's original name for this relationship was the law of the mean. The name Taylor's law was coined by Southwood in 1966.
Hypertabastic survival models were introduced in 2007 by Mohammad Tabatabai, Zoran Bursac, David Williams, and Karan Singh. This distribution can be used to analyze time-to-event data in biomedical and public health areas and normally called survival analysis. In engineering, the time-to-event analysis is referred to as reliability theory and in business and economics it is called duration analysis. Other fields may use different names for the same analysis. These survival models are applicable in many fields such as biomedical, behavioral science, social science, statistics, medicine, bioinformatics, medical informatics, data science especially in machine learning, computational biology, business economics, engineering, and commercial entities. They not only look at the time to event, but whether or not the event occurred. These time-to-event models can be applied in a variety of applications for instance, time after diagnosis of cancer until death, comparison of individualized treatment with standard care in cancer research, time until an individual defaults on loans, relapsed time for drug and smoking cessation, time until property sold after being put on the market, time until an individual upgrades to a new phone, time until job relocation, time until bones receive microscopic fractures when undergoing different stress levels, time from marriage until divorce, time until infection due to catheter, and time from bridge completion until first repair.