Index of dissimilarity

Last updated January 05, 2025

The index of dissimilarity is a demographic measure of the evenness with which two groups are distributed across component geographic areas that make up a larger area. A group is evenly distributed when each geographic unit has the same percentage of group members as the total population. The index score can also be interpreted as the percentage of one of the two groups included in the calculation that would have to move to different geographic areas in order to produce a distribution that matches that of the larger area. The index of dissimilarity can be used as a measure of segregation. A score of zero (0%) reflects a fully integrated environment; a score of 1 (100%) reflects full segregation. In terms of black–white segregation, a score of .60 means that 60 percent of blacks would have to exchange places with whites in other units to achieve an even geographic distribution.^[1]^[2]^[3]^[4] Index of dissimilarity is invariant to relative size of group.

Basic formula

The basic formula for the index of dissimilarity is:

D={\frac {1}{2}}\sum _{i=1}^{N}\left|{\frac {a_{i}}{A}}-{\frac {b_{i}}{B}}\right|

where (comparing a black and white population, for example):

a_i = the population of group A in the i^th area, e.g. census tract

A = the total population in group A in the large geographic entity for which the index of dissimilarity is being calculated.

b_i = the population of group B in the i^th area

B = the total population in group B in the large geographic entity for which the index of dissimilarity is being calculated.

The index of dissimilarity is applicable to any categorical variable (whether demographic or not) and because of its simple properties is useful for input into multidimensional scaling and clustering programs. It has been used extensively in the study of social mobility to compare distributions of origin (or destination) occupational categories.

Numerical Example

Consider the following distribution of white and black population across neighborhoods.


Neighborhood	White	Black	$D_{i}$
A	100	5	0.06
B	100	10	0.03
C	100	10	0.03
Total	300	25	0.12

Linear algebra perspective

The formula for the Index of Dissimilarity can be made much more compact and meaningful by considering it from the perspective of Linear algebra. Suppose we are studying the distribution of rich and poor people in a city (e.g. London). Suppose our city contains $N$ blocks:

$\{{\text{block 1}},{\text{block 2}},\ldots ,{\text{block N}}\}$

Let's create a vector $\mathbf {r}$ which shows the number of rich people in each block of our city:

$\mathbf {r} =[r_{1},r_{2},\cdots ,r_{N}]$

Similarly, let's create a vector $\mathbf {p}$ which shows the number of poor people in each block of our city:

$\mathbf {p} =[p_{1},p_{2},\cdots ,p_{N}]$

Now, the $L^{1}$ -norm of a vector is simply the sum of (the magnitude of) each entry in that vector.^[5] That is, for a vector $\mathbf {v} =[v_{1},v_{2},\cdots ,v_{N}]$ , we have the $L^{1}$ -norm:

$|\mathbf {v} |_{1}=\sum _{i=1}^{N}|v_{i}|$

If we denote $R$ as the total number of rich people in our city, than a compact way to calculate $R$ would be to use the $L^{1}$ -norm:

$R=|\mathbf {r} |_{1}=\sum _{i=1}^{N}|r_{i}|$

Similarly, if we denote $P$ as the total number of poor people in our city, then:

$P=|\mathbf {p} |_{1}=\sum _{i=1}^{N}|p_{i}|$

When we divide a vector $\mathbf {v}$ by its norm, we get what is called the normalized vector or Unit vector ${\hat {\mathbf {v} }}$ :

${\hat {\mathbf {v} }}={\frac {\mathbf {v} }{|\mathbf {v} |_{1}}}$

Let us normalize the rich vector $\mathbf {r}$ and the poor vector $\mathbf {p}$ :

${\hat {\mathbf {r} }}={\frac {\mathbf {r} }{|\mathbf {r} |_{1}}}={\frac {\mathbf {r} }{R}}$

${\hat {\mathbf {p} }}={\frac {\mathbf {p} }{|\mathbf {p} |_{1}}}={\frac {\mathbf {p} }{P}}$

We finally return to the formula for the Index of Dissimilarity ( $D$ ); it is simply equal to one-half the $L^{1}$ -norm of the difference between the vectors ${\hat {\mathbf {r} }}$ and ${\hat {\mathbf {p} }}$ :

Index of Dissimilarity
(in Linear Algebraic notation)

$D={\frac {1}{2}}|{\hat {\mathbf {r} }}-{\hat {\mathbf {p} }}|_{1}$

Numerical example

Consider a city consisting of four blocks of 2 people each. One block consists of 2 rich people. One block consists of 2 poor people. Two blocks consist of 1 rich and 1 poor person. What is the index of dissimilarity for this city?

Firstly, let's find the rich vector $\mathbf {r}$ and poor vector $\mathbf {p}$ :

$\mathbf {r} =[2,0,1,1]$

$\mathbf {p} =[0,2,1,1]$

Next, let's calculate the total number of rich people and poor people in our city:

$R=2+0+1+1=4$

$P=0+2+1+1=4$

Next, let's normalize the rich and poor vectors:

${\hat {\mathbf {r} }}={\frac {\mathbf {r} }{R}}={\frac {1}{4}}[2,0,1,1]=[0.5,0,0.25,0.25]$

${\hat {\mathbf {p} }}={\frac {\mathbf {p} }{P}}={\frac {1}{4}}[0,2,1,1]=[0,0.5,0.25,0.25]$

We can now calculate the difference ${\hat {\mathbf {r} }}-{\hat {\mathbf {p} }}$ :

${\hat {\mathbf {r} }}-{\hat {\mathbf {p} }}=[0.5,0,0.25,0.25]-[0,0.5,0.25,0.25]=[0.5,-0.5,0,0]$

Finally, let's find the index of dissimilarity ( $D$ ):

$D={\frac {1}{2}}|{\hat {\mathbf {r} }}-{\hat {\mathbf {p} }}|_{1}={\frac {1}{2}}(|0.5|+|-0.5|)=0.5$

Equivalence between formulae

We can prove that the Linear Algebraic formula for $D$ is identical to the basic formula for $D$ . Let's start with the Linear Algebraic formula:

$D={\frac {1}{2}}|{\hat {\mathbf {r} }}-{\hat {\mathbf {p} }}|_{1}$

Let's replace the normalized vectors $\mathbf {r}$ and $\mathbf {p}$ with:

$D={\frac {1}{2}}\left|{\frac {\mathbf {r} }{R}}-{\frac {\mathbf {p} }{P}}\right|_{1}$

Finally, from the definition of the $L^{1}$ -norm, we know that we can replace it with the summation:

$D={\frac {1}{2}}\sum _{i=1}^{N}|{\frac {r_{i}}{R}}-{\frac {p_{i}}{P}}|$

Thus we prove that the linear algebra formula for the index of dissimilarity is equivalent to the basic formula for it:

$D={\frac {1}{2}}|{\hat {\mathbf {r} }}-{\hat {\mathbf {p} }}|_{1}={\frac {1}{2}}\sum _{i=1}^{N}|{\frac {r_{i}}{R}}-{\frac {p_{i}}{P}}|$

Zero segregation

When the Index of Dissimilarity is zero, this means that the community we are studying has zero segregation. For example, if we are studying the segregation of rich and poor people in a city, then if $D=0$ , it means that:

There are no blocks in the city which are "rich blocks", and there are no blocks in the city which are "poor blocks"
There is a homogeneous distribution of rich and poor people throughout the city

If we set $D=0$ in the linear algebraic formula, we get the necessary condition for having zero segregation:

$\mathbf {\hat {r}} =\mathbf {\hat {p}}$

For example, suppose you have a city with 2 blocks. Each block has 4 rich people and 100 poor people:

$\mathbf {r} =[4,4]$

$\mathbf {p} =[100,100]$

Then, the total number of rich people is $R=4+4=8$ , and the total number of poor people is $P=100+100=200$ . Thus:

$\mathbf {\hat {r}} =[4/8,4/8]=[0.5,0.5]$

$\mathbf {\hat {p}} =[100/200,100/200]=[0.5,0.5]$

Because $\mathbf {\hat {r}} =\mathbf {\hat {p}}$ , thus this city has zero segregation.

As another example, suppose you have a city with 3 blocks:

$\mathbf {r} =[1,2,3]$

$\mathbf {p} =[100,200,300]$

Then, we have $R=1+2+3=6$ rich people in our city, and $P=100+200+300=600$ poor people. Thus:

$\mathbf {\hat {r}} =[1/6,2/6,3/6]$

$\mathbf {\hat {p}} =[100/600,200/600,300/600]=[1/6,2/6,3/6]$

Again, because $\mathbf {\hat {r}} =\mathbf {\hat {p}}$ , thus this city also has zero segregation.

Related Research Articles

In vector calculus, the curl, also known as rotor, is a vector operator that describes the infinitesimal circulation of a vector field in three-dimensional Euclidean space. The curl at a point in the field is represented by a vector whose length and direction denote the magnitude and axis of the maximum circulation. The curl of a field is formally defined as the circulation density at each point of the field.

A centripetal force is a force that makes a body follow a curved path. The direction of the centripetal force is always orthogonal to the motion of the body and towards the fixed point of the instantaneous center of curvature of the path. Isaac Newton described it as "a force by which bodies are drawn or impelled, or in any way tend, towards a point as to a centre". In Newtonian mechanics, gravity provides the centripetal force causing astronomical orbits.

In vector calculus, divergence is a vector operator that operates on a vector field, producing a scalar field giving the quantity of the vector field's source at each point. More technically, the divergence represents the volume density of the outward flux of a vector field from an infinitesimal volume around a given point.

<span class="mw-page-title-main">Gradient</span> Multivariate derivative (mathematics)

In vector calculus, the gradient of a scalar-valued differentiable function $of several variables is the vector field whose value at a point gives the direction and the rate of fastest increase. The gradient transforms like a vector under change of basis of the space of variables of . If the gradient of a function is non-zero at a point, the direction of the gradient is the direction in which the function increases most quickly from, and the magnitude of the gradient is the rate of increase in that direction, the greatest absolute directional derivative. Further, a point where the gradient is the zero vector is known as a stationary point. The gradient thus plays a fundamental role in optimization theory, where it is used to minimize a function by gradient descent. In coordinate-free terms, the gradient of a function may be defined by:$

In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant. Partial derivatives are used in vector calculus and differential geometry.

Kinematics is a subfield of physics and mathematics, developed in classical mechanics, that describes the motion of points, bodies (objects), and systems of bodies without considering the forces that cause them to move. Kinematics, as a field of study, is often referred to as the "geometry of motion" and is occasionally seen as a branch of both applied and pure mathematics since it can be studied without considering the mass of a body or the forces acting upon it. A kinematics problem begins by describing the geometry of the system and declaring the initial conditions of any known values of position, velocity and/or acceleration of points within the system. Then, using arguments from geometry, the position, velocity and acceleration of any unknown parts of the system can be determined. The study of how forces act on bodies falls within kinetics, not kinematics. For further details, see analytical dynamics.

In special relativity, a four-vector is an object with four components, which transform in a specific way under Lorentz transformations. Specifically, a four-vector is an element of a four-dimensional vector space considered as a representation space of the standard representation of the Lorentz group, the representation. It differs from a Euclidean vector in how its magnitude is determined. The transformations that preserve this magnitude are the Lorentz transformations, which include spatial rotations and boosts.

In vector calculus, Green's theorem relates a line integral around a simple closed curve $C$ to a double integral over the plane region $D$ bounded by $C$ . It is the two-dimensional special case of Stokes' theorem. In one dimension, it is equivalent to the fundamental theorem of calculus. In three dimensions, it is equivalent to the divergence theorem.

An operator is a function over a space of physical states onto another space of states. The simplest example of the utility of operators is the study of symmetry. Because of this, they are useful tools in classical mechanics. Operators are even more important in quantum mechanics, where they form an intrinsic part of the formulation of the theory.

In condensed matter physics, Bloch's theorem states that solutions to the Schrödinger equation in a periodic potential can be expressed as plane waves modulated by periodic functions. The theorem is named after the Swiss physicist Felix Bloch, who discovered the theorem in 1929. Mathematically, they are written

In classical mechanics, the Laplace–Runge–Lenz (LRL) vector is a vector used chiefly to describe the shape and orientation of the orbit of one astronomical body around another, such as a binary star or a planet revolving around a star. For two bodies interacting by Newtonian gravity, the LRL vector is a constant of motion, meaning that it is the same no matter where it is calculated on the orbit; equivalently, the LRL vector is said to be conserved. More generally, the LRL vector is conserved in all problems in which two bodies interact by a central force that varies as the inverse square of the distance between them; such problems are called Kepler problems.

<span class="mw-page-title-main">Rotating reference frame</span> Concept in classical mechanics

A rotating frame of reference is a special case of a non-inertial reference frame that is rotating relative to an inertial reference frame. An everyday example of a rotating reference frame is the surface of the Earth.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices. It collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. This greatly simplifies operations such as finding the maximum or minimum of a multivariate function and solving systems of differential equations. The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.

<span class="mw-page-title-main">Larmor formula</span> Gives the total power radiated by an accelerating, nonrelativistic point charge

In electrodynamics, the Larmor formula is used to calculate the total power radiated by a nonrelativistic point charge as it accelerates. It was first derived by J. J. Larmor in 1897, in the context of the wave theory of light.

<span class="mw-page-title-main">Bragg plane</span>

In physics, a Bragg plane is a plane in reciprocal space which bisects a reciprocal lattice vector, $, at right angles. The Bragg plane is defined as part of the Von Laue condition for diffraction peaks in x-ray diffraction crystallography.$

The gradient theorem, also known as the fundamental theorem of calculus for line integrals, says that a line integral through a gradient field can be evaluated by evaluating the original scalar field at the endpoints of the curve. The theorem is a generalization of the second fundamental theorem of calculus to any curve in a plane or space rather than just the real line.

In geometry, various formalisms exist to express a rotation in three dimensions as a mathematical transformation. In physics, this concept is applied to classical mechanics where rotational kinematics is the science of quantitative description of a purely rotational motion. The orientation of an object at a given instant is described with the same tools, as it is defined as an imaginary rotation from a reference placement in space, rather than an actually observed rotation from a previous placement in space.

In mathematics, vector spherical harmonics (VSH) are an extension of the scalar spherical harmonics for use with vector fields. The components of the VSH are complex-valued functions expressed in the spherical coordinate basis vectors.

In pure and applied mathematics, quantum mechanics and computer graphics, a tensor operator generalizes the notion of operators which are scalars and vectors. A special class of these are spherical tensor operators which apply the notion of the spherical basis and spherical harmonics. The spherical basis closely relates to the description of angular momentum in quantum mechanics and spherical harmonic functions. The coordinate-free generalization of a tensor operator is known as a representation operator.

References

↑ "Housing Patterns: Appendix B: Measures of Residential Segregation". Census.gov. United States Census Bureau. Retrieved 2022-04-28.
↑ Massey, Douglas S.; Rothwell, Jonathan; Domina, Thurston (2009-10-26). "The Changing Bases of Segregation in the United States". The Annals of the American Academy of Political and Social Science. 626 (1): 74–90. doi:10.1177/0002716209343558. ISSN 0002-7162. PMC 3844132 . PMID 24298193.
↑ White, Michael J. (1986). "Segregation and Diversity Measures in Population Distribution". Population Index. 52 (2): 198–221. doi:10.2307/3644339. ISSN 0032-4701.
↑ Massey, Douglas S.; Denton, Nancy A. (December 1988). "The Dimensions of Residential Segregation". Social Forces. 67 (2): 281. doi:10.2307/2579183. ISSN 0037-7732.
↑ Wolfram MathWorld: L1 Norm

External links

http://enceladus.isr.umich.edu/race/calculate.html Archived 2011-05-17 at the Wayback Machine

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Housing Patterns: Appendix B: Measures of Residential Segregation". Census.gov. United States Census Bureau. Retrieved 2022-04-28.

[2] Massey, Douglas S.; Rothwell, Jonathan; Domina, Thurston (2009-10-26). "The Changing Bases of Segregation in the United States". The Annals of the American Academy of Political and Social Science. 626 (1): 74–90. doi:10.1177/0002716209343558. ISSN 0002-7162. PMC 3844132 . PMID 24298193.

[3] White, Michael J. (1986). "Segregation and Diversity Measures in Population Distribution". Population Index. 52 (2): 198–221. doi:10.2307/3644339. ISSN 0032-4701.

[4] Massey, Douglas S.; Denton, Nancy A. (December 1988). "The Dimensions of Residential Segregation". Social Forces. 67 (2): 281. doi:10.2307/2579183. ISSN 0037-7732.

[5] Wolfram MathWorld: L1 Norm

[1]

[2]

[3]

[4]

[5]