fbpx
Wikipedia

Diversity index

A diversity index is a quantitative measure that reflects how many different types (e.g. species) there are in a dataset (e.g. a community). More sophisticated indices accounting for the phylogenetic relatedness among the types.[1] Diversity indices are statistical representations of different aspects of biodiversity (e.g. richness, evenness, and dominance), that are useful simplifications to compare different communities or sites.

Effective number of species or Hill numbers edit

When diversity indices are used in ecology, the types of interest are usually species, but they can also be other categories, such as genera, families, functional types, or haplotypes. The entities of interest are usually individual organisms (e.g. plants or animals), and the measure of abundance can be, for example, number of individuals, biomass or coverage. In demography, the entities of interest can be people, and the types of interest various demographic groups. In information science, the entities can be characters and the types of the different letters of the alphabet. The most commonly used diversity indices are simple transformations of the effective number of types (also known as 'true diversity'), but each diversity index can also be interpreted in its own right as a measure corresponding to some real phenomenon (but a different one for each diversity index).[2][3][4][5]

Many indices only account for categorical diversity between subjects or entities. Such indices, however do not account for the total variation (diversity) that can be held between subjects or entities which occurs only when both categorical and qualitative diversity are calculated.

True diversity, or the effective number of types, refers to the number of equally abundant types needed for the average proportional abundance of the types to equal that observed in the dataset of interest (where all types may not be equally abundant). The true diversity in a dataset is calculated by first taking the weighted generalized mean Mq−1 of the proportional abundances of the types in the dataset, and then taking the reciprocal of this. The equation is:[4][5]

 

The denominator Mq−1 equals the average proportional abundance of the types in the dataset as calculated with the weighted generalized mean with exponent q − 1. In the equation, R is richness (the total number of types in the dataset), and the proportional abundance of the ith type is pi. The proportional abundances themselves are used as the nominal weights. The numbers   are called Hill numbers of order q or effective number of species.[6]

When q = 1, the above equation is undefined. However, the mathematical limit as q approaches 1 is well defined and the corresponding diversity is calculated with the following equation:

 

which is the exponential of the Shannon entropy calculated with natural logarithms (see above). In other domains, this statistic is also known as the perplexity.

The general equation of diversity is often written in the form[2][3]

 

and the term inside the parentheses is called the basic sum. Some popular diversity indices correspond to the basic sum as calculated with different values of q.[3]

Sensitivity of the diversity value to rare vs. abundant species edit

The value of q is often referred to as the order of the diversity. It defines the sensitivity of the true diversity to rare vs. abundant species by modifying how the weighted mean of the species' proportional abundances is calculated. With some values of the parameter q, the value of the generalized mean Mq−1 assumes familiar kinds of weighted means as special cases. In particular,

  • q = 0 corresponds to the weighted harmonic mean,
  • q = 1 to the weighted geometric mean, and
  • q = 2 to the weighted arithmetic mean.
  • As q approaches infinity, the weighted generalized mean with exponent q − 1 approaches the maximum pi value, which is the proportional abundance of the most abundant species in the dataset.

Generally, increasing the value of q increases the effective weight given to the most abundant species. This leads to obtaining a larger Mq−1 value and a smaller true diversity (qD) value with increasing q.

When q = 1, the weighted geometric mean of the pi values is used, and each species is exactly weighted by its proportional abundance (in the weighted geometric mean, the weights are the exponents). When q > 1, the weight given to abundant species is exaggerated, and when q < 1, the weight given to rare species is. At q = 0, the species weights exactly cancel out the species proportional abundances, such that the weighted mean of the pi values equals 1 / R even when all species are not equally abundant. At q = 0, the effective number of species, 0D, hence equals the actual number of species R. In the context of diversity, q is generally limited to non-negative values. This is because negative values of q would give rare species so much more weight than abundant ones that qD would exceed R.[4][5]

Richness edit

Richness R simply quantifies how many different types the dataset of interest contains. For example, species richness (usually noted S) is simply the number of species, e.g. at a particular site. Richness is a simple measure, so it has been a popular diversity index in ecology, where abundance data are often not available.[7] If true diversity is calculated with q = 0, the effective number of types (0D) equals the actual number of types, which is identical to Richness (R).[3][5]

Shannon index edit

The Shannon index has been a popular diversity index in the ecological literature, where it is also known as Shannon's diversity index, Shannon–Wiener index, and (erroneously) Shannon–Weaver index.[8] The measure was originally proposed by Claude Shannon in 1948 to quantify the entropy (hence Shannon entropy, related to Shannon information content) in strings of text.[9] The idea is that the more letters there are, and the closer their proportional abundances in the string of interest, the more difficult it is to correctly predict which letter will be the next one in the string. The Shannon entropy quantifies the uncertainty (entropy or degree of surprise) associated with this prediction. It is most often calculated as follows:

 

where pi is the proportion of characters belonging to the ith type of letter in the string of interest. In ecology, pi is often the proportion of individuals belonging to the ith species in the dataset of interest. Then the Shannon entropy quantifies the uncertainty in predicting the species identity of an individual that is taken at random from the dataset.

Although the equation is here written with natural logarithms, the base of the logarithm used when calculating the Shannon entropy can be chosen freely. Shannon himself discussed logarithm bases 2, 10 and e, and these have since become the most popular bases in applications that use the Shannon entropy. Each log base corresponds to a different measurement unit, which has been called binary digits (bits), decimal digits (decits), and natural digits (nats) for the bases 2, 10 and e, respectively. Comparing Shannon entropy values that were originally calculated with different log bases requires converting them to the same log base: change from the base a to base b is obtained with multiplication by logba.[9]

The Shannon index (H') is related to the weighted geometric mean of the proportional abundances of the types. Specifically, it equals the logarithm of true diversity as calculated with q = 1:[4]

 

This can also be written

 

which equals

 

Since the sum of the pi values equals 1 by definition, the denominator equals the weighted geometric mean of the pi values, with the pi values themselves being used as the weights (exponents in the equation). The term within the parentheses hence equals true diversity 1D, and H' equals ln(1D).[2][4][5]

When all types in the dataset of interest are equally common, all pi values equal 1 / R, and the Shannon index hence takes the value ln(R). The more unequal the abundances of the types, the larger the weighted geometric mean of the pi values, and the smaller the corresponding Shannon entropy. If practically all abundance is concentrated to one type, and the other types are very rare (even if there are many of them), Shannon entropy approaches zero. When there is only one type in the dataset, Shannon entropy exactly equals zero (there is no uncertainty in predicting the type of the next randomly chosen entity).

In machine learning the Shannon index is also called as Information gain.

Rényi entropy edit

The Rényi entropy is a generalization of the Shannon entropy to other values of q than 1. It can be expressed:

 

which equals

 

This means that taking the logarithm of true diversity based on any value of q gives the Rényi entropy corresponding to the same value of q.

Simpson index edit

The Simpson index was introduced in 1949 by Edward H. Simpson to measure the degree of concentration when individuals are classified into types.[10] The same index was rediscovered by Orris C. Herfindahl in 1950.[11] The square root of the index had already been introduced in 1945 by the economist Albert O. Hirschman.[12] As a result, the same measure is usually known as the Simpson index in ecology, and as the Herfindahl index or the Herfindahl–Hirschman index (HHI) in economics.

The measure equals the probability that two entities taken at random from the dataset of interest represent the same type.[10] It equals:

 

where R is richness (the total number of types in the dataset). This equation is also equal to the weighted arithmetic mean of the proportional abundances pi of the types of interest, with the proportional abundances themselves being used as the weights.[2] Proportional abundances are by definition constrained to values between zero and one, but it is a weighted arithmetic mean, hence λ ≥ 1/R, which is reached when all types are equally abundant.

By comparing the equation used to calculate λ with the equations used to calculate true diversity, it can be seen that 1/λ equals 2D, i.e., true diversity as calculated with q = 2. The original Simpson's index hence equals the corresponding basic sum.[3]

The interpretation of λ as the probability that two entities taken at random from the dataset of interest represent the same type assumes that the first entity is replaced to the dataset before taking the second entity. If the dataset is very large, sampling without replacement gives approximately the same result, but in small datasets, the difference can be substantial. If the dataset is small, and sampling without replacement is assumed, the probability of obtaining the same type with both random draws is:

 

where ni is the number of entities belonging to the ith type and N is the total number of entities in the dataset.[10] This form of the Simpson index is also known as the Hunter–Gaston index in microbiology.[13]

Since the mean proportional abundance of the types increases with decreasing number of types and increasing abundance of the most abundant type, λ obtains small values in datasets of high diversity and large values in datasets of low diversity. This is counterintuitive behavior for a diversity index, so often, such transformations of λ that increase with increasing diversity have been used instead. The most popular of such indices have been the inverse Simpson index (1/λ) and the Gini–Simpson index (1 − λ).[2][3] Both of these have also been called the Simpson index in the ecological literature, so care is needed to avoid accidentally comparing the different indices as if they were the same.

Inverse Simpson index edit

The inverse Simpson index equals:

 

This simply equals true diversity of order 2, i.e. the effective number of types that is obtained when the weighted arithmetic mean is used to quantify average proportional abundance of types in the dataset of interest.

The index is also used as a measure of the effective number of parties.

Gini–Simpson index edit

The Gini-Simpson Index is also called Gini impurity, or Gini's diversity index[14] in the field of Machine Learning. The original Simpson index λ equals the probability that two entities taken at random from the dataset of interest (with replacement) represent the same type. Its transformation 1 − λ, therefore, equals the probability that the two entities represent different types. This measure is also known in ecology as the probability of interspecific encounter (PIE)[15] and the Gini–Simpson index.[3] It can be expressed as a transformation of the true diversity of order 2:

 

The Gibbs–Martin index of sociology, psychology, and management studies,[16] which is also known as the Blau index, is the same measure as the Gini–Simpson index.

The quantity is also known as the expected heterozygosity in population genetics.

Berger–Parker index edit

The Berger–Parker[17] index equals the maximum pi value in the dataset, i.e., the proportional abundance of the most abundant type. This corresponds to the weighted generalized mean of the pi values when q approaches infinity, and hence equals the inverse of the true diversity of order infinity (1/D).

See also edit

References edit

  1. ^ Tucker, Caroline M.; Cadotte, Marc W.; Carvalho, Silvia B.; Davies, T. Jonathan; Ferrier, Simon; Fritz, Susanne A.; Grenyer, Rich; Helmus, Matthew R.; Jin, Lanna S. (May 2017). "A guide to phylogenetic metrics for conservation, community ecology and macroecology: A guide to phylogenetic metrics for ecology". Biological Reviews. 92 (2): 698–715. doi:10.1111/brv.12252. PMC 5096690. PMID 26785932.
  2. ^ a b c d e Hill, M. O. (1973). "Diversity and evenness: a unifying notation and its consequences". Ecology. 54 (2): 427–432. Bibcode:1973Ecol...54..427H. doi:10.2307/1934352. JSTOR 1934352.
  3. ^ a b c d e f g Jost, L (2006). "Entropy and diversity". Oikos. 113 (2): 363–375. Bibcode:2006Oikos.113..363J. doi:10.1111/j.2006.0030-1299.14714.x.
  4. ^ a b c d e Tuomisto, H (2010). "A diversity of beta diversities: straightening up a concept gone awry. Part 1. Defining beta diversity as a function of alpha and gamma diversity". Ecography. 33 (1): 2–22. Bibcode:2010Ecogr..33....2T. doi:10.1111/j.1600-0587.2009.05880.x.
  5. ^ a b c d e Tuomisto, H (2010). "A consistent terminology for quantifying species diversity? Yes, it does exist". Oecologia. 164 (4): 853–860. Bibcode:2010Oecol.164..853T. doi:10.1007/s00442-010-1812-0. PMID 20978798. S2CID 19902787.
  6. ^ Chao, Anne; Chiu, Chun-Huo; Jost, Lou (2016), "Phylogenetic Diversity Measures and Their Decomposition: A Framework Based on Hill Numbers", Biodiversity Conservation and Phylogenetic Systematics, Topics in Biodiversity and Conservation, Springer International Publishing, vol. 14, pp. 141–172, doi:10.1007/978-3-319-22461-9_8, ISBN 9783319224602
  7. ^ Morris, E. Kathryn; Caruso, Tancredi; Buscot, François; Fischer, Markus; Hancock, Christine; Maier, Tanja S.; Meiners, Torsten; Müller, Caroline; Obermaier, Elisabeth; Prati, Daniel; Socher, Stephanie A.; Sonnemann, Ilja; Wäschke, Nicole; Wubet, Tesfaye; Wurst, Susanne (September 2014). "Choosing and using diversity indices: insights for ecological applications from the German Biodiversity Exploratories". Ecology and Evolution. 4 (18): 3514–3524. Bibcode:2014EcoEv...4.3514M. doi:10.1002/ece3.1155. ISSN 2045-7758. PMC 4224527. PMID 25478144.
  8. ^ Spellerberg, Ian F., and Peter J. Fedor. (2003) A tribute to Claude Shannon (1916–2001) and a plea for more rigorous use of species richness, species diversity and the ‘Shannon–Wiener’Index. Global Ecology and Biogeography 12.3, 177-179.
  9. ^ a b Shannon, C. E. (1948) A mathematical theory of communication. The Bell System Technical Journal, 27, 379–423 and 623–656.
  10. ^ a b c Simpson, E. H. (1949). "Measurement of diversity". Nature. 163 (4148): 688. Bibcode:1949Natur.163..688S. doi:10.1038/163688a0.
  11. ^ Herfindahl, O. C. (1950) Concentration in the U.S. Steel Industry. Unpublished doctoral dissertation, Columbia University.
  12. ^ Hirschman, A. O. (1945) National power and the structure of foreign trade. Berkeley.
  13. ^ Hunter, PR; Gaston, MA (1988). "Numerical index of the discriminatory ability of typing systems: an application of Simpson's index of diversity". J Clin Microbiol. 26 (11): 2465–2466. doi:10.1128/JCM.26.11.2465-2466.1988. PMC 266921. PMID 3069867.
  14. ^ "Growing Decision Trees". MathWorks.
  15. ^ Hurlbert, S.H. (1971). "The nonconcept of species diversity: A critique and alternative parameters". Ecology. 52 (4): 577–586. Bibcode:1971Ecol...52..577H. doi:10.2307/1934145. JSTOR 1934145. PMID 28973811. S2CID 25837001.
  16. ^ Gibbs, Jack P.; William T. Martin (1962). "Urbanization, technology and the division of labor". American Sociological Review. 27 (5): 667–677. doi:10.2307/2089624. JSTOR 2089624.
  17. ^ Berger, Wolfgang H.; Parker, Frances L. (June 1970). "Diversity of Planktonic Foraminifera in Deep-Sea Sediments". Science. 168 (3937): 1345–1347. Bibcode:1970Sci...168.1345B. doi:10.1126/science.168.3937.1345. PMID 17731043. S2CID 29553922.

Further reading edit

  • Colinvaux, Paul A. (1973). Introduction to Ecology. Wiley. ISBN 0-471-16498-4.
  • Cover, Thomas M.; Thomas, Joy A. (1991). Elements of Information Theory. Wiley. ISBN 0-471-06259-6. See chapter 5 for an elaboration of coding procedures described informally above.
  • Chao, A.; Shen, T-J. (2003). "Nonparametric estimation of Shannon's index of diversity when there are unseen species in sample" (PDF). Environmental and Ecological Statistics. 10 (4): 429–443. doi:10.1023/A:1026096204727. S2CID 20389926.

External links edit

  • Simpson's Diversity index
  • Diversity indices 2005-12-19 at the Wayback Machine gives some examples of estimates of Simpson's index for real ecosystems.

diversity, index, this, article, multiple, issues, please, help, improve, discuss, these, issues, talk, page, learn, when, remove, these, template, messages, this, article, need, rewritten, comply, with, wikipedia, quality, standards, help, talk, page, contain. This article has multiple issues Please help improve it or discuss these issues on the talk page Learn how and when to remove these template messages This article may need to be rewritten to comply with Wikipedia s quality standards You can help The talk page may contain suggestions April 2020 This article may be too technical for most readers to understand Please help improve it to make it understandable to non experts without removing the technical details April 2020 Learn how and when to remove this template message Learn how and when to remove this template message A diversity index is a quantitative measure that reflects how many different types e g species there are in a dataset e g a community More sophisticated indices accounting for the phylogenetic relatedness among the types 1 Diversity indices are statistical representations of different aspects of biodiversity e g richness evenness and dominance that are useful simplifications to compare different communities or sites Contents 1 Effective number of species or Hill numbers 2 Sensitivity of the diversity value to rare vs abundant species 3 Richness 4 Shannon index 4 1 Renyi entropy 5 Simpson index 5 1 Inverse Simpson index 5 2 Gini Simpson index 6 Berger Parker index 7 See also 8 References 9 Further reading 10 External linksEffective number of species or Hill numbers editWhen diversity indices are used in ecology the types of interest are usually species but they can also be other categories such as genera families functional types or haplotypes The entities of interest are usually individual organisms e g plants or animals and the measure of abundance can be for example number of individuals biomass or coverage In demography the entities of interest can be people and the types of interest various demographic groups In information science the entities can be characters and the types of the different letters of the alphabet The most commonly used diversity indices are simple transformations of the effective number of types also known as true diversity but each diversity index can also be interpreted in its own right as a measure corresponding to some real phenomenon but a different one for each diversity index 2 3 4 5 Many indices only account for categorical diversity between subjects or entities Such indices however do not account for the total variation diversity that can be held between subjects or entities which occurs only when both categorical and qualitative diversity are calculated True diversity or the effective number of types refers to the number of equally abundant types needed for the average proportional abundance of the types to equal that observed in the dataset of interest where all types may not be equally abundant The true diversity in a dataset is calculated by first taking the weighted generalized mean Mq 1 of the proportional abundances of the types in the dataset and then taking the reciprocal of this The equation is 4 5 q D 1 M q 1 1 i 1 R p i p i q 1 q 1 i 1 R p i q 1 1 q displaystyle q D 1 over M q 1 1 over sqrt q 1 sum i 1 R p i p i q 1 left sum i 1 R p i q right 1 1 q nbsp The denominator Mq 1 equals the average proportional abundance of the types in the dataset as calculated with the weighted generalized mean with exponent q 1 In the equation R is richness the total number of types in the dataset and the proportional abundance of the i th type is pi The proportional abundances themselves are used as the nominal weights The numbers q D displaystyle q D nbsp are called Hill numbers of order q or effective number of species 6 When q 1 the above equation is undefined However the mathematical limit as q approaches 1 is well defined and the corresponding diversity is calculated with the following equation 1 D 1 i 1 R p i p i exp i 1 R p i ln p i displaystyle 1 D 1 over prod i 1 R p i p i exp left sum i 1 R p i ln p i right nbsp which is the exponential of the Shannon entropy calculated with natural logarithms see above In other domains this statistic is also known as the perplexity The general equation of diversity is often written in the form 2 3 q D i 1 R p i q 1 1 q displaystyle q D left sum i 1 R p i q right 1 1 q nbsp and the term inside the parentheses is called the basic sum Some popular diversity indices correspond to the basic sum as calculated with different values of q 3 Sensitivity of the diversity value to rare vs abundant species editThe value of q is often referred to as the order of the diversity It defines the sensitivity of the true diversity to rare vs abundant species by modifying how the weighted mean of the species proportional abundances is calculated With some values of the parameter q the value of the generalized mean Mq 1 assumes familiar kinds of weighted means as special cases In particular q 0 corresponds to the weighted harmonic mean q 1 to the weighted geometric mean and q 2 to the weighted arithmetic mean As q approaches infinity the weighted generalized mean with exponent q 1 approaches the maximum pi value which is the proportional abundance of the most abundant species in the dataset Generally increasing the value of q increases the effective weight given to the most abundant species This leads to obtaining a larger Mq 1 value and a smaller true diversity qD value with increasing q When q 1 the weighted geometric mean of the pi values is used and each species is exactly weighted by its proportional abundance in the weighted geometric mean the weights are the exponents When q gt 1 the weight given to abundant species is exaggerated and when q lt 1 the weight given to rare species is At q 0 the species weights exactly cancel out the species proportional abundances such that the weighted mean of the pi values equals 1 R even when all species are not equally abundant At q 0 the effective number of species 0D hence equals the actual number of species R In the context of diversity q is generally limited to non negative values This is because negative values of q would give rare species so much more weight than abundant ones that qD would exceed R 4 5 Richness editMain article Species richness Richness R simply quantifies how many different types the dataset of interest contains For example species richness usually noted S is simply the number of species e g at a particular site Richness is a simple measure so it has been a popular diversity index in ecology where abundance data are often not available 7 If true diversity is calculated with q 0 the effective number of types 0D equals the actual number of types which is identical to Richness R 3 5 Shannon index editThe Shannon index has been a popular diversity index in the ecological literature where it is also known as Shannon s diversity index Shannon Wiener index and erroneously Shannon Weaver index 8 The measure was originally proposed by Claude Shannon in 1948 to quantify the entropy hence Shannon entropy related to Shannon information content in strings of text 9 The idea is that the more letters there are and the closer their proportional abundances in the string of interest the more difficult it is to correctly predict which letter will be the next one in the string The Shannon entropy quantifies the uncertainty entropy or degree of surprise associated with this prediction It is most often calculated as follows H i 1 R p i ln p i displaystyle H sum i 1 R p i ln p i nbsp where pi is the proportion of characters belonging to the i th type of letter in the string of interest In ecology pi is often the proportion of individuals belonging to the i th species in the dataset of interest Then the Shannon entropy quantifies the uncertainty in predicting the species identity of an individual that is taken at random from the dataset Although the equation is here written with natural logarithms the base of the logarithm used when calculating the Shannon entropy can be chosen freely Shannon himself discussed logarithm bases 2 10 and e and these have since become the most popular bases in applications that use the Shannon entropy Each log base corresponds to a different measurement unit which has been called binary digits bits decimal digits decits and natural digits nats for the bases 2 10 and e respectively Comparing Shannon entropy values that were originally calculated with different log bases requires converting them to the same log base change from the base a to base b is obtained with multiplication by logba 9 The Shannon index H is related to the weighted geometric mean of the proportional abundances of the types Specifically it equals the logarithm of true diversity as calculated with q 1 4 H i 1 R p i ln p i i 1 R ln p i p i displaystyle H sum i 1 R p i ln p i sum i 1 R ln p i p i nbsp This can also be written H ln p 1 p 1 ln p 2 p 2 ln p 3 p 3 ln p R p R displaystyle H ln p 1 p 1 ln p 2 p 2 ln p 3 p 3 cdots ln p R p R nbsp which equals H ln p 1 p 1 p 2 p 2 p 3 p 3 p R p R ln 1 p 1 p 1 p 2 p 2 p 3 p 3 p R p R ln 1 i 1 R p i p i displaystyle H ln p 1 p 1 p 2 p 2 p 3 p 3 cdots p R p R ln left 1 over p 1 p 1 p 2 p 2 p 3 p 3 cdots p R p R right ln left 1 over prod i 1 R p i p i right nbsp Since the sum of the pi values equals 1 by definition the denominator equals the weighted geometric mean of the pi values with the pi values themselves being used as the weights exponents in the equation The term within the parentheses hence equals true diversity 1D and H equals ln 1D 2 4 5 When all types in the dataset of interest are equally common all pi values equal 1 R and the Shannon index hence takes the value ln R The more unequal the abundances of the types the larger the weighted geometric mean of the pi values and the smaller the corresponding Shannon entropy If practically all abundance is concentrated to one type and the other types are very rare even if there are many of them Shannon entropy approaches zero When there is only one type in the dataset Shannon entropy exactly equals zero there is no uncertainty in predicting the type of the next randomly chosen entity In machine learning the Shannon index is also called as Information gain Renyi entropy edit The Renyi entropy is a generalization of the Shannon entropy to other values of q than 1 It can be expressed q H 1 1 q ln i 1 R p i q displaystyle q H frac 1 1 q ln left sum i 1 R p i q right nbsp which equals q H ln 1 i 1 R p i p i q 1 q 1 ln q D displaystyle q H ln left 1 over sqrt q 1 sum i 1 R p i p i q 1 right ln q D nbsp This means that taking the logarithm of true diversity based on any value of q gives the Renyi entropy corresponding to the same value of q Simpson index editThe Simpson index was introduced in 1949 by Edward H Simpson to measure the degree of concentration when individuals are classified into types 10 The same index was rediscovered by Orris C Herfindahl in 1950 11 The square root of the index had already been introduced in 1945 by the economist Albert O Hirschman 12 As a result the same measure is usually known as the Simpson index in ecology and as the Herfindahl index or the Herfindahl Hirschman index HHI in economics The measure equals the probability that two entities taken at random from the dataset of interest represent the same type 10 It equals l i 1 R p i 2 displaystyle lambda sum i 1 R p i 2 nbsp where R is richness the total number of types in the dataset This equation is also equal to the weighted arithmetic mean of the proportional abundances pi of the types of interest with the proportional abundances themselves being used as the weights 2 Proportional abundances are by definition constrained to values between zero and one but it is a weighted arithmetic mean hence l 1 R which is reached when all types are equally abundant By comparing the equation used to calculate l with the equations used to calculate true diversity it can be seen that 1 l equals 2D i e true diversity as calculated with q 2 The original Simpson s index hence equals the corresponding basic sum 3 The interpretation of l as the probability that two entities taken at random from the dataset of interest represent the same type assumes that the first entity is replaced to the dataset before taking the second entity If the dataset is very large sampling without replacement gives approximately the same result but in small datasets the difference can be substantial If the dataset is small and sampling without replacement is assumed the probability of obtaining the same type with both random draws is ℓ i 1 R n i n i 1 N N 1 displaystyle ell frac sum i 1 R n i n i 1 N N 1 nbsp where ni is the number of entities belonging to the i th type and N is the total number of entities in the dataset 10 This form of the Simpson index is also known as the Hunter Gaston index in microbiology 13 Since the mean proportional abundance of the types increases with decreasing number of types and increasing abundance of the most abundant type l obtains small values in datasets of high diversity and large values in datasets of low diversity This is counterintuitive behavior for a diversity index so often such transformations of l that increase with increasing diversity have been used instead The most popular of such indices have been the inverse Simpson index 1 l and the Gini Simpson index 1 l 2 3 Both of these have also been called the Simpson index in the ecological literature so care is needed to avoid accidentally comparing the different indices as if they were the same Inverse Simpson index edit The inverse Simpson index equals 1 l 1 i 1 R p i 2 2 D displaystyle frac 1 lambda 1 over sum i 1 R p i 2 2 D nbsp This simply equals true diversity of order 2 i e the effective number of types that is obtained when the weighted arithmetic mean is used to quantify average proportional abundance of types in the dataset of interest The index is also used as a measure of the effective number of parties Gini Simpson index edit The Gini Simpson Index is also called Gini impurity or Gini s diversity index 14 in the field of Machine Learning The original Simpson index l equals the probability that two entities taken at random from the dataset of interest with replacement represent the same type Its transformation 1 l therefore equals the probability that the two entities represent different types This measure is also known in ecology as the probability of interspecific encounter PIE 15 and the Gini Simpson index 3 It can be expressed as a transformation of the true diversity of order 2 1 l 1 i 1 R p i 2 1 1 2 D displaystyle 1 lambda 1 sum i 1 R p i 2 1 frac 1 2 D nbsp The Gibbs Martin index of sociology psychology and management studies 16 which is also known as the Blau index is the same measure as the Gini Simpson index The quantity is also known as the expected heterozygosity in population genetics Berger Parker index editThe Berger Parker 17 index equals the maximum pi value in the dataset i e the proportional abundance of the most abundant type This corresponds to the weighted generalized mean of the pi values when q approaches infinity and hence equals the inverse of the true diversity of order infinity 1 D See also editAlpha diversity Beta diversity Cultural diversity Effective number of parties a diversity index applied to political parties Gamma diversity Generalized entropy index Gini coefficient Isolation index Measurement of biodiversity Qualitative variation Relative abundance Species diversity Species richnessReferences edit Tucker Caroline M Cadotte Marc W Carvalho Silvia B Davies T Jonathan Ferrier Simon Fritz Susanne A Grenyer Rich Helmus Matthew R Jin Lanna S May 2017 A guide to phylogenetic metrics for conservation community ecology and macroecology A guide to phylogenetic metrics for ecology Biological Reviews 92 2 698 715 doi 10 1111 brv 12252 PMC 5096690 PMID 26785932 a b c d e Hill M O 1973 Diversity and evenness a unifying notation and its consequences Ecology 54 2 427 432 Bibcode 1973Ecol 54 427H doi 10 2307 1934352 JSTOR 1934352 a b c d e f g Jost L 2006 Entropy and diversity Oikos 113 2 363 375 Bibcode 2006Oikos 113 363J doi 10 1111 j 2006 0030 1299 14714 x a b c d e Tuomisto H 2010 A diversity of beta diversities straightening up a concept gone awry Part 1 Defining beta diversity as a function of alpha and gamma diversity Ecography 33 1 2 22 Bibcode 2010Ecogr 33 2T doi 10 1111 j 1600 0587 2009 05880 x a b c d e Tuomisto H 2010 A consistent terminology for quantifying species diversity Yes it does exist Oecologia 164 4 853 860 Bibcode 2010Oecol 164 853T doi 10 1007 s00442 010 1812 0 PMID 20978798 S2CID 19902787 Chao Anne Chiu Chun Huo Jost Lou 2016 Phylogenetic Diversity Measures and Their Decomposition A Framework Based on Hill Numbers Biodiversity Conservation and Phylogenetic Systematics Topics in Biodiversity and Conservation Springer International Publishing vol 14 pp 141 172 doi 10 1007 978 3 319 22461 9 8 ISBN 9783319224602 Morris E Kathryn Caruso Tancredi Buscot Francois Fischer Markus Hancock Christine Maier Tanja S Meiners Torsten Muller Caroline Obermaier Elisabeth Prati Daniel Socher Stephanie A Sonnemann Ilja Waschke Nicole Wubet Tesfaye Wurst Susanne September 2014 Choosing and using diversity indices insights for ecological applications from the German Biodiversity Exploratories Ecology and Evolution 4 18 3514 3524 Bibcode 2014EcoEv 4 3514M doi 10 1002 ece3 1155 ISSN 2045 7758 PMC 4224527 PMID 25478144 Spellerberg Ian F and Peter J Fedor 2003 A tribute to Claude Shannon 1916 2001 and a plea for more rigorous use of species richness species diversity and the Shannon Wiener Index Global Ecology and Biogeography 12 3 177 179 a b Shannon C E 1948 A mathematical theory of communication The Bell System Technical Journal 27 379 423 and 623 656 a b c Simpson E H 1949 Measurement of diversity Nature 163 4148 688 Bibcode 1949Natur 163 688S doi 10 1038 163688a0 Herfindahl O C 1950 Concentration in the U S Steel Industry Unpublished doctoral dissertation Columbia University Hirschman A O 1945 National power and the structure of foreign trade Berkeley Hunter PR Gaston MA 1988 Numerical index of the discriminatory ability of typing systems an application of Simpson s index of diversity J Clin Microbiol 26 11 2465 2466 doi 10 1128 JCM 26 11 2465 2466 1988 PMC 266921 PMID 3069867 Growing Decision Trees MathWorks Hurlbert S H 1971 The nonconcept of species diversity A critique and alternative parameters Ecology 52 4 577 586 Bibcode 1971Ecol 52 577H doi 10 2307 1934145 JSTOR 1934145 PMID 28973811 S2CID 25837001 Gibbs Jack P William T Martin 1962 Urbanization technology and the division of labor American Sociological Review 27 5 667 677 doi 10 2307 2089624 JSTOR 2089624 Berger Wolfgang H Parker Frances L June 1970 Diversity of Planktonic Foraminifera in Deep Sea Sediments Science 168 3937 1345 1347 Bibcode 1970Sci 168 1345B doi 10 1126 science 168 3937 1345 PMID 17731043 S2CID 29553922 Further reading editColinvaux Paul A 1973 Introduction to Ecology Wiley ISBN 0 471 16498 4 Cover Thomas M Thomas Joy A 1991 Elements of Information Theory Wiley ISBN 0 471 06259 6 See chapter 5 for an elaboration of coding procedures described informally above Chao A Shen T J 2003 Nonparametric estimation of Shannon s index of diversity when there are unseen species in sample PDF Environmental and Ecological Statistics 10 4 429 443 doi 10 1023 A 1026096204727 S2CID 20389926 External links editSimpson s Diversity index Diversity indices Archived 2005 12 19 at the Wayback Machine gives some examples of estimates of Simpson s index for real ecosystems Retrieved from https en wikipedia org w index php title Diversity index amp oldid 1203242852, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.