fbpx
Wikipedia

Krippendorff's alpha

Krippendorff's alpha coefficient,[1] named after academic Klaus Krippendorff, is a statistical measure of the agreement achieved when coding a set of units of analysis. Since the 1970s, alpha has been used in content analysis where textual units are categorized by trained readers, in counseling and survey research where experts code open-ended interview data into analyzable terms, in psychological testing where alternative tests of the same phenomena need to be compared, or in observational studies where unstructured happenings are recorded for subsequent analysis.

Krippendorff's alpha generalizes several known statistics, often called measures of inter-coder agreement, inter-rater reliability, reliability of coding given sets of units (as distinct from unitizing) but it also distinguishes itself from statistics that are called reliability coefficients but are unsuitable to the particulars of coding data generated for subsequent analysis.

Krippendorff's alpha is applicable to any number of coders, each assigning one value to one unit of analysis, to incomplete (missing) data, to any number of values available for coding a variable, to binary, nominal, ordinal, interval, ratio, polar, and circular metrics (note that this is not a metric in the mathematical sense, but often the square of a mathematical metric, see levels of measurement), and it adjusts itself to small sample sizes of the reliability data. The virtue of a single coefficient with these variations is that computed reliabilities are comparable across any numbers of coders, values, different metrics, and unequal sample sizes.

Software for calculating Krippendorff's alpha is available.[2][3][4][5][6][7][8][9][10]

Reliability data edit

Reliability data are generated in a situation in which m ≥ 2 jointly instructed (e.g., by a code book) but independently working coders assign any one of a set of values 1,...,V to a common set of N units of analysis. In their canonical form, reliability data are tabulated in an m-by-N matrix containing N values vij that coder ci has assigned to unit uj. Define mj as the number of values assigned to unit j across all coders c. When data are incomplete, mj may be less than m. Reliability data require that values be pairable, i.e., mj ≥ 2. The total number of pairable values is   nmN.

To help clarify, here is what the canonical form looks like, in the abstract:

u1 u2 u3 ... uN
c1 v11 v12 v13 v1N
c2 v21 v22 v23 v2N
c3 v31 v32 v33 v3N
cm vm1 vm2 vm3 vmN

General form of alpha edit

We denote by   the set of all possible responses an observer can give. The responses of all observers for an example is called a unit (it forms a multiset). We denote a multiset with these units as the items,  .

Alpha is given by:

 

where   is the disagreement observed and   is the disagreement expected by chance.

 

where   is a metric function (note that this is not a metric in the mathematical sense, but often the square of a mathematical metric, see below),   is the total number of pairable elements,   is the number of items in a unit,   number of   pairs in unit  , and   is the permutation function. Rearranging terms, the sum can be interpreted in a conceptual way as the weighted average of the disagreements of the individual units---weighted by the number of coders assigned to unit j:

 

where   is the mean of the   numbers   (here   and define pairable elements). Note that in the case   for all  ,   is just the average all the numbers   with  . There is also an interpretation of  as the (weighted) average observed distance from the diagonal.

 

where   is the number of ways the pair   can be made. This can be seen to be the average distance from the diagonal of all possible pairs of responses that could be derived from the multiset of all observations.

 

The above is equivalent to the usual form of   once it has been simplified algebraically.[11]

One interpretation of Krippendorff's alpha is:  

  indicates perfect reliability
  indicates the complete absence of reliability. Units and the values assigned to them are statistically unrelated.
  when disagreements are systematic and exceed what can be expected by chance.

In this general form, disagreements Do and De may be conceptually transparent but are computationally inefficient. They can be simplified algebraically, especially when expressed in terms of the visually more instructive coincidence matrix representation of the reliability data.

Coincidence matrices edit

A coincidence matrix cross tabulates the n pairable values from the canonical form of the reliability data into a v-by-v square matrix, where v is the number of values available in a variable. Unlike contingency matrices, familiar in association and correlation statistics, which tabulate pairs of values (cross tabulation), a coincidence matrix tabulates all pairable values. A coincidence matrix omits references to coders and is symmetrical around its diagonal, which contains all perfect matches, viu = vi'u for two coders i and i' , across all units u. The matrix of observed coincidences contains frequencies:

 

omitting unpaired values, where I(∘) = 1 if is true, and 0 otherwise.

Because a coincidence matrix tabulates all pairable values and its contents sum to the total n, when four or more coders are involved, ock may be fractions.

The matrix of expected coincidences contains frequencies:

 

which sum to the same nc, nk, and n as does ock. In terms of these coincidences, Krippendorff's alpha becomes:

 

Difference functions edit

Difference functions  [12] between values v and v' reflect the metric properties (levels of measurement) of their variable.

In general:

 

In particular:

For nominal data  , where v and v' serve as names.
For ordinal data  , where v and v′ are ranks.
For interval data  , where v and v′ are interval scale values.
For ratio data  , where v and v′ are absolute values.
For polar data  , where vmin and vmax define the end points of the polar scale.
For circular data  , where the sine function is expressed in degrees and U is the circumference or the range of values in a circle or loop before they repeat. For equal interval circular metrics, the smallest and largest integer values of this metric are adjacent to each other and U = vlargest – vsmallest + 1.

Significance edit

Inasmuch as mathematical statements of the statistical distribution of alpha are always only approximations, it is preferable to obtain alpha’s distribution by bootstrapping.[13][14] Alpha's distribution gives rise to two indices:

  • The confidence intervals of a computed alpha at various levels of statistical significance
  • The probability that alpha fails to achieve a chosen minimum, required for data to be considered sufficiently reliable (one-tailed test). This index acknowledges that the null-hypothesis (of chance agreement) is so far removed from the range of relevant alpha coefficients that its rejection would mean little regarding how reliable given data are. To be judged reliable, data must not significantly deviate from perfect agreement.

The minimum acceptable alpha coefficient should be chosen according to the importance of the conclusions to be drawn from imperfect data. When the costs of mistaken conclusions are high, the minimum alpha needs to be set high as well. In the absence of knowledge of the risks of drawing false conclusions from unreliable data, social scientists commonly rely on data with reliabilities α ≥ 0.800, consider data with 0.800 > α ≥ 0.667 only to draw tentative conclusions, and discard data whose agreement measures α < 0.667.[15]

A computational example edit

Let the canonical form of reliability data be a 3-coder-by-15 unit matrix with 45 cells:

Units u: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Coder A * * * * * 3 4 1 2 1 1 3 3 * 3
Coder B 1 * 2 1 3 3 4 3 * * * * * * *
Coder C * * 2 1 3 4 4 * 2 1 1 3 3 * 4

Suppose “*” indicates a default category like “cannot code,” “no answer,” or “lacking an observation.” Then, * provides no information about the reliability of data in the four values that matter. Note that unit 2 and 14 contains no information and unit 1 contains only one value, which is not pairable within that unit. Thus, these reliability data consist not of mN = 45 but of n = 26 pairable values, not in N = 15 but in 12 multiply coded units.

The coincidence matrix for these data would be constructed as follows:

o11 = {in u=4}:   {in u=10}:   {in u=11}:  
o13 = {in u=8}:   o31
o22 = {in u=3}:   {in u=9}:  
o33 = {in u=5}:   {in u=6}:   {in u=12}:   {in u=13}:  
o34 = {in u=6}:   {in u=15}:   o43
o44 = {in u=7}:  
Values v or v′: 1 2 3 4 nv
Value 1 6 1 7
Value 2 4 4
Value 3 1 7 2 10
Value 4 2 3 5
Frequency nv' 7 4 10 5 26

In terms of the entries in this coincidence matrix, Krippendorff's alpha may be calculated from:

 

For convenience, because products with   and  , only the entries in one of the off-diagonal triangles of the coincidence matrix are listed in the following:

 

Considering that all   when   for nominal data the above expression yields:

 

With   for interval data the above expression yields:

 

Here,   because disagreements happens to occur largely among neighboring values, visualized by occurring closer to the diagonal of the coincidence matrix, a condition that   takes into account but   does not. When the observed frequencies ovv are on the average proportional to the expected frequencies ev ≠ v',  .

Comparing alpha coefficients across different metrics can provide clues to how coders conceptualize the metric of a variable.

Alpha's embrace of other statistics edit

Krippendorff's alpha brings several known statistics under a common umbrella, each of them has its own limitations but no additional virtues.

  • Scott's pi[16] is an agreement coefficient for nominal data and two coders.
     
    When data are nominal, alpha reduces to a form resembling Scott's pi:
     
    Scott's observed proportion of agreement   appears in alpha’s numerator, exactly. Scott's expected proportion of agreement,   is asymptotically approximated by   when the sample size n is large, equal when infinite. It follows that Scott's pi is that special case of alpha in which two coders generate a very large sample of nominal data. For finite sample sizes:  . Evidently,  .
  • Fleiss’ kappa[17] is an agreement coefficient for nominal data with very large sample sizes where a set of coders have assigned exactly m labels to all of N units without exception (but note, there may be more than m coders, and only some subset label each instance). Fleiss claimed to have extended Cohen's kappa[18] to three or more raters or coders, but generalized Scott's pi instead. This confusion is reflected in Fleiss’ choice of its name, which has been recognized by renaming it K:[19]
     
    When sample sizes are finite, K can be seen to perpetrate the inconsistency of obtaining the proportion of observed agreements   by counting matches within the m(m − 1) possible pairs of values within u, properly excluding values paired with themselves, while the proportion   is obtained by counting matches within all (mN)2 = n2 possible pairs of values, effectively including values paired with themselves. It is the latter that introduces a bias into the coefficient. However, just as for pi, when sample sizes become very large this bias disappears and the proportion   in nominalα above asymptotically approximates   in K. Nevertheless, Fleiss' kappa, or rather K, intersects with alpha in that special situation in which a fixed number of m coders code all of N units (no data are missing), using nominal categories, and the sample size n = mN is very large, theoretically infinite.
  • Spearman's rank correlation coefficient rho[20] measures the agreement between two coders’ ranking of the same set of N objects. In its original form:
     
    where   is the sum of N differences between one coder's rank c and the other coder's rank k of the same object u. Whereas alpha accounts for tied ranks in terms of their frequencies for all coders, rho averages them in each individual coder's instance. In the absence of ties,  's numerator   and  's denominator  , where n = 2N, which becomes   when sample sizes become large. So, Spearman's rho is that special case of alpha in which two coders rank a very large set of units. Again,   and  .
  • Pearson's intraclass correlation coefficient rii is an agreement coefficient for interval data, two coders, and very large sample sizes. To obtain it, Pearson's original suggestion was to enter the observed pairs of values twice into a table, once as c − k and once as k − c, to which the traditional Pearson product-moment correlation coefficient is then applied.[21] By entering pairs of values twice, the resulting table becomes a coincidence matrix without reference to the two coders, contains n = 2N values, and is symmetrical around the diagonal, i.e., the joint linear regression line is forced into a 45° line, and references to coders are eliminated. Hence, Pearson's intraclass correlation coefficient is that special case of interval alpha for two coders and large sample sizes,   and  .
  • Finally, The disagreements in the interval alpha, Du, Do and De are proper sample variances.[22] It follows that the reliability the interval alpha assesses is consistent with all variance-based analytical techniques, such as the analysis of variance. Moreover, by incorporating difference functions not just for interval data but also for nominal, ordinal, ratio, polar, and circular data, alpha extends the notion of variance to metrics that classical analytical techniques rarely address.

Krippendorff's alpha is more general than any of these special purpose coefficients. It adjusts to varying sample sizes and affords comparisons across a wide variety of reliability data, mostly ignored by the familiar measures.

Coefficients incompatible with alpha and the reliability of coding edit

Semantically, reliability is the ability to rely on something, here on coded data for subsequent analysis. When a sufficiently large number of coders agree perfectly on what they have read or observed, relying on their descriptions is a safe bet. Judgments of this kind hinge on the number of coders duplicating the process and how representative the coded units are of the population of interest. Problems of interpretation arise when agreement is less than perfect, especially when reliability is absent.

  • Correlation and association coefficients. Pearson's product-moment correlation coefficient rij, for example, measures deviations from any linear regression line between the coordinates of i and j. Unless that regression line happens to be exactly 45° or centered, rij does not measure agreement. Similarly, while perfect agreement between coders also means perfect association, association statistics register any above chance pattern of relationships between variables. They do not distinguish agreement from other associations and are, hence, unsuitable as reliability measures.
  • Coefficients measuring the degree to which coders are statistically dependent on each other. When the reliability of coded data is at issue, the individuality of coders can have no place in it. Coders need to be treated as interchangeable. Alpha, Scott's pi, and Pearson's original intraclass correlation accomplish this by being definable as a function of coincidences, not only of contingencies. Unlike the more familiar contingency matrices, which tabulate N pairs of values and maintain reference to the two coders, coincidence matrices tabulate the n pairable values used in coding, regardless of who contributed them, in effect treating coders as interchangeable. Cohen's kappa,[23] by contrast, defines expected agreement in terms of contingencies, as the agreement that would be expected if coders were statistically independent of each other.[24] Cohen's conception of chance fails to include disagreements between coders’ individual predilections for particular categories, punishes coders who agree on their use of categories, and rewards those who do not agree with higher kappa-values. This is the cause of other noted oddities of kappa.[25] The statistical independence of coders is only marginally related to the statistical independence of the units coded and the values assigned to them. Cohen's kappa, by ignoring crucial disagreements, can become deceptively large when the reliability of coding data is to be assessed.
  • Coefficients measuring the consistency of coder judgments. In the psychometric literature,[26] reliability tends to be defined as the consistency with which several tests perform when applied to a common set of individual characteristics. Cronbach's alpha,[27] for example, is designed to assess the degree to which multiple tests produce correlated results. Perfect agreement is the ideal, of course, but Cronbach's alpha is high also when test results vary systematically. Consistency of coders’ judgments does not provide the needed assurances of data reliability. Any deviation from identical judgments – systematic or random – needs to count as disagreement and reduce the measured reliability. Cronbach's alpha is not designed to respond to absolute differences.
  • Coefficients with baselines (conditions under which they measure 0) that cannot be interpreted in terms of reliability, i.e. have no dedicated value to indicate when the units and the values assigned to them are statistically unrelated. Simple %-agreement ranges from 0 = extreme disagreement to 100 = perfect agreement with chance having no definite value. As already noted, Cohen's kappa falls into this category by defining the absence of reliability as the statistical independence between two individual coders. The baseline of Bennett, Alpert, and Goldstein's S[28] is defined in terms of the number of values available for coding, which has little to do with how values are actually used. Goodman and Kruskal's lambdar[29] is defined to vary between –1 and +1, leaving 0 without a particular reliability interpretation. Lin's reproducibility or concordance coefficient rc[30] takes Pearson's product moment correlation rij as a measure of precision and adds to it a measure Cb of accuracy, ostensively to correct for rij's above mentioned inadequacy. It varies between –1 and +1 and the reliability interpretation of 0 is uncertain. There are more so-called reliability measures whose reliability interpretations become questionable as soon as they deviate from perfect agreement.

Naming a statistic as one of agreement, reproducibility, or reliability does not make it a valid index of whether one can rely on coded data in subsequent decisions. Its mathematical structure must fit the process of coding units into a system of analyzable terms.

Notes edit

  1. ^ Krippendorff, K. (2013) pp. 221–250 describes the mathematics of alpha and its use in content analysis since 1969.
  2. ^ Hayes, A. F. & Krippendorff, K. (2007) describe and provide SPSS and SAS macros for computing alpha, its confidence limits and the probability of failing to reach a chosen minimum.
  3. ^ Reference manual of the irr package containing the kripp.alpha() function for the platform-independent statistics package R
  4. ^ The Alpha resources page.
  5. ^ Matlab code to compute Krippendorff's alpha.
  6. ^ Python code to compute Krippendorff's alpha.
  7. ^ Python code for Krippendorff's alpha fast computation.
  8. ^ Several user-written additions to the commercial software Stata are available.
  9. ^ Open Source Python implementation supporting Dataframes
  10. ^ Marzi, Giacomo; Balzano, Marco; Marchiori, Davide (2024). "K-Alpha Calculator–Krippendorff's Alpha Calculator: A user-friendly tool for computing Krippendorff's Alpha inter-rater reliability coefficient". MethodsX. 12: 102545. doi:10.1016/j.mex.2023.102545. hdl:10278/5046412. ISSN 2215-0161.
  11. ^ Honour, David. "Understanding Krippendorff's Alpha" (PDF).
  12. ^ Computing Krippendorff’s Alpha Reliability” http://repository.upenn.edu/asc_papers/43/
  13. ^ Krippendorff, K. (2004) pp. 237–238
  14. ^ Hayes, A. F. & Krippendorff, K. (2007) Answering the Call for a Standard Reliability Measure for Coding Data [1]
  15. ^ Krippendorff, K. (2004) pp. 241–243
  16. ^ Scott, W. A. (1955)
  17. ^ Fleiss, J. L. (1971)
  18. ^ Cohen, J. (1960)
  19. ^ Siegel, S. & Castellan, N. J. (1988), pp. 284–291.
  20. ^ Spearman, C. E. (1904)
  21. ^ Pearson, K. (1901), Tildesley, M. L. (1921)
  22. ^ Krippendorff, K. (1970)
  23. ^ Cohen, J. (1960)
  24. ^ Krippendorff, K. (1978) raised this issue with Joseph Fleiss
  25. ^ Zwick, R. (1988), Brennan, R. L. & Prediger, D. J. (1981), Krippendorff (1978, 2004).
  26. ^ Nunnally, J. C. & Bernstein, I. H. (1994)
  27. ^ Cronbach, L. J. (1951)
  28. ^ Bennett, E. M., Alpert, R. & Goldstein, A. C. (1954)
  29. ^ Goodman, L. A. & Kruskal, W. H. (1954) p. 758
  30. ^ Lin, L. I. (1989)
  • K. Krippendorff, 2013, Content Analysis: An Introduction to Its Methodology, 3rd ed. Thousand Oaks, CA, USA: Sage, PP. 221–250

References edit

  • Bennett, Edward M., Alpert, R. & Goldstein, A. C. (1954). Communications through limited response questioning. Public Opinion Quarterly, 18, 303–308.
  • Brennan, Robert L. & Prediger, Dale J. (1981). Coefficient kappa: Some uses, misuses, and alternatives. Educational and Psychological Measurement, 41, 687–699.
  • Cohen, Jacob (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20 (1), 37–46.
  • Cronbach, Lee, J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16 (3), 297–334.
  • Fleiss, Joseph L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378–382.
  • Goodman, Leo A. & Kruskal, William H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49, 732–764.
  • Hayes, Andrew F. & Krippendorff, Klaus (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1, 77–89.
  • Krippendorff, Klaus (2013). Content analysis: An introduction to its methodology, 3rd edition. Thousand Oaks, CA: Sage.
  • Krippendorff, Klaus (1978). Reliability of binary attribute data. Biometrics, 34 (1), 142–144.
  • Krippendorff, Klaus (1970). Estimating the reliability, systematic error, and random error of interval data. Educational and Psychological Measurement, 30 (1), 61–70.
  • Lin, Lawrence I. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45, 255–268.
  • Marzi, G., Balzano, M., & Marchiori, D. (2024). K-Alpha Calculator–Krippendorff's Alpha Calculator: A user-friendly tool for computing Krippendorff's Alpha inter-rater reliability coefficient. MethodsX, 12, 102545. https://doi.org/10.1016/j.mex.2023.102545
  • Jum C. & Bernstein, Ira H. (1994). Psychometric Theory, 3rd ed. New York: McGraw-Hill.
  • Pearson, Karl, et al. (1901). Mathematical contributions to the theory of evolution. IX: On the principle of homotyposis and its relation to heredity, to variability of the individual, and to that of race. Part I: Homotyposis in the vegetable kingdom. Philosophical Transactions of the Royal Society (London), Series A, 197, 285–379.
  • Scott, William A. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 19, 321–325.
  • Siegel, Sydney & Castella, N. John (1988). Nonparametric Statistics for the Behavioral Sciences, 2nd ed. Boston: McGraw-Hill.
  • Tildesley, M. L. (1921). A first study of the Burmes skull. Biometrica, 13, 176–267.
  • Spearman, Charles E. (1904). The proof and measurement of association between two things. American Journal of Psychology, 15, 72–101.
  • Zwick, Rebecca (1988). Another look at interrater agreement. Psychological Bulletin, 103 (3), 347–387.

External links edit

  • Krippendorff's Alpha online calculator with bootstrap and confidence intervals functions.
  • Youtube video about Krippendorff’s alpha using SPSS and a macro.
  • Reliability Calculator calculates Krippendorff's alpha.
  • Krippendorff Alpha Javascript implementation and library
  • Python implementation
  • Krippendorff Alpha Ruby Gem implementation and library.
  • Simpledorff Python implementation that works with Dataframes

krippendorff, alpha, coefficient, named, after, academic, klaus, krippendorff, statistical, measure, agreement, achieved, when, coding, units, analysis, since, 1970s, alpha, been, used, content, analysis, where, textual, units, categorized, trained, readers, c. Krippendorff s alpha coefficient 1 named after academic Klaus Krippendorff is a statistical measure of the agreement achieved when coding a set of units of analysis Since the 1970s alpha has been used in content analysis where textual units are categorized by trained readers in counseling and survey research where experts code open ended interview data into analyzable terms in psychological testing where alternative tests of the same phenomena need to be compared or in observational studies where unstructured happenings are recorded for subsequent analysis Krippendorff s alpha generalizes several known statistics often called measures of inter coder agreement inter rater reliability reliability of coding given sets of units as distinct from unitizing but it also distinguishes itself from statistics that are called reliability coefficients but are unsuitable to the particulars of coding data generated for subsequent analysis Krippendorff s alpha is applicable to any number of coders each assigning one value to one unit of analysis to incomplete missing data to any number of values available for coding a variable to binary nominal ordinal interval ratio polar and circular metrics note that this is not a metric in the mathematical sense but often the square of a mathematical metric see levels of measurement and it adjusts itself to small sample sizes of the reliability data The virtue of a single coefficient with these variations is that computed reliabilities are comparable across any numbers of coders values different metrics and unequal sample sizes Software for calculating Krippendorff s alpha is available 2 3 4 5 6 7 8 9 10 Contents 1 Reliability data 2 General form of alpha 3 Coincidence matrices 4 Difference functions 5 Significance 6 A computational example 7 Alpha s embrace of other statistics 8 Coefficients incompatible with alpha and the reliability of coding 9 Notes 10 References 11 External linksReliability data editReliability data are generated in a situation in which m 2 jointly instructed e g by a code book but independently working coders assign any one of a set of values 1 V to a common set of N units of analysis In their canonical form reliability data are tabulated in an m by N matrix containing N values vij that coder ci has assigned to unit uj Define mj as the number of values assigned to unit j across all coders c When data are incomplete mj may be less than m Reliability data require that values be pairable i e mj 2 The total number of pairable values is j 1 N m j displaystyle sum j 1 N m j nbsp n mN To help clarify here is what the canonical form looks like in the abstract u1 u2 u3 uN c1 v11 v12 v13 v1N c2 v21 v22 v23 v2N c3 v31 v32 v33 v3N cm vm1 vm2 vm3 vmNGeneral form of alpha editWe denote by R displaystyle R nbsp the set of all possible responses an observer can give The responses of all observers for an example is called a unit it forms a multiset We denote a multiset with these units as the items U displaystyle U nbsp Alpha is given by a 1 D o D e displaystyle alpha 1 frac D o D e nbsp where D o displaystyle D o nbsp is the disagreement observed and D e displaystyle D e nbsp is the disagreement expected by chance D o 1 n c R k R d c k u U m u n c k u P m u 2 displaystyle D o frac 1 n sum c in R sum k in R delta c k sum u in U m u frac n cku P m u 2 nbsp where d displaystyle delta nbsp is a metric function note that this is not a metric in the mathematical sense but often the square of a mathematical metric see below n displaystyle n nbsp is the total number of pairable elements m u displaystyle m u nbsp is the number of items in a unit n c k u displaystyle n cku nbsp number of c k displaystyle c k nbsp pairs in unit u displaystyle u nbsp and P displaystyle P nbsp is the permutation function Rearranging terms the sum can be interpreted in a conceptual way as the weighted average of the disagreements of the individual units weighted by the number of coders assigned to unit j D o 1 n j 1 N m j E d j displaystyle D o frac 1 n sum j 1 N m j mathbb E delta j nbsp where E d j displaystyle mathbb E delta j nbsp is the mean of the m j 2 displaystyle m j choose 2 nbsp numbers d v i j v i j displaystyle delta v ij v i j nbsp here i gt i displaystyle i gt i nbsp and define pairable elements Note that in the case m j m displaystyle m j m nbsp for all j displaystyle j nbsp D o displaystyle D o nbsp is just the average all the numbers d v i j v i j displaystyle delta v ij v i j nbsp with i gt i displaystyle i gt i nbsp There is also an interpretation of D o displaystyle D o nbsp as the weighted average observed distance from the diagonal D e 1 P n 2 c R k R d c k P c k displaystyle D e frac 1 P n 2 sum c in R sum k in R delta c k P ck nbsp where P c k displaystyle P ck nbsp is the number of ways the pair c k displaystyle c k nbsp can be made This can be seen to be the average distance from the diagonal of all possible pairs of responses that could be derived from the multiset of all observations P c k c k n c n k c k n c n c 1 displaystyle P ck begin cases c neq k amp n c n k c k amp n c n c 1 end cases nbsp The above is equivalent to the usual form of a displaystyle alpha nbsp once it has been simplified algebraically 11 One interpretation of Krippendorff s alpha is a 1 D within units in error D within and between units in total displaystyle alpha 1 frac D text within units text in error D text within and between units text in total nbsp a 1 displaystyle alpha 1 nbsp indicates perfect reliability a 0 displaystyle alpha 0 nbsp indicates the complete absence of reliability Units and the values assigned to them are statistically unrelated a lt 0 displaystyle alpha lt 0 nbsp when disagreements are systematic and exceed what can be expected by chance In this general form disagreements Do and De may be conceptually transparent but are computationally inefficient They can be simplified algebraically especially when expressed in terms of the visually more instructive coincidence matrix representation of the reliability data Coincidence matrices editA coincidence matrix cross tabulates the n pairable values from the canonical form of the reliability data into a v by v square matrix where v is the number of values available in a variable Unlike contingency matrices familiar in association and correlation statistics which tabulate pairs of values cross tabulation a coincidence matrix tabulates all pairable values A coincidence matrix omits references to coders and is symmetrical around its diagonal which contains all perfect matches viu vi u for two coders i and i across all units u The matrix of observed coincidences contains frequencies o v v u 1 N i i m I v i u v I v i u v m u 1 o v v n v ℓ 1 V o v ℓ v i j m N I v i j v and n ℓ 1 p 1 V o ℓ p displaystyle begin aligned o vv amp sum u 1 N frac sum i neq i m I v iu v cdot I v i u v m u 1 o v v 5pt n v amp sum ell 1 V o v ell sum v ij m N I v ij v text and n sum ell 1 p 1 V o ell p end aligned nbsp omitting unpaired values where I 1 if is true and 0 otherwise Because a coincidence matrix tabulates all pairable values and its contents sum to the total n when four or more coders are involved ock may be fractions The matrix of expected coincidences contains frequencies e v v i i m I v i u v I v i u v n 1 1 n 1 n v n v 1 if v v n v n v if v v e k c displaystyle e vv frac sum i neq i m I v iu v cdot I v i u v n 1 frac 1 n 1 cdot left begin cases n v n v 1 amp text if v v n v n v amp text if v neq v end cases right e kc nbsp which sum to the same nc nk and n as does ock In terms of these coincidences Krippendorff s alpha becomes a 1 D o D e 1 v 1 v 1 V o v v d v v v 1 v 1 V e v v d v v displaystyle alpha 1 frac D o D e 1 frac sum v 1 v 1 V o vv delta v v sum v 1 v 1 V e vv delta v v nbsp Difference functions editDifference functions d v v displaystyle delta v v nbsp 12 between values v and v reflect the metric properties levels of measurement of their variable In general d v v 0 d v v 0 d v v d v v displaystyle begin aligned delta v v amp geq 0 4pt delta v v amp 0 4pt delta v v amp delta v v end aligned nbsp In particular For nominal data d nominal v v 0 if v v 1 if v v displaystyle delta text nominal v v begin cases 0 amp text if v v 1 amp text if v neq v end cases nbsp where v and v serve as names For ordinal data d ordinal v v g v g v n g n v n v 2 2 displaystyle delta text ordinal v v left sum g v g v n g frac n v n v 2 right 2 nbsp where v and v are ranks For interval data d interval v v v v 2 displaystyle delta text interval v v v v 2 nbsp where v and v are interval scale values For ratio data d ratio v v v v v v 2 displaystyle delta text ratio v v left frac v v v v right 2 nbsp where v and v are absolute values For polar data d polar v v v v 2 v v 2 v min 2 v max v v displaystyle delta text polar v v frac v v 2 v v 2v min 2v max v v nbsp where vmin and vmax define the end points of the polar scale For circular data d circular v v sin 180 v v U 2 displaystyle delta text circular v v left sin left 180 frac v v U right right 2 nbsp where the sine function is expressed in degrees and U is the circumference or the range of values in a circle or loop before they repeat For equal interval circular metrics the smallest and largest integer values of this metric are adjacent to each other and U vlargest vsmallest 1 Significance editInasmuch as mathematical statements of the statistical distribution of alpha are always only approximations it is preferable to obtain alpha s distribution by bootstrapping 13 14 Alpha s distribution gives rise to two indices The confidence intervals of a computed alpha at various levels of statistical significance The probability that alpha fails to achieve a chosen minimum required for data to be considered sufficiently reliable one tailed test This index acknowledges that the null hypothesis of chance agreement is so far removed from the range of relevant alpha coefficients that its rejection would mean little regarding how reliable given data are To be judged reliable data must not significantly deviate from perfect agreement The minimum acceptable alpha coefficient should be chosen according to the importance of the conclusions to be drawn from imperfect data When the costs of mistaken conclusions are high the minimum alpha needs to be set high as well In the absence of knowledge of the risks of drawing false conclusions from unreliable data social scientists commonly rely on data with reliabilities a 0 800 consider data with 0 800 gt a 0 667 only to draw tentative conclusions and discard data whose agreement measures a lt 0 667 15 A computational example editLet the canonical form of reliability data be a 3 coder by 15 unit matrix with 45 cells Units u 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Coder A 3 4 1 2 1 1 3 3 3 Coder B 1 2 1 3 3 4 3 Coder C 2 1 3 4 4 2 1 1 3 3 4 Suppose indicates a default category like cannot code no answer or lacking an observation Then provides no information about the reliability of data in the four values that matter Note that unit 2 and 14 contains no information and unit 1 contains only one value which is not pairable within that unit Thus these reliability data consist not of mN 45 but of n 26 pairable values not in N 15 but in 12 multiply coded units The coincidence matrix for these data would be constructed as follows o11 in u 4 2 2 1 textstyle frac 2 2 1 nbsp in u 10 2 2 1 textstyle frac 2 2 1 nbsp in u 11 2 2 1 6 textstyle frac 2 2 1 6 nbsp o13 in u 8 1 2 1 1 textstyle frac 1 2 1 1 nbsp o31 o22 in u 3 2 2 1 textstyle frac 2 2 1 nbsp in u 9 2 2 1 4 textstyle frac 2 2 1 4 nbsp o33 in u 5 2 2 1 textstyle frac 2 2 1 nbsp in u 6 2 3 1 textstyle frac 2 3 1 nbsp in u 12 2 2 1 textstyle frac 2 2 1 nbsp in u 13 2 2 1 7 textstyle frac 2 2 1 7 nbsp o34 in u 6 2 3 1 textstyle frac 2 3 1 nbsp in u 15 1 2 1 2 textstyle frac 1 2 1 2 nbsp o43 o44 in u 7 6 3 1 3 textstyle frac 6 3 1 3 nbsp Values v or v 1 2 3 4 nv Value 1 6 1 7 Value 2 4 4 Value 3 1 7 2 10 Value 4 2 3 5 Frequency nv 7 4 10 5 26 In terms of the entries in this coincidence matrix Krippendorff s alpha may be calculated from a metric 1 D o D e 1 v 1 v 1 V o v v d metric v v 1 n 1 v 1 v 1 V n v n v d metric v v displaystyle alpha text metric 1 frac D o D e 1 frac sum v 1 v 1 V o vv delta text metric v v frac 1 n 1 sum v 1 v 1 V n v n v delta text metric v v nbsp For convenience because products with d v v 0 displaystyle delta v v 0 nbsp and d v v d v v displaystyle delta v v delta v v nbsp only the entries in one of the off diagonal triangles of the coincidence matrix are listed in the following a metric 1 1 d metric 1 3 2 d metric 3 4 1 26 1 4 7 d metric 1 2 10 7 d metric 1 3 5 7 d metric 1 4 10 4 d metric 2 3 5 4 d metric 2 4 5 10 d metric 3 4 displaystyle alpha text metric 1 frac 1 delta text metric 1 3 2 delta text metric 3 4 frac 1 26 1 4 cdot 7 delta text metric 1 2 10 cdot 7 delta text metric 1 3 5 cdot 7 delta text metric 1 4 10 cdot 4 delta text metric 2 3 5 cdot 4 delta text metric 2 4 5 cdot 10 delta text metric 3 4 nbsp Considering that all d nominal v v 1 displaystyle delta text nominal v v 1 nbsp when v v displaystyle v neq v nbsp for nominal data the above expression yields a nominal 1 1 2 1 26 1 4 7 10 7 5 7 10 4 5 4 5 10 0 691 displaystyle alpha text nominal 1 frac 1 2 frac 1 26 1 4 cdot 7 10 cdot 7 5 cdot 7 10 cdot 4 5 cdot 4 5 cdot 10 0 691 nbsp With d interval 1 2 d interval 2 3 d interval 3 4 1 2 d interval 1 3 d interval 2 4 2 2 and d interval 1 4 3 2 displaystyle delta text interval 1 2 delta text interval 2 3 delta text interval 3 4 1 2 qquad delta text interval 1 3 delta text interval 2 4 2 2 text and delta text interval 1 4 3 2 nbsp for interval data the above expression yields a interval 1 1 2 2 2 1 2 1 26 1 4 7 1 2 10 7 2 2 5 7 3 2 10 4 1 2 5 4 2 2 5 10 1 2 0 811 displaystyle alpha text interval 1 frac 1 cdot 2 2 2 cdot 1 2 frac 1 26 1 4 cdot 7 cdot 1 2 10 cdot 7 cdot 2 2 5 cdot 7 cdot 3 2 10 cdot 4 cdot 1 2 5 cdot 4 cdot 2 2 5 cdot 10 cdot 1 2 0 811 nbsp Here a interval gt a nominal displaystyle alpha text interval gt alpha text nominal nbsp because disagreements happens to occur largely among neighboring values visualized by occurring closer to the diagonal of the coincidence matrix a condition that a interval displaystyle alpha text interval nbsp takes into account but a nominal displaystyle alpha text nominal nbsp does not When the observed frequencies ov v are on the average proportional to the expected frequencies ev v a interval a nominal displaystyle alpha text interval alpha text nominal nbsp Comparing alpha coefficients across different metrics can provide clues to how coders conceptualize the metric of a variable Alpha s embrace of other statistics editKrippendorff s alpha brings several known statistics under a common umbrella each of them has its own limitations but no additional virtues Scott s pi 16 is an agreement coefficient for nominal data and two coders p P o P e 1 P e where P o c o c c n and P e c n c 2 n 2 displaystyle pi frac P o P e 1 P e text where P o sum c frac o cc n text and P e sum c frac n c 2 n 2 nbsp When data are nominal alpha reduces to a form resembling Scott s pi nominal a 1 D o D e c o c c c e c c n c e c c c O c c n c n c n c 1 n n 1 1 c n c n c 1 n n 1 displaystyle text nominal alpha 1 frac D o D e frac sum c o cc sum c e cc n sum c e cc frac sum c frac O cc n sum c frac n c n c 1 n n 1 1 sum c frac n c n c 1 n n 1 nbsp Scott s observed proportion of agreement P o displaystyle P o nbsp appears in alpha s numerator exactly Scott s expected proportion of agreement P e c n c 2 n 2 textstyle P e sum c frac n c 2 n 2 nbsp is asymptotically approximated by c n c n c 1 n n 1 textstyle sum c frac n c n c 1 n n 1 nbsp when the sample size n is large equal when infinite It follows that Scott s pi is that special case of alpha in which two coders generate a very large sample of nominal data For finite sample sizes nominal a 1 n 1 n 1 p p displaystyle text nominal alpha 1 tfrac n 1 n 1 pi geq pi nbsp Evidently lim n nominal a p textstyle lim n to infty text nominal alpha pi nbsp Fleiss kappa 17 is an agreement coefficient for nominal data with very large sample sizes where a set of coders have assigned exactly m labels to all of N units without exception but note there may be more than m coders and only some subset label each instance Fleiss claimed to have extended Cohen s kappa 18 to three or more raters or coders but generalized Scott s pi instead This confusion is reflected in Fleiss choice of its name which has been recognized by renaming it K 19 K P P e 1 P e where P 1 N u 1 N c n c u n c u 1 m m 1 c o c c m N and P e c n c 2 m N 2 displaystyle K frac bar P bar P e 1 bar P e text where bar P frac 1 N sum u 1 N sum c frac n cu n cu 1 m m 1 sum c frac o cc mN text and bar P e sum c frac n c 2 mN 2 nbsp When sample sizes are finite K can be seen to perpetrate the inconsistency of obtaining the proportion of observed agreements P displaystyle bar P nbsp by counting matches within the m m 1 possible pairs of values within u properly excluding values paired with themselves while the proportion P e displaystyle bar P e nbsp is obtained by counting matches within all mN 2 n2 possible pairs of values effectively including values paired with themselves It is the latter that introduces a bias into the coefficient However just as for pi when sample sizes become very large this bias disappears and the proportion c n c n c 1 n n 1 textstyle sum c frac n c n c 1 n n 1 nbsp in nominala above asymptotically approximates P e displaystyle bar P e nbsp in K Nevertheless Fleiss kappa or rather K intersects with alpha in that special situation in which a fixed number of m coders code all of N units no data are missing using nominal categories and the sample size n mN is very large theoretically infinite Spearman s rank correlation coefficient rho 20 measures the agreement between two coders ranking of the same set of N objects In its original form r 1 6 D 2 N N 2 1 displaystyle rho 1 frac 6 sum D 2 N N 2 1 nbsp where D 2 u 1 N ordinal d c u k u 2 textstyle sum D 2 sum u 1 N text ordinal delta c u k u 2 nbsp is the sum of N differences between one coder s rank c and the other coder s rank k of the same object u Whereas alpha accounts for tied ranks in terms of their frequencies for all coders rho averages them in each individual coder s instance In the absence of ties r displaystyle rho nbsp s numerator D 2 N D o textstyle sum D 2 ND o nbsp and r displaystyle rho nbsp s denominator N N 2 1 6 n n 1 N D e textstyle frac N N 2 1 6 frac n n 1 ND e nbsp where n 2N which becomes N D e displaystyle ND e nbsp when sample sizes become large So Spearman s rho is that special case of alpha in which two coders rank a very large set of units Again ordinal a r displaystyle text ordinal alpha geq rho nbsp and lim n ordinal a r textstyle lim n to infty text ordinal alpha rho nbsp Pearson s intraclass correlation coefficient rii is an agreement coefficient for interval data two coders and very large sample sizes To obtain it Pearson s original suggestion was to enter the observed pairs of values twice into a table once as c k and once as k c to which the traditional Pearson product moment correlation coefficient is then applied 21 By entering pairs of values twice the resulting table becomes a coincidence matrix without reference to the two coders contains n 2N values and is symmetrical around the diagonal i e the joint linear regression line is forced into a 45 line and references to coders are eliminated Hence Pearson s intraclass correlation coefficient is that special case of interval alpha for two coders and large sample sizes interval a r i i displaystyle text interval alpha geq r ii nbsp and lim n interval a r i i textstyle lim n to infty text interval alpha r ii nbsp Finally The disagreements in the interval alpha Du Do and De are proper sample variances 22 It follows that the reliability the interval alpha assesses is consistent with all variance based analytical techniques such as the analysis of variance Moreover by incorporating difference functions not just for interval data but also for nominal ordinal ratio polar and circular data alpha extends the notion of variance to metrics that classical analytical techniques rarely address Krippendorff s alpha is more general than any of these special purpose coefficients It adjusts to varying sample sizes and affords comparisons across a wide variety of reliability data mostly ignored by the familiar measures Coefficients incompatible with alpha and the reliability of coding editSemantically reliability is the ability to rely on something here on coded data for subsequent analysis When a sufficiently large number of coders agree perfectly on what they have read or observed relying on their descriptions is a safe bet Judgments of this kind hinge on the number of coders duplicating the process and how representative the coded units are of the population of interest Problems of interpretation arise when agreement is less than perfect especially when reliability is absent Correlation and association coefficients Pearson s product moment correlation coefficient rij for example measures deviations from any linear regression line between the coordinates of i and j Unless that regression line happens to be exactly 45 or centered rij does not measure agreement Similarly while perfect agreement between coders also means perfect association association statistics register any above chance pattern of relationships between variables They do not distinguish agreement from other associations and are hence unsuitable as reliability measures Coefficients measuring the degree to which coders are statistically dependent on each other When the reliability of coded data is at issue the individuality of coders can have no place in it Coders need to be treated as interchangeable Alpha Scott s pi and Pearson s original intraclass correlation accomplish this by being definable as a function of coincidences not only of contingencies Unlike the more familiar contingency matrices which tabulate N pairs of values and maintain reference to the two coders coincidence matrices tabulate the n pairable values used in coding regardless of who contributed them in effect treating coders as interchangeable Cohen s kappa 23 by contrast defines expected agreement in terms of contingencies as the agreement that would be expected if coders were statistically independent of each other 24 Cohen s conception of chance fails to include disagreements between coders individual predilections for particular categories punishes coders who agree on their use of categories and rewards those who do not agree with higher kappa values This is the cause of other noted oddities of kappa 25 The statistical independence of coders is only marginally related to the statistical independence of the units coded and the values assigned to them Cohen s kappa by ignoring crucial disagreements can become deceptively large when the reliability of coding data is to be assessed Coefficients measuring the consistency of coder judgments In the psychometric literature 26 reliability tends to be defined as the consistency with which several tests perform when applied to a common set of individual characteristics Cronbach s alpha 27 for example is designed to assess the degree to which multiple tests produce correlated results Perfect agreement is the ideal of course but Cronbach s alpha is high also when test results vary systematically Consistency of coders judgments does not provide the needed assurances of data reliability Any deviation from identical judgments systematic or random needs to count as disagreement and reduce the measured reliability Cronbach s alpha is not designed to respond to absolute differences Coefficients with baselines conditions under which they measure 0 that cannot be interpreted in terms of reliability i e have no dedicated value to indicate when the units and the values assigned to them are statistically unrelated Simple agreement ranges from 0 extreme disagreement to 100 perfect agreement with chance having no definite value As already noted Cohen s kappa falls into this category by defining the absence of reliability as the statistical independence between two individual coders The baseline of Bennett Alpert and Goldstein s S 28 is defined in terms of the number of values available for coding which has little to do with how values are actually used Goodman and Kruskal s lambdar 29 is defined to vary between 1 and 1 leaving 0 without a particular reliability interpretation Lin s reproducibility or concordance coefficient rc 30 takes Pearson s product moment correlation rij as a measure of precision and adds to it a measure Cb of accuracy ostensively to correct for rij s above mentioned inadequacy It varies between 1 and 1 and the reliability interpretation of 0 is uncertain There are more so called reliability measures whose reliability interpretations become questionable as soon as they deviate from perfect agreement Naming a statistic as one of agreement reproducibility or reliability does not make it a valid index of whether one can rely on coded data in subsequent decisions Its mathematical structure must fit the process of coding units into a system of analyzable terms Notes edit Krippendorff K 2013 pp 221 250 describes the mathematics of alpha and its use in content analysis since 1969 Hayes A F amp Krippendorff K 2007 describe and provide SPSS and SAS macros for computing alpha its confidence limits and the probability of failing to reach a chosen minimum Reference manual of the irr package containing the kripp alpha function for the platform independent statistics package R The Alpha resources page Matlab code to compute Krippendorff s alpha Python code to compute Krippendorff s alpha Python code for Krippendorff s alpha fast computation Several user written additions to the commercial software Stata are available Open Source Python implementation supporting Dataframes Marzi Giacomo Balzano Marco Marchiori Davide 2024 K Alpha Calculator Krippendorff s Alpha Calculator A user friendly tool for computing Krippendorff s Alpha inter rater reliability coefficient MethodsX 12 102545 doi 10 1016 j mex 2023 102545 hdl 10278 5046412 ISSN 2215 0161 Honour David Understanding Krippendorff s Alpha PDF Computing Krippendorff s Alpha Reliability http repository upenn edu asc papers 43 Krippendorff K 2004 pp 237 238 Hayes A F amp Krippendorff K 2007 Answering the Call for a Standard Reliability Measure for Coding Data 1 Krippendorff K 2004 pp 241 243 Scott W A 1955 Fleiss J L 1971 Cohen J 1960 Siegel S amp Castellan N J 1988 pp 284 291 Spearman C E 1904 Pearson K 1901 Tildesley M L 1921 Krippendorff K 1970 Cohen J 1960 Krippendorff K 1978 raised this issue with Joseph Fleiss Zwick R 1988 Brennan R L amp Prediger D J 1981 Krippendorff 1978 2004 Nunnally J C amp Bernstein I H 1994 Cronbach L J 1951 Bennett E M Alpert R amp Goldstein A C 1954 Goodman L A amp Kruskal W H 1954 p 758 Lin L I 1989 K Krippendorff 2013 Content Analysis An Introduction to Its Methodology 3rd ed Thousand Oaks CA USA Sage PP 221 250References editBennett Edward M Alpert R amp Goldstein A C 1954 Communications through limited response questioning Public Opinion Quarterly 18 303 308 Brennan Robert L amp Prediger Dale J 1981 Coefficient kappa Some uses misuses and alternatives Educational and Psychological Measurement 41 687 699 Cohen Jacob 1960 A coefficient of agreement for nominal scales Educational and Psychological Measurement 20 1 37 46 Cronbach Lee J 1951 Coefficient alpha and the internal structure of tests Psychometrika 16 3 297 334 Fleiss Joseph L 1971 Measuring nominal scale agreement among many raters Psychological Bulletin 76 378 382 Goodman Leo A amp Kruskal William H 1954 Measures of association for cross classifications Journal of the American Statistical Association 49 732 764 Hayes Andrew F amp Krippendorff Klaus 2007 Answering the call for a standard reliability measure for coding data Communication Methods and Measures 1 77 89 Krippendorff Klaus 2013 Content analysis An introduction to its methodology 3rd edition Thousand Oaks CA Sage Krippendorff Klaus 1978 Reliability of binary attribute data Biometrics 34 1 142 144 Krippendorff Klaus 1970 Estimating the reliability systematic error and random error of interval data Educational and Psychological Measurement 30 1 61 70 Lin Lawrence I 1989 A concordance correlation coefficient to evaluate reproducibility Biometrics 45 255 268 Marzi G Balzano M amp Marchiori D 2024 K Alpha Calculator Krippendorff s Alpha Calculator A user friendly tool for computing Krippendorff s Alpha inter rater reliability coefficient MethodsX 12 102545 https doi org 10 1016 j mex 2023 102545 Jum C amp Bernstein Ira H 1994 Psychometric Theory 3rd ed New York McGraw Hill Pearson Karl et al 1901 Mathematical contributions to the theory of evolution IX On the principle of homotyposis and its relation to heredity to variability of the individual and to that of race Part I Homotyposis in the vegetable kingdom Philosophical Transactions of the Royal Society London Series A 197 285 379 Scott William A 1955 Reliability of content analysis The case of nominal scale coding Public Opinion Quarterly 19 321 325 Siegel Sydney amp Castella N John 1988 Nonparametric Statistics for the Behavioral Sciences 2nd ed Boston McGraw Hill Tildesley M L 1921 A first study of the Burmes skull Biometrica 13 176 267 Spearman Charles E 1904 The proof and measurement of association between two things American Journal of Psychology 15 72 101 Zwick Rebecca 1988 Another look at interrater agreement Psychological Bulletin 103 3 347 387 External links editKrippendorff s Alpha online calculator with bootstrap and confidence intervals functions Youtube video about Krippendorff s alpha using SPSS and a macro Reliability Calculator calculates Krippendorff s alpha Krippendorff Alpha Javascript implementation and library Python implementation Krippendorff Alpha Ruby Gem implementation and library Simpledorff Python implementation that works with Dataframes Retrieved from https en wikipedia org w index php title Krippendorff 27s alpha amp oldid 1217224871, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.