fbpx
Wikipedia

Lexical similarity

In linguistics, lexical similarity is a measure of the degree to which the word sets of two given languages are similar. A lexical similarity of 1 (or 100%) would mean a total overlap between vocabularies, whereas 0 means there are no common words.

There are different ways to define the lexical similarity and the results vary accordingly. For example, Ethnologue's method of calculation consists in comparing a regionally standardized wordlist (comparable to the Swadesh list) and counting those forms that show similarity in both form and meaning. Using such a method, English was evaluated to have a lexical similarity of 60% with German and 27% with French.

Lexical similarity can be used to evaluate the degree of genetic relationship between two languages. Percentages higher than 85% usually indicate that the two languages being compared are likely to be related dialects.[1]

The lexical similarity is only one indication of the mutual intelligibility of the two languages, since the latter also depends on the degree of phonetical, morphological, and syntactical similarity. The variations due to differing wordlists weigh on this. For example, lexical similarity between French and English is considerable in lexical fields relating to culture, whereas their similarity is smaller as far as basic (function) words are concerned. Unlike mutual intelligibility, lexical similarity can only be symmetrical.

Indo-European languages edit

The table below shows some lexical similarity values for pairs of selected Romance, Germanic, and Slavic languages, as collected and published by Ethnologue.[2]

Lang.
code
Language 1
Lexical similarity coefficients
Italian Spanish Portuguese French Romanian Catalan Romansh Sardinian English German Russian
ita Italian 1 0.82 0.80 0.89 0.77 0.87 0.78 0.85 - - -
spa Spanish 0.82 1 0.89 0.75 0.71 0.85 0.74 0.76 - - -
por Portuguese 0.80 0.89 1 0.75 0.72 0.85 0.74 0.76 - - -
fra French 0.89 0.75 0.75 1 0.75 0.85 0.78 0.80 0.27 0.29 -
ron Romanian 0.77 0.71 0.72 0.75 1 0.73 0.72 0.74 - - -
cat Catalan 0.87 0.85 0.85 0.85 0.73 1 0.76 0.75 - - -
roh Romansh 0.78 0.74 0.74 0.78 0.72 0.76 1 0.74 - - -
srd Sardinian 0.85 0.76 0.76 0.80 0.74 0.75 0.74 1 - - -
eng English - - - 0.27 - - - - 1 0.60 0.24
deu German - - - 0.29 - - - - 0.60 1 -
rus Russian - - - - - - - - 0.24 - 1
Italian Spanish Portuguese French Romanian Catalan Romansh Sardinian English German Russian
Language 2 → ita spa por fra ron cat roh srd eng deu rus

Notes:

  • Language codes are from standard ISO 639-3.
  • Roberto Bolognesi and Wilbert Heeringa found the average divergence between Sardinian and Italian to be around 48.7%, ranging from a minimum dialectal degree of divergence being 46.6% to the highest one of 51.1%.[3] That would make the various dialects of Sardinian slightly more divergent from Italian than Spanish (with an average degree of divergence from Italian being around 46.0%) is.[3]
  • "-" denotes that comparison data are not available.
  • In the case of English-French lexical similarity, at least two other studies[4][5] estimate the number of English words directly inherited from French at 28.3% and 41% respectively, with respectively 28.24% and 15% of other English words derived from Latin, putting English-French lexical similarity at around 0.56, with reciprocally lower English-German lexical similarities. Another study estimates the number of English words with an Italic origin at 51%, consistent with the two previous analyses.[6]

See also edit

References edit

  • (lexical similarity values available at some of the individual language entries)
  • Definition of lexical similarity at Ethnologue.com
  • Rensch, Calvin R. 1992. "Calculating lexical similarity." In Eugene H. Casad (ed.), Windows on bilingualism , 13-15. (Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics, 110). Dallas: Summer Institute of Linguistics and the University of Texas at Arlington.

Notes edit

  1. ^ "About the Ethnologue". Ethnologue. 2012-09-25. Retrieved 2019-02-24.
  2. ^ See, for instance, lexical similarity data for French, German, English
  3. ^ a b (PDF). Archived from the original (PDF) on 2014-02-11. Retrieved 2017-04-14.
  4. ^ Finkenstaedt, Thomas; Dieter Wolff (1973). Ordered profusion; studies in dictionaries and the English lexicon. C. Winter. ISBN 3-533-02253-6.
  5. ^ "Joseph M. Willams, Origins of the English Language at". Amazon.com. Retrieved 2010-04-21.
  6. ^ Nation, I.S.P. (2001). Learning Vocabulary in Another Language. Cambridge University Press. p. 477. ISBN 0-521-80498-1.

External links edit

  • Most similar languages
  • A Similarity Database of Modern Lexicons: Lexical similarity of 331 languages

lexical, similarity, linguistics, lexical, similarity, measure, degree, which, word, sets, given, languages, similar, lexical, similarity, would, mean, total, overlap, between, vocabularies, whereas, means, there, common, words, there, different, ways, define,. In linguistics lexical similarity is a measure of the degree to which the word sets of two given languages are similar A lexical similarity of 1 or 100 would mean a total overlap between vocabularies whereas 0 means there are no common words There are different ways to define the lexical similarity and the results vary accordingly For example Ethnologue s method of calculation consists in comparing a regionally standardized wordlist comparable to the Swadesh list and counting those forms that show similarity in both form and meaning Using such a method English was evaluated to have a lexical similarity of 60 with German and 27 with French Lexical similarity can be used to evaluate the degree of genetic relationship between two languages Percentages higher than 85 usually indicate that the two languages being compared are likely to be related dialects 1 The lexical similarity is only one indication of the mutual intelligibility of the two languages since the latter also depends on the degree of phonetical morphological and syntactical similarity The variations due to differing wordlists weigh on this For example lexical similarity between French and English is considerable in lexical fields relating to culture whereas their similarity is smaller as far as basic function words are concerned Unlike mutual intelligibility lexical similarity can only be symmetrical Contents 1 Indo European languages 2 See also 3 References 4 Notes 5 External linksIndo European languages editThe table below shows some lexical similarity values for pairs of selected Romance Germanic and Slavic languages as collected and published by Ethnologue 2 Lang code Language 1 Lexical similarity coefficientsItalian Spanish Portuguese French Romanian Catalan Romansh Sardinian English German Russianita Italian 1 0 82 0 80 0 89 0 77 0 87 0 78 0 85 spa Spanish 0 82 1 0 89 0 75 0 71 0 85 0 74 0 76 por Portuguese 0 80 0 89 1 0 75 0 72 0 85 0 74 0 76 fra French 0 89 0 75 0 75 1 0 75 0 85 0 78 0 80 0 27 0 29 ron Romanian 0 77 0 71 0 72 0 75 1 0 73 0 72 0 74 cat Catalan 0 87 0 85 0 85 0 85 0 73 1 0 76 0 75 roh Romansh 0 78 0 74 0 74 0 78 0 72 0 76 1 0 74 srd Sardinian 0 85 0 76 0 76 0 80 0 74 0 75 0 74 1 eng English 0 27 1 0 60 0 24deu German 0 29 0 60 1 rus Russian 0 24 1Italian Spanish Portuguese French Romanian Catalan Romansh Sardinian English German RussianLanguage 2 ita spa por fra ron cat roh srd eng deu rusNotes Language codes are from standard ISO 639 3 Roberto Bolognesi and Wilbert Heeringa found the average divergence between Sardinian and Italian to be around 48 7 ranging from a minimum dialectal degree of divergence being 46 6 to the highest one of 51 1 3 That would make the various dialects of Sardinian slightly more divergent from Italian than Spanish with an average degree of divergence from Italian being around 46 0 is 3 denotes that comparison data are not available In the case of English French lexical similarity at least two other studies 4 5 estimate the number of English words directly inherited from French at 28 3 and 41 respectively with respectively 28 24 and 15 of other English words derived from Latin putting English French lexical similarity at around 0 56 with reciprocally lower English German lexical similarities Another study estimates the number of English words with an Italic origin at 51 consistent with the two previous analyses 6 See also editLexis linguistics Language family Dialect Linguistic distanceReferences editEthnologue com lexical similarity values available at some of the individual language entries Definition of lexical similarity at Ethnologue com Rensch Calvin R 1992 Calculating lexical similarity In Eugene H Casad ed Windows on bilingualism 13 15 Summer Institute of Linguistics and the University of Texas at Arlington Publications in Linguistics 110 Dallas Summer Institute of Linguistics and the University of Texas at Arlington Notes edit About the Ethnologue Ethnologue 2012 09 25 Retrieved 2019 02 24 See for instance lexical similarity data for French German English a b Bolognesi Roberto Heeringa Wilbert Sardegna fra tante lingue pp 123 2005 Condaghes PDF Archived from the original PDF on 2014 02 11 Retrieved 2017 04 14 Finkenstaedt Thomas Dieter Wolff 1973 Ordered profusion studies in dictionaries and the English lexicon C Winter ISBN 3 533 02253 6 Joseph M Willams Origins of the English Language at Amazon com Retrieved 2010 04 21 Nation I S P 2001 Learning Vocabulary in Another Language Cambridge University Press p 477 ISBN 0 521 80498 1 External links editMost similar languages A Similarity Database of Modern Lexicons Lexical similarity of 331 languages Retrieved from https en wikipedia org w index php title Lexical similarity amp oldid 1198565560, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.