fbpx
Wikipedia

Pangloss Collection

The Pangloss Collection is a digital library whose objective is to store and facilitate access to audio recordings in endangered languages of the world. Developed by the LACITO centre of CNRS in Paris, the collection provides free online access to documents of connected, spontaneous speech, in otherwise little-documented languages of all continents.[1]

Principles

A sound archive with synchronized transcriptions

For the science of linguistics, language is first and foremost spoken language. The medium of spoken language is sound. The Pangloss Collection gives access to original recordings simultaneously with transcriptions and translations, as a resource for further research. After being recorded in its cultural context, texts have been transcribed in collaboration with native speakers.

A structured, open architecture

The archived data is structured in accordance with the latest data-processing standards, as open architecture, in an open format, and may be downloaded under a Creative Commons license. The software used to prepare and disseminate it is open-source. The Pangloss Collection is a member of the OLAC network of archival repositories and of the Digital Endangered Languages and Music Archive Network (DELAMAN).

History

The collection was initially called the LACITO Archive.[2][3] The project originated in 1996 from the collaboration of Boyd Michailovsky, linguist at LACITO, with John B. Lowe, engineer;[4]: 15  they were later joined by Michel Jacobson, engineer, who developed some tools for the project, and brought it online.[1]: 124  [4]

The purpose of the archive was “to conserve, and to make available for research, recorded and transcribed oral traditions and other linguistic materials in (mainly) unwritten languages, giving simultaneous access to sound recordings and text annotation.”[4] The earliest archived corpora in the collection were languages from Nepal, from New Caledonia, from eastern Africa and French Guiana.[5]

The archive has grown steadily since the early 2000s,[6] incorporating corpora from various linguists, whether members of LACITO or not. In 2009, the archive had 200 recordings in 45 languages.[7] In 2014, the (newly renamed) Pangloss Collection had 1,400 recordings in 70 languages.[1]: 121 

As of April 2021, the Pangloss archive contains 5,038 recordings[8] in 196 languages,[9] totalling 780 hours of audio and video recordings.[6]

The main languages represented in the Pangloss Collection are Mwotlap (Austronesian; Vanuatu),[10]Japhug (Sino-Tibetan; Southwest China),[11]Ersu (Sino-Tibetan; Southwest China),[12]Naxi (or Yongnin Na: Sino-Tibetan; Southwest China),[13] and Cèmuhî (Austronesian; New Caledonia).[14] The main contributors to Pangloss (in number of resources archived) are linguists Alexandre François,[15] Katia Chirkova,[16]Guillaume Jacques,[17] and Michel Ferlus.[18]

References

  1. ^ a b c Michailovsky, Boyd, Martine Mazaudon, Alexis Michaud, Séverine Guillaume, Alexandre François & Evangelia Adamou. 2014. Documenting and researching endangered languages: the Pangloss Collection. Language Documentation & Conservation 8, pp. 119-135.
  2. ^ Jacobson, Michel; Michailovsky, Boyd (2002). The LACITO Archive : its purpose and implementation. Int'l Workshop on Resources and Tools in Field Linguistics. Las Palmas, Canary Is., Spain.
  3. ^ — 27 February 2001.
  4. ^ a b c Jacobson, Michel; Michailovsky, Boyd; Lowe, John B. (2001). "Linguistic documents synchronizing sound and text". Speech Communication. Special issue: “Speech Annotation and Corpus Tools”. 33 (1–2): 79–96. doi:10.1016/S0167-6393(00)00070-4.
  5. ^ — 22 April 2002.
  6. ^ a b “About us” section of the Pangloss Collection (retrieved 24 April 2021)
  7. ^ — 26 November 2009.
  8. ^ Source: list of all Pangloss resources on the Cocoon homepage (retrieved 10 January 2022).
  9. ^ Source: number of language entries in its list of corpora (retrieved 24 April 2021).
  10. ^ Mwotlap corpus: 564 resources.
  11. ^ Japhug corpus: 551 resources.
  12. ^ Ersu corpus: 363 resources.
  13. ^ Yongnin Na corpus: 301 resources.
  14. ^ Cèmuhî corpus: 230 resources.
  15. ^ François contributed 1075 resources.
  16. ^ Chirkova contributed 601 resources.
  17. ^ Jacques contributed 554 resources.
  18. ^ Ferlus contributed 530 resources.

External links

  • Homepage of the Pangloss Collection
  • Sample text from the collection: “The Ogre Kanayongba”, a story in the Limbu language of Nepal, presented in bilingual format.
  • Access to the Pangloss Collection through its language map
  • Access to the Pangloss Collection through the CoCoON search interface.
  • Access to the Pangloss Collection through the OLAC search interface. 2021-04-24 at the Wayback Machine

pangloss, collection, digital, library, whose, objective, store, facilitate, access, audio, recordings, endangered, languages, world, developed, lacito, centre, cnrs, paris, collection, provides, free, online, access, documents, connected, spontaneous, speech,. The Pangloss Collection is a digital library whose objective is to store and facilitate access to audio recordings in endangered languages of the world Developed by the LACITO centre of CNRS in Paris the collection provides free online access to documents of connected spontaneous speech in otherwise little documented languages of all continents 1 Contents 1 Principles 1 1 A sound archive with synchronized transcriptions 1 2 A structured open architecture 2 History 3 References 4 External linksPrinciples EditA sound archive with synchronized transcriptions Edit For the science of linguistics language is first and foremost spoken language The medium of spoken language is sound The Pangloss Collection gives access to original recordings simultaneously with transcriptions and translations as a resource for further research After being recorded in its cultural context texts have been transcribed in collaboration with native speakers A structured open architecture Edit The archived data is structured in accordance with the latest data processing standards as open architecture in an open format and may be downloaded under a Creative Commons license The software used to prepare and disseminate it is open source The Pangloss Collection is a member of the OLAC network of archival repositories and of the Digital Endangered Languages and Music Archive Network DELAMAN History EditThe collection was initially called the LACITO Archive 2 3 The project originated in 1996 from the collaboration of Boyd Michailovsky linguist at LACITO with John B Lowe engineer 4 15 they were later joined by Michel Jacobson engineer who developed some tools for the project and brought it online 1 124 4 The purpose of the archive was to conserve and to make available for research recorded and transcribed oral traditions and other linguistic materials in mainly unwritten languages giving simultaneous access to sound recordings and text annotation 4 The earliest archived corpora in the collection were languages from Nepal from New Caledonia from eastern Africa and French Guiana 5 The archive has grown steadily since the early 2000s 6 incorporating corpora from various linguists whether members of LACITO or not In 2009 the archive had 200 recordings in 45 languages 7 In 2014 the newly renamed Pangloss Collection had 1 400 recordings in 70 languages 1 121 As of April 2021 the Pangloss archive contains 5 038 recordings 8 in 196 languages 9 totalling 780 hours of audio and video recordings 6 The main languages represented in the Pangloss Collection are Mwotlap Austronesian Vanuatu 10 Japhug Sino Tibetan Southwest China 11 Ersu Sino Tibetan Southwest China 12 Naxi or Yongnin Na Sino Tibetan Southwest China 13 and Cemuhi Austronesian New Caledonia 14 The main contributors to Pangloss in number of resources archived are linguists Alexandre Francois 15 Katia Chirkova 16 Guillaume Jacques 17 and Michel Ferlus 18 References Edit a b c Michailovsky Boyd Martine Mazaudon Alexis Michaud Severine Guillaume Alexandre Francois amp Evangelia Adamou 2014 Documenting and researching endangered languages the Pangloss Collection Language Documentation amp Conservation 8 pp 119 135 Jacobson Michel Michailovsky Boyd 2002 The LACITO Archive its purpose and implementation Int l Workshop on Resources and Tools in Field Linguistics Las Palmas Canary Is Spain Screen capture of LACITO s archive homepage 27 February 2001 a b c Jacobson Michel Michailovsky Boyd Lowe John B 2001 Linguistic documents synchronizing sound and text Speech Communication Special issue Speech Annotation and Corpus Tools 33 1 2 79 96 doi 10 1016 S0167 6393 00 00070 4 Screen capture of LACITO s archive contents 22 April 2002 a b About us section of the Pangloss Collection retrieved 24 April 2021 Screen capture of LACITO s archive contents 26 November 2009 Source list of all Pangloss resources on the Cocoon homepage retrieved 10 January 2022 Source number of language entries in its list of corpora retrieved 24 April 2021 Mwotlap corpus 564 resources Japhug corpus 551 resources Ersu corpus 363 resources Yongnin Na corpus 301 resources Cemuhi corpus 230 resources Francois contributed 1075 resources Chirkova contributed 601 resources Jacques contributed 554 resources Ferlus contributed 530 resources External links EditHomepage of the Pangloss Collection Sample text from the collection The Ogre Kanayongba a story in the Limbu language of Nepal presented in bilingual format Access to the Pangloss Collection through its language map Access to the Pangloss Collection through the CoCoON search interface Access to the Pangloss Collection through the OLAC search interface Archived 2021 04 24 at the Wayback Machine Retrieved from https en wikipedia org w index php title Pangloss Collection amp oldid 1118673002, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.