fbpx
Wikipedia

CJK characters

In internationalization, CJK characters is a collective term for the Chinese, Japanese, and Korean languages, all of which include Chinese characters and derivatives in their writing systems, sometimes paired with other scripts. Collectively, the CJK characters often include Hànzì in Chinese, Kanji and Kana in Japanese, and Hanja and Hangul in Korean. Vietnamese can be included, making the abbreviation CJKV, as Vietnamese historically used Chinese characters in which they were known as chữ Hán and chữ Nôm in Vietnamese (Hán-Nôm altogether).

Character repertoire Edit

Standard Mandarin Chinese and Standard Cantonese are written almost exclusively in Chinese characters. Over 3,000 characters are required for general literacy, with up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japanese can be expected with 2,136 characters. The use of Chinese characters in Korea is increasingly rare, although idiosyncratic use of Chinese characters in proper names requires knowledge (and therefore availability) of many more characters. Even today, however, South Korean students are taught 1,800 characters.

Other scripts used for these languages, such as bopomofo and the Latin-based pinyin for Chinese, hiragana and katakana for Japanese, and hangul for Korean, are not strictly "CJK characters", although CJK character sets almost invariably include them as necessary for full coverage of the target languages.

The sinologist Carl Leban (1971) produced an early survey of CJK encoding systems.

Until the early 20th century, Classical Chinese was the written language of government and scholarship in Vietnam. Popular literature in Vietnamese was written in the chữ Nôm script, consisting of Chinese characters with many characters created locally. From 1920s onwards, the script since then used for recording literature has been the Latin chữ Quốc ngữ.[1][2]

Encoding Edit

The number of characters required for complete coverage of all these languages' needs cannot fit in the 256-character code space of 8-bit character encodings, requiring at least a 16-bit fixed width encoding or multi-byte variable-length encodings. The 16-bit fixed width encodings, such as those from Unicode up to and including version 2.0, are now deprecated due to the requirement to encode more characters than a 16-bit encoding can accommodate—Unicode 5.0 has some 70,000 Han characters—and the requirement by the Chinese government that software in China support the GB 18030 character set.

Although CJK encodings have common character sets, the encodings often used to represent them have been developed separately by different East Asian governments and software companies, and are mutually incompatible. Unicode has attempted, with some controversy, to unify the character sets in a process known as Han unification.

CJK character encodings should consist minimally of Han characters plus language-specific phonetic scripts such as pinyin, bopomofo, hiragana, katakana and hangul.

CJK character encodings include:

The CJK character sets take up the bulk of the assigned Unicode code space. There is much controversy among Japanese experts of Chinese characters about the desirability and technical merit of the Han unification process used to map multiple Chinese and Japanese character sets into a single set of unified characters.[citation needed]

All three languages can be written both left-to-right and top-to-bottom (right-to-left and top-to-bottom in ancient documents), but are usually considered left-to-right scripts when discussing encoding issues.

Legal status Edit

Libraries cooperated on encoding standards for JACKPHY characters in the early 1980s. According to Ken Lunde, the abbreviation "CJK" was a registered trademark of Research Libraries Group[3] (which merged with OCLC in 2006). The trademark owned by OCLC between 1987 and 2009 has now expired.[4]

See also Edit

References Edit

  1. ^ Coulmas (1991), pp. 113–115.
  2. ^ DeFrancis (1977).
  3. ^ Ken Lunde, 1996
  4. ^ Justia listing

Works cited Edit

  • Coulmas, Florian (1991). The writing systems of the world. Blackwell. ISBN 978-0-631-18028-9.
  • DeFrancis, John (1977). Colonialism and language policy in Viet Nam. The Hague: Mouton. ISBN 978-90-279-7643-7.

Sources Edit

  • DeFrancis, John. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press, 1990. ISBN 0-8248-1068-6.
  • Hannas, William C. Asia's Orthographic Dilemma. Honolulu: University of Hawaii Press, 1997. ISBN 0-8248-1892-X (paperback); ISBN 0-8248-1842-3 (hardcover).
  • Lemberg, Werner: The CJK package for LATEX2ε—Multilingual support beyond babel. TUGboat, Volume 18 (1997), No. 3—Proceedings of the 1997 Annual Meeting.
  • Leban, Carl. Automated Orthographic Systems for East Asian Languages (Chinese, Japanese, Korean), State-of-the-art Report, Prepared for the Board of Directors, Association for Asian Studies. 1971.
  • Lunde, Ken. CJKV Information Processing. Sebastopol, Calif.: O'Reilly & Associates, 1998. ISBN 1-56592-224-7.

External links Edit

  • CJKV: A Brief Introduction
  • Lemberg CJK article from above, TUGboat18-3
  • On "CJK Unified Ideograph", from Wenlin.com

characters, help, with, character, display, help, multilingual, support, east, asian, internationalization, collective, term, chinese, japanese, korean, languages, which, include, chinese, characters, derivatives, their, writing, systems, sometimes, paired, wi. For help with CJK character display see Help Multilingual support East Asian In internationalization CJK characters is a collective term for the Chinese Japanese and Korean languages all of which include Chinese characters and derivatives in their writing systems sometimes paired with other scripts Collectively the CJK characters often include Hanzi in Chinese Kanji and Kana in Japanese and Hanja and Hangul in Korean Vietnamese can be included making the abbreviation CJKV as Vietnamese historically used Chinese characters in which they were known as chữ Han and chữ Nom in Vietnamese Han Nom altogether Contents 1 Character repertoire 2 Encoding 3 Legal status 4 See also 5 References 5 1 Works cited 6 Sources 7 External linksCharacter repertoire EditStandard Mandarin Chinese and Standard Cantonese are written almost exclusively in Chinese characters Over 3 000 characters are required for general literacy with up to 40 000 characters for reasonably complete coverage Japanese uses fewer characters general literacy in Japanese can be expected with 2 136 characters The use of Chinese characters in Korea is increasingly rare although idiosyncratic use of Chinese characters in proper names requires knowledge and therefore availability of many more characters Even today however South Korean students are taught 1 800 characters Other scripts used for these languages such as bopomofo and the Latin based pinyin for Chinese hiragana and katakana for Japanese and hangul for Korean are not strictly CJK characters although CJK character sets almost invariably include them as necessary for full coverage of the target languages The sinologist Carl Leban 1971 produced an early survey of CJK encoding systems Until the early 20th century Classical Chinese was the written language of government and scholarship in Vietnam Popular literature in Vietnamese was written in the chữ Nom script consisting of Chinese characters with many characters created locally From 1920s onwards the script since then used for recording literature has been the Latin chữ Quốc ngữ 1 2 Encoding EditThe number of characters required for complete coverage of all these languages needs cannot fit in the 256 character code space of 8 bit character encodings requiring at least a 16 bit fixed width encoding or multi byte variable length encodings The 16 bit fixed width encodings such as those from Unicode up to and including version 2 0 are now deprecated due to the requirement to encode more characters than a 16 bit encoding can accommodate Unicode 5 0 has some 70 000 Han characters and the requirement by the Chinese government that software in China support the GB 18030 character set Although CJK encodings have common character sets the encodings often used to represent them have been developed separately by different East Asian governments and software companies and are mutually incompatible Unicode has attempted with some controversy to unify the character sets in a process known as Han unification CJK character encodings should consist minimally of Han characters plus language specific phonetic scripts such as pinyin bopomofo hiragana katakana and hangul CJK character encodings include Big5 the most prevalent encoding before Unicode was implemented CCCII CNS 11643 official standard of Republic of China EUC JP EUC KR GB 2312 subset and predecessor of GB 18030 GB 18030 mandated standard in the People s Republic of China Giga Character Set GCS ISO 2022 JP KS C 5861 Shift JIS TRON Unicode The CJK character sets take up the bulk of the assigned Unicode code space There is much controversy among Japanese experts of Chinese characters about the desirability and technical merit of the Han unification process used to map multiple Chinese and Japanese character sets into a single set of unified characters citation needed All three languages can be written both left to right and top to bottom right to left and top to bottom in ancient documents but are usually considered left to right scripts when discussing encoding issues Legal status EditLibraries cooperated on encoding standards for JACKPHY characters in the early 1980s According to Ken Lunde the abbreviation CJK was a registered trademark of Research Libraries Group 3 which merged with OCLC in 2006 The trademark owned by OCLC between 1987 and 2009 has now expired 4 See also EditChinese character description languages Chinese character encoding Chinese input methods for computers CJK Compatibility Ideographs CJK strokes CJK Unified Ideographs Complex Text Layout languages CTL Input method editor Japanese language and computers Korean language and computers List of CJK fonts Sinoxenic Variable width encoding Vietnamese language and computersReferences Edit Coulmas 1991 pp 113 115 DeFrancis 1977 Ken Lunde 1996 Justia listing Works cited Edit Coulmas Florian 1991 The writing systems of the world Blackwell ISBN 978 0 631 18028 9 DeFrancis John 1977 Colonialism and language policy in Viet Nam The Hague Mouton ISBN 978 90 279 7643 7 Sources EditDeFrancis John The Chinese Language Fact and Fantasy Honolulu University of Hawaii Press 1990 ISBN 0 8248 1068 6 Hannas William C Asia s Orthographic Dilemma Honolulu University of Hawaii Press 1997 ISBN 0 8248 1892 X paperback ISBN 0 8248 1842 3 hardcover Lemberg Werner The CJK package for LATEX2e Multilingual support beyond babel TUGboat Volume 18 1997 No 3 Proceedings of the 1997 Annual Meeting Leban Carl Automated Orthographic Systems for East Asian Languages Chinese Japanese Korean State of the art Report Prepared for the Board of Directors Association for Asian Studies 1971 Lunde Ken CJKV Information Processing Sebastopol Calif O Reilly amp Associates 1998 ISBN 1 56592 224 7 External links EditCJKV A Brief Introduction Lemberg CJK article from above TUGboat18 3 On CJK Unified Ideograph from Wenlin com FGA Unicode CJKV character set rationalization Retrieved from https en wikipedia org w index php title CJK characters amp oldid 1177434679, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.