fbpx
Wikipedia

Windows-1258

Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts. It makes use of combining diacritical marks.

Windows-1258
MIME / IANAwindows-1258
Alias(es)cp1258 (Code page 1258)
Language(s)Vietnamese, English, French, German, Spanish, Danish, Norwegian, Swedish, Finnish, Irish, Albanian, Luxembourgish, Tswana.
With combining diacritics:
Estonian, Italian, Portuguese, Yoruba, Guarani, Igbo, Nauruan, Devanagari transliteration.
Created byMicrosoft
StandardWHATWG Encoding Standard
Classificationextended ASCII, Windows-125x
Based onWindows-1252

Windows-1258 is compatible with neither the Vietnamese standard (TCVN 5712 / VSCII), nor the various other encodings in use in practice (VISCII, VNI, VPS). Rather, it is very similar to Windows-1252, with the differences being that s-caron and z-caron (which were added to Windows-1252 later) are missing, five of the letters with diacritics have been replaced by combining diacritics for Vietnamese tone marks, one has been replaced with the đông sign, and eight others (four per case) have been changed to four otherwise-unsupported Vietnamese letters.

Use of combining diacritics means that Windows-1258 can cover the large number of combinations of letters and tone marks in Vietnamese without compromising coverage of control codes or symbols. However it also means that software must be careful to handle conversions between precomposed characters and combining sequences correctly when converting to/from other encodings and makes determining user-visible length of a string more difficult.

IBM uses code page 1258 (CCSID 1258 and euro sign extended CCSID 5354) for Windows-1258.[1][2][3]

UTF-8 is the preferred encoding for Vietnamese in modern applications. Windows-1258 may not always round-trip Unicode encoded Vietnamese due to changes caused by Unicode normalization.[4] Combining diacritics are encoded after the letter in both Windows-1258 and Unicode[4] (like VNI, unlike ANSEL).

Character set edit

The following table shows Windows-1258. Each character is shown with its Unicode equivalent.

Windows-1258[5][6][7][8][9][10]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8x ƒ ˆ Œ
9x ˜ œ Ÿ
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ă Ä Å Æ Ç È É Ê Ë ◌̀ Í Î Ï
Dx Đ Ñ ◌̉ Ó Ô Ơ Ö × Ø Ù Ú Û Ü Ư ◌̃ ß
Ex à á â ă ä å æ ç è é ê ë ◌́ í î ï
Fx đ ñ ◌̣ ó ô ơ ö ÷ ø ù ú û ü ư ÿ
  Differences from Windows-1252

Code page 1129 edit

IBM's code page 1129 (CCSID 1129 and euro sign extended CCSID 1163)[11][12][13] is similar to code page 1258, but with the following differences:

Code page 1129 (differences from code page 1258)[14][15][16][17][18][19]
0 1 2 3 4 5 6 7 8 9 A B C D E F
8x
9x
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § œ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ Ÿ µ · Œ ¹ º » ¼ ½ ¾ ¿
  Differences from Windows-1258

See also edit

References edit

  1. ^ . Archived from the original on 2016-03-03.
  2. ^ . Archived from the original on 2014-11-29.
  3. ^ . Archived from the original on 2014-11-29.
  4. ^ a b Kaplan, Michael S. (2005-04-19). "A few of the gotchas of MultiByteToWideChar". Sorting it all out.
  5. ^ Steele, Shawn (1998-04-15). "cp1258 to Unicode table". Microsoft.
  6. ^ Unicode mappings of windows 1258 with "best fit"
  7. ^ Code Page CPGID 01258 (pdf) (PDF), IBM
  8. ^ Code Page CPGID 01258 (txt), IBM
  9. ^ International Components for Unicode (ICU), ibm-1258_P100-1997.ucm, 2002-12-03
  10. ^ International Components for Unicode (ICU), ibm-5354_P100-1998.ucm, 2002-12-03
  11. ^ . Archived from the original on 2010-09-21.
  12. ^ . Archived from the original on 2016-03-27.
  13. ^ . Archived from the original on 2014-11-29.
  14. ^ Lunde, Ken (13 January 2009). "Appendix L: Vietnamese Character Sets" (PDF). CJKV Information Processing (2nd ed.). ISBN 978-0-596-51447-1.
  15. ^ Code Page CPGID 01129 (pdf) (PDF), IBM
  16. ^ Code Page CPGID 01129 (txt), IBM
  17. ^ International Components for Unicode (ICU), ibm-1129_P100-1997.ucm, 2002-12-03
  18. ^ Code Page CPGID 01163 (pdf) (PDF), IBM
  19. ^ Code Page CPGID 01163 (txt), IBM

External links edit

  • IANA Charset Name Registration of windows-1258
  • Michael Kaplan's blog describing the Windows 1258 encoding behavior

windows, 1258, code, page, used, microsoft, windows, represent, vietnamese, texts, makes, combining, diacritical, marks, mime, ianawindows, 1258alias, cp1258, code, page, 1258, language, vietnamese, english, french, german, spanish, danish, norwegian, swedish,. Windows 1258 is a code page used in Microsoft Windows to represent Vietnamese texts It makes use of combining diacritical marks Windows 1258MIME IANAwindows 1258Alias es cp1258 Code page 1258 Language s Vietnamese English French German Spanish Danish Norwegian Swedish Finnish Irish Albanian Luxembourgish Tswana With combining diacritics Estonian Italian Portuguese Yoruba Guarani Igbo Nauruan Devanagari transliteration Created byMicrosoftStandardWHATWG Encoding StandardClassificationextended ASCII Windows 125xBased onWindows 1252vteWindows 1258 is compatible with neither the Vietnamese standard TCVN 5712 VSCII nor the various other encodings in use in practice VISCII VNI VPS Rather it is very similar to Windows 1252 with the differences being that s caron and z caron which were added to Windows 1252 later are missing five of the letters with diacritics have been replaced by combining diacritics for Vietnamese tone marks one has been replaced with the đong sign and eight others four per case have been changed to four otherwise unsupported Vietnamese letters Use of combining diacritics means that Windows 1258 can cover the large number of combinations of letters and tone marks in Vietnamese without compromising coverage of control codes or symbols However it also means that software must be careful to handle conversions between precomposed characters and combining sequences correctly when converting to from other encodings and makes determining user visible length of a string more difficult IBM uses code page 1258 CCSID 1258 and euro sign extended CCSID 5354 for Windows 1258 1 2 3 UTF 8 is the preferred encoding for Vietnamese in modern applications Windows 1258 may not always round trip Unicode encoded Vietnamese due to changes caused by Unicode normalization 4 Combining diacritics are encoded after the letter in both Windows 1258 and Unicode 4 like VNI unlike ANSEL Contents 1 Character set 2 Code page 1129 3 See also 4 References 5 External linksCharacter set editThe following table shows Windows 1258 Each character is shown with its Unicode equivalent Windows 1258 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 A B C D E F0x NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US2x SP amp 3x 0 1 2 3 4 5 6 7 8 9 lt gt 4x A B C D E F G H I J K L M N O5x P Q R S T U V W X Y Z 6x a b c d e f g h i j k l m n o7x p q r s t u v w x y z DEL8x ƒ ˆ Œ9x œ ŸAx NBSP c ª SHY Bx µ º Cx A A A Ă A A AE C E E E E I I IDx Đ N o O Ơ O O U U U U Ư ssEx a a a ă a a ae c e e e e i i iFx đ n o o ơ o o u u u u ư y Differences from Windows 1252Code page 1129 editIBM s code page 1129 CCSID 1129 and euro sign extended CCSID 1163 11 12 13 is similar to code page 1258 but with the following differences Code page 1129 differences from code page 1258 14 15 16 17 18 19 0 1 2 3 4 5 6 7 8 9 A B C D E F8x9xAx NBSP œ c ª SHY Bx Ÿ µ Œ º Differences from Windows 1258See also editVSCII VISCII VNI Character Set VPS character encodingReferences edit Code page 1258 information document Archived from the original on 2016 03 03 CCSID 1258 information document Archived from the original on 2014 11 29 CCSID 5354 information document Archived from the original on 2014 11 29 a b Kaplan Michael S 2005 04 19 A few of the gotchas of MultiByteToWideChar Sorting it all out Steele Shawn 1998 04 15 cp1258 to Unicode table Microsoft Unicode mappings of windows 1258 with best fit Code Page CPGID 01258 pdf PDF IBM Code Page CPGID 01258 txt IBM International Components for Unicode ICU ibm 1258 P100 1997 ucm 2002 12 03 International Components for Unicode ICU ibm 5354 P100 1998 ucm 2002 12 03 Code page 1129 information document Archived from the original on 2010 09 21 CCSID 1129 information document Archived from the original on 2016 03 27 CCSID 1163 information document Archived from the original on 2014 11 29 Lunde Ken 13 January 2009 Appendix L Vietnamese Character Sets PDF CJKV Information Processing 2nd ed ISBN 978 0 596 51447 1 Code Page CPGID 01129 pdf PDF IBM Code Page CPGID 01129 txt IBM International Components for Unicode ICU ibm 1129 P100 1997 ucm 2002 12 03 Code Page CPGID 01163 pdf PDF IBM Code Page CPGID 01163 txt IBMExternal links editIANA Charset Name Registration of windows 1258 Michael Kaplan s blog describing the Windows 1258 encoding behavior Retrieved from https en wikipedia org w index php title Windows 1258 amp oldid 1200885848 Code page 1129, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.