fbpx
Wikipedia

ISO/IEC 8859-2

ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central[1] or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 (MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions.[2] Code page 912 is an extension. Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8 (on the web).

ISO/IEC 8859-2
MIME / IANAISO-8859-2
Alias(es)iso-ir-101, csISOLatin2, latin2, l2, IBM1111
Language(s)(see below)
StandardECMA-94:1986, ISO/IEC 8859
ClassificationExtended ASCII, ISO/IEC 8859
ExtendsUS-ASCII
Based onISO-8859-1
Other related encoding(s)Windows-1250, MacCroatian

ISO-8859-2 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. Less than 0.04% of all web pages use ISO-8859-2 as of October 2022.[3][4] Microsoft has assigned code page 28592 a.k.a. Windows-28592 to ISO-8859-2 in Windows. IBM assigned Code page 1111 to ISO 8859-2.

Windows-1250 is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike Windows-1252, which keeps all printable characters from ISO-8859-1 in the same place).

Language coverage

These code values can be used for the following languages:

  1. ^ In 2017, the Council for German Orthography officially adopted a capital, ⟨ẞ⟩, before support for German was complete. Fully compatible with ISO/IEC 8859-1 for German texts.

It can also be used for Romanian, but it is not well suited for that language, due to lacking letters s and t with commas below, although it provides s and t with similar-looking cedillas. These letters were unified in the first versions of the Unicode standard, meaning that the appearance with cedilla or with a comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should therefore, in theory, have characters with a comma below at those code points.

Microsoft did not really provide such fonts for computers sold in Romania. Still, ISO 8859-2 and Windows-1250 (with the same problem) have been heavily used for Romanian. Unicode subsequently disunified the comma variants from the cedilla variants, and has since taken the lead for web pages, which however often have s and t with cedilla anyway. Unicode notes as of 2014[citation needed] that disunifying the letters with comma below was a mistake, causing corruptions of Romanian data: pre-existing data and input methods would still contain the older cedilla codepoints, complicating text searching.

Code page layout

Differences from ISO-8859-1 have the Unicode code point number underneath.

ISO/IEC 8859-2 (Latin-2)
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP Ą
0104
˘
02D8
Ł
0141
¤ Ľ
013D
Ś
015A
§ ¨ Š
0160
Ş
015E
Ť
0164
Ź
0179
SHY Ž
017D
Ż
017B
Bx ° ą
0105
˛
02DB
ł
0142
´ ľ
013E
ś
015B
ˇ
02C7
¸ š
0161
ş
015F
ť
0165
ź
017A
˝
02DD
ž
017E
ż
017C
Cx Ŕ
0154
Á Â Ă
0102
Ä Ĺ
0139
Ć
0106
Ç Č
010C
É Ę
0118
Ë Ě
011A
Í Î Ď
010E
Dx Đ
0110
Ń
0143
Ň
0147
Ó Ô Ő
0150
Ö × Ř
0158
Ů
016E
Ú Ű
0170
Ü Ý Ţ
0162
ß
Ex ŕ
0155
á â ă
0103
ä ĺ
013A
ć
0107
ç č
010D
é ę
0119
ë ě
011B
í î ď
010F
Fx đ
0111
ń
0144
ň
0148
ó ô ő
0151
ö ÷ ř
0159
ů
016F
ú ű
0171
ü ý ţ
0163
˙
02D9

See also

References

  1. ^ "Microsoft Outlook Message Encodings".
  2. ^ "The Czech and Slovak Character Encoding Mess Explained". luki.sdf-eu.org. Retrieved 2022-02-27.
  3. ^ "Usage Statistics and Market Share of ISO-8859-2 for Websites, October 2022". w3techs.com. Retrieved 2022-10-23.
  4. ^ "Historical trends in the usage statistics of character encodings for websites, February 2022".

External links

  • ISO/IEC 8859-2:1999
  • Standard ECMA-94: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986)
  • ISO-IR 101 Right-Hand Part of Latin Alphabet No.2 (February 1, 1986)

8859, 1999, information, technology, single, byte, coded, graphic, character, sets, part, latin, alphabet, part, 8859, series, ascii, based, standard, character, encodings, first, edition, published, 1987, informally, referred, latin, generally, intended, cent. ISO IEC 8859 2 1999 Information technology 8 bit single byte coded graphic character sets Part 2 Latin alphabet No 2 is part of the ISO IEC 8859 series of ASCII based standard character encodings first edition published in 1987 It is informally referred to as Latin 2 It is generally intended for Central 1 or Eastern European languages that are written in the Latin script Note that ISO IEC 8859 2 is very different from code page 852 MS DOS Latin 2 PC Latin 2 which is also referred to as Latin 2 in Czech and Slovak regions 2 Code page 912 is an extension Almost half the use of the encoding is for Polish and it s the main legacy encoding for Polish while virtually all use of it has been replaced by UTF 8 on the web ISO IEC 8859 2MIME IANAISO 8859 2Alias es iso ir 101 csISOLatin2 latin2 l2 IBM1111Language s see below StandardECMA 94 1986 ISO IEC 8859ClassificationExtended ASCII ISO IEC 8859ExtendsUS ASCIIBased onISO 8859 1Other related encoding s Windows 1250 MacCroatianvteISO 8859 2 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO IEC 6429 Less than 0 04 of all web pages use ISO 8859 2 as of October 2022 3 4 Microsoft has assigned code page 28592 a k a Windows 28592 to ISO 8859 2 in Windows IBM assigned Code page 1111 to ISO 8859 2 Windows 1250 is similar to ISO 8859 2 and has all the printable characters it has and more However a few of them are rearranged unlike Windows 1252 which keeps all printable characters from ISO 8859 1 in the same place Contents 1 Language coverage 2 Code page layout 3 See also 4 References 5 External linksLanguage coverage EditThese code values can be used for the following languages Albanian Bosnian Croatian Czech German missing uppercase ẞ a Hungarian Polish Rotokas Serbian Latin Slovak Slovene Upper Sorbian Lower Sorbian Turkmen In 2017 the Council for German Orthography officially adopted a capital ẞ before support for German was complete Fully compatible with ISO IEC 8859 1 for German texts It can also be used for Romanian but it is not well suited for that language due to lacking letters s and t with commas below although it provides s and t with similar looking cedillas These letters were unified in the first versions of the Unicode standard meaning that the appearance with cedilla or with a comma was treated as a glyph choice rather than as separate characters fonts intended for use with Romanian should therefore in theory have characters with a comma below at those code points Microsoft did not really provide such fonts for computers sold in Romania Still ISO 8859 2 and Windows 1250 with the same problem have been heavily used for Romanian Unicode subsequently disunified the comma variants from the cedilla variants and has since taken the lead for web pages which however often have s and t with cedilla anyway Unicode notes as of 2014 citation needed that disunifying the letters with comma below was a mistake causing corruptions of Romanian data pre existing data and input methods would still contain the older cedilla codepoints complicating text searching Code page layout EditDifferences from ISO 8859 1 have the Unicode code point number underneath ISO IEC 8859 2 Latin 2 0 1 2 3 4 5 6 7 8 9 A B C D E F0x1x2x SP amp 3x 0 1 2 3 4 5 6 7 8 9 lt gt 4x A B C D E F G H I J K L M N O5x P Q R S T U V W X Y Z 6x a b c d e f g h i j k l m n o7x p q r s t u v w x y z 8x9xAx NBSP A0104 02D8 L0141 Ľ013D S015A S0160 S015E T0164 Z0179 SHY Z017D Z017BBx a0105 02DB l0142 ľ013E s015B ˇ02C7 s0161 s015F t0165 z017A 02DD z017E z017CCx Ŕ0154 A A Ă0102 A Ĺ0139 C0106 C C010C E e0118 E E011A I I D010EDx Đ0110 N0143 N0147 o O O0150 O R0158 U016E U U0170 U Y Ţ0162 ssEx ŕ0155 a a ă0103 a ĺ013A c0107 c c010D e e0119 e e011B i i d010FFx đ0111 n0144 n0148 o o o0151 o r0159 u016F u u0171 u y ţ0163 02D9See also EditCharacter encoding Polish code pagesReferences Edit Microsoft Outlook Message Encodings The Czech and Slovak Character Encoding Mess Explained luki sdf eu org Retrieved 2022 02 27 Usage Statistics and Market Share of ISO 8859 2 for Websites October 2022 w3techs com Retrieved 2022 10 23 Historical trends in the usage statistics of character encodings for websites February 2022 External links EditISO IEC 8859 2 1999 Standard ECMA 94 8 Bit Single Byte Coded Graphic Character Sets Latin Alphabets No 1 to No 4 2nd edition June 1986 ISO IR 101 Right Hand Part of Latin Alphabet No 2 February 1 1986 ISO 8859 2 Latin 2 Resources Retrieved from https en wikipedia org w index php title ISO IEC 8859 2 amp oldid 1117820345, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.