fbpx
Wikipedia

Tai Tham (Unicode block)

Tai Tham is a Unicode block containing characters of the Lanna script used for writing the Northern Thai (Kam Mu'ang), Tai Lü, and Khün languages.

Tai Tham
RangeU+1A20..U+1AAF
(144 code points)
PlaneBMP
ScriptsTai Tham
Major alphabetsTai Tham
Assigned127 code points
Unused17 reserved code points
Unicode version history
5.2 (2009)127 (+127)
Unicode documentation
Code chart ∣ Web page
Note: [1][2]
Tai Tham[1][2]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+1A2x
U+1A3x ᨿ
U+1A4x
U+1A5x  ᩖ  ᩘ  ᩙ  ᩚ  ᩛ  ᩜ  ᩝ  ᩞ
U+1A6x   ᩠   ᩢ  ᩥ  ᩦ  ᩧ  ᩨ  ᩩ  ᩪ  ᩫ  ᩬ
U+1A7x  ᩳ  ᩴ  ᩵  ᩶  ᩷  ᩸  ᩹  ᩺  ᩻  ᩼  ᩿
U+1A8x
U+1A9x
U+1AAx
Notes
1.^ As of Unicode version 15.0
2.^ Grey areas indicate non-assigned code points

History

123 of the 127 code points initially encoded were proposed in L2/07-007R,[3] two more (U+1A5C and U+1A7C) in L2/08-037R2[4] and a final pair (U+1A5D and U+1A5E) in L2/08-073.[5] The last of these three documents modified the definitions of U+1A37 and U+1A38 given in the first of the three.

The following Unicode-related documents record the purpose and process of defining specific characters in the Tai Tham block:

Version Final code points[a] Count L2 ID WG2 ID Document
5.2[b] U+1A20..1A5E, 1A60..1A7C, 1A7F..1A89, 1A90..1A99, 1AA0..1AAD 127 L2/99-245 N2042 Everson, Michael; McGowan, Rick (1999-07-20), Unicode Technical Report #3: Early Aramaic, Balti, Kirat (Limbu), Manipuri (Meitei) and Tai Lü scripts
X3L2/94-088 N1013 The Motion on the Coding of the Old Xishuang Banna Dai Writing, Entering into BMP of ISO/IEC 10646, 1994-04-18
N1099 (pdf, doc) The motion on coding of the Old Xishuang Banna Dai Writing Entering into BMP of ISO/IEC 10646, 1994-10-10
L2/04-351 Hosken, Martin (2004-06-28), Lanna Unicode: A Draft Proposal
L2/05-095R Hosken, Martin (2005-04-25), Lanna Unicode: A Proposal
L2/05-166 Kourilsky, G.; Berment, V. (2005-07-15), Towards a Computerization of the Lao Tham System of Writing
L2/05-188 Hosken, Martin (2005-08-02), Lao Tham in Terms of Lanna: a response to L2/05-166 from L2/05-095
L2/06-258R N3121R Everson, Michael; Hosken, Martin (2006-09-09), Proposal for encoding the Lanna script in the BMP of the UCS
L2/06-311 N3159 Tun, Ngwe (2006-09-20), Response to N3121R: Proposal for encoding the Lanna script in the BMP of the UCS
L2/06-319 N3161 Opinions on N3121-Lanna script, 2006-09-22
L2/06-320 N3169R Chen, Zhuang; Everson, Michael; Hosken, Martin; Wei, Lin-Mei (2006-09-26), Lanna ad-hoc report
N3153 (pdf, doc) Umamaheswaran, V. S. (2007-02-16), "M49.17", Unconfirmed minutes of WG 2 meeting 49 AIST, Akihabara, Tokyo, Japan; 2006-09-25/29
L2/07-015 Moore, Lisa (2007-02-08), "Lanna (C.17)", UTC #110 Minutes
L2/07-007R N3207 Everson, Michael; Hosken, Martin; Constable, Peter (2007-03-21), Revised proposal for encoding the Lanna script in the BMP of the UCS
L2/07-101 N3238 Proposing on Encoding Old Tai Lue, 2007-04-03
L2/07-098 N3239 Response to Chinese contribution N3238, "Proposing on Encoding Old Tai Lue", 2007-04-11
N3353 (pdf, doc) Umamaheswaran, V. S. (2007-10-10), "M51.2", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27
L2/07-118R2 Moore, Lisa (2007-05-23), "111-C17", UTC #111 Minutes
L2/07-268 N3253 (pdf, doc) Umamaheswaran, V. S. (2007-07-26), "M50.10", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27
L2/07-307 N3313 Comments on Lanna encoding in FPDAM4, 2007-09-06
L2/07-316 N3342 Hosken, Martin (2007-09-10), Response to N3313
L2/07-319 N3346 Ad hoc report on Lanna, 2007-09-19
L2/07-322 N3349R Everson, Michael (2007-09-28), "Tai Tham", Summary of repertoire for FPDAM 5 of ISO/IEC 10646:2003 and future amendments
L2/07-345 Moore, Lisa (2007-10-25), "Consensus 113-C10", UTC #113 Minutes
L2/07-353 Whistler, Ken (2007-10-10), "A. Lanna (FDAM 4 and FPDAM 5)", WG2 Consent Docket
L2/08-037R2 N3379R2 Constable, Peter (2008-04-18), Tai Tham Ad-hoc Meeting Report
L2/08-073 N3384 Hosken, Martin (2008-01-28), Tai Tham Subjoined Variants
L2/08-003 Moore, Lisa (2008-02-14), "Tai Tham", UTC #114 Minutes
L2/08-318 N3453 (pdf, doc) Umamaheswaran, V. S. (2008-08-13), "M52.2a", Unconfirmed minutes of WG 2 meeting 52
L2/14-126 + appendices Pournader, Roozbeh (2014-05-02), Improvements requested for Unicode Indic properties (two text file appendices HERE)
[affected U+1A55, 1A60, 1A80-1A89, 1A90-1A99]
L2/14-177 Moore, Lisa (2014-08-21), "B.14.5", UTC #140 Minutes
[affected U+1A56-1A5E, 1A75-1A7C, 1A7F]
L2/17-120 Wordingham, Richard (2017-05-01), Corrections to the Indic Syllabic Category for the Tai Tham Script
[affected U+1A57, 1A5A-1A5E, 1A74, 1A7A]
L2/17-169 Pournader, Roozbeh (2017-05-12), Proposed Indic Syllabic Category changes for Tai Tham for Unicode 10
[affected U+1A57, 1A5A-1A5E, 1A74, 1A7A]
L2/17-103 Moore, Lisa (2017-05-18), "B.14.9", UTC #151 Minutes
[affected U+1A57, 1A5A-1A5E, 1A74, 1A7A]
L2/18-053 Pournader, Roozbeh (2018-01-24), New Indic Syllabic Category Consonant_Initial_Postfixed
[affected U+1A5A]
L2/18-007 Moore, Lisa (2018-03-19), "B.14.7", UTC #154 Minutes
[affected U+1A5A]
L2/18-171 Wordingham, Richard (2018-04-29), Positioning of Tai Tham Vowels Below
[documented U+1A69 & U+1A6A]
L2/18-241 Anderson, Deborah; et al. (2018-07-25), "15. Tai Tham", Recommendations to UTC # 156 July 2018 on Script Proposals
[documented U+1A69 & U+1A6A]
L2/18-183 Moore, Lisa (2018-11-20), "D.12 Positioning of Tai Tham vowels below", UTC #156 Minutes
[documented U+1A69 & U+1A6A]
  1. ^ Proposed code points and characters names may differ from final code points and names
  2. ^ Changes to characters may have first taken effect in a later version of Unicode

Encoding of Subscript Consonants

Base and subscript consonants have different encodings because words such as ᨲᩥ᩠ᨠ and ᨲᩥᨠ are different in both appearance and sound. Subscript consonants are encoded as a sequence of 2 characters. The second is the base character and the first is the special character U+1A60 TAI THAM SIGN SAKOT.[3]: Section 2 

If a consonant has two subscript forms and the choice affects the meaning, the form typically used for syllable-final consonants will be encoded with SAKOT, and the other form will have its own code point. There are 7 consonants which have different subscript forms in this way, namely RA, LA, BA, HIGH SA, MA, HIGH RATA, and LOW PA.

ᨣᩕᩪ (Northern Thai pronunciation: [kʰuː]) is encoded as <U+1A23 LOW KA, U+1A55 MEDIAL RA, U+1A6A SIGN UU> but ᨠᩣ᩠ᩁ (IPA: [kaːn]) is encoded as <U+1A20 HIGH KA, U+1A63 SIGN AA, U+1A60 SAKOT, U+1A41 RA>[3]: Section 4 

ᩆᩦ᩠ᩃ (IPA: [siːn]) is encoded as <U+1A46 HIGH SHA, U+1A66 SIGN II, U+1A60 SAKOT, U+1A43 LA>[3]: Section 14.5  but ᨸᩖᩦ (IPA: [piː]) is encoded as <U+1A38 HIGH PA, U+1A56 MEDIAL LA, U+1A66 SIGN II>.[3]: Section 4  (For the use of LA as a syllable final letter, compare ᩁᨭᩛᨷᩣ᩠ᩃ[3]: Section 4  (Northern Thai pronunciation: [lat tha baːn]).

U+1A57 SIGN LA TANG LAI looks like <U+1A60 SAKOT, U+1A43 LA> but is in origin a ligature of it with <U+1A60 SAKOT, U+1A26 NGA>. Tai Lue uses it to write the word ᨴᩢ᩵ᩗᩣ (IPA: [taŋ laːi]).[6]

ᨣᩝᩴ (IPA: [kɔː bɔː])is encoded as <U+1A23 LOW KA, U+1A5D SIGN BA, U+1A74 MAI KANG>, but ᨠᩢ᩠ᨷ (IPA: [kap]) is encoded as <U+1A20 HIGH KA, U+1A62 MAI SAT, U+1A60 SAKOT, U+1A37 BA> and ᨠᩢᨷ᩠ᨷ᩺ (IPA: [kap]) is encoded as <U+1A20 HIGH KA, U+1A62 MAI SAT, U+1A37 BA, U+1A60 SAKOT, U+1A37 BA, U+1A7A RA HAAM>

In the final proposal,[3]: 1  which the Unicode Consortium accepted that what is now SIGN BA (as in ᨣᩝᩴ) would be encoded as <SAKOT, BA> and what is now <SAKOT, BA> (as in ᨠᩢ᩠ᨷ) should be encoded as <SAKOT, HIGH PA>, but during the ISO process the meaning of <SAKOT, BA> changed[5] and SIGN BA was added. However, the original meaning of <SAKOT, HIGH PA> remains for words from Thai that have ป as a syllable-final consonant. (This proposal mistakenly calls <SAKOT, HIGH PA> <SAKOT, HIGH PHA>.)

Pali uses HIGH PA instead of BA in Laos and northeast Thailand. One should therefore be prepared to find <SAKOT, BA> encoded as <U+1A60 SAKOT, U+1A38 HIGH PA> in Pali.

Tai Khuen has two ways of writing subscript HIGH SA. They are not interchangeable. In Tai Khuen, to write ᩃᩮᩞ is correct and to write ᩃᩮ᩠ᩈ is wrong,[5] but to write ᩈᨶ᩠ᨶᩥᩅᩤ᩠ᩈ is correct while to write ᩈᨶ᩠ᨶᩥᩅᩤᩞ is wrong! ᩃᩮᩞ is encoded as <U+1A43 LA, U+1A6E SIGN E, U+1A5E SIGN SA> while the incorrect ᩃᩮ᩠ᩈ is encoded as <U+1A43 LA, U+1A6E SIGN E, U+1A60 SAKOT, U+1A48 HIGH SA>.

Tai Khuen has an additional way of writing subscript MA. There is a special codepoint for this additional method[4]: Item 9  The word which Northern Thai writes as ᨵᨾ᩠ᨾ᩺ is written in Tai Khuen both as ᨵᨾ᩠ᨾ᩼ encoded as <U+1A35 LOW THA, U+1A3E MA, U+1A60 SAKOT, U+1A3E MA, U+1A7C KARAN> and as ᨵᨾᩜ᩼ encoded as <U+1A35 LOW THA, U+1A3E MA, U+1A5C SIGN MA, U+1A7C KARAN>.

There are two ways of writing the subscript for both HIGH RATHA and LOW PA. ᨶᩥᨣᨱᩛ[7]: 368  is encoded as <U+1A36 NA, U+1A65 SIGN I, U+1A23 LOW KA, U+1A31 RANA, U+1A5B SIGN HIGH RATHA OR LOW PA>: ᩁᩣᨩᨽᩢ᩠ᨮ[3]: 3  is encoded <U+1A41 RA, U+1A63 SIGN AA, U+1A29 LOW CA, U+1A3D LOW PHA, U+1A62 MAI SAT, U+1A60 SAKOT, U+1A2E HIGH RATHA>. ᨶᩥᨻᩛᩣᨶ is encoded as <U+1A36 NA, U+1A65 SIGN I, U+1A3B LOW PA, U+1A5B SIGN HIGH RATHA OR LOW PA, U+1A63 SIGN AA, U+1A36 NA>: ᨴᩮ᩠ᨻ is encoded as <U+1A34 LOW TA, U+1A6E SIGN E, U+1A60 SAKOT, U+1A3B LOW PA>. The latter word is also written as ᨴᩮ᩠ᨷ. The Lao-style consonant conjunct ᨲ᩠ᨳ (encoded as <U+1A32 HIGH TA, U+1A60 SAKOT, U+1A33 HIGH THA>) looks as though it is ᨲᩛ encoded as <U+1A32 HIGH TA, U+1A5B SIGN HIGH RATHA OR LOW PA>. The shape of U+1A5B depends upon the consonant it is subscript to.

The dependent vowel of words like ᨯᩬᨠ 'flower' is encoded by the special vowel <U+1A6C SIGN OA BELOW>; one should not use the sequence <U+1A60 SAKOT, U+1A4B LETTER A> There is also an encoded dependent vowel for words like Tai Khuen, Tai Lue and Lao words such as ᨶ᩶ᩭ, namely U+1A6D SIGN OY. This vowel is not encoded as <U+1A6C SIGN OA BELOW, U+1A60 SAKOT, U+1A3F LOW YA> (which is what Northern Thai uses for the corresponding words; nor is it the sequence <U+1A60 SAKOT, U+1A40 HIGH YA>[3]: Section 5 

Superscript Consonants

Superscript consonants are encoded independently of the base consonants. Some characters serve both as superscript consonants and in other roles, and are therefore discussed further in this section.

Niggahita and is encoded as U+1A74 MAI KANG. Superscript WA is not encoded separately. It is encoded as MAI KANG. For example, Tai Khuen ᨯ᩠ᨿᩴ (IPA: [deu]) is encoded as <U+1A2, DA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A74 MAI KANG>. For the purposes of character sequencing, it is generally treated as a vowel.

Superscript cluster-initial NGA is encoded as U+1A58 MAI KANG LAI. Note that Lao generally uses the same glyph for MAI KANG LAI and U+1A59 SIGN FINAL NGA.

U+1A62 MAI SAT serves three roles - it is a vowel, a final consonant, and a vowel shortener.

Choosing the encoding of the superscript form of RA and the vowel killers was difficult. In the 1940s the Tai Khuen wrote the consonant and the vowel killer the same way. The proposers of the encoding made enquiries and were told that the glyphs were still the same and therefore encoded them both as U+1A7A RA HAAM. It was then learnt that the Tai Khuen had changed the glyphs of the vowel killer, and a new character U+1A7C KARAN was added for the Tai Khuen style of the vowel killer. Some Northern Thai writers prefer to use U+1A7C as the vowel killer, and indeed the use of its glyph is not unknown in Northern Thai handwriting.

Special Consonants

The special forms and are encoded by the code points U+1A53 and U+1A55 respectively.

If the glyphs of U+1A36 NA and U+1A63 SIGN AA would be side by side they are written as the ligature ᨶᩣ rather than as two separate glyphs ᨶ‌ᩣ. They are written as a ligature even if the NA has a subscript consonant or a non-following mark attached. Examples: ᨾᨶ᩠ᨲᩣ (IPA: [man taː], encoding <U+1A3E MA, U+1A36 NA, U+1A60 SAKOT, U+1A32 HIGH TA, U+1A63 SIGN AA>) and ᨶᩮᩢᩣ (IPA: [nau], encoding <U+1A36 NA, 1A6E SIGN E, U+1A62 MAI SAT, U+1A63 SIGN AA>). Subscript NA and SIGN AA do not similarly ligate, e.g. ᩉ᩠ᨶᩣ ((IPA: [naː]), encoded <U+1A49 HIGH HA, U+1A60 SAKOT, 1A36 NA, U+1A63 SIGN AA>)

The geminate consonant is encoded separately because the word ᩅᩥᩈᩮ᩠ᩈ (Northern Thai pronunciation: [wiseːt], encoding <U+1A45 WA, U+1A65 SIGN I, U+1A48 HIGH SA, U+1A6E SIGN E, U+1A60 SAKOT, U+1A48 HIGH SA>) has an appearance very different from ᩅᩥᩔᩮ, but one may have occasion to fold the final syllable to <HIGH SA, SAKOT, HIGH SA, SIGN E>. Indeed, in 2019 to 2020 there was a campaign to establish the latter as its standard spelling.

By contrast, the geminate consonant ᨬ᩠ᨬ is encoded as the conjunct <U+1A2C NYA, U+1A60 SAKOT, U+1A2C NYA>, even though some of its glyphs may resemble the hypothetical conjunct ᨱ᩠ᨬ <U+1A31 RANA, U+1A60 SAKOT, U+1A2C NYA>.

Independent Vowels

The independent vowel and the consonant are the same character, U+1A4B.

The independent vowel ᩋᩣ and the sequence of the consonant and dependent vowel have the same appearance ᩋᩣ and are therefore both encoded <U+1A20 LETTER A, U+1A63 SIGN AA>.

Northern Thai uses 5 independent vowels with their own code points, namely , , , and .[3]: Section 3 

In Northern Thai the 8th independent vowel is no different from the sequence of the consonant and dependent vowel , i.e. ᩋᩰ, and they are therefore both encoded <U+1A4B LETTER A, U+1A70 SIGN OO>. Other languages use a distinct character U+1A52 LETTER OO for the independent vowel.

Character Order within Text

The encoding proposal[3] defined the ordering of Unicode characters.

Like the way of writing Burmese, Khmer, and Indian languages, Unicode characters are ordered according to the order of the sounds except in special cases[9] or if 2 sounds combine into a single sound and then one uses the old order. This order is usually as in Siamese. If the sound does not have an order then one uses the visual order or a special alternative order.

There are special rules for:

(a) The ordering of vowels
(b) The writing of mai kia in all its variants
(c) Th writing of mai kua in all its variants
(d) The writing of mai kam
(e) The writing of tone marks

The ordering of Unicode characters for consonants and vowels is: onset letters, true vowel marks, coda consonants, onset letters, true vowel marks, coda consonants.[3]: Section 14  For convenience, one reckons that symbols killing vowels are vowels.

The 'onset letters' are consonants, independent vowels or special symbols. The consonants in a group are ordered according to the order in which they are sounded or used to be sounded.

Example: ᨻᩩᨴ᩠ᨵ (Northern Thai pronunciation: [put thaʔ])

onset letter:
pure vowel:
final 'consonant':
onset letter:
pure vowel: no symbol
final consonant: none

The encoding is <U+1A3B LOW PA, U+1A69 SIGN U, U+1A34 LOW TA, U+1A60 SAKOT, U+1A35 LOW THA>

Example: ᨻᩕ has a single consonant sound Northern Thai pronunciation: [pʰ], but formerly had 2 sounds, namely those of and then as in central Thai. This word is encoded as <LOW PA, MEDIAL RA>.

Apart from MEDIAL RA, the order of the consonant glyphs is the same as the order of the sounds. In most cases MEDIAL RA is the last consonant but the WA of /ua/ and the LOW YA of /ia/ follow MEDIAL RA.

Examples:

ᩆᩣᩈ᩠ᨲᩕ᩺ is encoded <U+1A46 HIGH SHA, U+1A63 SIGN AA, U+1A48 HIGH SA, U+1A60 SAKOT, U+1A32 HIGH TA, U+1A55 MEDIAL RA, U+1A7A RA HAAM>.
ᨠᩕᩈᩢ᩠ᨲ is encoded <U+1A20 HIGH KA, U+1A55 MEDIAL RA, U+1A48 HIGH SA, U+1A62 MAI SAT, U+1A60 SAKOT, U+1A32 HIGH TA&gt.
ᩈᩕ᩠ᩅᨾ is encoded <U+1A48 HIGH SA, U+1A55 MEDIAL RA, U+1A60 SAKOT, U+1A45 WA, U+1A3E MA>.
But ᨲᩕ᩠ᨶᩬᨾ (Northern Thai pronunciation: [tʰa nɔːm])[7]: 269  is encoded <U+1A32 HIGH TA, U+1A55 MEDIAL RA, U+1A60 SAKOT, U+1A36 NA, U+1A6C SIGN OA BELOW, U+1A3E MA>

For words like ᨧᩮᩢ᩶ᩣ there is the rule that symbols for vowels and tones have the order:[3]: Section 5 first part, 5.3 and 13 

(1) leading vowels
(2) vowels below (top to bottom)
(3) vowels above (bottom to top)
(4) tone marks (left to right)
(5) trailing vowels (left to right)

In the application of these rules, MAI KANG is reckoned as a vowel even though it function as niggahita or as a consonant. The Unicode character MAI SAT is reckoned as a vowel even though it function as a consonant, i.e as mai kak, i.e. as a final consonant or function as a vowel shortener as in ᨸᩮᩢ᩠ᨯ.

The relative ordering of the marks above and below should follow Thai and Lao as in เจ้า เกี่ว ชุํ and ບິ່.

Examples:

ᨧᩮᩢ᩶ᩣ is encoded as <U+1A27 HIGH CA, U+1A6E SIGN E, U+1A62 MAI SAT, U+1A76 TONE-2, U+1A63 SIGN AA>[3]: Section 5 no. 29 
ᨾᩢᩣ (IPA: [maːk]) is encoded as <U+1A3E MA, U+1A62 MAI SAT, U+1A63 SIGN AA>
ᩃᩪᩢ (IPA: [luːk]) is encoded as <U+1A43 LA, U+1A6A SIGN UU, U+1A62 MAI SAT>
ᨶᩮᩢᩣ is encoded as <U+1A36 NA, U+1A6E SIGN E, U+1A62 MAI SAT, U+1A63 SIGN AA>
ᩋᩫᨶ᩠ᨲᩕᩣ᩠ᨿ (Northern Thai pronunciation: [on thaʔ laːi]) is encoded as <U+1A4B LETTER A, U+1A6B SIGN O, U+1A36 NA, U+1A60 SAKOT, U+1A32 HIGH TA, U+1A55 MEDIAL RA, U+1A63 SIGN AA, U+1A60 SAKOT, U+1A3F LOW YA>

For /ia/ and /ua/ in all their forms, subscript LOW YA and WA are reckoned as onset consonants.[3]: Section 14.3 

Examples:

ᩈ᩠ᨿᩮ is actually encoded <U+1A48 HIGH SA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A6E SIGN E>[3]: Section 5 No. 33 
ᨸ᩠ᩃ᩠ᨿ᩵ᩁ is actually encoded <U+1A38 HIGH PA, U+1A60 SAKOT, U+1A43 LA, U+1A60 SAKOT, U+1A3F LOW YA, U+1A75 TONE-1, U+1A41 RA>[3]: Section 14.9 
ᨲ᩠ᩅᩫ is actually encoded <U+1A32 HIGH TA, U+1A60 SAKOT, U+1A45 WA, U+1A6B SIGN O>[3]: Section 14.3 
ᩈ᩠ᩅ᩵ᩁ is actually encoded <U+1A48 HIGH SA, U+1A60 SAKOT, U+1A45 WA, U+1A75 TONE-1, U+1A41 RA>
ᨠᩖ᩠ᩅ᩠᩶ᨿ is actually encoded as <U+1A20 KA, U+1A56 MEDIAL LA, U+1A60 SAKOT, U+1A45 WA, U+1A60 SAKOT, U+1A76 TONE-2, U+1A3F LOW YA>
(<U+1A60, U+1A76> is canonically equivalent to <U+1A76, U+1A60>)

Outside Northern Thailand, the MAI KANG in the symbol for /am/ is written on the SIGN AA component. In Northern Thailand, it is positioned variously – on the consonant, on the SIGN AA and between them. The Unicode Consortium refused a special character for the combination. The word ᨷᩴ᩠᩵ᨾᩣ (Northern Thai pronunciation: [bɔːmaː]) should not appear to have the same vowel as ᨲ᩵ᩣᩴ (IPA: [tam]). The combination for /am/ is therefore encoded as <U+1A63 SIGN AA, U+1A74 MAI KANG>. The word ᨷᩴ᩠᩵ᨾᩣ is encoded as <U+1A37 BA, U+1A74 MAI KANG, U+1A75 TONE-1, U+1A60 SAKOT, U+1A3E MA, U+1A63 SIGN AA>. The word ᨲ᩵ᩣᩴ is encoded as <U+1A32 HIGH TA, U+1A75 TONE-1, U+1A63 SIGN AA, U+1A74 MAI KANG>. The combination for /am/ with SIGN TALL AA is encoded as <U+1A64 SIGN TALL AA, U+1A74 MAI KANG>.

U+1A5A SIGN LOW PA is a special case; the Tai Lue word ᨣᨽᩚ (IPA: [kap phaʔ]) is encoded as <U+1A23 LOW KA, U+1A3D LOW PHA, U+1A5A SIGN LOW PA>.[3]: Section 4 

Examples showing mai kang lai and la tang lai:

Pali word ᩈᩘᨥᩮᩣ (saṅgho) is encoded <U+1A48 SA, U+1A58 MAI KANG LAI, U+1A25 LOW KHA, U+1A6E SIGN E, U+1A63 SIGN AA>.
Northern Thai word ᨴᩘ᩠ᩃᩣ᩠ᨿ (Northern Thai pronunciation: [taŋ laːi]) is encoded <U+1A34 LOW TA, U+1A58 MAI KANG LAI, U+1A60 SAKOT, U+1A43 LA, U+1A63 SIGN AA, U+1A60 SAKOT, U+1A3F LOW YA&gt.
Tai Lue word ᨴᩢᩗᩣ (Tai Lue pronunciation: [taŋ laːi]) is encoded <U+1A34 LOW TA, U+1A62 MAI SAT, U+1A57 LA TANG LAI, U+1A63 SIGN AA>.

External links

  • Chew, P., Saengboon, P., & Wordingham, R. (2015). "Tai Tham: A Hybrid Script that Challenges Current Encoding Models". Presented at the Internationalization and Unicode Conference (IUC 39).

References

  1. ^ "Unicode character database". The Unicode Standard. Retrieved 2023-07-26.
  2. ^ "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2023-07-26.
  3. ^ a b c d e f g h i j k l m n o p q r s t Everson, Michael; Hosken, Martin; Constable, Peter (21 March 2007). "Revised proposal for encoding the Lanna script in the BMP of the UCS" (PDF). Unicode.
  4. ^ a b "Tai Tham Ad-hoc Meeting Report (WG2 N3379)" (PDF). Unicode. 22 January 2008.
  5. ^ a b c Hosken, Martin (28 January 2008). "Tai Tham Subjoined Variants" (PDF). Unicode.
  6. ^ Khotsimeuang, Veomany. "Tai Lue: Complex Orthographic Rules: Graphic Blends(I)". SEAsite. Retrieved 10 June 2018.
  7. ^ a b Rungruengsi, Udom (January 2004). Lanna-Thai Dictionary: Maefahluang Edition พจนานุกรมล้านนา ~ ไทย: ฉบับแม่ฟ้าหลวง (in Thai). Chiang Mai: Chiang Mai University. ISBN 974-685-175-6.
  8. ^ Read as COENG i.e. U+17D2 KHMER SIGN COENG
  9. ^ "The encoding model for Lanna is similar to that for Myanmar and Khmer, using a CEONG[8]-like character plus some combining medial-consonant characters."[3]: Section 14 

tham, unicode, block, this, article, contains, special, characters, without, proper, rendering, support, question, marks, boxes, other, symbols, tham, unicode, block, containing, characters, lanna, script, used, writing, northern, thai, khün, languages, thamra. This article contains special characters Without proper rendering support you may see question marks boxes or other symbols Tai Tham is a Unicode block containing characters of the Lanna script used for writing the Northern Thai Kam Mu ang Tai Lu and Khun languages Tai ThamRangeU 1A20 U 1AAF 144 code points PlaneBMPScriptsTai ThamMajor alphabetsTai ThamAssigned127 code pointsUnused17 reserved code pointsUnicode version history5 2 2009 127 127 Unicode documentationCode chart Web pageNote 1 2 Contents 1 History 2 Encoding of Subscript Consonants 3 Superscript Consonants 4 Special Consonants 5 Independent Vowels 6 Character Order within Text 7 External links 8 References Tai Tham 1 2 Official Unicode Consortium code chart PDF 0 1 2 3 4 5 6 7 8 9 A B C D E FU 1A2x ᨠ ᨡ ᨢ ᨣ ᨤ ᨥ ᨦ ᨧ ᨨ ᨩ ᨪ ᨫ ᨬ ᨭ ᨮ ᨯU 1A3x ᨰ ᨱ ᨲ ᨳ ᨴ ᨵ ᨶ ᨷ ᨸ ᨹ ᨺ ᨻ ᨼ ᨽ ᨾ ᨿU 1A4x ᩀ ᩁ ᩂ ᩃ ᩄ ᩅ ᩆ ᩇ ᩈ ᩉ ᩊ ᩋ ᩌ ᩍ ᩎ ᩏU 1A5x ᩐ ᩑ ᩒ ᩓ ᩔ U 1A6x U 1A7x U 1A8x ᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉U 1A9x ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙U 1AAx ᪧ Notes 1 As of Unicode version 15 0 2 Grey areas indicate non assigned code pointsHistory Edit123 of the 127 code points initially encoded were proposed in L2 07 007R 3 two more U 1A5C and U 1A7C in L2 08 037R2 4 and a final pair U 1A5D and U 1A5E in L2 08 073 5 The last of these three documents modified the definitions of U 1A37 and U 1A38 given in the first of the three The following Unicode related documents record the purpose and process of defining specific characters in the Tai Tham block Version Final code points a Count L2 ID WG2 ID Document5 2 b U 1A20 1A5E 1A60 1A7C 1A7F 1A89 1A90 1A99 1AA0 1AAD 127 L2 99 245 N2042 Everson Michael McGowan Rick 1999 07 20 Unicode Technical Report 3 Early Aramaic Balti Kirat Limbu Manipuri Meitei and Tai Lu scriptsX3L2 94 088 N1013 The Motion on the Coding of the Old Xishuang Banna Dai Writing Entering into BMP of ISO IEC 10646 1994 04 18N1099 pdf doc The motion on coding of the Old Xishuang Banna Dai Writing Entering into BMP of ISO IEC 10646 1994 10 10L2 04 351 Hosken Martin 2004 06 28 Lanna Unicode A Draft ProposalL2 05 095R Hosken Martin 2005 04 25 Lanna Unicode A ProposalL2 05 166 Kourilsky G Berment V 2005 07 15 Towards a Computerization of the Lao Tham System of WritingL2 05 188 Hosken Martin 2005 08 02 Lao Tham in Terms of Lanna a response to L2 05 166 from L2 05 095L2 06 258R N3121R Everson Michael Hosken Martin 2006 09 09 Proposal for encoding the Lanna script in the BMP of the UCSL2 06 311 N3159 Tun Ngwe 2006 09 20 Response to N3121R Proposal for encoding the Lanna script in the BMP of the UCSL2 06 319 N3161 Opinions on N3121 Lanna script 2006 09 22L2 06 320 N3169R Chen Zhuang Everson Michael Hosken Martin Wei Lin Mei 2006 09 26 Lanna ad hoc reportN3153 pdf doc Umamaheswaran V S 2007 02 16 M49 17 Unconfirmed minutes of WG 2 meeting 49 AIST Akihabara Tokyo Japan 2006 09 25 29L2 07 015 Moore Lisa 2007 02 08 Lanna C 17 UTC 110 MinutesL2 07 007R N3207 Everson Michael Hosken Martin Constable Peter 2007 03 21 Revised proposal for encoding the Lanna script in the BMP of the UCSL2 07 101 N3238 Proposing on Encoding Old Tai Lue 2007 04 03L2 07 098 N3239 Response to Chinese contribution N3238 Proposing on Encoding Old Tai Lue 2007 04 11N3353 pdf doc Umamaheswaran V S 2007 10 10 M51 2 Unconfirmed minutes of WG 2 meeting 51 Hanzhou China 2007 04 24 27L2 07 118R2 Moore Lisa 2007 05 23 111 C17 UTC 111 MinutesL2 07 268 N3253 pdf doc Umamaheswaran V S 2007 07 26 M50 10 Unconfirmed minutes of WG 2 meeting 50 Frankfurt am Main Germany 2007 04 24 27L2 07 307 N3313 Comments on Lanna encoding in FPDAM4 2007 09 06L2 07 316 N3342 Hosken Martin 2007 09 10 Response to N3313L2 07 319 N3346 Ad hoc report on Lanna 2007 09 19L2 07 322 N3349R Everson Michael 2007 09 28 Tai Tham Summary of repertoire for FPDAM 5 of ISO IEC 10646 2003 and future amendmentsL2 07 345 Moore Lisa 2007 10 25 Consensus 113 C10 UTC 113 MinutesL2 07 353 Whistler Ken 2007 10 10 A Lanna FDAM 4 and FPDAM 5 WG2 Consent DocketL2 08 037R2 N3379R2 Constable Peter 2008 04 18 Tai Tham Ad hoc Meeting ReportL2 08 073 N3384 Hosken Martin 2008 01 28 Tai Tham Subjoined VariantsL2 08 003 Moore Lisa 2008 02 14 Tai Tham UTC 114 MinutesL2 08 318 N3453 pdf doc Umamaheswaran V S 2008 08 13 M52 2a Unconfirmed minutes of WG 2 meeting 52L2 14 126 appendices Pournader Roozbeh 2014 05 02 Improvements requested for Unicode Indic properties two text file appendices HERE affected U 1A55 1A60 1A80 1A89 1A90 1A99 L2 14 177 Moore Lisa 2014 08 21 B 14 5 UTC 140 Minutes affected U 1A56 1A5E 1A75 1A7C 1A7F L2 17 120 Wordingham Richard 2017 05 01 Corrections to the Indic Syllabic Category for the Tai Tham Script affected U 1A57 1A5A 1A5E 1A74 1A7A L2 17 169 Pournader Roozbeh 2017 05 12 Proposed Indic Syllabic Category changes for Tai Tham for Unicode 10 affected U 1A57 1A5A 1A5E 1A74 1A7A L2 17 103 Moore Lisa 2017 05 18 B 14 9 UTC 151 Minutes affected U 1A57 1A5A 1A5E 1A74 1A7A L2 18 053 Pournader Roozbeh 2018 01 24 New Indic Syllabic Category Consonant Initial Postfixed affected U 1A5A L2 18 007 Moore Lisa 2018 03 19 B 14 7 UTC 154 Minutes affected U 1A5A L2 18 171 Wordingham Richard 2018 04 29 Positioning of Tai Tham Vowels Below documented U 1A69 amp U 1A6A L2 18 241 Anderson Deborah et al 2018 07 25 15 Tai Tham Recommendations to UTC 156 July 2018 on Script Proposals documented U 1A69 amp U 1A6A L2 18 183 Moore Lisa 2018 11 20 D 12 Positioning of Tai Tham vowels below UTC 156 Minutes documented U 1A69 amp U 1A6A Proposed code points and characters names may differ from final code points and names Changes to characters may have first taken effect in a later version of UnicodeEncoding of Subscript Consonants EditBase and subscript consonants have different encodings because words such as ᨲ ᨠ and ᨲ ᨠ are different in both appearance and sound Subscript consonants are encoded as a sequence of 2 characters The second is the base character and the first is the special character U 1A60 TAI THAM SIGN SAKOT 3 Section 2 If a consonant has two subscript forms and the choice affects the meaning the form typically used for syllable final consonants will be encoded with SAKOT and the other form will have its own code point There are 7 consonants which have different subscript forms in this way namely ᩁ RA ᩃ LA ᨷ BA ᩈ HIGH SA ᨾ MA ᨳ HIGH RATA and ᨻ LOW PA ᨣ Northern Thai pronunciation kʰuː is encoded as lt U 1A23 LOW KA U 1A55 MEDIAL RA U 1A6A SIGN UU gt but ᨠ ᩁ IPA kaːn is encoded as lt U 1A20 HIGH KA U 1A63 SIGN AA U 1A60 SAKOT U 1A41 RA gt 3 Section 4 ᩆ ᩃ IPA siːn is encoded as lt U 1A46 HIGH SHA U 1A66 SIGN II U 1A60 SAKOT U 1A43 LA gt 3 Section 14 5 but ᨸ IPA piː is encoded as lt U 1A38 HIGH PA U 1A56 MEDIAL LA U 1A66 SIGN II gt 3 Section 4 For the use of LA as a syllable final letter compare ᩁᨭ ᨷ ᩃ 3 Section 4 Northern Thai pronunciation lat tha baːn U 1A57 SIGN LA TANG LAI looks like lt U 1A60 SAKOT U 1A43 LA gt but is in origin a ligature of it with lt U 1A60 SAKOT U 1A26 NGA gt Tai Lue uses it to write the word ᨴ IPA taŋ laːi 6 ᨣ IPA kɔː bɔː is encoded as lt U 1A23 LOW KA U 1A5D SIGN BA U 1A74 MAI KANG gt but ᨠ ᨷ IPA kap is encoded as lt U 1A20 HIGH KA U 1A62 MAI SAT U 1A60 SAKOT U 1A37 BA gt and ᨠ ᨷ ᨷ IPA kap is encoded as lt U 1A20 HIGH KA U 1A62 MAI SAT U 1A37 BA U 1A60 SAKOT U 1A37 BA U 1A7A RA HAAM gt In the final proposal 3 1 which the Unicode Consortium accepted that what is now SIGN BA as in ᨣ would be encoded as lt SAKOT BA gt and what is now lt SAKOT BA gt as in ᨠ ᨷ should be encoded as lt SAKOT HIGH PA gt but during the ISO process the meaning of lt SAKOT BA gt changed 5 and SIGN BA was added However the original meaning of lt SAKOT HIGH PA gt remains for words from Thai that have p as a syllable final consonant This proposal mistakenly calls lt SAKOT HIGH PA gt lt SAKOT HIGH PHA gt Pali uses HIGH PA instead of BA in Laos and northeast Thailand One should therefore be prepared to find lt SAKOT BA gt encoded as lt U 1A60 SAKOT U 1A38 HIGH PA gt in Pali Tai Khuen has two ways of writing subscript HIGH SA They are not interchangeable In Tai Khuen to write ᩃ is correct and to write ᩃ ᩈ is wrong 5 but to write ᩈᨶ ᨶ ᩅ ᩈ is correct while to write ᩈᨶ ᨶ ᩅ is wrong ᩃ is encoded as lt U 1A43 LA U 1A6E SIGN E U 1A5E SIGN SA gt while the incorrect ᩃ ᩈ is encoded as lt U 1A43 LA U 1A6E SIGN E U 1A60 SAKOT U 1A48 HIGH SA gt Tai Khuen has an additional way of writing subscript MA There is a special codepoint for this additional method 4 Item 9 The word which Northern Thai writes as ᨵᨾ ᨾ is written in Tai Khuen both as ᨵᨾ ᨾ encoded as lt U 1A35 LOW THA U 1A3E MA U 1A60 SAKOT U 1A3E MA U 1A7C KARAN gt and as ᨵᨾ encoded as lt U 1A35 LOW THA U 1A3E MA U 1A5C SIGN MA U 1A7C KARAN gt There are two ways of writing the subscript for both HIGH RATHA and LOW PA ᨶ ᨣᨱ 7 368 is encoded as lt U 1A36 NA U 1A65 SIGN I U 1A23 LOW KA U 1A31 RANA U 1A5B SIGN HIGH RATHA OR LOW PA gt ᩁ ᨩᨽ ᨮ 3 3 is encoded lt U 1A41 RA U 1A63 SIGN AA U 1A29 LOW CA U 1A3D LOW PHA U 1A62 MAI SAT U 1A60 SAKOT U 1A2E HIGH RATHA gt ᨶ ᨻ ᨶ is encoded as lt U 1A36 NA U 1A65 SIGN I U 1A3B LOW PA U 1A5B SIGN HIGH RATHA OR LOW PA U 1A63 SIGN AA U 1A36 NA gt ᨴ ᨻ is encoded as lt U 1A34 LOW TA U 1A6E SIGN E U 1A60 SAKOT U 1A3B LOW PA gt The latter word is also written as ᨴ ᨷ The Lao style consonant conjunct ᨲ ᨳ encoded as lt U 1A32 HIGH TA U 1A60 SAKOT U 1A33 HIGH THA gt looks as though it is ᨲ encoded as lt U 1A32 HIGH TA U 1A5B SIGN HIGH RATHA OR LOW PA gt The shape of U 1A5B depends upon the consonant it is subscript to The dependent vowel of words like ᨯ ᨠ flower is encoded by the special vowel lt U 1A6C SIGN OA BELOW gt one should not use the sequence lt U 1A60 SAKOT U 1A4B LETTER A gt There is also an encoded dependent vowel for words like Tai Khuen Tai Lue and Lao words such as ᨶ namely U 1A6D SIGN OY This vowel is not encoded as lt U 1A6C SIGN OA BELOW U 1A60 SAKOT U 1A3F LOW YA gt which is what Northern Thai uses for the corresponding words nor is it the sequence lt U 1A60 SAKOT U 1A40 HIGH YA gt 3 Section 5 Superscript Consonants EditSuperscript consonants are encoded independently of the base consonants Some characters serve both as superscript consonants and in other roles and are therefore discussed further in this section Niggahita and is encoded as U 1A74 MAI KANG Superscript WA is not encoded separately It is encoded as MAI KANG For example Tai Khuen ᨯ ᨿ IPA deu is encoded as lt U 1A2 DA U 1A60 SAKOT U 1A3F LOW YA U 1A74 MAI KANG gt For the purposes of character sequencing it is generally treated as a vowel Superscript cluster initial NGA is encoded as U 1A58 MAI KANG LAI Note that Lao generally uses the same glyph for MAI KANG LAI and U 1A59 SIGN FINAL NGA U 1A62 MAI SAT serves three roles it is a vowel a final consonant and a vowel shortener Choosing the encoding of the superscript form of RA and the vowel killers was difficult In the 1940s the Tai Khuen wrote the consonant and the vowel killer the same way The proposers of the encoding made enquiries and were told that the glyphs were still the same and therefore encoded them both as U 1A7A RA HAAM It was then learnt that the Tai Khuen had changed the glyphs of the vowel killer and a new character U 1A7C KARAN was added for the Tai Khuen style of the vowel killer Some Northern Thai writers prefer to use U 1A7C as the vowel killer and indeed the use of its glyph is not unknown in Northern Thai handwriting Special Consonants EditThe special forms ᩓ and are encoded by the code points U 1A53 and U 1A55 respectively If the glyphs of U 1A36 NA and U 1A63 SIGN AA would be side by side they are written as the ligature ᨶ rather than as two separate glyphs ᨶ They are written as a ligature even if the NA has a subscript consonant or a non following mark attached Examples ᨾᨶ ᨲ IPA man taː encoding lt U 1A3E MA U 1A36 NA U 1A60 SAKOT U 1A32 HIGH TA U 1A63 SIGN AA gt and ᨶ IPA nau encoding lt U 1A36 NA 1A6E SIGN E U 1A62 MAI SAT U 1A63 SIGN AA gt Subscript NA and SIGN AA do not similarly ligate e g ᩉ ᨶ IPA naː encoded lt U 1A49 HIGH HA U 1A60 SAKOT 1A36 NA U 1A63 SIGN AA gt The geminate consonant ᩔ is encoded separately because the word ᩅ ᩈ ᩈ Northern Thai pronunciation wiseːt encoding lt U 1A45 WA U 1A65 SIGN I U 1A48 HIGH SA U 1A6E SIGN E U 1A60 SAKOT U 1A48 HIGH SA gt has an appearance very different from ᩅ ᩔ but one may have occasion to fold the final syllable to lt HIGH SA SAKOT HIGH SA SIGN E gt Indeed in 2019 to 2020 there was a campaign to establish the latter as its standard spelling By contrast the geminate consonant ᨬ ᨬ is encoded as the conjunct lt U 1A2C NYA U 1A60 SAKOT U 1A2C NYA gt even though some of its glyphs may resemble the hypothetical conjunct ᨱ ᨬ lt U 1A31 RANA U 1A60 SAKOT U 1A2C NYA gt Independent Vowels EditThe independent vowel ᩋ and the consonant ᩋ are the same character U 1A4B The independent vowel ᩋ and the sequence of the consonant ᩋ and dependent vowel have the same appearance ᩋ and are therefore both encoded lt U 1A20 LETTER A U 1A63 SIGN AA gt Northern Thai uses 5 independent vowels with their own code points namely ᩍ ᩎ ᩏ ᩐ and ᩑ 3 Section 3 In Northern Thai the 8th independent vowel is no different from the sequence of the consonant ᩋ and dependent vowel i e ᩋ and they are therefore both encoded lt U 1A4B LETTER A U 1A70 SIGN OO gt Other languages use a distinct character ᩒ U 1A52 LETTER OO for the independent vowel Character Order within Text EditThe encoding proposal 3 defined the ordering of Unicode characters Like the way of writing Burmese Khmer and Indian languages Unicode characters are ordered according to the order of the sounds except in special cases 9 or if 2 sounds combine into a single sound and then one uses the old order This order is usually as in Siamese If the sound does not have an order then one uses the visual order or a special alternative order There are special rules for a The ordering of vowels b The writing of mai kia in all its variants c Th writing of mai kua in all its variants d The writing of mai kam e The writing of tone marksThe ordering of Unicode characters for consonants and vowels is onset letters true vowel marks coda consonants onset letters true vowel marks coda consonants 3 Section 14 For convenience one reckons that symbols killing vowels are vowels The onset letters are consonants independent vowels or special symbols The consonants in a group are ordered according to the order in which they are sounded or used to be sounded Example ᨻ ᨴ ᨵ Northern Thai pronunciation put thaʔ onset letter ᨻ pure vowel final consonant ᨴ onset letter ᨵ pure vowel no symbol final consonant noneThe encoding is lt U 1A3B LOW PA U 1A69 SIGN U U 1A34 LOW TA U 1A60 SAKOT U 1A35 LOW THA gt Example ᨻ has a single consonant sound Northern Thai pronunciation pʰ but formerly had 2 sounds namely those of ᨻ and then ᩁ as in central Thai This word is encoded as lt LOW PA MEDIAL RA gt Apart from MEDIAL RA the order of the consonant glyphs is the same as the order of the sounds In most cases MEDIAL RA is the last consonant but the WA of ua and the LOW YA of ia follow MEDIAL RA Examples ᩆ ᩈ ᨲ is encoded lt U 1A46 HIGH SHA U 1A63 SIGN AA U 1A48 HIGH SA U 1A60 SAKOT U 1A32 HIGH TA U 1A55 MEDIAL RA U 1A7A RA HAAM gt ᨠ ᩈ ᨲ is encoded lt U 1A20 HIGH KA U 1A55 MEDIAL RA U 1A48 HIGH SA U 1A62 MAI SAT U 1A60 SAKOT U 1A32 HIGH TA amp gt ᩈ ᩅᨾ is encoded lt U 1A48 HIGH SA U 1A55 MEDIAL RA U 1A60 SAKOT U 1A45 WA U 1A3E MA gt But ᨲ ᨶ ᨾ Northern Thai pronunciation tʰa nɔːm 7 269 is encoded lt U 1A32 HIGH TA U 1A55 MEDIAL RA U 1A60 SAKOT U 1A36 NA U 1A6C SIGN OA BELOW U 1A3E MA gt For words like ᨧ there is the rule that symbols for vowels and tones have the order 3 Section 5 first part 5 3 and 13 1 leading vowels 2 vowels below top to bottom 3 vowels above bottom to top 4 tone marks left to right 5 trailing vowels left to right In the application of these rules MAI KANG is reckoned as a vowel even though it function as niggahita or as a consonant The Unicode character MAI SAT is reckoned as a vowel even though it function as a consonant i e as mai kak i e as a final consonant or function as a vowel shortener as in ᨸ ᨯ The relative ordering of the marks above and below should follow Thai and Lao as in eca ekiw chu and ບ Examples ᨧ is encoded as lt U 1A27 HIGH CA U 1A6E SIGN E U 1A62 MAI SAT U 1A76 TONE 2 U 1A63 SIGN AA gt 3 Section 5 no 29 ᨾ IPA maːk is encoded as lt U 1A3E MA U 1A62 MAI SAT U 1A63 SIGN AA gt ᩃ IPA luːk is encoded as lt U 1A43 LA U 1A6A SIGN UU U 1A62 MAI SAT gt ᨶ is encoded as lt U 1A36 NA U 1A6E SIGN E U 1A62 MAI SAT U 1A63 SIGN AA gt ᩋ ᨶ ᨲ ᨿ Northern Thai pronunciation on thaʔ laːi is encoded as lt U 1A4B LETTER A U 1A6B SIGN O U 1A36 NA U 1A60 SAKOT U 1A32 HIGH TA U 1A55 MEDIAL RA U 1A63 SIGN AA U 1A60 SAKOT U 1A3F LOW YA gt For ia and ua in all their forms subscript LOW YA and WA are reckoned as onset consonants 3 Section 14 3 Examples ᩈ ᨿ is actually encoded lt U 1A48 HIGH SA U 1A60 SAKOT U 1A3F LOW YA U 1A6E SIGN E gt 3 Section 5 No 33 ᨸ ᩃ ᨿ ᩁ is actually encoded lt U 1A38 HIGH PA U 1A60 SAKOT U 1A43 LA U 1A60 SAKOT U 1A3F LOW YA U 1A75 TONE 1 U 1A41 RA gt 3 Section 14 9 ᨲ ᩅ is actually encoded lt U 1A32 HIGH TA U 1A60 SAKOT U 1A45 WA U 1A6B SIGN O gt 3 Section 14 3 ᩈ ᩅ ᩁ is actually encoded lt U 1A48 HIGH SA U 1A60 SAKOT U 1A45 WA U 1A75 TONE 1 U 1A41 RA gt ᨠ ᩅ ᨿ is actually encoded as lt U 1A20 KA U 1A56 MEDIAL LA U 1A60 SAKOT U 1A45 WA U 1A60 SAKOT U 1A76 TONE 2 U 1A3F LOW YA gt lt U 1A60 U 1A76 gt is canonically equivalent to lt U 1A76 U 1A60 gt dd Outside Northern Thailand the MAI KANG in the symbol for am is written on the SIGN AA component In Northern Thailand it is positioned variously on the consonant on the SIGN AA and between them The Unicode Consortium refused a special character for the combination The word ᨷ ᨾ Northern Thai pronunciation bɔːmaː should not appear to have the same vowel as ᨲ IPA tam The combination for am is therefore encoded as lt U 1A63 SIGN AA U 1A74 MAI KANG gt The word ᨷ ᨾ is encoded as lt U 1A37 BA U 1A74 MAI KANG U 1A75 TONE 1 U 1A60 SAKOT U 1A3E MA U 1A63 SIGN AA gt The word ᨲ is encoded as lt U 1A32 HIGH TA U 1A75 TONE 1 U 1A63 SIGN AA U 1A74 MAI KANG gt The combination for am with SIGN TALL AA is encoded as lt U 1A64 SIGN TALL AA U 1A74 MAI KANG gt U 1A5A SIGN LOW PA is a special case the Tai Lue word ᨣᨽ IPA kap phaʔ is encoded as lt U 1A23 LOW KA U 1A3D LOW PHA U 1A5A SIGN LOW PA gt 3 Section 4 Examples showing mai kang lai and la tang lai Pali word ᩈ ᨥ saṅgho is encoded lt U 1A48 SA U 1A58 MAI KANG LAI U 1A25 LOW KHA U 1A6E SIGN E U 1A63 SIGN AA gt Northern Thai word ᨴ ᩃ ᨿ Northern Thai pronunciation taŋ laːi is encoded lt U 1A34 LOW TA U 1A58 MAI KANG LAI U 1A60 SAKOT U 1A43 LA U 1A63 SIGN AA U 1A60 SAKOT U 1A3F LOW YA amp gt Tai Lue word ᨴ Tai Lue pronunciation taŋ laːi is encoded lt U 1A34 LOW TA U 1A62 MAI SAT U 1A57 LA TANG LAI U 1A63 SIGN AA gt External links EditChew P Saengboon P amp Wordingham R 2015 Tai Tham A Hybrid Script that Challenges Current Encoding Models Presented at the Internationalization and Unicode Conference IUC 39 References Edit Unicode character database The Unicode Standard Retrieved 2023 07 26 Enumerated Versions of The Unicode Standard The Unicode Standard Retrieved 2023 07 26 a b c d e f g h i j k l m n o p q r s t Everson Michael Hosken Martin Constable Peter 21 March 2007 Revised proposal for encoding the Lanna script in the BMP of the UCS PDF Unicode a b Tai Tham Ad hoc Meeting Report WG2 N3379 PDF Unicode 22 January 2008 a b c Hosken Martin 28 January 2008 Tai Tham Subjoined Variants PDF Unicode Khotsimeuang Veomany Tai Lue Complex Orthographic Rules Graphic Blends I SEAsite Retrieved 10 June 2018 a b Rungruengsi Udom January 2004 Lanna Thai Dictionary Maefahluang Edition phcnanukrmlanna ithy chbbaemfahlwng in Thai Chiang Mai Chiang Mai University ISBN 974 685 175 6 Read as COENG i e U 17D2 KHMER SIGN COENG The encoding model for Lanna is similar to that for Myanmar and Khmer using a CEONG 8 like character plus some combining medial consonant characters 3 Section 14 Retrieved from https en wikipedia org w index php title Tai Tham Unicode block amp oldid 1167345861, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.