diff options
author | Chinese Language Team <chinese> | 1999-04-16 12:59:08 +0000 |
---|---|---|
committer | Chinese Language Team <chinese> | 1999-04-16 12:59:08 +0000 |
commit | 941e322085f2976b90a89e518a79961b06c64ff2 (patch) | |
tree | a7ca7508873d83c79dfeeef35e07ac2d38b32386 /chinese/Chinese.README | |
parent | cc96b9875a5a3d28f4653ab650d4b67a4f921d67 (diff) |
* Added the Big5 and GB mapping tables from ftp.unicode.org for reference.
* Added some ramblings in Chinese.README (foka)
CVS version numbers
chinese/BIG5.TXT: INITIAL -> 1.1
chinese/Chinese.README: 1.1 -> 1.2
chinese/GB2312.TXT: INITIAL -> 1.1
chinese/chinese.wml: 1.1 -> 1.2
Diffstat (limited to 'chinese/Chinese.README')
-rw-r--r-- | chinese/Chinese.README | 53 |
1 files changed, 52 insertions, 1 deletions
diff --git a/chinese/Chinese.README b/chinese/Chinese.README index 225b0750463..3829fd2a726 100644 --- a/chinese/Chinese.README +++ b/chinese/Chinese.README @@ -1,4 +1,55 @@ -Note: This document is in Big5 code. +Some notes and ramblings on Chinese translations (and the fun of +maintaining both Big5 and GB pages and hope that all the characters +show up properly. :-) +Note: This document may contain Big5 code. + +Content Negotiation: +~~~~~~~~~~~~~~~~~~~ + lang charset + ------ ------ ------- zh-CN .zh-cn Big5 zh-TW .zh-tw GB2312 + + +Big5<->GB... Arrgghh! + +Big5 is *bad*!! Its relationship to Unicode is _not_ one-to-one, +and is giving me a lot of headaches. + +The following is from + ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/OTHER/BIG5.TXT + +# WARNING! It is currently impossible to provide round-trip compatibility +# between BIG5 and Unicode. +# +# A number of characters are not currently mapped because +# of conflicts with other mappings. They are as follows: +# +# BIG5 Description Comments +# +# 0xA15A SPACING UNDERSCORE duplicates A1C4 +# 0xA1C3 SPACING HEAVY OVERSCORE not in Unicode +# 0xA1C5 SPACING HEAVY UNDERSCORE not in Unicode +# 0xA1FE LT DIAG UP RIGHT TO LOW LEFT duplicates A2AC +# 0xA240 LT DIAG UP LEFT TO LOW RIGHT duplicates A2AD +# 0xA2CC HANGZHOU NUMERAL TEN conflicts with A451 mapping +# 0xA2CE HANGZHOU NUMERAL THIRTY conflicts with A4CA mapping +# +# We currently map all of these characters to U+FFFD REPLACEMENT CHARACTER + +Another reference is the Big5+ standard tables. At least it won't +leave any Big5+ codes dangling. :-) It does include a Big5+ to GBK +table, but then, we want GB, not GBK. Hmm... + + +Converter +~~~~~~~~~ + * Don't bother with tcs. Due to the traditional/simplified character + issue, tcs simply doesn't work well at all. + + * utf-converter works but need more tweaking to get everything translated + properly. + + + -- Anthony Fok <foka@debian.org>, Fri, 16 Apr 1999 05:11:03 -0600 |