blob: 57320677ac09f59d41860918d8036fe814cb9a77 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
|
Some notes and ramblings on Chinese translations (and the fun of
maintaining both Big5 and GB pages and hope that all the characters
show up properly. :-)
Note: This document may contain GB2312 code.
Content Negotiation:
~~~~~~~~~~~~~~~~~~~
lang charset
------ ------ -------
zh-CN .zh-cn Big5
zh-TW .zh-tw GB2312
Big5<->GB... Arrgghh!
Big5 is *bad*!! Its relationship to Unicode is _not_ one-to-one,
and is giving me a lot of headaches.
The following is from
ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/OTHER/BIG5.TXT
# WARNING! It is currently impossible to provide round-trip compatibility
# between BIG5 and Unicode.
#
# A number of characters are not currently mapped because
# of conflicts with other mappings. They are as follows:
#
# BIG5 Description Comments
#
# 0xA15A SPACING UNDERSCORE duplicates A1C4
# 0xA1C3 SPACING HEAVY OVERSCORE not in Unicode
# 0xA1C5 SPACING HEAVY UNDERSCORE not in Unicode
# 0xA1FE LT DIAG UP RIGHT TO LOW LEFT duplicates A2AC
# 0xA240 LT DIAG UP LEFT TO LOW RIGHT duplicates A2AD
# 0xA2CC HANGZHOU NUMERAL TEN conflicts with A451 mapping
# 0xA2CE HANGZHOU NUMERAL THIRTY conflicts with A4CA mapping
#
# We currently map all of these characters to U+FFFD REPLACEMENT CHARACTER
Another reference is the Big5+ standard tables. At least it won't
leave any Big5+ codes dangling. :-) It does include a Big5+ to GBK
table, but then, we want GB, not GBK. Hmm...
Converter
~~~~~~~~~
* Don't bother with tcs. Due to the traditional/simplified character
issue, tcs simply doesn't work well at all.
* utf-converter works but need more tweaking to get everything translated
properly.
-- Anthony Fok <foka@debian.org>, Fri, 16 Apr 1999 05:11:03 -0600
|