90 likes | 210 Views
This article discusses the various issues related to Chinese domain names and character encoding faced by users in China. It highlights problems such as the inability to send certain Chinese characters using Internet Explorer, the necessity to prefix "http://" for browsing Chinese domains, and the limitations of different character sets like GB2312, GBK, and GB18030. Additionally, it explores common concerns regarding simplified and traditional characters in domain names. The insights aim to clarify the complexity of encoding issues for users unfamiliar with technicalities.
E N D
Some Chinese Domain Name Issues Zhang Wenhui China Internet Network Information Center (CNNIC) zwh@cnnic.net.cn
Problems on Windows and Browser • Some general Chinese Character can not be sent out by IE, ping, telnet, NetTerm on windows • Must type “http://” when using IE to browser Chinese domain name • Some versions of IE can not correctly support UTF-8
Character Set • GB 2312: current standard, but do not widely use, more than 6000 characters, some characters which people daily use can not be displayed • GBK: recommended standard, more than 20,954 characters, widely use currently • GB18030: will be the forced standard at the end of this year, 27,487 characters
Character Set (cont.) • GB 2312: two bytes • two bytes are all high byte • GBK: two bytes • the first byte is high byte, second byte may be low byte • GB18030:2-4 two bytes • odd byte is high byte, even byte may be low byte
Character Set (cont.) • Example: 镕(朱镕基, the name of China Premier): GB2312 do not include the character “镕”, but GBK and GB18030 include the character
Some Special Issues (1) Simplified Traditional 着(火) 著(火,作) 著(作) For users, they do not know character set, encoding, but they strongly want to type the string “著火” to be the same domain name as “着火”
Some Special Issues (2) 髮 Simplified Traditional 發 发 For users, in some case, they strongly want to type “发*” to be the same domain name as “發*”, in other case, they strongly want to type “发*” to be the same domain name as “髮*”.
Some Special Issues (3) • Both used in GBK 华,華,学,學 清华大学 清华大學 is the same domain? 清華大学 清華大學
Thanks www.cnnic.net.cn zwh@cnnic.net.cn