本文转载于网络
文章原名:UTF-8 and Unicode FAQ
作者:Markus Kuhn
译者:中国LINUX论坛翻译小组 xLoneStar
/* This function tests, whether the ISO 10646/Unicode character code
* ucs belongs into the East Asian Wide (W) or East Asian FullWidth
* (F) category as defined in Unicode Technical Report #11. In this
* case, the terminal emulator should represent the character using a
* a glyph from a double-wide font that covers two normal (Latin)
* character cells. */
int iswide(int ucs)
{
if (ucs < 0x1100)
return 0;
return
(ucs >= 0x1100 && ucs <= 0x115f) || /* Hangul Jamo */
(ucs >= 0x2e80 && ucs <= 0xa4cf && (ucs & ~0x0011) != 0x300a &&
ucs != 0x303f) || /* CJK ... Yi */
(ucs >= 0xac00 && ucs <= 0xd7a3) || /* Hangul Syllables */
(ucs >= 0xf900 && ucs <= 0xfaff) || /* CJK Compatibility Ideographs */
(ucs >= 0xfe30 && ucs <= 0xfe6f) || /* CJK Compatibility Forms */
(ucs >= 0xff00 && ucs <= 0xff5f) || /* Fullwidth Forms */
(ucs >= 0xffe0 && ucs <= 0xffe6);
}
#include <wchar.h>
int wcwidth(wchar_t wc);
int wcswidth(const wchar_t *pwcs, size_t n);
Bruno Haible ’s Unicode HOWTO.
The Unicode Standard, Version 2.0
Unicode Technical Reports
Mark Davis’ Unicode FAQ
ISO/IEC 10646-1:1993
Frank Tang’s I?t?rnati?nàliz?ti?n Secrets
Unicode Support in the Solaris 7 Operating Environment
The USENIX paper by Rob Pike and Ken Thompson on the introduction of UTF-8 under Plan9 reports about the first operating system that migrated already in 1992 completely to UTF-8 (which was at the time still called UTF-2).
Li18nux is a project initiated by several Linux distributors to enhance Unicode support for Linux.
The Online Single Unix Specification contains definitions of all the ISO C Amendment 1 function, plus extensions such as wcwidth().
The Open Group’s summary of ISO C Amendment 1.
GNU libc
The Linux Console Tools
The Unicode Consortium character database and character set conversion tables are an essential resource for anyone developping Unicode related tools.
Other conversion tables are available from Microsoft and Keld Simonsen’s WG15 archive.
Michael Everson’s ISO10646-1 archive contains online versions of many of the more recent ISO 10646-1 amendments, plus many other goodies. See also his Roadmaps to the Universal Character Set.
An introduction into The Universal Character Set (UCS).
Otfried Cheong’s essey on Han Unification in Unicode
The AMS STIX project is working on revising and extending the mathematical characters for Unicode 4.0 and ISO 10646-2.
Jukka Korpela’s Soft hyphen (SHY) - a hard problem? is an excellent discussion of the controversy surrounding U+00AD.
James Briggs’ Perl, Unicode and I18N FAQ.
欢迎光临 曲径通幽论坛 (http://www.groad.net/bbs/) | Powered by Discuz! X3.2 |