[ Index ] |
PHP Cross Reference of Unnamed Project |
[Summary view] [Print] [Text view]
1 package Encode::TW; 2 BEGIN { 3 if ( ord("A") == 193 ) { 4 die "Encode::TW not supported on EBCDIC\n"; 5 } 6 } 7 use strict; 8 use warnings; 9 use Encode; 10 our $VERSION = do { my @r = ( q$Revision: 2.2 $ =~ /\d+/g ); sprintf "%d." . "%02d" x $#r, @r }; 11 use XSLoader; 12 XSLoader::load( __PACKAGE__, $VERSION ); 13 14 1; 15 __END__ 16 17 =head1 NAME 18 19 Encode::TW - Taiwan-based Chinese Encodings 20 21 =head1 SYNOPSIS 22 23 use Encode qw/encode decode/; 24 $big5 = encode("big5", $utf8); # loads Encode::TW implicitly 25 $utf8 = decode("big5", $big5); # ditto 26 27 =head1 DESCRIPTION 28 29 This module implements tradition Chinese charset encodings as used 30 in Taiwan and Hong Kong. 31 Encodings supported are as follows. 32 33 Canonical Alias Description 34 -------------------------------------------------------------------- 35 big5-eten /\bbig-?5$/i Big5 encoding (with ETen extensions) 36 /\bbig5-?et(en)?$/i 37 /\btca-?big5$/i 38 big5-hkscs /\bbig5-?hk(scs)?$/i 39 /\bhk(scs)?-?big5$/i 40 Big5 + Cantonese characters in Hong Kong 41 MacChineseTrad Big5 + Apple Vendor Mappings 42 cp950 Code Page 950 43 = Big5 + Microsoft vendor mappings 44 -------------------------------------------------------------------- 45 46 To find out how to use this module in detail, see L<Encode>. 47 48 =head1 NOTES 49 50 Due to size concerns, C<EUC-TW> (Extended Unix Character), C<CCCII> 51 (Chinese Character Code for Information Interchange), C<BIG5PLUS> 52 (CMEX's Big5+) and C<BIG5EXT> (CMEX's Big5e) are distributed separately 53 on CPAN, under the name L<Encode::HanExtra>. That module also contains 54 extra China-based encodings. 55 56 =head1 BUGS 57 58 Since the original C<big5> encoding (1984) is not supported anywhere 59 (glibc and DOS-based systems uses C<big5> to mean C<big5-eten>; Microsoft 60 uses C<big5> to mean C<cp950>), a conscious decision was made to alias 61 C<big5> to C<big5-eten>, which is the de facto superset of the original 62 big5. 63 64 The C<CNS11643> encoding files are not complete. For common C<CNS11643> 65 manipulation, please use C<EUC-TW> in L<Encode::HanExtra>, which contains 66 planes 1-7. 67 68 The ASCII region (0x00-0x7f) is preserved for all encodings, even 69 though this conflicts with mappings by the Unicode Consortium. See 70 71 L<http://www.debian.or.jp/~kubota/unicode-symbols.html.en> 72 73 to find out why it is implemented that way. 74 75 =head1 SEE ALSO 76 77 L<Encode> 78 79 =cut
title
Description
Body
title
Description
Body
title
Description
Body
title
Body
Generated: Tue Mar 17 22:47:18 2015 | Cross-referenced by PHPXref 0.7.1 |