Jump to content
Wikipedia The Free Encyclopedia

VSCII

From Wikipedia, the free encyclopedia
National standard character encoding for the Vietnamese alphabet
Not to be confused with VISCII, an unofficial encoding for Vietnamese.
VSCII
Alias(es)x-viet-tcvn5712[1]
LanguagesVietnamese, English
Created byTCVN/TC1
StandardTCVN 5712:1993
Classification8-bit SBCS;
Extended ASCII (VSCII-2/-3)

VSCII (Vietnamese Standard Code for Information Interchange), also known as TCVN 5712,[2] ISO-IR-180,[3] .VN,[4] ABC[4] or simply the TCVN encodings,[4] [5] is a set of three closely related Vietnamese national standard character encodings for using the Vietnamese language with computers, developed by the TCVN Technical Committee on Information Technology (TCVN/TC1) and first adopted in 1993 (as TCVN 5712:1993).[2]

It should not be confused with the similarly-named unofficial VISCII encoding, which was sometimes used by overseas Vietnamese speakers.[4] VISCII was also intended to stand for Vietnamese Standard Code for Information Interchange, but is not related to VSCII.[6]

VSCII (TCVN) was used extensively in the north of Vietnam, while VNI was popular in the south.[4] Unicode and the Windows-1258 code page are now used for virtually all Vietnamese computer data,[citation needed ] but legacy files or archived messages may need conversion.

Encodings

[edit ]

All three forms of VSCII keep the 95 printable characters of ASCII unmodified.

VSCII-3, also known as TCVN 5712-3, VN3 or simply TCVN3,[7] includes the fewest assignments. It is an extended ASCII, because it keeps all 128 codes of ASCII unmodified. It does not reassign any of the C0 and C1 control codes. Compared to ASCII, it adds 75 characters:

  • 67 lowercase characters, allowing full lowercase support.
  • 7 uppercase characters, allowing uppercase support for the 29 base letters without tone marks.
  • The non-breaking space.

Tone marks on uppercase vowels is accomplished in TCVN3 by switching to an all-capital font.[8]

VSCII-2, also known as TCVN 5712-2 and VN2, is a superset of VSCII-3. It is an extended ASCII, because it keeps all 128 codes of ASCII unmodified. It does not reassign any of the C0 and C1 control codes, making it conformant with ISO 2022 as a 96-set.[2] [3] Compared to VSCII-3, it adds (for a total of 96 non-ASCII characters):

  • 16 more uppercase characters with pre-composed tone marks (for a total of 23 non-ASCII uppercase characters)
  • 5 combining diacritics for tone marks, allowing other combinations of uppercase letters and tone marks to be represented. Combining marks follow the base letter[2] as in VNI (rather than preceding them as in ANSEL).

VSCII-1, also known as TCVN 5712-1 and VN1, is an extension of VSCII-2, and is a modified ASCII, since it replaces 12 of the 33 control characters with precomposed characters. Compared to VSCII-2, it (for a total of 140 non-ASCII characters):

  • Adds 44 more pre-composed uppercase letters, bringing them to the same count as the lowercase
  • Does this by replacing 12 ASCII control characters and allocating 32 graphical characters to the C1 control area, breaking ISO 2022 compatibility

Conversion from VSCII-3 to VSCII-2 or VSCII-1 and conversion from VSCII-2 to VSCII-1 are not necessary, but can result in smaller files.

Conversion from VSCII-1 to VSCII-2 or VSCII-3 and conversion from VSCII-2 to VSCII-3 require expansion of some pre-composed characters.

Character set

[edit ]
VSCII-1[2]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x NUL Ú
00DA
1EE4 ETX
1EEA
1EEC
1EEE BEL BS HT LF VT FF CR SO SI
1x DLE
1EE8
1EF0
1EF2
1EF6
1EF8 Ý
00DD
1EF4 CAN EM SUB ESC FS GS RS US
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ DEL
8x À
00C0
1EA2 Ã
00C3 Á
00C1
1EA0
1EB6
1EAC È
00C8
1EBA
1EBC É
00C9
1EB8
1EC6 Ì
00CC
1EC8 Ĩ
0128
9x Í
00CD
1ECA Ò
00D2
1ECE Õ
00D5 Ó
00D3
1ECC
1ED8
1EDC
1EDE
1EE0
1EDA
1EE2 Ù
00D9
1EE6 Ũ
0168
Ax NBSP Ă
0102 Â
00C2 Ê
00CA Ô
00D4 Ơ
01A0 Ư
01AF Đ
0110 ă
0103 â
00E2 ê
00EA ô
00F4 ơ
01A1 ư
01B0 đ
0111
1EB0
Bx ◌̀
0300 ◌̉
0309 ◌̃
0303 ◌́
0301 ◌̣
0323 à
00E0
1EA3 ã
00E3 á
00E1
1EA1
1EB2
1EB1
1EB3
1EB5
1EAF
1EB4
Cx
1EAE
1EA6
1EA8
1EAA
1EA4
1EC0
1EB7
1EA7
1EA9
1EAB
1EA5
1EAD è
00E8
1EC2
1EBB
1EBD
Dx é
00E9
1EB9
1EC1
1EC3
1EC5 ế
1EBF
1EC7 ì
00EC
1EC9
1EC4
1EBE
1ED2 ĩ
0129 í
00ED
1ECB ò
00F2
Ex
1ED4
1ECF õ
00F5 ó
00F3
1ECD
1ED3
1ED5
1ED7
1ED1
1ED9
1EDD
1EDF
1EE1
1EDB
1EE3 ù
00F9
Fx
1ED6
1EE7 ũ
0169 ú
00FA
1EE5
1EEB
1EED
1EEF
1EE9
1EF1
1EF3
1EF7
1EF9 ý
1EF5
1ED0
  VSCII-3
  Additions for VSCII-2
  Additions for VSCII-1[9]

References

[edit ]
  1. ^ Sivonen, Henri (2014年09月26日). "Character encoding changes in m-c require c-c action". mozilla.dev.apps.thunderbird.
  2. ^ a b c d e "[news] TCVN 5712:1993 (VSCII) -- Vietnamese national standard". 1993年06月02日. Archived from the original on 2017年01月11日.
  3. ^ a b TCVN (1993). ISO-IR-180: Right-hand Part of the VSCII-2 Code Table (PDF). ITSCJ/IPSJ.
  4. ^ a b c d e Ngo, Hoc Dinh; Tran, TuBinh. "5. Why Having Vietnamese Charset (Character Set – Encoding) Conversion?". Some special functions of WinVNKey.
  5. ^ Nguyen, Minh T. "Vietnamese Conversions (Vietnet/VIQR, VNI, VPS, VISCII, VNU, TCVN, VietWare, unicode)".
  6. ^ Lunde, Ken (13 January 2009). "Chapter 1: CJKV Information Processing Overview (§ Are VISCII and VSCII identical? What about TCVN?)". CJKV Information Processing (2nd ed.). p. 17. ISBN 978-0-596-51447-1.
  7. ^ "Unicode & Vietnamese Legacy Character Encodings". Vietnamese Unicode FAQs.
  8. ^ "Unicode & Vietnamese Legacy Character Encodings". Vietnamese Unicode FAQs. TCVN3 is not double-byte, but due to the nature of its encoding, capital letters (vowels) are mapped to a separate, capital font that is similar to the normal, lowercase one.
  9. ^ Lunde, Ken (13 January 2009). "Appendix L: Vietnamese Character Sets" (PDF). CJKV Information Processing (2nd ed.). ISBN 978-0-596-51447-1.
[edit ]
Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
ISO/IEC 2022
Mac OS Code pages
("scripts")
DOS code pages
IBM AIX code pages
Windows code pages
EBCDIC code pages
DEC terminals (VTx)
Platform specific
Unicode / ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Control character
Related topics

AltStyle によって変換されたページ (->オリジナル) /