2. Using the Tutorial Examples
3. Getting Started with Web Applications
5. JavaServer Pages Technology
7. JavaServer Pages Standard Tag Library
10. JavaServer Faces Technology
11. Using JavaServer Faces Technology in JSP Pages
12. Developing with JavaServer Faces Technology
13. Creating Custom UI Components
14. Configuring JavaServer Faces Applications
15. Internationalizing and Localizing Web Applications
16. Building Web Services with JAX-WS
17. Binding between XML Schema and Java Classes
19. SOAP with Attachments API for Java
21. Getting Started with Enterprise Beans
23. A Message-Driven Bean Example
24. Introduction to the Java Persistence API
25. Persistence in the Web Tier
26. Persistence in the EJB Tier
27. The Java Persistence Query Language
28. Introduction to Security in the Java EE Platform
29. Securing Java EE Applications
31. The Java Message Service API
32. Java EE Examples Using the JMS API
36. The Coffee Break Application
37. The Duke's Bank Application
Further Information about Character Encoding
This appendix describes the character-encoding schemes that are supported by the Java platform.
US-ASCII
US-ASCII is a 7-bit character set and encoding that covers the English-language alphabet. It is not large enough to cover the characters used in other languages, however, so it is not very useful for internationalization.
ISO-8859-1
ISO-8859-1 is the character set for Western European languages. It’s an 8-bit encoding scheme in which every encoded character takes exactly 8 bits. (With the remaining character sets, on the other hand, some codes are reserved to signal the start of a multibyte character.)
UTF-8
UTF-8 is an 8-bit encoding scheme. Characters from the English-language alphabet are all encoded using an 8-bit byte. Characters for other languages are encoded using 2, 3, or even 4 bytes. UTF-8 therefore produces compact documents for the English language, but for other languages, documents tend to be half again as large as they would be if they used UTF-16. If the majority of a document’s text is in a Western European language, then UTF-8 is generally a good choice because it allows for internationalization while still minimizing the space required for encoding.
UTF-16
UTF-16 is a 16-bit encoding scheme. It is large enough to encode all the characters from all the alphabets in the world. It uses 16 bits for most characters but includes 32-bit characters for ideogram-based languages such as Chinese. A Western European-language document that uses UTF-16 will be twice as large as the same document encoded using UTF-8. But documents written in far Eastern languages will be far smaller using UTF-16.
Note - UTF-16 depends on the system’s byte-ordering conventions. Although in most systems, high-order bytes follow low-order bytes in a 16-bit or 32-bit “word,” some systems use the reverse order. UTF-16 documents cannot be interchanged between such systems without a conversion.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved. Legal Notices
Scripting on this page tracks web page traffic, but does not change the content in any way.