Character Data Type in Java
In Java, characters are stored using the char data type. Unlike C/C++, where char is 1 byte, Java uses 2 bytes (16 bits) for char because it supports Unicode, not just ASCII. This allows Java to represent characters from almost all world languages, symbols, and even emojis.
char
char a;
- Size: 2 bytes = 16 bits
- Encoding: Unicode (UTF-16)
- Range (formula): 0 to (216 − 1)
- Range (actual): 0 to 65,535 (Unsigned only)
- Total Values: 65,536 possible characters
- Usage: To store a single character, symbol, digit, or Unicode character.
Internal Representation
- char is internally stored as a 16-bit unsigned integer.
- Each char maps to a Unicode code point.
Example:
- 'A' → Unicode 0041 → Decimal 65
- 'a' → Unicode 0061 → Decimal 97
- '₹' → Unicode 20B9 → Decimal 8377
- '😊' → Unicode 1F60A (stored using surrogate pairs in UTF-16)
Examples in Java
public class CharExample {
public static void main(String[] args) {
char c1 = 'A'; // Normal character
char c2 = 65; // Decimal value of 'A'
char c3 = '\u0041'; // Unicode escape sequence for 'A'
char c4 = '₹'; // Indian Rupee symbol
char c5 = '\u20B9'; // Same as above using Unicode
System.out.println("c1: " + c1);
System.out.println("c2: " + c2);
System.out.println("c3: " + c3);
System.out.println("c4: " + c4);
System.out.println("c5: " + c5);
}
}
Output:
c1: A
c2: A
c3: A
c4: ₹
c5: ₹
Explanation:
- In Java, char is a 16-bit Unicode character (range: 0 to 65535).
- It can be assigned in multiple ways:
- Direct character: c1 = 'A' stores 'A'.
- Integer value: c2 = 65 uses the Unicode decimal value for 'A'.
- Unicode escape sequence: c3 = '\u0041' also represents 'A'.
- Special symbols: c4 = '₹' directly stores the Indian Rupee symbol.
- Unicode for symbols: c5 = '\u20B9' is another way to store '₹'.
- All representations ('A', 65, '\u0041') point to the same Unicode character.
- Thus, Java char supports internationalization by handling not only English letters but also currency symbols, emojis, and characters from different languages.
Important Points
- char is unsigned in Java (cannot store negative values).
- Default value of char = '\u0000' (null character).
- You can assign a char using:
- Direct character ('A')
- Integer value (65)
- Unicode escape ('\u0041')
- Since char is numeric internally, it supports arithmetic:
char c = 'A';
System.out.println((int)c); // 65
System.out.println((char)(c + 1)); // 'B'
If you want to handle strings of multiple characters, use the String class, not char.
Special Note
If you try to store larger Unicode characters (like emojis) above \uFFFF, Java internally represents them using surrogate pairs (two char values). For such cases, it’s better to use String or int (codePoint) instead of char.
Summary:
- char = 2 bytes, unsigned, range 0–65,535.
- Stores Unicode characters (supports all languages & symbols).
- Can be assigned via character, integer, or Unicode escape.
- Use String for multiple characters or emojis beyond \uFFFF.