private static boolean isPalindrome(String myString) {
int len = myString.length();
for (int i = 0; i < len / 2; ++i) {
int frontHalfCharacter = myString.codePointAt(i);
int backHalfCharacter = myString.codePointBefore(len - i);
if (frontHalfCharacter != backHalfCharacter) {
return false;
}
if (Character.isSupplementaryCodePointisSupplementaryCodePoint(frontHalfCharacter)) { // i.e. if this is a 2-char character
i++;
}
}
return true;
}
private static boolean isPalindrome(String myString) {
int len = myString.length();
for (int i = 0; i < len / 2; ++i) {
int frontHalfCharacter = myString.codePointAt(i);
int backHalfCharacter = myString.codePointBefore(len - i)
if (frontHalfCharacter != backHalfCharacter) {
return false;
}
if (Character.isSupplementaryCodePoint(frontHalfCharacter)) { // i.e. if this is a 2-char character
i++;
}
}
return true;
}
private static boolean isPalindrome(String myString) {
int len = myString.length();
for (int i = 0; i < len / 2; ++i) {
int frontHalfCharacter = myString.codePointAt(i);
int backHalfCharacter = myString.codePointBefore(len - i);
if (frontHalfCharacter != backHalfCharacter) {
return false;
}
if (Character.isSupplementaryCodePoint(frontHalfCharacter)) { // i.e. if this is a 2-char character
i++;
}
}
return true;
}
As Peter Cordes pointed out in a comment, we do not even need to reverse the string in order to detect a palindrome. Instead we can examine the input string in place, comparing the first character to the last, the second to the next-to-last, etc., until we reach the middle. This may be more performant.
We need special handling for 2-char
characters in this case as well; fortunately, the String class has methods for pulling code point values instead of pulling char
values directly.
codePointAt(int index)
behaves similarly tocharAt(int index)
in most cases, but if thechar
at the given index is the first half of a surrogate pair, it will return the full value of the pair.codePointBefore(int index)
approaches the problem from the other end; if thechar
before the given index is the last half of a surrogate pair, it will return the full value of the pair.
private static boolean isPalindrome(String myString) {
int len = myString.length();
for (int i = 0; i < len / 2; ++i) {
int frontHalfCharacter = myString.codePointAt(i);
int backHalfCharacter = myString.codePointBefore(len - i)
if (frontHalfCharacter != backHalfCharacter) {
return false;
}
if (Character.isSupplementaryCodePoint(frontHalfCharacter)) { // i.e. if this is a 2-char character
i++;
}
}
return true;
}
As Peter Cordes pointed out in a comment, we do not even need to reverse the string in order to detect a palindrome. Instead we can examine the input string in place, comparing the first character to the last, the second to the next-to-last, etc., until we reach the middle. This may be more performant.
We need special handling for 2-char
characters in this case as well; fortunately, the String class has methods for pulling code point values instead of pulling char
values directly.
codePointAt(int index)
behaves similarly tocharAt(int index)
in most cases, but if thechar
at the given index is the first half of a surrogate pair, it will return the full value of the pair.codePointBefore(int index)
approaches the problem from the other end; if thechar
before the given index is the last half of a surrogate pair, it will return the full value of the pair.
private static boolean isPalindrome(String myString) {
int len = myString.length();
for (int i = 0; i < len / 2; ++i) {
int frontHalfCharacter = myString.codePointAt(i);
int backHalfCharacter = myString.codePointBefore(len - i)
if (frontHalfCharacter != backHalfCharacter) {
return false;
}
if (Character.isSupplementaryCodePoint(frontHalfCharacter)) { // i.e. if this is a 2-char character
i++;
}
}
return true;
}
Doi9t's answer is very good, but even with their improvements, there is still a problem: your code does not produce the correct answer in all cases.
Java strings use UTF-16 encoding. This means that a Java char
is not large enough to store all Unicode characters. Instead some characters (for example, 😂) are stored as a pair of char
s, and reversing the pair (as your code does) will result in nonsense data. See this documentation for more details.
Fortunately, the way UTF-16 is defined, char
s that are surrogates (half-characters) have a completely separate range of values from char
s that are Unicode characters by themselves. This means it is possible to test each char
individually to see if it is a surrogate, and then have special handling to preserve the pairs.
import java.lang.Character;
import java.lang.StringBuilder;
import java.util.Scanner;
<...>
private static String reverseString(String myString) {
StringBuilder reversedString = new StringBuilder();
for (int j = myString.length() - 1; j >= 0; j--) {
char c = myString.charAt(j);
if (Character.isLowSurrogate(c)) {
j--;
reversedString.append(myString.charAt(j));
}
reversedString.append(c);
}
return reversedString.toString();
}
If you really wanted to re-invent the wheel, I think Character.isLowSurrogate(c)
could be replaced with c >= '\uDC00' && c <= '\uDFFF'
, though I have not personally tested this.
And while we're on the topic of Unicode, you should of course read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), if you haven't already.
Doi9t's answer is very good, but even with their improvements, there is still a problem: your code does not produce the correct answer in all cases.
Java strings use UTF-16 encoding. This means that a Java char
is not large enough to store all Unicode characters. Instead some characters (for example, 😂) are stored as a pair of char
s, and reversing the pair (as your code does) will result in nonsense data. See this documentation for more details.
Fortunately, the way UTF-16 is defined, char
s that are surrogates (half-characters) have a completely separate range of values from char
s that are Unicode characters by themselves. This means it is possible to test each char
individually to see if it is a surrogate, and then have special handling to preserve the pairs.
import java.lang.Character;
import java.lang.StringBuilder;
import java.util.Scanner;
<...>
private static String reverseString(String myString) {
StringBuilder reversedString = new StringBuilder();
for (int j = myString.length() - 1; j >= 0; j--) {
char c = myString.charAt(j);
if (Character.isLowSurrogate(c)) {
j--;
reversedString.append(myString.charAt(j));
}
reversedString.append(c);
}
return reversedString.toString();
}
If you really wanted to re-invent the wheel, I think Character.isLowSurrogate(c)
could be replaced with c >= '\uDC00' && c <= '\uDFFF'
, though I have not personally tested this.
And while we're on the topic of Unicode, you should of course read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), if you haven't already.
Doi9t's answer is very good, but even with their improvements, there is still a problem: your code does not produce the correct answer in all cases.
Java strings use UTF-16 encoding. This means that a Java char
is not large enough to store all Unicode characters. Instead some characters (for example, 😂) are stored as a pair of char
s, and reversing the pair (as your code does) will result in nonsense data. See this documentation for more details.
Fortunately, the way UTF-16 is defined, char
s that are surrogates (half-characters) have a completely separate range of values from char
s that are Unicode characters by themselves. This means it is possible to test each char
individually to see if it is a surrogate, and then have special handling to preserve the pairs.
import java.lang.Character;
import java.lang.StringBuilder;
import java.util.Scanner;
<...>
private static String reverseString(String myString) {
StringBuilder reversedString = new StringBuilder();
for (int j = myString.length() - 1; j >= 0; j--) {
char c = myString.charAt(j);
if (Character.isLowSurrogate(c)) {
j--;
reversedString.append(myString.charAt(j));
}
reversedString.append(c);
}
return reversedString.toString();
}
If you really wanted to re-invent the wheel, I think Character.isLowSurrogate(c)
could be replaced with c >= '\uDC00' && c <= '\uDFFF'
, though I have not personally tested this.
And while we're on the topic of Unicode, you should of course read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), if you haven't already.