I'm writing a URL validator. Firstly, it checks for special characters in the input. Secondly, it adds 'http://' and checks for validity.
/* Returns true if url is valid */
private static boolean isValidURL(String url) {
boolean containsSpecialCharacters = specialCharactersExists(url);
if ( !containsSpecialCharacters ) {
/* Try creating a valid URL */
try {
new URL(String.format("%s%s", "http://", url)).toURI();
return true;
} catch (Exception e) {
/* Not a valid URL */
}
}
return false;
}
/* Returns true if url contains special characters */
private static boolean specialCharactersExists(String input) {
Pattern regex = Pattern.compile("[^A-Za-z0-9.-]");
Matcher matcher = regex.matcher(input);
return matcher.find();
}
This serves my purpose. I'm seeking advice on how to improve the code (especially the Regex part)
1 Answer 1
Welcome to Code Review!
Bad naming
The method doesn't actually check that a URL is valid, as it allows values that are not URLs. Thus it should not be named as "isValidURL". It should be isValidAddress
or similar, but not URL. URLs have a very well defined syntax.
Unnecessary variables
There is no need to store the return value of specialCharactersExists(url)
to a variable. It only adds an unnecessary large if-statement. Instead, check the value and exit early. Also, the characters you check are not special, they're "illegal" or "invalid" in your implementation so change name to reflect that:
if (illegalCharactersExist(url)) {
return false;
}
Missing final
Variables that are not supposed to change should be marked as final
. If you decide to keep the containsSpecialCharacters variable, it should be marked as final (and named as decribed above).
final boolean containsIllegalCharacters = illegalCharactersExist(url);
Reuse Pattern
The Pattern class is thread safe. You should compile the pattern once into a static variable and reuse it in the matcher. Also, regex is not a good name. It tells what the variable is, not what it's purpose is. Naming should always reflect purpose.
private static final Pattern ILLEGAL_CHAR_PATTERN = Pattern.compile("[^A-Za-z0-9.-]");