I have implemented a simple pattern matching for the below algorithm in Java with time complexity of O(n)(n is the size of the input array).
Any suggestions for optimization??
one of the most important aspects of accurately matching records to candidates is ensuring that the name in the record matches the name or a known alias of the candidate. In this exercise, you’ll build a method nameMatch() that receives an array of known names and a name from a record to match.
The method should pass the following tests:
1. Exact match
known_aliases = ['Alphonse Gabriel Capone', 'Al Capone']
name_match(known_aliases, 'Alphonse Gabriel Capone') => True
name_match(known_aliases, 'Al Capone') => True
name_match(known_aliases, 'Alphonse Francis Capone') => False
2.Middle name missing (on alias)
known_aliases = ['Alphonse Capone']
name_match(known_aliases, 'Alphonse Gabriel Capone') => True
name_match(known_aliases, 'Alphonse Francis Capone') => True
name_match(known_aliases, 'Alexander Capone') => False
3.Middle name missing (on record name)
known_aliases = ['Alphonse Gabriel Capone']
name_match(known_aliases, 'Alphonse Capone') => True
name_match(known_aliases, 'Alphonse Francis Capone') => False
name_match(known_aliases, 'Alexander Capone') => False
4.More middle name tests
known_aliases = ['Alphonse Gabriel Capone', 'Alphonse Francis Capone']
name_match(known_aliases, 'Alphonse Gabriel Capone') => True
name_match(known_aliases, 'Alphonse Francis Capone') => True
name_match(known_aliases, 'Alphonse Edward Capone') => False
5.Middle initial matches middle name
known_aliases = ['Alphonse Gabriel Capone', 'Alphonse F Capone']
name_match(known_aliases, 'Alphonse G Capone') => True
name_match(known_aliases, 'Alphonse Francis Capone') => True
name_match(known_aliases, 'Alphonse E Capone') => False
name_match(known_aliases, 'Alphonse Edward Capone') => False
name_match(known_aliases, 'Alphonse Gregory Capone') => False
6. Transposition of middle name and first name, All of the test cases implemented previously also apply to the transposed name.
'Gabriel Alphonse Capone' is a valid transposition of 'Alphonse Gabriel Capone'
known_aliases = ['Alphonse Gabriel Capone']
name_match(known_aliases, 'Gabriel Alphonse Capone') => True
name_match(known_aliases, 'Gabriel A Capone') => True
name_match(known_aliases, 'Gabriel Capone') => True
name_match(known_aliases, 'Gabriel Francis Capone') => False
7. Last name cannot be transposed
'Alphonse Capone Gabriel' is NOT a valid transposition of 'Alphonse Gabriel Capone'
'Capone Alphonse Gabriel' is NOT a valid transposition of 'Alphonse Gabriel Capone'
known_aliases = ['Alphonse Gabriel Capone']
name_match(known_aliases, 'Alphonse Capone Gabriel') => False
name_match(known_aliases, 'Capone Alphonse Gabriel') => False
name_match(known_aliases, 'Capone Gabriel') => False
public class PatternMatch {
public boolean nameMatch(List<String> knownAliases, String name){
for(String alias : knownAliases){
if(nameMatch(name,alias)) return true;
}
return false;
}
private boolean nameMatch(String name, String alias){
//case7: lastName check
String aliasLastName = extractLastName(alias);
String lastName = extractLastName(name);
if(aliasLastName.equalsIgnoreCase(lastName)){
String firstName = extractFirstName(name);
String middleName = extractMiddleName(name);
//case6: transposition of middle name and first name, by swapping alias firstName and middleName
if(firstNameAndMiddleNameMatch(firstName,middleName,extractFirstName(alias),extractMiddleName(alias))) return true;
if(firstNameAndMiddleNameMatch(firstName,middleName,extractMiddleName(alias),extractFirstName(alias))) return true;
}
return false;
}
private boolean firstNameAndMiddleNameMatch(String firstName, String middleName, String aliasFirstName, String aliasMiddleName){
if(aliasFirstName == null || aliasFirstName.length() == 0) return false;
boolean firstNameMatch = firstName.equalsIgnoreCase(aliasFirstName);
if(firstNameMatch){
if(middleName!=null && aliasMiddleName!=null){
//case1: exact match
boolean middleNameMatch = middleName.equalsIgnoreCase(aliasMiddleName);
//case5: Middle initial matches middle name
boolean middleInitialsMatch = middleName.charAt(0) == aliasMiddleName.charAt(0);
return middleNameMatch || middleInitialsMatch;
}
//case2 & case3: middle name is missing
return firstNameMatch;
}
return false;
}
private String extractFirstName(String name){
if(name == null || name.length() == 0) return null;
String[] split = name.split(" ");
return split[0];
}
private String extractMiddleName(String name){
if(name == null || name.length() == 0) return null;
String[] split = name.split(" ");
StringBuilder sb = new StringBuilder();
for(int i =1;i<split.length-1;i++){
sb.append(" ");
sb.append(split[i]);
}
if(sb.length() == 0) return null;
return sb.toString().trim();
}
private String extractLastName(String name){
if(name == null || name.length() == 0) return null;
String[] split = name.split(" ");
return split[split.length-1];
}
}
2 Answers 2
your code looks fine - some minor issues:
test driven developement
if you have clear requirements, write test to validate them.
"The method should pass the following tests"
i hope you have these tests and just forgot to add them on the code review.
exception handling
instead of returning null
you should throw an excpetion, namely IllegalArgumentAexception, if you don't you violate the Principle of Least Astonishment.
if(name == null || name.length() == 0) throw new IllegalArgumentExcpetion("name must not be null");
Note
Java code convention says: always use bracket on program flow control:
if(name == null || name.length() == 0){
throw new IllegalArgumentExcpetion("name must not be null");
}
Data Object
i would advise you to create a data object for your purpose. instead of using a List<String> knownAliases
i would create a class NameAlias
that gives you the proper access on your desired fields.
class NameAlias {
String getFirstName();
String getLastName();
List<String> getMiddleNames(); //should it be rather a Set than a List?
}
that would increase the readability of your test (of your code) and also increase performance since you need to split the input only once and can re-use the information then.
Note:
you split your input within every method extractFirstName
, extractMiddleName
and extractLastName
- a DataModel class would do it only once!
Note 2:
within your data object you could also do the work for UpperCasting. Again it would be done only once instead for each check:
class NameAlias {
...
String getFirstNameUpperCase();
//and so on...
}
Note 3
if you had a data object you could even reduce the error handling.
class NameAlias {
...
boolean hasMiddleName();
...
}
Deeper Data Modelling?
instead of using String
for a Name you could create a class Name
. Especially when you have interactions (methods) with your name
class Name{
boolean startsWith(String capital);
boolean isCapital(); //true if name.length==1, sorry, my english lacks here
}
more objects
the complexity of your code is high. You should try to reduce the complexity by separating your code into smaller pieces that belong together (objects).
i am missing a class FirstNameMatcher
that is responsible for... well, simply to check if your NameAlias.firstName
matches. and of course Matcher for the other cases as well.
class FirstNameMatcher{
boolean matches(NameAlias alias);
}
aside from reducing mere complexity you would follow the single responsibility principle .
null
Nulls can be useful, however they can also be more trouble than their worth.
Do you really need to use null, or could you get away with using ""
instead? Consider the impact this would have on your code for simplifying statements like this: if(aliasFirstName == null || aliasFirstName.length() == 0)
Multiple middle names
Reading the test cases, it seems like a person may have a middle name, or not have a middle name. It doesn't appear like a person can have multiple middle names. Your code, however allows multiple middle names. Initially this sounds reasonable, because it makes sense that a person can have multiple middle names, however for the purposes of this it seems like scope creep. One of the biggest problems with scope creep like this is the ambiguity that it can introduce. What's the impact of having multiple middle names on rule 6?:
Transposition of middle name and first name, All of the test cases implemented previously also apply to the transposed name
From above, for single middle names, these should be true:
- alias('Alphonse Gabriel Capone'), name('Gabriel Alphonse Capone')
- alias('Alphonse Gabriel Capone'), name('Gabriel A Capone')
Supporting middle names, for the alias('Alphonse Gabriel Francis Capone') what are the expected name matches?
- Alphonse Gabriel Francis Capone // This is the only one that works
- Gabriel Francis Alphonse Capone
- Gabriel Alphonse Francis Capone
- Francis Alphonse Gabriel Capone
- Francis Gabriel Alphonse Capone
That's before considering the impact of rule 5 (Middle initial matches middle name).
Rule 5 - Middle initial matches middle name
The way you've implemented this seems a bit lax to me. Rather than matching an initial for the middle name, you're matching the first letter of the middle name, without considering how long that name is. This means that for the alias "Alphonse Gabriel Capone"
it would match "Alphonse George Capone"
. This seems wrong. Because of the way you've allowed multiple middle names, it would also match "Alphonse George isn't really my name Capone"
.
Data cleansing
The inputs to your matching method might be validated by another class. But if not, then consider the impact a space in the wrong place could have on the matching process. Whilst a user would probably consider the following the same, the program treats them differently "Alphonse Gabriel Capone" and " Alphonse Gabriel Capone" because of the leading space.
-
\$\begingroup\$ returning
null
or""
seems both not satisfying - if it is not guaranteed that a value does return a valid value you should offer a check methodhasXxx()
. ifhasXxx
returns false and you callgetXxx()
logically an exception should be thrown (that what would be the least astounding behaviour) - alternativly you provide a wrapper object that supports this functionallity (namely:Optional<Xxx>
hasisPresent()
). - i am mentioning this with deepest respect on your answer! \$\endgroup\$Martin Frank– Martin Frank2021年11月12日 05:45:26 +00:00Commented Nov 12, 2021 at 5:45 -
\$\begingroup\$ i like your comment about code creep \$\endgroup\$Martin Frank– Martin Frank2021年11月12日 08:46:37 +00:00Commented Nov 12, 2021 at 8:46
-
1\$\begingroup\$ @MartinFrank I mostly agree with you. I prefer
Optional
overhas/get
because it's less prone to race conditions in mutable objects.Optional
allows you to represent a non-present object without worrying about null-pointers and flag to the caller this might happen. Some classes likeString
andList
have a built in way of saying that they're empty. In your example you haveList<String> getMiddleNames()
. Does this throw if there's no middle names, or return anemptyList()
? Sometimes I need to work the code to find the best fit but part of that is bottoming out the requirements. \$\endgroup\$forsvarir– forsvarir2021年11月12日 11:11:33 +00:00Commented Nov 12, 2021 at 11:11 -
1\$\begingroup\$ @MartinFrank as an aside... you really don't need the disclaimer, at least not with me, I've read enough of your stuff on here to respect your opinion and I've got a pretty thick skin, even against the dreaded down vote. 5 developers will have 5 ideas about the best solution... in my case at least sitting down at a problem 5 times will also often end up with 5 (often slightly) different solutions. It's all a journey. \$\endgroup\$forsvarir– forsvarir2021年11月12日 11:16:52 +00:00Commented Nov 12, 2021 at 11:16
-
\$\begingroup\$ thats sound like are a professional worker - i'm glad we can learn from each other :-) \$\endgroup\$Martin Frank– Martin Frank2021年11月12日 11:57:29 +00:00Commented Nov 12, 2021 at 11:57
Explore related questions
See similar questions with these tags.