This question is the first draft, my new revised code is here: Anagrams for a given input 2.0
Here is my code that I wrote, which basically lists the anagrams for a given input.
I did this by reading each line in my dictionary file and comparing it to the anagram. Only if it matched did I add it to the list
, instead of adding all the words to the list
and then sorting through them.
I compared the words by first checking the length of both words, and then if they matched I put put each word into a list and then sorted the list
, then I compared the two lists using the equals()
function. And iterated my counter.
The function of my getOutPut()
was just to neaten up the printout.
What I would like are some critiques on my work and what I should/could have done differently, especially pertaining to semantic efficiency, code correctness, common practices that I may be unaware of, and code efficiency.
This is only my second project, so I want to learn from it to become a better programmer. That's why I'm asking.
The methods are in the order they are called in:
(Note that I split the code up to make it easier to see, all the methods are inside the Anagramatic Class)
package anagramatic;
/**
* IDE : NETBEANS
* Additional Libraries : Guava
* @author KyleMHB
*/
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Scanner;
import javax.swing.JOptionPane;
public class Anagramatic {
static int k;
-
public static void main(String[] args) throws FileNotFoundException{
String anagram;
anagram=input();
ArrayList<String> words=readFile(anagram);
String output= getOutPut(words);
JOptionPane.showMessageDialog(null,
"The anagram "+anagram+" has "+k+" matches, they are:\n\n"+output);
}//PVSM
-
private static String input() {
String input;
input = JOptionPane.showInputDialog(null,
"Enter the Word or words you would like to be processed");
return input;
}//input
-
private static ArrayList readFile(String anag) throws FileNotFoundException{
k=0;
ArrayList<String> list;
ArrayList<Character> a;
ArrayList<Character> b;
try (Scanner s = new Scanner(new File("words.txt"))){
list = new ArrayList<>();
while (s.hasNext()){
String word= s.next();
if (word.length()==anag.length()){
a = new ArrayList<>();
b = new ArrayList<>();
for(int i=0;i!=word.length();i++){
a.add(anag.charAt(i));
b.add(word.charAt(i));
}//forloop to make two lists of the words
Collections.sort(a);
Collections.sort(b);
if(a.equals(b)){
list.add(word);
k++;
}//comparing the two lists
}//if length
}//while
}//try
return list;
}//readfile
-
private static String getOutPut(ArrayList<String> words) {
String wordz="[";
int x=0;
int y=0;
for(int i=0; i!=words.size()-1;i++){
if(x!=7){
wordz+=words.get(i)+", ";
x++;
}else{wordz+="\n";x=0;}
}//for
wordz+=words.get(words.size()-1)+"]";
return wordz;
}//getOutPut
}//Anagramatic
4 Answers 4
I'll focus on the heart of your solution first, which is your readFile()
method. Here's how I would write it:
public static List<String> anagramsInFile(String word, File f)
throws FileNotFoundException {
char[] canonical = canonicalize(original);
ArrayList<String> anagrams = new ArrayList<String>();
try (Scanner s = new Scanner(f)) {
while (s.hasNext()) {
String candidate = s.next();
if ( (candidate.length() == word.length()) &&
(Arrays.equals(canonical, canonicalize(candidate))) ) {
anagrams.add(candidate);
}
}
}
return anagrams;
}
private static char[] canonicalize(String original) {
char[] array = original.toCharArray();
Arrays.sort(array);
return array;
}
Points to note:
- Naming is important.
handleFile()
is vague;anagramsInFile()
is meaningful. Also, I findanag
to be confusing parameter name. - Hard-coding the filename inside the method is a bad idea.
- Unless the caller has a reason to require an
ArrayList
, your method signature should just commit to returning some kind ofList
. Better yet, consider returning aSet
. - You only have to canonicalize the original string once, not once per word in the file.
- Canonicalization deserves a helper method, since you have to do it to both the original string and to the contents of the file.
- Canonicalizing to a
char[]
avoids working character by character, since you can take advantage ofString.toCharArray()
. It's probably more efficient as well. - There's no point in maintaining
k
: it's just thesize()
of the returned list. Maintainingk
as a class variable is particularly egregious, as there's no reason for that count to be part of the state of the class. - If your blocks are so lengthy and deeply nested that you need comments on the closing braces, consider that a Bad Code Smell and address that issue instead. (By the way, your braces were misaligned, which causes confusion that even your close-brace comments can't compensate for.)
Next, let's look at getOutPut()
. Here's how I would write it:
private static String formatOutput(List<String> words) {
StringBuilder out = new StringBuilder('[');
int wordsPrinted = 0;
Iterator<String> w = words.iterator();
while (w.hasNext()) {
out.append(w.next());
if (w.hasNext()) {
out.append((++wordsPrinted % 8 == 0) ? "\n" : ", ");
}
}
return out.append(']').toString();
}
Points to note:
- When concatenating so many strings, you really need to use a
StringBuilder
. Repeatedstring + string
would create a temporary string each time. - Your output is buggy: you drop every eighth word, replacing it with a newline.
- Be careful with edge cases: your code crashes when
words
is empty. - Try to structure your code so it's obvious that certain properties hold true. For example, in my solution, you can see that every word gets printed (because if
.hasNext()
is true, the next word will get appended). You can also see that every word except the last word is followed by a delimiter (either newline or comma). - Avoid cryptically named variables like
x
andy
. In fact, you don't even usey
. - Naming your result
wordz
when there's another variable namedwords
makes the code a pain to read.
Finally:
public static void main(String[] args) throws FileNotFoundException{
String word = input("Enter the word or words you would like to be processed");
List<String> anagrams = anagramsInFile(word, new File("words.txt"));
display(String.format("The word %s has %d matches. They are:\n\n%s",
word, anagrams.size(), formatOutput(anagrams)));
}
private static String input(String prompt) {
return JOptionPane.showInputDialog(null, prompt);
}
private static void display(String message) {
JOptionPane.showMessageDialog(null, message);
}
Observations:
- I've decomposed the work differently. There's
input()
anddisplay()
, which are analogous to each other, and encapsulate the Swing UI. (If you ever want to convert the program to work in the text console, you just modify those functions.) - More importantly, you can now see at a glance what the whole program does just by looking at
main()
. Note that all string constants are there as well: the prompt, the filename, and the output. - Don't declare any more variables than you need to. If possible, when you declare variables, assign them in the same statement.
- Use
String.format()
.
-
\$\begingroup\$ Thanks a lot, Im going to go through my program again and see what I can change now. I just need to read up on some of the functions you used there. I will add the reformed code when Im done. And yeah I also picked up that I didn't need
k
, but only this morning. And about the declarations I actually generally do not predeclare them but NetBeans suggests to split the declaration and initialization! So I thought it was how it was supposed to be. Thanks of the detailed reply, its really appreciated. \$\endgroup\$Dinocv– Dinocv2013年10月04日 08:58:40 +00:00Commented Oct 4, 2013 at 8:58 -
\$\begingroup\$ out.append((++wordsPrinted % 8 == 0) ? "\n" : ", "); Is this a kind of If statement?
if(worrdsPrinte%8==0){ out.append("\n"); }else{out.append(", ");}
\$\endgroup\$Dinocv– Dinocv2013年10月04日 10:07:27 +00:00Commented Oct 4, 2013 at 10:07 -
\$\begingroup\$ The ternary operator is like an if-else, but for containing expressions rather than statements. Not only is
if (...) { ... } else { ... }
more verbose, it doesn't convey as effectively the sense that that statement is all about appending something toout
no matter what. \$\endgroup\$200_success– 200_success2013年10月04日 10:20:14 +00:00Commented Oct 4, 2013 at 10:20 -
\$\begingroup\$ Ok thanks, also I wanted to know aboou the if statement:
if (w.hasNext()) {
Is it needed? doesnt thewhile loop
already check ifw.hasNext
? Thank you for your patience, and excuse my ignorance. \$\endgroup\$Dinocv– Dinocv2013年10月04日 10:25:07 +00:00Commented Oct 4, 2013 at 10:25 -
\$\begingroup\$ Nevermind, I missed the
out.append(w.next());
would iterate the position. \$\endgroup\$Dinocv– Dinocv2013年10月04日 10:32:08 +00:00Commented Oct 4, 2013 at 10:32
Well, the most obvious improvement is:
You make list a
of the anagram and sort it for every word in the list. You could just do that once before the loop and use that sorted list as reference.
There might be cleverer algorithms for finding anagrams but your approach is fairly simple and easy to understand and read which I'd prefer over clever unless clever gives you a notable performance advantage.
Others:
Do not comment every closed bracket. This adds no value and just clutter. Worse, it makes refactoring code painful as you have to keep the comments updated (no comments are better than wrong comments)
getOutPut
should be spelled getOutput
getOutput
contains a bug because you compare i != words.size() - 1
. This comparison is done before the next iteration starts so you will omit the last element. In generally in for loops over arrays you do something like for (int i = 0; i < size; ++i)
I would consider using StringBuilder
in getOutput
:
private static String getOutPut(ArrayList<String> words) {
StringBuilder builder = new StringBuilder("[");
int last = words.size();
for (int i = 0; i < last; ++i) {
builder.append(words.get(i));
// only append "," and new lines if it is not the last word in the list
if (i < last - 1) {
builder.append(", ");
// every 7th i append a new line
if ((i + 1) % 7 == 0)
builder.append("\n");
}
}
builder.append("]");
return builder.toString();
}
When you concatenate strings like you did you will potentially create a whole lot of string objects (strings are immutable so wordz += "a"
will create a new string and not append a
to the existing string) which are not really useful as they are only temporary (like "word1", "word1, word2", "word1, word2, word3" etc.). The StringBuilder
avoids that but simply keeping a list of all appended strings internally and only building the final string in toString()
.
Disclaimer: I'm not doing much Java development so it might be possible to implement getOutput
a bit more elegant (it also means that my implementation might contain a bug as I haven't tested it in code). Using join
of the string utilities was mentioned but I'm not sure how you could use that to insert an additional delimiter every x-th element.
- Stop declaring your variables and then setting their values. You can just set the values in the same line that you declare them.
- Always use the appropriate type parameter for generics (i.e.,
ArrayList<String>
, not justArrayList
orArrayList<>
. - Don't create variables where you don't need to (i.e.,
static int k
,int x
,int y
). - Wherever possible, separate out different functions into different methods. For example, in your refactored code below, I split the anagram generation into a separate method from the actual file IO. This not only improves code readability but also prevents your program from holding a lock on the file for the duration of the run. (Think about it; there's no need for the file to be locked up while we're actually checking the list for the anagrams.)
- People generally use the interface rather than the underlying object whenever possible. (
List<String>
vs.ArrayList<String>
) Not really sure why, to be honest, but this is the accepted convention. - The
input()
function probably doesn't even need to exist, but if you're going to leave it, it can be a one-liner. - Use variable and method names that are actually meaningful.
- Don't convolute code where it's not necessary. For example, see the refactored
formatOutput()
method below.
.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import javax.swing.JOptionPane;
public class Anagramatic {
public static void main(String[] args) throws FileNotFoundException {
String anagram = input();
List<String> anagrams = getAnagrams(anagram, readFile(anagram));
String output = formatOutput(anagrams);
JOptionPane.showMessageDialog(null,
"The anagram "+anagram+" has "+anagrams.size()+" matches, they are:\n\n"+output);
}
private static String input() {
return JOptionPane.showInputDialog(null, "Enter the word or words you would like to be processed");
}
private static List<String> readFile(String anagram) throws FileNotFoundException {
List<String> list = new ArrayList<String>();
try (Scanner s = new Scanner(new File("words.txt"))) {
while(s.hasNext()) {
list.add(s.next());
}
}
return list;
}
private static List<String> getAnagrams(String anagram, List<String> words) {
List<String> list = new ArrayList<String>();
List<Character> a, b;
for(String word : words) {
if(anagram.length() == word.length()) {
a = new ArrayList<Character>();
b = new ArrayList<Character>();
for(int i = 0; i < word.length(); i++) {
a.add(anagram.charAt(i));
b.add(word.charAt(i));
}
if(a.containsAll(b)){
list.add(word);
}
}
}
return list;
}
private static String formatOutput(List<String> words) {
String formattedWords = "[";
for(int i = 0; i < words.size(); i++) {
formattedWords += words.get(i);
if(i % 7 == 0) formattedWords += "\n";
else formattedWords += ", ";
}
formattedWords += "]";
return formattedWords;
}
}
The organisation of the code is not the best as some methods are probably doing more than they should.
You should write a method to check whether two words are anagrams or not.
There is not point in storing the number of anagrams in k as it corresponds to the length of the list anyway.
You could use join from StringUtils in getOutput
.
If you plan to look for anagrams in the same file multiple time, you could go for a different algorithm in order not to have to reprocess the whole file every time. The trick is to compute for each word its "sorted" version and to keep a mapping from "sorted" word to the corresponding list of real words. Then checking for the anagrams for a word is nothing but sorting it and looking for it in the mapping.
I have to go, I'll try to finish this.
-
\$\begingroup\$ Yeah I know the
readFile
method does more than it should. But I wanted it to only add the correct entries from the file to this list instead of loading the entire file and add another set of loops to process that list. \$\endgroup\$Dinocv– Dinocv2013年10月03日 23:22:56 +00:00Commented Oct 3, 2013 at 23:22