4
\$\begingroup\$

This program works, but it's pretty clunky and fragile. I want to make it more dynamic / solid.

For instance some of the functionality depends on the order I entered things into an ArrayList, that seems to me to be an example of a terribly odious smelly code.

//These structures hold the word list & corresponding flag indicator for each language
static HashMap<String, Boolean> de_map = new HashMap<String, Boolean>();
static HashMap<String, Boolean> fr_map = new HashMap<String, Boolean>();
static HashMap<String, Boolean> ru_map = new HashMap<String, Boolean>();
static HashMap<String, Boolean> eng_map = new HashMap<String, Boolean>();
public static void main( String[] args ) throws URISyntaxException, IOException
{
 //put them all in a list for later processing
 List<HashMap<String, Boolean>> lang_maps = new ArrayList<HashMap<String, Boolean>>();
 lang_maps.add( de_map );
 lang_maps.add( fr_map );
 lang_maps.add( ru_map );
 lang_maps.add( eng_map );
 //this holds all the dictionary files, i.e. word lists garners from language folders
 ArrayList<Path> dictionary_files = new ArrayList<Path>();
 //THIS SEGMENT IS FOR DYNAMICALLY LOCATING THE DIRECTORY, SO THE PROGRAM WORKS "OUT OF THE BOX"
/*******************************************************************************************************************************************/
 File currentDir = new File( "." ); // Read current file location
 //System.out.println(currentDir.getAbsolutePath());
 File targetDir = null;
 if (currentDir.isDirectory()) 
 {
 targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
 }
 if ( targetDir != null && targetDir.exists() )
 {
 SearchDirectories.listDirectoryAndFiles( targetDir.toPath(), dictionary_files );
 }
/*******************************************************************************************************************************************/
 //this populates word presence data structs for each language
 for(Path dir : dictionary_files)
 {
 String word_holding_directory_path = dir.toString().toLowerCase();
 BufferedReader br = new BufferedReader( new FileReader( dir.toString() ) );
 String line = null;
 while ((line = br.readLine()) != null)
 {
 //System.out.println(line);
 if(word_holding_directory_path.toLowerCase().contains("/de/") )
 {
 de_map.put(line, false); 
 }
 if(word_holding_directory_path.toLowerCase().contains("/ru/") )
 {
 ru_map.put(line, false);
 }
 if(word_holding_directory_path.toLowerCase().contains("/fr/") )
 {
 fr_map.put(line, false);
 }
 if(word_holding_directory_path.toLowerCase().contains("/eng/") )
 {
 eng_map.put(line, false);
 }
 }
 }
 //print debugging
// for (Map.Entry entry : de_map.entrySet()) 
// {
// System.out.println(entry.getKey() + ", " + entry.getValue());
// }
/*******************************************************************************************************************************************/ 
 //GET THE USER INPUT
 ArrayList<String> input_text = new ArrayList<String>();
 Scanner in = new Scanner(System.in);
 System.out.println("Please enter a sentence: ");
 String [] tokens = in.nextLine().split("\\s");
 for (int i = 0; i < tokens.length; i++)
 {
 input_text.add( tokens[i].toString() );
 }
/*******************************************************************************************************************************************/
 //iterate over the hashmaps of all the languages we're considering
 //and flip the bool if there is a hit
 for( int i = 0; i < lang_maps.size(); i++ )
 {
 HashMap<String, Boolean> working_lang_map = lang_maps.get( i );
 Iterator it = working_lang_map.entrySet().iterator();
 while (it.hasNext()) 
 {
 Map.Entry pair = (Map.Entry)it.next();
 //System.out.println(pair.getKey() + ", " + pair.getValue());
 for(String word : input_text)
 {
 if(pair.getKey().toString().toLowerCase().trim().equals( word.toLowerCase().trim() ) )
 {
 working_lang_map.put(pair.getKey().toString(), true);
 }
 } 
 }
 }
 // !!!!!!!!!!!!!!
 // this is fragile. since it depends on order. need to make it 
 // more robust
 int total_de = 0;
 int total_fr = 0;
 int total_ru = 0;
 int total_eng = 0;
 //iterate over all the hashmaps and count the hit totals
 for( int i = 0; i < lang_maps.size(); i++ )
 {
 HashMap<String, Boolean> working_lang_map = lang_maps.get( i );
 //System.out.println( working_lang_map );
 for (Map.Entry entry : working_lang_map.entrySet()) 
 {
 if( ((Boolean) entry.getValue()) == true )
 {
 //this part is really stupid
 if( i == 0)
 {
 total_de++;
 }
 if( i == 1)
 {
 total_fr++;
 }
 if( i == 2)
 {
 total_ru++;
 }
 if( i == 3)
 {
 total_eng++;
 }
 }
 }
 }
 HashMap< String, Integer > most_hits_lang = new HashMap<String, Integer>();
 most_hits_lang.put( "German", total_de );
 most_hits_lang.put( "French", total_fr );
 most_hits_lang.put( "Russian", total_ru );
 most_hits_lang.put( "English", total_eng );
 Entry<String,Integer> maxEntry = null;
 for(Entry<String,Integer> entry : most_hits_lang.entrySet()) 
 {
 if (maxEntry == null || entry.getValue() > maxEntry.getValue()) 
 {
 maxEntry = entry;
 }
 }
 System.out.println( maxEntry.getKey() );
}

The full code and everything necessary to run it can be found on my GitHub page.

200_success
146k22 gold badges191 silver badges481 bronze badges
asked May 17, 2015 at 13:45
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

You're right about the clunky and fragile. A lot of improvements are possible. Most notably, many unnecessary elements can be rewritten simpler, shorter, cleaner. Let's go from the top.

Prefer interface types instead of implementation

Instead of this:

static HashMap<String, Boolean> de_map = new HashMap<String, Boolean>();

If you don't have a specific need for a HashMap, declare as a Map:

static Map<String, Boolean> de_map = new HashMap<String, Boolean>();

Do this everywhere.

Map<String, Boolean> or Set<String>

Maps with boolean as value type are always suspicious. Very often they can be rewritten as Set<> instead. We'll get to that too a bit later.

Use the diamond operator

Instead of:

static Map<String, Boolean> fr_map = new HashMap<String, Boolean>();

Use the modern diamond operator:

static Map<String, Boolean> fr_map = new HashMap<>();

This is supported as of Java 7, which is the lowest officially supported Java version. Consider migrating to it ASAP.

Another similar example, instead of this:

 List<HashMap<String, Boolean>> lang_maps = new ArrayList<HashMap<String, Boolean>>();

Should be:

 List<Map<String, Boolean>> lang_maps = new ArrayList<>();

Don't throw exceptions in vain

The main method is declared to throw URISyntaxException, but I don't see where that can come from. Perhaps from SearchDirectories.listDirectoryAndFiles ? I doubt you really need it.

Perhaps related to this (or perhaps not at all) is that SearchDirectories.listDirectoryAndFiles takes a Path as parameter. I'm wondering if that's necessary at all. Can't you rewrite it with a simple File ?

Unnecessary stuff

Instead of this:

 File currentDir = new File( "." ); // Read current file location
 File targetDir = null;
 if (currentDir.isDirectory()) 
 {
 targetDir = new File( currentDir, "word_lists_1" ); // Construct the target directory file with the right parent directory
 }
 if ( targetDir != null && targetDir.exists() )
 {
 // ...
 }

This is exactly the same:

 File currentDir = new File(".");
 targetDir = new File(currentDir, "word_lists_1"); 
 if (targetDir.exists()) {
 // ...
 }

Simpler isn't it? I also removed the redundant comments, and changed the formatting to the style used by major IDEs. Speaking of comments, putting comments at the end of lines is not a great idea, as it makes the lines longer, and forces the reader to scroll horizontally to the far right.

if or if-else ?

About this code:

 if (word_holding_directory_path.toLowerCase().contains("/de/")) {
 de_map.put(line, false);
 }
 if (word_holding_directory_path.toLowerCase().contains("/ru/")) {
 ru_map.put(line, false);
 }
 if (word_holding_directory_path.toLowerCase().contains("/fr/")) {
 fr_map.put(line, false);
 }
 if (word_holding_directory_path.toLowerCase().contains("/eng/")) {
 eng_map.put(line, false);
 }

Can you have paths like something/fr/.../eng/.../de/? I suspect that a path will only contain either "fr", "eng", "de", and so on. In which case these conditions should be chained with else-if.

Also, instead of evaluating word_holding_directory_path.toLowerCase() multiple times, it would be better to evaluate it only once.

Even worse, there's really no need to re-evaluate any of these conditions for each line of input. This would be better:

 Map<String, Boolean> map;
 if (word_holding_directory_path.toLowerCase().contains("/de/")) {
 map = de_map;
 } else if (word_holding_directory_path.toLowerCase().contains("/ru/")) {
 map = ru_map;
 } else if (word_holding_directory_path.toLowerCase().contains("/fr/")) {
 map = fr_map;
 } else if (word_holding_directory_path.toLowerCase().contains("/eng/")) {
 map = eng_map;
 } else {
 continue;
 }
 while ((line = br.readLine()) != null) {
 map.put(line, false);
 }

Unnecessary stuff 2

Instead of this:

 ArrayList<String> input_text = new ArrayList<String>();
 String [] tokens = in.nextLine().split("\\s");
 for (int i = 0; i < tokens.length; i++)
 {
 input_text.add( tokens[i].toString() );
 }

This is exactly the same:

 List<String> input_text = Arrays.asList(in.nextLine().split("\\s"));

But actually, you don't need a List<String> at all later in your code. So you could as well just replace the above with:

String[] input_text = in.nextLine().split("\\s");

And the rest of the code will work the same way.

Unnecessary stuff 3

The nested loop after the you get the user input can be simplified using enhanced for-each loops, dropping unnecessary toString calls, and written better as:

 for (Map<String, Boolean> working_lang_map : lang_maps) {
 for (Map.Entry<String, Boolean> entry : working_lang_map.entrySet()) {
 for (String word : input_text) {
 if (entry.getKey().toLowerCase().trim().equals(word.toLowerCase().trim())) {
 working_lang_map.put(entry.getKey(), true);
 }
 }
 }
 }

... but ... something's awfully wrong here. Iterating over entries of a map to check if it contains some elements is not normal. Maps are designed for fast access by key, and you're squandering that away.

If you rewrote the maps to contain trimmed and lowercased strings (it seems you can, the rest of the code will work just the same), then you could simplify the above nested loop and make it more efficient like this:

 for (Map<String, Boolean> working_lang_map : lang_maps) {
 for (String word : input_text) {
 String normalized = word.trim().toLowerCase();
 if (working_lang_map.containsKey(normalized)) {
 working_lang_map.put(normalized, true);
 }
 }
 }

Simplified implementation

I suggest to replace de_map, eng_map, and the others with Set<String> of words, so that you populate:

Set<String> enWords = new HashSet<String>();
Set<String> frWords = new HashSet<String>();
Set<String> ruWords = new HashSet<String>();
Set<String> deWords = new HashSet<String>();

And a map of languages:

Map<String, Set<String>> langMaps = new HashMap<>();
langMaps.put("English", enWords);
langMaps.put("French", frWords);
langMaps.put("Russian", ruWords);
langMaps.put("German", deWords);

And then, instead of setting flags and then counting them, you can count directly, replacing the final fragile part with this simpler and more robust approach:

 int maxCount = 0;
 String maxLang = null;
 for (Map.Entry<String, Set<String>> entry : langMaps.entrySet()) {
 int count = 0;
 Set<String> words = entry.getValue();
 for (String word : input_text) {
 String normalized = word.trim().toLowerCase();
 if (words.contains(normalized)) {
 ++count;
 }
 }
 if (count > maxCount) {
 maxLang = entry.getKey();
 maxCount = count;
 }
 }
 System.out.println(maxLang);
answered May 18, 2015 at 22:49
\$\endgroup\$
1
  • 1
    \$\begingroup\$ that was really awesome man, thank you. I wonder, if perhaps you might consider this unrelated issue which I've been having with that code. \$\endgroup\$ Commented May 19, 2015 at 5:34

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.