1
\$\begingroup\$

I have millions of files in a folder. File types will be with the name PES_timestamp and DOM_timestamp. I have to aggregate all PES and DOM filenames into one file which has the same filename.

The Aggregator program picks up only matching PES and DOM file pairs based on the timestamp of the file name from the incoming folder.

Example:

pes234343440
dom234343440

I thought of this idea to create 3 lists:

  1. Holds all files in the folder
  2. Holds only PES files
  3. Holds all DOM files

Finally, I will read all the file name from list and aggregate them from this program:

public void mergeFiles(File output,List<String> list) throws IOException {
 BufferedWriter writer=null;
 BufferedReader br = null;
 BufferedReader r = null;
 try {
 writer = new BufferedWriter(new FileWriter(output)); 
 for(int i=0;i<list.size();i++){
 br = new BufferedReader(new FileReader("INCOMINGFOLDER/"+list.get(0))); 
 AggregateNj obj=new AggregateNj(); 
 String s1 =null; 
 while ((s1 = br.readLine()) != null) { 
 writer.write(s1);
 writer.write("\n");
 }
 } 
 } catch (IOException e){
 e.printStackTrace();
 }finally{
 if(br != null)br.close();
 if(r != null)r.close();
 if(writer != null)writer.close();
 }
 }

Is this approach correct, or can you suggest a better approach?

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jul 2, 2015 at 5:59
\$\endgroup\$
4
  • \$\begingroup\$ For which OS will this be running on? \$\endgroup\$ Commented Jul 2, 2015 at 8:13
  • \$\begingroup\$ @h.j.k. Ubuntu linux \$\endgroup\$ Commented Jul 2, 2015 at 11:11
  • \$\begingroup\$ You may want to consider writing a shell script, and asking on askubuntu or Unix&Linux.SE for help on that... \$\endgroup\$ Commented Jul 2, 2015 at 11:12
  • \$\begingroup\$ e.g. (assuming bash): cd /path/to/dir; for i in pes*; do cat $i ${i/pes/dom} > /path/to/dest/${i/pes/combined}; done \$\endgroup\$ Commented Jul 2, 2015 at 11:15

1 Answer 1

2
\$\begingroup\$

Spaces. You need spaces around things. For example, keywords should always have whitespace on either side. Brackets, too. Parentheses should have spaces on the outside -- that is, to the right of a closing, to the left of an opening one. Also, on either side of binary operators. In general, your use of whitespace is inconsistent and unconventional.

Aside from that:

You need to work on your variable names. Things like r, list, s1, and obj don't say anything about what they do or for what purpose they exist. In short, they're meaningless.

You always open list.get(0), so you're always opening the same file. I strongly suspect you mean list.get(i).

With that change, you can change the for from a normal for to an advanced one, which makes it much simpler semantically.

Still focusing on that line: Why are you assuming that the files to be merged are in INCOMINGFOLDER? That's a pretty risky assumption for a method that has no context other than being called mergeFiles, which suggests that it should merge any files, without modifying the paths.

Rather than hardcoding a line separator which doesn't work on all OSs -- which, incidentally, sorta breaks the biggest reason for using Java at all -- use BufferedWriter#newLine().

Instead of manually checking for null in a finally, just use a try with resources. It's much cleaner.

It's generally bad to swallow up errors without telling the use of the function. Rather than printing the stacktrace of possible IOExceptions, I'd recomment rethrowing it. I suggest removing the catch block -- that is, passing it up the chain.

You never use the instance of AggregateNj, so it's useless, so I got rid of it. You also never use BufferedReader r, so I removed that as well.

With all those put in, here's your (much nicer IMO) code:

public void mergeFiles(File outputFile, List<String> inputFiles) throws IOException {
 try (BufferedWriter output = new BufferedWriter(new FileWriter(outputFile))) {
 for(String fileName : inputFiles){
 try (BufferedReader in = new BufferedReader(new FileReader(fileName))) {
 String line;
 while ((line = in.readLine()) != null) {
 output.write(line);
 output.newLine();
 }
 }
 }
 }
}

You could also return a boolean depending on whether or not it completed successfully, and the method would look like this:

public boolean mergeFiles(File outputFile, List<String> inputFiles) throws IOException {
 try (BufferedWriter output = new BufferedWriter(new FileWriter(outputFile))) {
 for(String fileName : inputFiles){
 try (BufferedReader in = new BufferedReader(new FileReader(fileName))) {
 String line;
 while ((line = in.readLine()) != null) {
 output.write(line);
 output.newLine();
 }
 }
 }
 return true;
 } catch (IOException e) {
 return false;
 }
}

However, I'd recommend throwing an error instead of returning a boolean, as it gives a lot more information about what went wrong.

Note that I didn't review your design, since this is Code Review. If you want the design reviewed, please provide the code.

answered Jul 2, 2015 at 7:29
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.