Below is the code for user recommendations using mahout.
DataModel dm = new FileDataModel(new File(inputFile));
UserSimilarity sim = new LogLikelihoodSimilarity(dm);
UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, sim,
dm);
GenericUserBasedRecommender recommender = new GenericUserBasedRecommender(
dm, neighborhood, sim);
After the recommendations are generated, I am trying to write it to a file like this:
FileWriter writer = new FileWriter(outputFile);
for (LongPrimitiveIterator userIterator = dm.getItemIDs(); userIterator.hasNext();) {
long user = (long) userIterator.next();
List<RecommendedItem> recs = recommender.recommend(user, numOfRec );
for (RecommendedItem item : recs) {
writer.write(user + "," + item.getItemID() + ","
+ item.getValue()+"\n");
}
}
writer.close();
This code to write to file is taking lot of time. How can I speed up the write operations?
I tried with BufferedWriter
, but was unable to gain speed-up.
1 Answer 1
You can gain some efficiency by wrapping the writer in a BufferedWriter.
If you are using java 7 or better you should use try-with-resources to auto close the writer, otherwise you should use a try-finally:
try(BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile))){
//the for loops
}
with try-finally:
BufferedWriter writer = new BufferedWriter(new FileWriter(outputFile));
try{
//the for loops
}finally{
writer.close();
}
This ensures that the writer is closed should an exception occur.
Second appending Strings for output is not the most performant thing that should be done instead pass each string separately:
writer.write(user);
writer.write(",");
writer.write(item.getItemID());
writer.write(",");
writer.write(item.getValue()
writer.newLine();//only available in BufferedWriter
As a more general suggestion there is a better method than reading the entire file doing some processing and then writing the output. Instead you can read only as far as you need to process a part of the data and then write the result out again. Whether you can depends on what you are actually doing with the data. This is only possible if you have a 1-pass transform.
-
\$\begingroup\$ Would
BufferedWriter
with strings passed separately outperform aStringBuilder
with one call towriter.write
? I feel the amount of separate write calls would hinder performance, and if they could be reduced to 1 write call, it would be more efficient. \$\endgroup\$Clark Kent– Clark Kent2015年11月09日 16:04:42 +00:00Commented Nov 9, 2015 at 16:04 -
2\$\begingroup\$ @SaviourSelf But a BufferedWriter will flush its own buffer regularly as it gets full instead of allocating more memory and copying everything over. \$\endgroup\$ratchet freak– ratchet freak2015年11月09日 16:07:26 +00:00Commented Nov 9, 2015 at 16:07
recommend
that is taking that long? \$\endgroup\$