Is there a faster, more efficient way to read a text file than this implementation?
Taking into account phones capabilities:
dictionary = new ArrayList<String>();
long start = System.currentTimeMillis();
int count = 0;
try{
InputStream inputStream = context.getAssets().open("words.txt");
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String word;
while((word = bufferedReader.readLine()) != null){
dictionary.add(word);
count++;
}
inputStream.close();
inputStreamReader.close();
bufferedReader.close();
}catch(IOException e){
e.printStackTrace();
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0);
System.out.println("Time to read database from file " + count + " items " + t + " seconds");
Output:
Time to read database from file 272403 items 1.112 seconds
Update: After taking @rolfl advice into account and doing a little more digging this is what I came up with. Any further advice or a tidy up would be very welcome
dictionary = new ArrayList<>(300000);
long start = System.currentTimeMillis();
InputStream inputStream = null;
try{
inputStream = context.getAssets().open("words.txt");
}catch(IOException e){
e.printStackTrace();
}
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buff = new byte[1048576];
try{
for(int i; (i = inputStream.read(buff)) != -1; ){
byteArrayOutputStream.write(buff, 0, i);
}
}catch(IOException ex){
ex.printStackTrace();
}
String[] contents = byteArrayOutputStream.toString().split("\n");
for(int i = 0; i < contents.length; i++){
dictionary.add(contents[i]);
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0);
System.out.println("Time to read database from file " + dictionary.size() + " items " + t + " seconds");
Output:
Time to read database from file 272403 items 0.708 seconds
1 Answer 1
Don't do unnecessary work. You have the
count
variable, but you also have thedictionary
which has asize()
method. There's no need for thecount
.Android supports Java-7 language features, use them. In this case, the try-with-resources would be your friend.
guess the size of the ArrayList that yoy may need. In this case, you should be a little generous, and say, pre-size it at 300,000 entries.
Android now supports (since KitKat) the diamond operator, there should be no need to declare the generic type of the ArrayList as
<String>
.I actually like the while loop you have. It is my preferred way of doing line-by-line IO too.
Here's a 'cleaned up' version of your code:
private static final int INITIALSIZE = 300000;
....
long start = System.currentTimeMillis();
dictionary = new ArrayList<>(INITIALSIZE);
try (BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(context.getAssets().open("words.txt")));) {
String word;
while((word = bufferedReader.readLine()) != null){
dictionary.add(word);
}
}catch(IOException e){
e.printStackTrace();
}
long end = System.currentTimeMillis();
double t = ((end - start) / 1000.0);
System.out.println("Time to read database from file " + dictionary.size()
+ " items " + t + " seconds");
So, that's a "simplified" version, how to make it faster?
Well, there's a few things. First up, nothing can be for sure unless you test it, so, run some experiments. Things I would try:
- Specify a buffer-size on the BufferedReader, something large like
1024 * 1024
(a megabyte). This should increase the size of IO's - The pre-sized ArrayList will help
- Consider reading the whole data file in to a
ByteArrayOutputStream
, and then converting that in one go in to a large String, then splitting the string on line-breaks.
In essence, the larger the IO sizes the better, and the larger the cache sizes are, the better.
-
\$\begingroup\$ Thanks very much your always very clear and very helpful with your answers :) \$\endgroup\$kfcobrien– kfcobrien2015年06月05日 02:02:52 +00:00Commented Jun 5, 2015 at 2:02