2
\$\begingroup\$

This is part of this topic.
Problem:
I need to read a file with BigIntegers and make some analysis "on-the-fly" with each read number (get prime numbers count, get armstrong numbers count). Right now I have this code, but it works very slow:

package ee.raintree.test.numbers;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.math.BigInteger;
import ee.raintree.test.numbers.utils.MathUtils;
public class FileAnalyzer {
 private static final char SEPARATOR = ' ';
 private File file;
 private int armstrongNumbersCount;
 private int primeNumbersCount;
 public FileAnalyzer(File file) {
 this.file = file;
 }
 public void analyze() throws FileNotFoundException, IOException {
 countNumbers();
 }
 public int getArmstrongNumbersCount() {
 return armstrongNumbersCount;
 }
 public int getPrimeNumbersCount() {
 return primeNumbersCount;
 }
 private void countNumbers() throws FileNotFoundException, IOException {
 StringBuilder numberSb = new StringBuilder();
 try (FileInputStream fis = new FileInputStream(file)) {
 char currentChar;
 while (fis.available() > 0) {
 currentChar = (char) fis.read();
 if (currentChar == SEPARATOR) {
 analyzeNumber(new BigInteger(numberSb.toString()));
 numberSb = new StringBuilder();
 continue;
 }
 numberSb.append(currentChar);
 }
 if (numberSb.length() > 0) {
 analyzeNumber(new BigInteger(numberSb.toString()));
 }
 }
 }
 private void analyzeNumber(BigInteger number) {
 if(MathUtils.isArmstrongNumber(number)) {
 armstrongNumbersCount++;
 }
 if(MathUtils.isPrime(number)) {
 primeNumbersCount++;
 }
 }
}

How can I speed up this process? As I was informed - it is lack of buffering.

πάντα ῥεῖ
5,1524 gold badges23 silver badges32 bronze badges
asked Jan 8, 2020 at 7:12
\$\endgroup\$

3 Answers 3

3
\$\begingroup\$

Replace

try (FileInputStream fis = new FileInputStream(file)) {

with

try (InputStream fis = new BufferedInputStream(new FileInputStream(file))) {

See also Why is using BufferedInputStream to read a file byte by byte faster than using FileInputStream?

answered Jan 8, 2020 at 7:17
\$\endgroup\$
1
  • \$\begingroup\$ In the second try with resources shouldn't the FileInputStream also created before buffered stream, so both can be closed? \$\endgroup\$ Commented Jan 8, 2020 at 16:35
3
\$\begingroup\$

Apart from the performce issue, you should not call available at all, as it makes your code unnecessarily complicated. Instead, read into an int variable until the value gets -1, just like everyone else is doing this. You then need to add a (char) type cast to the append call.

It's not necessary to create a new StringBuilder each time, you can just call sbNumber.setLength(0).

answered Jan 8, 2020 at 7:35
\$\endgroup\$
0
\$\begingroup\$

Some more general things to consider. Your class is called FileAnalyzer, but it advertises public properties like getPrimeNumbersCount, which I'd consider to be part of the analysis result (rather than the analyser itself). Do they really belong there?

There's a similar issue around the way that you're storing the file to be analysed in a field of the analyser. If a field isn't needed by at least two public member functions, I'd consider if it really needs to be a field or not.

FileNotFoundException extends IOException, so you don't really need to declare that a method throws both of them.

Does the analyzer really need to be tied to an input file, or could it actually take in an InputStream as the source to analyse? This approach makes the analyser more flexible (it could be used to process numbers from an API call, or could simply process numbers from a String for testing purposes), and pushes the decision about what type of stream to construct up to the caller, where there may be more context about the information's source.

Given those points, I'd expect the analyze function to have a signature more like:

NumericAnalysis analyze(InputStream streamToAnalyse) throws IOException

As it stands, your public analyze method immediately delegates out to a private countNumbers. You don't perform any signature adaption or anything else as part of this, so there doesn't really seem to be a reason for creating this extra layer of complexity. Why not just put the code straight into the analyze method?

There's no validity checking when constructing a BigInteger when a SEPERATOR is encountered. This might be OK in your scenario because you're confident about the source of the data, however it's good to consider what you would expect the behaviour to be if for example there were two spaces between a pair of numbers / if a file started with a space before the first number.

answered Jan 14, 2020 at 14:18
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.