\$\begingroup\$
\$\endgroup\$
2
I have a parser for CNF formulas in Dimacs format, which is very slow. Any suggestion on how to improve its speed? I did some profiling and I might have to replace Scanner
. Is there anything faster?
A possible input for the parser is:
c A sample .cnf file. p cnf 3 2 1 -3 0 2 3 -1 0
The code:
/**
* Parses a stream for a CNF instance.
*
* @param source input stream
* @return read skeleton
* @throws ParseException if stream contains an invalid instance
*/
private static Skeleton parseStream(final InputStream source)
throws IOException, ParseException {
Scanner scanner = new Scanner(source);
// Skip comments
try {
String token = scanner.next();
while (token.equals("c")) {
scanner.nextLine();
token = scanner.next();
}
if (!token.equals("p")) {
throw new ParseException(
"Excepted 'p', but '" + token + "' was found");
}
} catch (NoSuchElementException e) {
throw new ParseException("Header not found");
}
// Reads header
int numVariables, numClauses;
try {
String cnf = scanner.next();
if (!cnf.equals("cnf")) {
throw new ParseException(
"Expected 'cnf', but '" + cnf + "' was found");
}
numVariables = scanner.nextInt();
numClauses = scanner.nextInt();
} catch (NoSuchElementException e) {
throw new ParseException("Incomplete header");
}
logger.debug("p cnf " + numVariables + " " + numClauses);
// Reads clauses
Skeleton skeleton = new Skeleton(numVariables);
try {
while (numClauses > 0) {
int literal = scanner.nextInt();
skeleton.cnf.add(literal);
if (literal == 0) {
numClauses--;
}
}
} catch (NoSuchElementException e) {
throw new ParseException(
"Incomplete problem: " + numClauses + " clauses are missing");
}
return skeleton;
}
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Feb 15, 2011 at 0:04
-
\$\begingroup\$ I suggest you to use ANTLR to generate parsers. It also provides you with some useful operations. You can find all you need. When you don't want to just learn how to write a parser from scratch then you should not reinvent the wheel. \$\endgroup\$Amirh– Amirh2011年04月08日 07:51:11 +00:00Commented Apr 8, 2011 at 7:51
-
\$\begingroup\$ Are you sure it's the parsing that is the bottleneck here? Simply reading data from disk is an expensive operation, so, it could be just that. \$\endgroup\$user3008– user30082011年04月08日 19:26:57 +00:00Commented Apr 8, 2011 at 19:26
1 Answer 1
\$\begingroup\$
\$\endgroup\$
1
Use a BufferedInputStream
to speed up the disk access. If that's not enough, you can read the file line-by-line and use split()
to break it into individual numbers.
answered Feb 15, 2011 at 7:29
-
\$\begingroup\$ Same exact suggestion I was going to give \$\endgroup\$Joe Phillips– Joe Phillips2011年02月16日 21:20:55 +00:00Commented Feb 16, 2011 at 21:20
lang-java