Parser for CNF formulas

Question 1

I have a parser for CNF formulas in Dimacs format, which is very slow. Any suggestion on how to improve its speed? I did some profiling and I might have to replace Scanner. Is there anything faster?

A possible input for the parser is:

c A sample .cnf file.
p cnf 3 2
1 -3 0
2 3 -1 0

The code:

/**
 * Parses a stream for a CNF instance. 
 *
 * @param source input stream 
 * @return read skeleton
 * @throws ParseException if stream contains an invalid instance 
 */
private static Skeleton parseStream(final InputStream source) 
 throws IOException, ParseException { 
 Scanner scanner = new Scanner(source); 
 // Skip comments
 try { 
 String token = scanner.next(); 
 while (token.equals("c")) { 
 scanner.nextLine();
 token = scanner.next(); 
 }
 if (!token.equals("p")) {
 throw new ParseException( 
 "Excepted 'p', but '" + token + "' was found"); 
 }
 } catch (NoSuchElementException e) {
 throw new ParseException("Header not found"); 
 }
 // Reads header
 int numVariables, numClauses; 
 try {
 String cnf = scanner.next(); 
 if (!cnf.equals("cnf")) { 
 throw new ParseException( 
 "Expected 'cnf', but '" + cnf + "' was found"); 
 }
 numVariables = scanner.nextInt(); 
 numClauses = scanner.nextInt();
 } catch (NoSuchElementException e) {
 throw new ParseException("Incomplete header"); 
 }
 logger.debug("p cnf " + numVariables + " " + numClauses); 
 // Reads clauses
 Skeleton skeleton = new Skeleton(numVariables);
 try {
 while (numClauses > 0) {
 int literal = scanner.nextInt();
 skeleton.cnf.add(literal);
 if (literal == 0) {
 numClauses--;
 }
 }
 } catch (NoSuchElementException e) {
 throw new ParseException(
 "Incomplete problem: " + numClauses + " clauses are missing");
 }
 return skeleton; 
}

Question 2

I suggest you to use ANTLR to generate parsers. It also provides you with some useful operations. You can find all you need. When you don't want to just learn how to write a parser from scratch then you should not reinvent the wheel.

Question 3

Are you sure it's the parsing that is the bottleneck here? Simply reading data from disk is an expensive operation, so, it could be just that.

Question 4

Use a BufferedInputStream to speed up the disk access. If that's not enough, you can read the file line-by-line and use split() to break it into individual numbers.

Question 5

Same exact suggestion I was going to give

David Harkness David Harkness 8,8931 gold badge27 silver badges44 bronze badges · Accepted Answer · 2011-02-15 07:29:39Z

7

\$\begingroup\$

Use a BufferedInputStream to speed up the disk access. If that's not enough, you can read the file line-by-line and use split() to break it into individual numbers.

Share

answered Feb 15, 2011 at 7:29

David Harkness's user avatar

David Harkness David Harkness

8,8931 gold badge27 silver badges44 bronze badges

\$\endgroup\$

1

\$\begingroup\$ Same exact suggestion I was going to give \$\endgroup\$

Joe Phillips
– Joe Phillips

2011年02月16日 21:20:55 +00:00
Commented Feb 16, 2011 at 21:20

Add a comment |

Stack Exchange Network

Parser for CNF formulas

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Parser for CNF formulas

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions