Newbie here, presenting the code for close public scrutiny.
Couple remarks regarding the context (taken from Princeton's COS 226 course):
- In this problem I model a percolation system using an N-by-N grid of sites. Each site could be either open or blocked (all of them are blocked initially). To open the site, client should call open(row, col) method.
- There is a notion of full site. "A full site is an open site that can be connected to an open site in the top row via a chain of neighboring (left, right, up, down) open sites [1]"
- The system percolates, when there is a full site in the bottom row.
Below you can find an image of 2 percolation systems:
- Now the goal is to perform efficient percolates() queries (boolean method that checks whether system percolates or not), using the UnionFind algorithm (Brute-force solution will take quadratic time, which is not scalable for huge systems).
Additional materials that might be useful:
Quick video for visualizing the entire process:
https://www.youtube.com/watch?v=xUWuZjadbbQSource code for WeightedQuickUnionUF: http://algs4.cs.princeton.edu/15uf/WeightedQuickUnionUF.java.html
[1] Reference: https://www.cs.princeton.edu/courses/archive/fall15/cos226/assignments/percolation.html
Below goes my implementation of Percolation:
Percolation.java
package percolation;
import edu.princeton.cs.algs4.WeightedQuickUnionUF;
/**
* Modeling a percolation system using an N-by-N grid of sites.
* Each site is either open or blocked. A full site is an open site
* that can be connected to an open site in the top row via a chain of
* neighboring (left, right, up, down) open sites.
* The system percolates if there is a full site in the bottom row.
* In other words, a system percolates if we fill all open sites
* connected to the top row and that process fills some open site on the bottom row.
* <p>
* Application in practice: <a href="http://www.geoffreylandis.com/percolation.htp">The Fermi Paradox: An Approach Based on Percolation Theory</a> </p>
* <p>
* Code for Percolation Visualization can be found at: <a href="http://www.cs.princeton.edu/courses/archive/fall16/cos226/checklist/percolation.html">Princeton COS 226</a>
* </p>
*/
public class Percolation {
private final boolean[][] open;
private final boolean[][] full;
private int n;
private final int N;
private final WeightedQuickUnionUF uf;
private final int top; // virtual top-site
private final int bottom; // virtual bottom-site
/* Rep Invariant
* open, full, uf != null
* open.length == full.length
* n >= 0
* N > 0
* top != bottom
* Abstraction Function
* AF(N) represents a percolation system with N^2 sites in it (N-by-N grid).
* Safety Exposure Argument
* All fields are private. All fields except n are final.
* All fields except open, full and uf are immutable;
* however they are not being shared with clients.
*/
/**
* Creates N-by-N grid, with all sites initially blocked.
* @param N dimension of the grid. Should be greater than 0.
* @throws IllegalArgumentException if N <= 0
*/
public Percolation(int N) {
if (N <= 0) throw new IllegalArgumentException("N must be more than zero");
this.open = new boolean[N][N];
this.full = new boolean[N][N];
this.n = 0;
this.N = N;
this.uf = new WeightedQuickUnionUF(N*N + 2); // to allow top & bottom virtual sites
this.top = N*N; // top virtual site
this.bottom = N*N + 1; // bottom virtual site
for (int i = 0 ; i < N; i++) {
if (i != N * (N - 1) + i) { // avoids bug in 1x1 case
uf.union(top, i);
uf.union(bottom, N * (N - 1) + i);
}
}
checkRep(); // checking rep invariant, assertions should be enabled!
}
/**
* Opens the site (row, col) on the grid. This method has no effect
* if the site is already open.
* @param row row of the site to be opened
* @param col column of the site to be opened
* @throws IllegalArgumentException if row or col is < 0 or >= N
*/
public void open(int row, int col) {
if (row < 0 || row >= N) throw new IllegalArgumentException("Illegal value for row: must be between 0 and N - 1");
if (col < 0 || col >= N) throw new IllegalArgumentException("Illegal value for col: must be between 0 and N - 1");
if (!isOpen(row, col)) {
open[row][col] = true;
n++;
}
else return; // to avoid unnecessary computations
if (row == 0) {
checkAdjacent(full, open, row, col);
full[row][col] = true;
}
// filling in
if (row + 1 < N && isFull(row + 1, col)) checkAdjacent(full, open, row, col);
if (col + 1 < N && isFull(row, col + 1)) checkAdjacent(full, open, row, col);
if (col - 1 >= 0 && isFull(row, col - 1)) checkAdjacent(full, open, row, col);
if (row - 1 >= 0 && isFull(row - 1, col)) checkAdjacent(full, open, row, col);
// merging
if (row + 1 < N && isOpen(row + 1, col)) uf.union(row * N + col, (row + 1) * N + col);
if (col + 1 < N && isOpen(row, col + 1)) uf.union(row * N + col, row * N + col + 1);
if (col - 1 >= 0 && isOpen(row, col - 1)) uf.union(row * N + col, row * N + col - 1);
if (row - 1 >= 0 && isOpen(row - 1, col)) uf.union(row * N + col, (row - 1) * N + col);
if (N == 1) { // for 1x1 case
uf.union(bottom, 0);
uf.union(top, 0);
}
checkRep(); // checking rep invariant, assertions should be enabled!
}
// Recursively checks adjacent sites to be filled
private void checkAdjacent(boolean[][] full, boolean[][] open, int row, int col) {
if (row < 0 || row >= N) return;
if (col < 0 || col >= N) return;
if (!isOpen(row, col)) return;
if (isFull(row, col)) return;
full[row][col] = true;
checkAdjacent(full, open, row + 1, col);
checkAdjacent(full, open, row, col + 1);
checkAdjacent(full, open, row, col - 1);
checkAdjacent(full, open, row - 1, col);
}
/**
* Checks whether the site (row, col) is open.
* @param row row of the site
* @param col column of the site
* @return true iff, the site is open
* @throws IllegalArgumentException if row or col is < 0 or >= N
*/
public boolean isOpen(int row, int col) {
if (row < 0 || row >= N) throw new IllegalArgumentException("Illegal value for row: must be between 0 and N - 1");
if (col < 0 || col >= N) throw new IllegalArgumentException("Illegal value for col: must be between 0 and N - 1");
return open[row][col];
}
/**
* Checks whether the site (row, col) is full.
* @param row row of the site
* @param col column of the site
* @return true iff, the site is full
* @throws IllegalArgumentException if row or col is < 0 or >= N
*/
public boolean isFull(int row, int col) {
if (row < 0 || row >= N) throw new IllegalArgumentException("Illegal value for row: must be between 0 and N - 1");
if (col < 0 || col >= N) throw new IllegalArgumentException("Illegal value for col: must be between 0 and N - 1");
return full[row][col];
}
/**
* @return the number of opened sites.
*/
public int numberOfOpenSites() {
return n;
}
/**
* Checks whether system percolates or not. This method takes constant time.
* @return true iff the system percolates
* (i.e possible to reach bottom row from top row,
* with the help of opened sites).
*/
public boolean percolates() {
return uf.connected(top, bottom);
}
// Note to self: checkRep ought to be called at every operation
// that creates or mutates rep
/**
* Checks whether the rep invariant is being preserved.
* Assertions should be enabled.
*/
private void checkRep() {
assert open != null;
assert full != null;
assert uf != null;
assert open.length == full.length;
assert n >= 0;
assert N > 0;
assert top != bottom;
}
}
And below goes the implementation of JUnit test case for Percolation
PercolationTest.java
package percolation;
import static org.junit.Assert.*;
import org.junit.Test;
public class PercolationTest {
// check if assertions are on
@Test(expected = AssertionError.class)
public void testAssertionsEnabled() {
assert false;
}
@Test
public void emptyPercolationTest() {
Percolation perc = new Percolation(5);
for (int i = 0; i < 5; i++) {
for (int j = 0; j < 5; j++) {
assertFalse(perc.isOpen(i, j));
assertFalse(perc.isFull(i, j));
}
}
assertFalse(perc.percolates());
assertEquals(0, perc.numberOfOpenSites());
}
@Test
public void tinyPercolationTest() {
Percolation perc = new Percolation(1);
assertFalse(perc.isOpen(0, 0));
assertFalse(perc.isFull(0, 0));
assertFalse(perc.percolates());
assertEquals(0, perc.numberOfOpenSites());
perc.open(0, 0);
assertTrue(perc.isOpen(0, 0));
assertTrue(perc.isFull(0, 0));
assertTrue(perc.percolates());
assertEquals(1, perc.numberOfOpenSites());
}
@Test
public void basicPercolationTest() {
Percolation perc = new Percolation(3);
perc.open(0, 0);
assertTrue(perc.isOpen(0, 0));
assertTrue(perc.isFull(0, 0));
assertFalse(perc.percolates());
perc.open(1, 0);
assertTrue(perc.isOpen(1, 0));
assertTrue(perc.isFull(1, 0));
assertFalse(perc.percolates());
perc.open(1, 1);
assertTrue(perc.isOpen(1, 1));
assertTrue(perc.isFull(1, 1));
assertFalse(perc.percolates());
perc.open(2, 2);
assertTrue(perc.isOpen(2, 2));
assertFalse(perc.isFull(2, 2));
assertFalse(perc.percolates());
perc.open(2, 0);
assertTrue(perc.isOpen(2, 0));
assertTrue(perc.isFull(2, 0));
assertTrue(perc.percolates());
assertEquals(5, perc.numberOfOpenSites());
}
// common problem
@Test
public void backwashPercolationTest() {
Percolation perc = new Percolation(4);
perc.open(0, 0);
perc.open(1, 0);
perc.open(2, 0);
// sites not intended to be full, added before model percolates
perc.open(2, 2);
perc.open(3, 2);
perc.open(3, 0);
assertTrue(perc.percolates());
assertFalse(perc.isFull(2, 2));
assertFalse(perc.isFull(3, 2));
// sites not intended to be full, added after model percolates
perc.open(1, 2);
perc.open(1, 3);
perc.open(3, 3);
assertFalse(perc.isFull(1, 2));
assertFalse(perc.isFull(1, 3));
assertFalse(perc.isFull(3, 3));
}
@Test
public void sameNumberOfOpenSitesTest() {
Percolation perc = new Percolation(3);
perc.open(0, 0);
perc.open(0, 0);
assertEquals(1, perc.numberOfOpenSites());
}
}
Any constructive feedback is very much welcome!
1 Answer 1
I won't go into the logic, just the format and basic Java & OOP issues.
First, good defensive practices, throwing exceptions when input is illegal. Good reflex, but don't overdo it.
Let's look at the class definition:
public class Percolation {
private final boolean[][] open;
private final boolean[][] full;
private int n;
private final int N;
- Having a
open
and afull
variable is:- confusing: It opens the possibility of having a cell be full, but NOT open. You don't have 4 possibilites, only 3, and your code should reflect that.
- error-prone: You might mistakenly access
open[i-1, j]
and thenfull[i, j-1]
and not see the error. - Wastefull: You will make two aray accesses every time
This is an ideal use-case for an enum:
public class Percolation {
private static enum Cell {CLOSED, EMPTY, FULL}
private final Cell[][] field;
- Next bit of class definition is this:
private int n;
private final int N;
The two "n" definitions are highly confusing. "n" is accepted naming for a size variable or a counter, but if you use two, you defeat the consensus. Make it clear the non-final one is a counter:
private int count;
private final int N;
- Finally, for the naming:
private final WeightedQuickUnionUF uf;
"uf" is a badly choosen name. It doesn't take much to make it unionFind
. Keystrokes don't count, readability matters!
Now let's look at the open
method.
It should at least return at least a success result! To make sense to the user, and allow the reader to understand the 'why' without even needing to comment on stuff:
public boolean open(int row, int col) {
Now this bit of code isn't very idiomatic:
if (!isOpen(row, col)) {
open[row][col] = true;
n++;
}
else return; // to avoid unnecessary computations
It should be (without needing to explain in comments):
if (isOpen(row, col)) {
return false;
}
open[row][col] = true;
n++;
Make life easy when throwing exceptions (and use braces) by repeating the input: the caller might not be conscious he/she sent the wrong value:
if (row < 0 || row >= N) {
throw new IllegalArgumentException("Illegal value for row: must be between 0 and " + (N - 1) + ", but was " + row);
}
The checkAdjacent
method doesn't need all this parameter passing, because it is a non-static method it already has the full
and open
fields accessible to it:
private void checkAdjacent(int row, int col) {
Also it is badly named for a method with side-effects. Consider naming it propagate
because it changes the full
status as it goes!
Too much checking going on
isOpen
and isFull
already check range. So don't check it anew before calling it:
if (row + 1 < N && isFull(row + 1, col)) checkAdjacent(full, open, row, col);
Becomes:
if (isFull(row, col + 1)) {
checkAdjacent(full, open, row, col);
}
BUT you'd need to let that method return false
when out of bounds isntead of raising the exception. Then make it private and make it be isEmptyOrOOB
. I strongly encourage you to do something similar.
Also the whole 4-way repetition is calling the same method passing the same parameters, so feels like duplicate (well, four-plicate):
if (isFull(row + 1, col)) checkAdjacent(full, open, row, col);
if (isFull(row, col + 1)) checkAdjacent(full, open, row, col);
if (isFull(row, col - 1)) checkAdjacent(full, open, row, col);
if (isFull(row - 1, col)) checkAdjacent(full, open, row, col);
Could probably be replaced by:
if (isFull(row + 1, col) || isFull(row, col + 1) || isFull(row, col - 1)) || isFull(row - 1, col)){
checkAdjacent(full, open, row, col);
}
I have no idea what uf.union
does (didn't read external link) but it might benefit frm the same treatment. At least the multi bound-checking before it must go.
The getter function numberOfOpenSites()
should be named getNumberOfOpenSites()
CheckRep
is very wrong:
private void checkRep() {
assert open != null;
assert full != null;
assert uf != null;
assert open.length == full.length;
assert n >= 0;
assert N > 0;
assert top != bottom;
}
Assertions must be put in place, not grouped in a method. When they will be disabled (no one bothers enabling it) they cost nothing, but the empty method call becomes useless.
All assertions are useless: your open
, full
, N
and uf
variables are final and can never be null
(N can never be negative). In general no need to check final variables non-nullity for invariance.
I don't understand what top
and bottom
stand for, but they are final as well, and can never become equal.
In general, invariance checking is not checking that stuff that are final ever change, it is that some construct emerging from changing variables, still holds. I don't immediately see any invariant worth checking.
I did not review the Unit tests.
-
\$\begingroup\$ Thanks a lot for your insights, I have revised my code. Few remarks:
isEmpty()
andisFull()
should remain public, otherwise it would break several clients that depend on them. Do you think representingfull[][]
andopen[][]
as 2d byte array instead of enums worth the effort (readability vs memory consumption)? Thanks! \$\endgroup\$user118482– user1184822016年10月27日 19:26:14 +00:00Commented Oct 27, 2016 at 19:26 -
1\$\begingroup\$ Sure, keep
isFull
andisEmpty
in the public API, but make sure to use a better version internally. For performance, you already chose Java for extendability and ease of use over C. The same applied, the perf will be largely unchanged, but the added expressiveness will let you leverage some kickass External APIs much more easily. Often, chosing slower but more precise implementations usually lets you implement large scale optimisations that make up for it. Plus memory is dirt cheap these days. \$\endgroup\$MrBrushy– MrBrushy2016年10月27日 21:54:23 +00:00Commented Oct 27, 2016 at 21:54 -
\$\begingroup\$ Finally IIRC enums are implemented as int so it might just be a win-win \$\endgroup\$MrBrushy– MrBrushy2016年10月27日 21:56:07 +00:00Commented Oct 27, 2016 at 21:56
connected()
and hencepercolates()
take time proportional to log2 N + log2 M where N and M are the number of connected nodes totop
andbottom
nodes \$\endgroup\$