So far I have Java code that will:
- take in two .csv files
- store each line as a string in a string array
- split it into a 2D string array where a comma would act as the separator
- sum the number found in a column with all the other numbers that share the same number in the column next to it, and count how many times it does this
- sort the array based on whichever has the highest average, and then by most number of occurrences
- saves it to a file
The code below works, but it is very long, and not very easy to change without breaking it/losing a lot of code. How can I make this code simpler and more modular?
/**
* The below program takes in two files, one containing movies and information on them, the other containing a list of
* user ratings for the movies It then sorts the movies by the average rating for each one, then by number of ratings It
* will then save it to a file
*
* movies [movie#][movieName][etc...][][][][]... ratings [user][movie#][stars]
*
* unsorted[sumOfStars][#OfRatings] average [averageRating][movie#] sorted [averageRating][#OfRatings][movieName]
*
* @version 1.0 21/11/14
*/
public class movieAverage {
public static void main(String args[]) {
FileIO reader = new FileIO();
String[] rawMovies = reader.load("C://Users/James/Desktop/Movies.csv");
String[] rawRatings = reader.load("C://Users/James/Desktop/Ratings.csv");
String delim = ",(?=([^\"]*\"[^\"]*\")*[^\"]*$)";
int moviesRows = rawMovies.length;
int ratingRows = rawRatings.length;
int moviesColumns = rawMovies[0].split(delim).length;
int ratingsColumns = rawRatings[0].split(delim).length;
//saving the file to a 2d array
String[][] movies = new String[moviesRows][moviesColumns];
String[][] ratings = new String[ratingRows][ratingsColumns];
for(int i = 0; i < moviesRows; i++) {
String[] tokens = rawMovies[i].split(delim);
for(int j = 0; j < moviesColumns; j++) {
movies[i][j] = tokens[j];
}
}
for(int i = 0; i < ratingRows; i++) {
String[] tokens = rawRatings[i].split(",");
for(int j = 0; j < ratingsColumns; j++) {
ratings[i][j] = tokens[j];
}
}
//averaging the ratings
int[][] unsorted = new int[moviesRows][2]; // sum, size(amount of ratings)
// for(int i = 0; i < moviesRows; i++) {
// for(int j = 0; j < unsorted[0].length; j++) {
// unsorted[i][j] = 0;
// }
// }
int x = 0;
String s = "";
for(int i = 0; i < ratings.length; i++) {
x = Integer.parseInt(ratings[i][1]) - 1;
s = ratings[i][2];
String[] tok = s.split("[\\r\\n]+");
unsorted[x][0] += Integer.parseInt(tok[0]);
unsorted[x][1]++;
}
double[][] average = new double[unsorted.length][2]; // average, film#
for(int i = 0; i < unsorted.length; i++) {
if(unsorted[i][1] == 0) average[i][0] = 0.0;
else average[i][0] = (double)(unsorted[i][0]) / (unsorted[i][1]);
average[i][1] = i;
}
// selection sort to sort by averages
double TEMP = 0.0;
for(int i = 0; i < average.length - 1; i++) {
int max = i;
for(int j = i + 1; j < average.length; j++) {
if(average[j][0] > average[max][0]) max = j;
}
TEMP = average[i][0];
average[i][0] = average[max][0];
average[max][0] = TEMP;
TEMP = average[i][1];
average[i][1] = average[max][1];
average[max][1] = TEMP;
}
// bubble sort to sort by number of ratings with averages still in order
for(int i = average.length - 1; i >= 0; i--) {
for(int j = 0; j <= i - 1; j++) {
if(average[j][0] == average[j + 1][0] && unsorted[(int)average[j][1]][1] < unsorted[(int)average[j + 1][1]][1]) {
TEMP = average[j][1];
average[j][1] = average[j + 1][1];
average[j + 1][1] = TEMP;
}
}
}
String[][] sorted = new String[moviesRows][3];
for(int i = 0; i < moviesRows; i++) {
sorted[i][0] = String.valueOf(average[i][0]); // average rating
sorted[i][1] = String.valueOf(unsorted[(int)average[i][1]][1]); // number of ratings
sorted[i][2] = movies[(int)average[i][1]][1]; // movie name
}
try {
reader.save("C://Users/James/Desktop/sortedMovies.csv", sorted);
System.out.println("SAVED");
} catch(Exception e) {
System.out.println(e.getClass());
}
}
}
And the fileIO class is here
import java.io.*;
public class FileIO {
public String[] load(String file) {
File aFile = new File(file);
StringBuffer contents = new StringBuffer();
BufferedReader input = null;
try {
input = new BufferedReader(new FileReader(aFile));
String line = null;
while ((line = input.readLine()) != null) {
contents.append(line);
contents.append(System.getProperty("line.separator"));
}
} catch (FileNotFoundException ex) {
System.out
.println("Can't find the file - are you sure the file is in this location: "
+ file);
ex.printStackTrace();
} catch (IOException ex) {
System.out.println("Input output exception while processing file");
ex.printStackTrace();
} finally {
try {
if (input != null) {
input.close();
}
} catch (IOException ex) {
System.out
.println("Input output exception while processing file");
ex.printStackTrace();
}
}
String[] array = contents.toString().split("\n");
return array;
}
public void save(String file, String[] array) throws FileNotFoundException,
IOException {
File aFile = new File(file);
Writer output = null;
try {
output = new BufferedWriter(new FileWriter(aFile));
for (int i = 0; i < array.length; i++) {
output.write(array[i]);
output.write(System.getProperty("line.separator"));
}
} finally {
if (output != null)
output.close();
}
}
public void save(String file, String[][] array)
throws FileNotFoundException, IOException {
File aFile = new File(file);
Writer output = null;
try {
output = new BufferedWriter(new FileWriter(aFile));
for (int i = 0; i < array.length; i++) {
for (int j = 0; j < array[0].length - 1; j++) {
output.write(array[i][j] + ",");
}
output.write(array[i][array[0].length - 1]);
output.write(System.getProperty("line.separator"));
}
} finally {
if (output != null)
output.close();
}
}
}
1 Answer 1
There are a lot of things to improve here. I will focus on the FileIO
class.
The load
method
The load
method returns String[]
.
I suggest to change that to List<String>
.
Arrays are troublesome in many ways.
Unless you have a specific need for arrays,
it's better to use lists.
Unless you really need a thread-safe solution (rare),
StringBuilder
is recommended instead of StringBuffer
.
I would rename the String file
parameter to path
.
That's a bit more clear,
and then you can use the more natural File file
instead of File aFile
.
Reading the file line by line,
at each step appending to a buffer and appending System.getProperty("line.separator")
,
and then splitting the whole thing by \n
is wasteful and not natural.
This is where using a List<String>
helps you as I suggested in the first step:
you can just append to it line by line,
and you'll have something ready to use without splitting.
Applying the suggestions above,
the load
method becomes a bit simpler:
public List<String> load(String path) {
File file = new File(path);
List<String> lines = new ArrayList<>();
BufferedReader input = null;
try {
input = new BufferedReader(new FileReader(file));
String line;
while ((line = input.readLine()) != null) {
lines.add(line);
}
} catch (FileNotFoundException ex) {
System.out.println("Can't find the file - are you sure the file is in this location: " + path);
ex.printStackTrace();
} catch (IOException ex) {
System.out.println("Input output exception while processing file");
ex.printStackTrace();
} finally {
try {
if (input != null) {
input.close();
}
} catch (IOException ex) {
System.out.println("Input output exception while processing file");
ex.printStackTrace();
}
}
return lines;
}
The save
method
I recommend the following changes:
- Use
List<String>
instead of array as inload
- Rename
file
andaFile
as inload
- Use the modern for-each style iteration instead of
for (;;)
- Cache the value of
System.getProperty("line.separator")
before the for loop, to avoid unnecessary multiple lookups - Use braces in
if (output != null) output.close();
like you did everywhere else