In addition to @Christian Hujer @Christian Hujer's answer...
As briefly touched on by @Christian Hujer @Christian Hujer, we can improve on the summarize()
method by having domain classes, instead of dealing with plain String
s. For example, with a Product
class and some suitable methods, the method can be expressed in an even clearer manner:
In addition to @Christian Hujer's answer...
As briefly touched on by @Christian Hujer, we can improve on the summarize()
method by having domain classes, instead of dealing with plain String
s. For example, with a Product
class and some suitable methods, the method can be expressed in an even clearer manner:
In addition to @Christian Hujer's answer...
As briefly touched on by @Christian Hujer, we can improve on the summarize()
method by having domain classes, instead of dealing with plain String
s. For example, with a Product
class and some suitable methods, the method can be expressed in an even clearer manner:
private static Map<String, DoubleSummaryStatistics> summarize(Collection<String> lines) {
return lines.stream()
.map(lrow -> lrow.split(","))
.collect(Collectors.groupingBy(ldata -> l[1]data[1],
Collectors.summarizingDouble(ldata -> Double.valueOf(l[2]data[2]))));
}
- For each
Stream<String>
element (i.e. row of CSV data),map()
into aString[]
array by splitting on","
. This is assuming your CSV data is clean enough to be parsed in the simplest way possible, else you might want to consider using a CSV library to do this for you. - Collect the results
groupingBy()
our category column (l[1]data[1]
) and then summarize on our price column (l[2]data[2]
).
private static Map<String, DoubleSummaryStatistics> summarize(
Collection<Product> products) {
return products.stream()
.collect(Collectors.groupingBy(Product::getCategory,
Collectors.summarizingDouble(Product::getPrice)));
}
Putting it altogether
As commented by @Christian Hujer , once you get to stream processing on the file directly, a possible solution can be just:
public static void main(String[] args) throws IOException {
try (Stream<String> lines = Files.lines(pathToCsvFile)) {
lines.skip(1) // to skip the header row
.map(ProductSummarizer::parse) // String -> Product function
.collect(Collectors.groupingBy(Product::getCategory,
Collectors.summarizingDouble(Product::getPrice)))
.forEach(ProductSummarizer::print);
}
}
Here, we use Files.lines(Path)
to get our Stream
, and we can inline the processing within summarize()
too.
private static Map<String, DoubleSummaryStatistics> summarize(Collection<String> lines) {
return lines.stream()
.map(l -> l.split(","))
.collect(Collectors.groupingBy(l -> l[1],
Collectors.summarizingDouble(l -> Double.valueOf(l[2]))));
}
- For each
Stream<String>
element (i.e. row of CSV data),map()
into aString[]
array by splitting on","
. This is assuming your CSV data is clean enough to be parsed in the simplest way possible, else you might want to consider using a CSV library to do this for you. - Collect the results
groupingBy()
our category column (l[1]
) and then summarize on our price column (l[2]
).
private static Map<String, DoubleSummaryStatistics> summarize(
Collection<Product> products) {
return products.stream()
.collect(Collectors.groupingBy(Product::getCategory,
Collectors.summarizingDouble(Product::getPrice)));
}
private static Map<String, DoubleSummaryStatistics> summarize(Collection<String> lines) {
return lines.stream()
.map(row -> row.split(","))
.collect(Collectors.groupingBy(data -> data[1],
Collectors.summarizingDouble(data -> Double.valueOf(data[2]))));
}
- For each
Stream<String>
element (i.e. row of CSV data),map()
into aString[]
array by splitting on","
. This is assuming your CSV data is clean enough to be parsed in the simplest way possible, else you might want to consider using a CSV library to do this for you. - Collect the results
groupingBy()
our category column (data[1]
) and then summarize on our price column (data[2]
).
private static Map<String, DoubleSummaryStatistics> summarize(
Collection<Product> products) {
return products.stream()
.collect(Collectors.groupingBy(Product::getCategory,
Collectors.summarizingDouble(Product::getPrice)));
}
Putting it altogether
As commented by @Christian Hujer , once you get to stream processing on the file directly, a possible solution can be just:
public static void main(String[] args) throws IOException {
try (Stream<String> lines = Files.lines(pathToCsvFile)) {
lines.skip(1) // to skip the header row
.map(ProductSummarizer::parse) // String -> Product function
.collect(Collectors.groupingBy(Product::getCategory,
Collectors.summarizingDouble(Product::getPrice)))
.forEach(ProductSummarizer::print);
}
}
Here, we use Files.lines(Path)
to get our Stream
, and we can inline the processing within summarize()
too.
In addition to @Christian Hujer's answer...
Relying on return
values than 'result' objects
Map<String, List<Double>> db = new HashMap<>();
FileUtils.ReadFile("./Files/products.csv", db);
// DisplayUtils over DisplayUtil for consistency and convention
DisplayUtils.DisplayProducts(db);
This can be simplified to the following if you rely on FileUtils.readFile()
returning the output, instead of putting them into db
:
DisplayUtils.displayProducts(FileUtils.readFile("./Files/products.csv"));
StringBuilder
usage
When StringBuilder
s are used in looping and collecting String
outputs, only one instance is created. For your usage, you are creating one instance per iteration just to print the results at the end of it, so that is slightly self-defeating. It will be easier, and likely clearer too, to just print them as such:
for (Map.Entry<String,List<Double>> entry : db.entrySet()) {
System.out.println(entry.getKey());
System.out.println("----------------------------------");
System.out.println("average_price = " +
AggregateUtil.getTheAveragePriceForEachCategory(entry.getValue()));
System.out.println("total_sales_value = " +
AggregateUtil.getTotalSalesValueForEachCategory(entry.getValue()));
System.out.println("number_of_products = " +
AggregateUtil.getNumberOfProductsInEachCategory(entry.getValue()));
System.out.println("most_expensive = " +
AggregateUtil.getTheMostExpensiveProductInEachCategory(entry.getValue()));
}
Java 8
There's a much better way to process lines of CSV data, when you rely on Java 8 stream-based processing. The implementation is much simpler (you write less code), and you get your results in a single pass of the data, instead of having to loop per result:
private static Map<String, DoubleSummaryStatistics> summarize(Collection<String> lines) {
return lines.stream()
.map(l -> l.split(","))
.collect(Collectors.groupingBy(l -> l[1],
Collectors.summarizingDouble(l -> Double.valueOf(l[2]))));
}
- For each
Stream<String>
element (i.e. row of CSV data),map()
into aString[]
array by splitting on","
. This is assuming your CSV data is clean enough to be parsed in the simplest way possible, else you might want to consider using a CSV library to do this for you. - Collect the results
groupingBy()
our category column (l[1]
) and then summarize on our price column (l[2]
).
To print from the resulting Map<String, DoubleSummaryStatistics>
object, you just need a method accepting both String, DoubleSummaryStatistics
arguments, so that you can use it as a BiConsumer
in a forEach()
:
private static final Format FORMATTER = new DecimalFormat("#.##");
private static void printNumber(String description, double value) {
System.out.printf("%20s: %s%n", description, FORMATTER.format(value));
}
private static void print(String category, DoubleSummaryStatistics summary) {
System.out.println("Category: " + category);
printNumber("Average price", summary.getAverage());
printNumber("Total", summary.getSum());
printNumber("Number of products", summary.getCount());
printNumber("Most expensive", summary.getMax());
System.out.println("-----");
}
// assuming class is called ProductSummarizer
public static void main(String[] args) {
summarize(parseToCollection("/path/to/file")).forEach(ProductSummarizer::print);
}
Here, print(String, DoubleSummaryStatistics)
is used as a method reference, and a DecimalFormat
instance is used to format the numeric values.
More OOP
As briefly touched on by @Christian Hujer, we can improve on the summarize()
method by having domain classes, instead of dealing with plain String
s. For example, with a Product
class and some suitable methods, the method can be expressed in an even clearer manner:
private static Map<String, DoubleSummaryStatistics> summarize(
Collection<Product> products) {
return products.stream()
.collect(Collectors.groupingBy(Product::getCategory,
Collectors.summarizingDouble(Product::getPrice)));
}