2

I understand how to get specific data from a file with Java 8 Streams. For example if we need to get Loaded packages from a file like this

2015年01月06日 11:33:03 b.s.d.task [INFO] Emitting: eVentToRequestsBolt __ack_ack 
2015年01月06日 11:33:03 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package com.foo.bar
2015年01月06日 11:33:04 b.s.d.executor [INFO] Processing received message source: eventToManageBolt:2, stream: __ack_ack, id: {}, [-6722594615019711369 -1335723027906100557]
2015年01月06日 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package co.il.boo
2015年01月06日 11:33:04 c.s.p.d.PackagesProvider [INFO] ===---> Loaded package dot.org.biz

we can do

List<String> packageList = Files.lines(Paths.get(args[1])).filter(line -> line.contains("===---> Loaded package"))
 .map(line -> line.split(" "))
 .map(arr -> arr[arr.length - 1]).collect(Collectors.toList());

I took (and slightly modified) the code from Parsing File Example.

But what if we also need to get all the dates (and times) for Emitting: events from the same log file? How we can do this within working with the same Stream?

I can only imagine using collect(groupingBy(...)) which groups lines with Loaded packages and lines with Emitting: before parsing and then parse each group (a map entry) separately. But that would create a map with all the raw data from log file which is very memory consuming.

Is there a similar way to effectively extract multiple types of data from Java 8 Streams?

asked Jan 19, 2016 at 18:35
3
  • 1
    You can do this with peek(), however this is not generally recommended. In particular groupingBy will only produce a result once all the data has been processed. Commented Jan 19, 2016 at 18:42
  • 1
    Could you expand a bit? The data you posted does not contain "some" so I'm not sure I understand. Could you post a sample input / output of what you'd like? Commented Jan 19, 2016 at 20:52
  • @Tunaki Some is a wrong word. I corrected the question. I need to get all the Loaded packages from the file. They are in lines with arrows ===--->. Commented Jan 20, 2016 at 6:41

2 Answers 2

1

You may solve it without defining new collectors and using third-party libraries in more imperative style. First you need to define a class which represents the parsing result. It should have two methods to accept an input line and combine with existing partial result:

class Data {
 List<String> packageDates = new ArrayList<>();
 List<String> emittingDates = new ArrayList<>();
 // Consume single input line
 void accept(String line) {
 if(line.contains("===---> Loaded package"))
 packageDates.add(line.substring(0, "XXXX-XX-XX".length()));
 if(line.contains("Emitting"))
 packageDates.add(line.substring(0, "XXXX-XX-XX XX:XX:XX".length()));
 }
 // Combine two partial results
 void combine(Data other) {
 packageDates.addAll(other.packageDates);
 emittingDates.addAll(other.emittingDates);
 }
}

Now you can collect in quite straightforward way:

Data result = Files.lines(Paths.get(args[1]))
 .collect(Data::new, Data::accept, Data::combine);
answered Jan 20, 2016 at 3:56

Comments

1

You may use pairing collector which I wrote in this answer and which is available in my StreamEx library. For your concrete problem you will also need a filtering collector which is available in JDK-9 early access builds and also in my StreamEx library. If you don't like using third-party library, you may copy it from this answer.

Also you will need to store everything into some data structure. I declared the Data class for this purpose:

class Data {
 List<String> packageDates;
 List<String> emittingDates;
 public Data(List<String> packageDates, List<String> emittingDates) {
 this.packageDates = packageDates;
 this.emittingDates = emittingDates;
 }
}

Putting everything together you can define a parsingCollector:

Collector<String, ?, List<String>> packageDatesCollector = 
 filtering(line -> line.contains("===---> Loaded package"),
 mapping(line -> line.substring(0, "XXXX-XX-XX".length()), toList()));
Collector<String, ?, List<String>> emittingDatesCollector = 
 filtering(line -> line.contains("Emitting"),
 mapping(line -> line.substring(0, "XXXX-XX-XX XX:XX:XX".length()), toList()));
Collector<String, ?, Data> parsingCollector = pairing(
 packageDatesCollector, emittingDatesCollector, Data::new);

And use it like this:

Data data = Files.lines(Paths.get(args[1])).collect(parsingCollector);
answered Jan 20, 2016 at 3:49

1 Comment

That seems to be a more elegant solution if we have pairing and filtering already.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.