Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 1179c72

Browse files
More detailed docs + fix of manual test description
1 parent e8ebcbc commit 1179c72

File tree

1 file changed

+35
-28
lines changed

1 file changed

+35
-28
lines changed

‎h2o-gbm/readme.md

Lines changed: 35 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -2,76 +2,83 @@
22

33
General info in main [Readme](../readme.md)
44

5-
### Example 1 - Gradient Boosting with H2O.ai for Prediction of Flight Delays
5+
## Example 1 - Gradient Boosting with H2O.ai for Prediction of Flight Delays
66

7-
**Use Case**
7+
### Use Case
88

99
Gradient Boosting Method (GBM) to predict flight delays.
1010
A H2O generated GBM Java model (POJO) is instantiated and used in a Kafka Streams application to do interference on new events.
1111

12-
**Machine Learning Technology**
12+
### Machine Learning Technology
1313

1414
* [H2O](https://www.h2o.ai)
1515
* Check the [H2O demo](https://github.com/h2oai/h2o-2/wiki/Hacking-Airline-DataSet-with-H2O) to understand the test and and how the model was built
1616
* You can re-use the generated Java model attached to this project ([gbm_pojo_test.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/models/gbm_pojo_test.java)) or build your own model using R, Python, Flow UI or any other technologies supported by H2O framework.
1717

18-
**Source Code**
18+
### Source Code
1919

20+
Business Logic (applying the analytic model to do the prediction):
21+
[Kafka_Streams_MachineLearning_H2O_Application.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_Application.java)
22+
23+
Specification of the used model:
2024
[Kafka_Streams_MachineLearning_H2O_GBM_Example.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_GBM_Example.java)
21-
->Logic in [Kafka_Streams_MachineLearning_H2O_Application.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_Application.java)
2225

23-
**Unit Test**
26+
### Automated Tests
2427

28+
Unit Test using TopologyTestDriver:
2529
[Kafka_Streams_MachineLearning_H2O_GBM_ExampleTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_GBM_ExampleTest.java)
26-
[Kafka_Streams_MachineLearning_H2O_GBM_Example_IntegrationTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/test/Kafka_Streams_MachineLearning_H2O_GBM_Example_IntegrationTest.java)
2730

28-
**Manual Testing**
31+
Integration Test using EmbeddedKafkaCluster:
32+
[Kafka_Streams_MachineLearning_H2O_GBM_Example_IntegrationTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_GBM_Example_IntegrationTest.java)
33+
34+
### Manual Testing
2935

3036
You can easily test this by yourself. Here are the steps:
31-
- Start Kafka, e.g. with Confluent CLI:
37+
38+
* Start Kafka, e.g. with Confluent CLI:
3239

3340
confluent start kafka
34-
- Create topics AirlineInputTopic and AirlineOutputTopic
41+
* Create topics AirlineInputTopic and AirlineOutputTopic
3542

3643
kafka-topics --zookeeper localhost:2181 --create --topic AirlineInputTopic --partitions 3 --replication-factor 1
3744

3845
kafka-topics --zookeeper localhost:2181 --create --topic AirlineOutputTopic --partitions 3 --replication-factor 1
39-
- Start the Kafka Streams app:
46+
* Start the Kafka Streams app:
4047

41-
java -cp target/h2o-gbm-CP51_AK21-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Kafka_Streams_MachineLearning_H2O_GBM_Example
42-
- Send messages, e.g. with kafkacat:
48+
java -cp h2o-gbm/target/h2o-gbm-CP51_AK21-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Kafka_Streams_MachineLearning_H2O_GBM_Example
49+
* Send messages, e.g. with kafkacat:
4350

4451
echo -e "1987,10,14,3,741,730,912,849,PS,1451,NA,91,79,NA,23,11,SAN,SFO,447,NA,NA,0,NA,0,NA,NA,NA,NA,NA,YES,YES" | kafkacat -b localhost:9092 -P -t AirlineInputTopic
45-
- Consume predictions:
52+
* Consume predictions:
4653

4754
kafka-console-consumer --bootstrap-server localhost:9092 --topic AirlineOutputTopic --from-beginning
48-
- Find more details in the unit test...
55+
* Find more details in the unit test...
4956

50-
51-
**H2O Deep Learning instead of H2O GBM Model**
57+
## H2O Deep Learning instead of H2O GBM Model
5258

5359
The project includes another example with similar code to use a [H2O Deep Learning model](src/main/java/com/github/megachucky/kafka/streams/machinelearning/models/deeplearning_fe7c1f02_08ec_4070_b784_c2531147e451.java) instead of H2O GBM Model: [Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/test/Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java)
5460
This shows how you can easily test or replace different analytic models for one use case, or even use them for A/B testing.
5561

56-
**Source Code**
62+
### Source Code
63+
64+
Business Logic (applying the analytic model to do the prediction):
65+
[Kafka_Streams_MachineLearning_H2O_Application.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_Application.java)
5766

67+
Specification of the used model:
5868
[Kafka_Streams_MachineLearning_H2O_DeepLearning_Example.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_DeepLearning_Example.java)
59-
->Logic in [Kafka_Streams_MachineLearning_H2O_Application.java](src/main/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_Application.java)
6069

61-
**Unit Test**
70+
### Unit Test
6271

72+
Unit Test using TopologyTestDriver:
6373
[Kafka_Streams_MachineLearning_H2O_DeepLearning_ExampleTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_DeepLearning_ExampleTest.java)
64-
[Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/test/Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java)
6574

75+
Integration Test using EmbeddedKafkaCluster:
76+
[Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java](src/test/java/com/github/megachucky/kafka/streams/machinelearning/Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java)
6677

67-
**Manual Testing**
78+
### Manual Testing
6879

6980
Same as above but change class to start app:
7081

71-
- Start the Kafka Streams app:
72-
73-
java -cp target/h2o-gbm-CP51_AK21-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Kafka_Streams_MachineLearning_H2O_DeepLearning_Example
74-
75-
76-
82+
* Start the Kafka Streams app:
7783

84+
java -cp h2o-gbm/target/h2o-gbm-CP51_AK21-jar-with-dependencies.jar com.github.megachucky.kafka.streams.machinelearning.Kafka_Streams_MachineLearning_H2O_DeepLearning_Example

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /