You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+39-19Lines changed: 39 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,7 +54,6 @@ Original Creator of Apache Spark <br>
54
54
55
55
-------
56
56
57
-
58
57
## [Github Chapter Solutions](./code/)
59
58
60
59
* This GitHub repository will host all source code and scripts for
@@ -77,28 +76,49 @@ Original Creator of Apache Spark <br>
77
76
78
77
## Table of Contents
79
78
80
-
| Chapter | Title |
81
-
|-----------------------|------------------|
82
-
| Glossary |[Glossary of big data, MapReduce, Spark](https://github.com/mahmoudparsian/big-data-mapreduce-course/blob/master/slides/glossary/glossary_of_big_data_and_mapreduce.md)|
83
-
| Bonus <br> Chapters | <ul><li>[Word Count by RDD and DataFrame](./code/bonus_chapters/wordcount/python/README.md)</li><li>[Tutorials: RDDs and DataFrames](./code/bonus_chapters/)</li><li>[Top-N, UDF, Partitioning, TF-IDF, ...](./code/bonus_chapters/)</li><li>[Correlation, K-mers, anagrams, ...](./code/bonus_chapters/)</li><li>[Monoid: Design Principle](./wiki-spark/docs/monoid/README.md)</li></ul> |
84
-
| Chapter 1 |[Introduction to Data Algorithms](./code/chap01/)|
85
-
| Chapter 2 |[Transformations in Action](./code/chap02/)|
| Chapter 12 |[Feature Engineering in PySpark](./code/chap12/)|
79
+
| Chapter | Title |
80
+
|--------------|------------------|
81
+
| Glossary | [Glossary of Big Data, MapReduce, Spark](https://github.com/mahmoudparsian/big-data-mapreduce-course/blob/master/slides/glossary/glossary_of_big_data_and_mapreduce.md)
82
+
| Chapter 1 |[Introduction to Data Algorithms](./code/chap01/)|
83
+
| Chapter 2 |[Transformations in Action](./code/chap02/)|
| Glossary |[Glossary of big data, MapReduce, Spark](https://github.com/mahmoudparsian/big-data-mapreduce-course/blob/master/slides/glossary/glossary_of_big_data_and_mapreduce.md)|
104
+
| Word Count |[Solutions for Word Count using RDDs and DataFrames](./code/bonus_chapters/wordcount/)|
105
+
| Anagrams |[Find words, which are anagrams](./code/bonus_chapters/anagrams/)|
106
+
| Lambda Expressions |[Using Lambda Expressions in PySpark programs](./code/bonus_chapters/lambda_expressions/)|
107
+
| TF-IDF |[Term Frequency - Inverse Document Frequency](./code/bonus_chapters/TF-IDF/)|
108
+
| K-mers |[K-mers for DNA Sequences](./code/bonus_chapters/k-mers/)|
109
+
| Correlation |[All vs. All Correlation](./code/bonus_chapters/correlation/)|
| UDF |[User-Defined Function Examples](./code/bonus_chapters/UDF/)|
112
+
| DataFrames Transformations |[Examples on Creation and Transformation of DataFrames](./code/bonus_chapters/dataframes/)|
113
+
| DataFrames Tutorials |[DataFrames Tutorials: from collections and CSV text files](./code/bonus_chapters/dataframes/)|
114
+
| Join Operations |[Examples on join of RDDs and DataFrames](./code/bonus_chapters/join/)|
115
+
| PySpark Tutorial 101 |[Examples on using PySpark RDDs and DataFrames](./code/bonus_chapters/pyspark_tutorial/)|
116
+
| Physical Data Partitioning |[Tutorial of Physical Data Partitioning](./code/bonus_chapters/physical_partitioning/README.md)|
117
+
| Monoids and Combiners |[Monoid as a Design Principle](https://github.com/mahmoudparsian/data-algorithms-with-spark/blob/master/wiki-spark/docs/monoid/README.md)|
0 commit comments