Pattern Mining: The Online Course (BETA)
[画像:banner]
Philippe Fournier-Viger
Distinguished professor, Ph.D.
https://www.philippe-fournier-viger.com
Introduction
Welcome! Do you want to learn how to find hidden patterns in data that can help you understand it better and make better decisions?
If yes, then this free online course on pattern mining is for you!
Pattern mining is a subfield of data
mining that aim at applying algorithms to discover interesting patterns in
data. These patterns can be used to understand the data or to support
decision-making or tasks such as prediction.
This course is designed to introduce students or researchers to the different topics of pattern mining, and explain the key algorithms and key concepts.
By taking this course, you will:
- Learn the theory and key algorithms in pattern mining from a leading researcher in the field, and founder of the popular, SPMF sofware.
- Explore various topics and applications of pattern mining, such as frequent itemset mining, sequential pattern mining, episode mining, and periodic pattern mining.
- Gain practical skills and experience by using the SPMF software to apply pattern mining techniques to real-world datasets.
- Access all resources for this course, for free.
How to study?
This course is an online course that consists of multiple recorded lectures that you can watch. After watching a lecture, you can do the corresponding exercises to test your knowledge.
In general, it is not necessary to watch all the content. You may skip some videos if you are not interested by some topics.
Because this is a beta version of the course, I will keep improving the course with more content over time.
Hope you will enjoy the course. If you have any comments or suggestions,
you may send me an e-mail or post a message in the data
mining forum.
| Topic |
Lectures
|
Exercises |
|
1
|
|
|
2 |
Frequent itemset mining and association rule mining
Lecture(s)
Some interesting papers
- Fournier-Viger, P., Lin, J. C.-W., Vo, B, Chi, T.T., Zhang,
J., Le, H. B. (2017). A
Survey of Itemset Mining. WIREs Data Mining and
Knowledge Discovery, Wiley, e1207 doi: 10.1002/widm.1207, 18
pages.
- Luna, J. M., Fournier-Viger, P., Ventura, S. (2019). Frequent
Itemset Mining: a 25 Years Review. WIREs Data Mining and
Knowledge Discovery, Wiley, 9(6):e1329. DOI: 10.1002/widm.1329
Some online tools
To test some of the concepts from this lecture, you may try some of these online tools:
- An online tool that gives the list of all itemsets that can be made from a set of items.
- An online tool that demonstrates the Apriori algorithm step by step.
- An online tool that demonstrate how an horizontal database is transformed into a vertical database.
- An online tool that demonstrates the Eclat algorithm step by step.
- An online tool that gives the list of all association rules that can be made from a set of items.
- An online tool to calculate the number of possible itemsets and association rules that can be made from a given number of items.
|
|
| 3 |
Concise representations of patterns
Lecture(s)
Some online tool(s):
- An online tool that uses a brute-force approach to find all frequent itemsets, closed and maximal frequent itemsets in a dataset (inefficient, but useful for quick testing).
|
|
| 4 |
Rare Pattern Mining
Lecture(s)
|
|
| 5 |
Correlated and statistically significant patterns
Lecture(s)
Some online tool(s):
- An online tool that uses a brute-force approach to find all frequent itemsets in a transaction dataset and calculate their support, bond and all-confidence (inefficient, but useful for quick testing).
|
|
|
6
|
High Utility Itemset Mining
Lecture(s)
Some interesting paper(s)
- Fournier-Viger., P., Lin, J. C.-W., Truong, T., Nkambou, R. (2019). A survey of high utility itemset mining. In: Fournier-Viger et al. (eds). High-Utility Pattern Mining: Theory, Algorithms and Applications, Springer (to appear), p. 1-46.
- Fournier-Viger, P., Lin. J. C.-W., Vo, B., Nkambou, R., Tseng, V. S. (editors). (2019) High-Utility Pattern Mining: Theory, Algorithms and Applications, Springer.
- Truong, T., Fournier-Viger., P. (2019). A survey of high utility sequential pattern mining. In: Fournier-Viger et al. (eds). High-Utility Pattern Mining: Theory, Algorithms and Applications, Springer, p. 97-130.
Some online tool(s):
- An online tool that shows how to calculate the utility of an itemset.
- An online tool that shows how to calculate the utility-list of an itemset.
|
|
| 7 |
Sequential pattern mining
Lecture(s)
Some interesting paper(s)
- Fournier-Viger, P., Lin, J. C.-W., Kiran,
R. U., Koh, Y. S., Thomas, R. (2017). A
Survey of Sequential Pattern Mining. Data Science and
Pattern Recognition (DSPR), vol. 1(1), pp. 54-77.
- Tutorial:
Using the SPMF software to discover frequent patterns in
text documents (tutorial)
|
|
| 8 |
Sequential rule mining
Lecture(s)
|
|
|
9
|
Episode Mining
Lecture(s)
Some interesting paper(s)
|
|
| 10 |
Periodic pattern mining
Lecture(s)
|
|
| 11 |
Other topics
Lecture(s)
- Approximate pattern mining
- Frequent subgraph mining (pdf / ppt / video
- 11 min)
- Interactive pattern mining
- Classification using patterns
- ....
|
- Questions about frequent subgraph mining
|
| 12 |
... |
... |
Software, source code and datasets
To try the different pattern mining algorithms discussed in this
course, you can download the SPMF
data mining software. SPMF is an open-source software, offering over 250
algorithms. It is implemented in Java and there exist also unofficial wrappers for some other languages. Besides, you can find several public datasets to try the algorithms from SPMF on the datasets page of SPMF
pattern mining software SPMF
More videos on pattern mining
If you want to see more videos on pattern mining, you may also check:
- The video page on the SPMF website: Pattern mining videos on the SPMF website
- My Youtube channel: https://www.youtube.com/@philfv
FAQ about this course
- How can I contact you if I find some error in the course?
- Send me an e-mail with your feedback and suggestions. I will try to fix the errors as soon as possible. You will be listed as a contributor on this webpage.
- Where can I get more information about these topics, and also ask questions?
- This webpage lists several resources and you can also find more pattern mining videos on my Youtube Channel. Besides, you can try the different algorithms discussed in this course by using the SPMF software, which is free and open-source. Also if you have question, you can also post your questions in the data mining forum. I check this forum every few days and will try to answer your questions.
- Can I use and modify your Powerpoints to teach a course at my university
- Yes, I will be very happy about this! The goal of this free course is to share knowledge. But if you reuse my powerpoints, I ask you to cite this website in your modified PPT and indicate that your powerpoint is based on my content.
Bibliography
This course is based on content from research articles mentioned in the PPTs and PDFs and also some information from those books:
- Fournier-Viger, P., Lin. J. C.-W., Vo, B., Nkambou, R., Tseng, V. S. (editors). (2019) High-Utility Pattern Mining: Theory, Algorithms and Applications, Springer.
- Han and Kamber (2011), Data Mining: Concepts and Techniques, 3rd
edition, Morgan Kaufmann Publishers,
- Tan, Steinbach & Kumar (2006), Introduction to Data Mining,
Pearson education, ISBN-10: 0321321367.
- Data Mining: The Textbook by Aggarwal (2015)
- Data Mining and Analysis Fundamental Concepts and Algorithms by Zaki
& Meira (2014)
Contributors
Several people have given feedback, ideas or reported errors, related to this course:
- Chongsheng Zhang
- Wensheng Gan
- Tai Dinh
- ...
Visitors:
000106