You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: DataScience/GeneralMLPrep.md
+27Lines changed: 27 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,3 +31,30 @@ RNNs are commonly used in:
31
31
* Natural Language Processing: Tasks such as language modeling, text generation, and sentiment analysis.
32
32
* Speech Recognition: Processing audio signals to convert speech into text.
33
33
* Time Series Prediction: Forecasting stock prices or weather conditions based on historical data.
34
+
35
+
Decision Tree
36
+
==========
37
+
* Decision tree is a supervised ML algorithm used in classification and regression taks
38
+
* It is able to model decision and possible consequences in the form of a tree like strcuture
39
+
* The branch represents a `decision rule` and the internal node represents a `feature`. The leaf node or the terminal node of the branch is the `outcome`
* Entropy Calculation: Calculate the entropy of the target variable and predictor attributes to measure impurity.
46
+
* Information Gain: Determine the information gain for each attribute to identify which feature best splits the data.
47
+
* Node Selection: Choose the attribute with the highest information gain as the root node.
48
+
* Recursive Splitting: Repeat this process recursively for each branch until all branches are finalized or a stopping criterion is met (e.g., maximum depth or minimum samples per leaf)
49
+
50
+
Advantages:
51
+
==========
52
+
* Easy to interpret and visualize.
53
+
* Requires little data preprocessing (no need for normalization).
54
+
* Can handle both numerical and categorical data.
55
+
56
+
Disadvantages:
57
+
============
58
+
* Prone to overfitting, especially with deep trees.
0 commit comments