Commit 59ebb30

committed

part 1 and 2 added

1 parent 684cd3c commit 59ebb30Copy full SHA for 59ebb30

File tree

20 files changed

+2531

-69

lines changed

Part - 1
- .ipynb_checkpoints
  - Practical Machine Learning With Python [Part - 1]-checkpoint.ipynb
- Practical Machine Learning With Python [Part - 1].ipynb
Part - 2
README.md
images

20 files changed

+2531

-69

lines changed

`‎Part - 1/.ipynb_checkpoints/Practical Machine Learning With Python [Part - 1]-checkpoint.ipynb`

Lines changed: 1220 additions & 0 deletions

Large diffs are not rendered by default.

`‎Part - 1/Practical Machine Learning With Python [Part - 1].ipynb`

Lines changed: 48 additions & 48 deletions

Large diffs are not rendered by default.

`‎Part - 2/.ipynb_checkpoints/Practical Machine Learning With Python - Part 2-checkpoint.ipynb`

Lines changed: 1202 additions & 0 deletions

Large diffs are not rendered by default.

`‎Part - 2/Practical Machine Learning With Python - Part 2.ipynb`

Lines changed: 45 additions & 21 deletions

Original file line number	Diff line number	Diff line change
`@@ -135,9 +135,7 @@`
`135`	`135`	`"collapsed": false`
`136`	`136`	`},`
`137`	`137`	`"outputs": [],`
`138`		`- "source": [`
`139`		`- "#sentiment analysis"`
`140`		`- ]`
	`138`	`+ "source": []`
`141`	`139`	`},`
`142`	`140`	`{`
`143`	`141`	`"cell_type": "code",`
`@@ -165,15 +163,15 @@`
`165`	`163`	`"\n",`
`166`	`164`	`"Consider a binary classification problem, where the task is to assign a one of the two labels to given input. We plot each data item as a point in n-dimensional space as follows:</p>\n",`
`167`	`165`	`"\n",`
`168`		`- "![title](./images/svm1.png)"`
	`166`	`+ "![title](../images/svm1.png)"`
`169`	`167`	`]`
`170`	`168`	`},`
`171`	`169`	`{`
`172`	`170`	`"cell_type": "markdown",`
`173`	`171`	`"metadata": {},`
`174`	`172`	`"source": [`
`175`	`173`	`"<p style=\"font-family:verdana; font-size:15px\">We can perform classification by finding the hyperplane that differentiate the two classes very well. As you can see in the above image, we can draw m number of hyperplanes. How do we find the best one? We can find the optimal hyperplane by maximizing the <b> margin </b>.</p>\n",`
`176`		`- "![title](./images/svm2.png)"`
	`174`	`+ "![title](../images/svm2.png)"`
`177`	`175`	`]`
`178`	`176`	`},`
`179`	`177`	`{`
`@@ -755,15 +753,15 @@`
`755`	`753`	`"metadata": {},`
`756`	`754`	`"source": [`
`757`	`755`	`"<p style=\"font-family:verdana; font-size:15px\">Till now, we see problems where input data can be seperated by linear hyperplane. But what is data points are not linearly seperable as shown below?</p>\n",`
`758`		`- "![title](./images/svm5.png)\n"`
	`756`	`+ "![title](../images/svm5.png)\n"`
`759`	`757`	`]`
`760`	`758`	`},`
`761`	`759`	`{`
`762`	`760`	`"cell_type": "markdown",`
`763`	`761`	`"metadata": {},`
`764`	`762`	`"source": [`
`765`	`763`	`"<p style=\"font-family:verdana; font-size:15px\">To solve this type of problems where data can not be seperated linearly, we add new feature. For example, let us add new feature z = x<sup>2</sup> + y<sup>2</sup>. Now, if we plot data points on x and z axis we get :</p>\n",`
`766`		`- "![title](./images/svm3.png)\n"`
	`764`	`+ "![title](../images/svm3.png)\n"`
`767`	`765`	`]`
`768`	`766`	`},`
`769`	`767`	`{`
`@@ -807,7 +805,7 @@`
`807`	`805`	`},`
`808`	`806`	`"source": [`
`809`	`807`	`"<p style=\"font-family:verdana; font-size:15px\"> We can have different decision boundary for different kernels and gamma values. Here is the screenshot from scikit-learn website.</p>\n",`
`810`		`- "![title](./images/svm6.png)\n"`
	`808`	`+ "![title](../images/svm6.png)\n"`
`811`	`809`	`]`
`812`	`810`	`},`
`813`	`811`	`{`
`@@ -824,7 +822,7 @@`
`824`	`822`	`"metadata": {},`
`825`	`823`	`"source": [`
`826`	`824`	"<p style=\"font-family:verdana; font-size:15px\">Decision Tree is the supervised learning algorithm which can be used for classification as well as regression problems. Decision Tree is very popular learning algorithm because of its interpretability. In this method, we split population into set of homogeneous sets by asking set of questions. Consider a problem where we want to decide what to do on a particular day. We can design a decision tree as follows : (Source: Python Machine Learning by Sebastian Raschka)</p>\n",
`827`		`- "![title](./images/dt1.png)"`
	`825`	`+ "![title](../images/dt1.png)"`
`828`	`826`	`]`
`829`	`827`	`},`
`830`	`828`	`{`
`@@ -874,7 +872,7 @@`
`874`	`872`	`"cell_type": "markdown",`
`875`	`873`	`"metadata": {},`
`876`	`874`	`"source": [`
`877`		`- "![title](./images/entropy.png)\n",`
	`875`	`+ "![title](../images/entropy.png)\n",`
`878`	`876`	`"<p style=\"font-family:verdana; font-size:15px\">\n",`
`879`	`877`	`"As you can see, entropy is maximum when p(i=1 \| t ) and p(i=0 \| t)=0.5. And entropy is minimum when all the samples belong to the same class. We define Gini Impurity as :\n",`
`880`	`878`	`"<br><BR>\n",`
`@@ -886,7 +884,7 @@`
`886`	`884`	`},`
`887`	`885`	`{`
`888`	`886`	`"cell_type": "code",`
`889`		`- "execution_count": 7,`
	`887`	`+ "execution_count": 13,`
`890`	`888`	`"metadata": {`
`891`	`889`	`"collapsed": false`
`892`	`890`	`},`
`@@ -901,7 +899,7 @@`
`901`	`899`	`" presort=False, random_state=42, splitter='best')"`
`902`	`900`	`]`
`903`	`901`	`},`
`904`		`- "execution_count": 7,`
	`902`	`+ "execution_count": 13,`
`905`	`903`	`"metadata": {},`
`906`	`904`	`"output_type": "execute_result"`
`907`	`905`	`}`
`@@ -929,7 +927,7 @@`
`929`	`927`	`},`
`930`	`928`	`{`
`931`	`929`	`"cell_type": "code",`
`932`		`- "execution_count": 8,`
	`930`	`+ "execution_count": 14,`
`933`	`931`	`"metadata": {`
`934`	`932`	`"collapsed": false`
`935`	`933`	`},`
`@@ -940,7 +938,7 @@`
`940`	`938`	`"0.97777777777777775"`
`941`	`939`	`]`
`942`	`940`	`},`
`943`		`- "execution_count": 8,`
	`941`	`+ "execution_count": 14,`
`944`	`942`	`"metadata": {},`
`945`	`943`	`"output_type": "execute_result"`
`946`	`944`	`}`
`@@ -960,7 +958,7 @@`
`960`	`958`	`},`
`961`	`959`	`{`
`962`	`960`	`"cell_type": "code",`
`963`		`- "execution_count": 9,`
	`961`	`+ "execution_count": 15,`
`964`	`962`	`"metadata": {`
`965`	`963`	`"collapsed": false`
`966`	`964`	`},`
`@@ -971,7 +969,7 @@`
`971`	`969`	`"True"`
`972`	`970`	`]`
`973`	`971`	`},`
`974`		`- "execution_count": 9,`
	`972`	`+ "execution_count": 15,`
`975`	`973`	`"metadata": {},`
`976`	`974`	`"output_type": "execute_result"`
`977`	`975`	`}`
`@@ -988,7 +986,7 @@`
`988`	`986`	`},`
`989`	`987`	`{`
`990`	`988`	`"cell_type": "code",`
`991`		`- "execution_count": 10,`
	`989`	`+ "execution_count": 17,`
`992`	`990`	`"metadata": {`
`993`	`991`	`"collapsed": false`
`994`	`992`	`},`
`@@ -1000,14 +998,14 @@`
`1000`	`998`	`"<IPython.core.display.Image object>"`
`1001`	`999`	`]`
`1002`	`1000`	`},`
`1003`		`- "execution_count": 10,`
	`1001`	`+ "execution_count": 17,`
`1004`	`1002`	`"metadata": {},`
`1005`	`1003`	`"output_type": "execute_result"`
`1006`	`1004`	`}`
`1007`	`1005`	`],`
`1008`	`1006`	`"source": [`
`1009`	`1007`	`"from IPython.display import Image\n",`
`1010`		`- "Image(filename=\"./images/tree.png\")"`
	`1008`	`+ "Image(filename=\"../images/tree.png\")"`
`1011`	`1009`	`]`
`1012`	`1010`	`},`
`1013`	`1011`	`{`
`@@ -1057,7 +1055,7 @@`
`1057`	`1055`	`},`
`1058`	`1056`	`{`
`1059`	`1057`	`"cell_type": "code",`
`1060`		`- "execution_count": 15,`
	`1058`	`+ "execution_count": 18,`
`1061`	`1059`	`"metadata": {`
`1062`	`1060`	`"collapsed": false`
`1063`	`1061`	`},`
`@@ -1149,8 +1147,34 @@`
`1149`	`1147`	`"<ul>\n",`
`1150`	`1148`	`"<li> Fit an additive model (ensemble) in a forward stage-wise manner.</li>\n",`
`1151`	`1149`	`"<li> In each stage, introduce a weak learner to compensate the shortcomings of previous weak learners.</li>\n",`
`1152`		`- "<li> Shortcomings are identified by gradients</li>"`
	`1150`	`+ "<li> Shortcomings are identified by gradients</li></ul>\n",`
	`1151`	`+ "\n",`
	`1152`	`+ "TODO: add theory and code"`
`1153`	`1153`	`]`
	`1154`	`+ },`
	`1155`	`+ {`
	`1156`	`+ "cell_type": "markdown",`
	`1157`	`+ "metadata": {},`
	`1158`	`+ "source": [`
	`1159`	`+ "# Exercise"`
	`1160`	`+ ]`
	`1161`	`+ },`
	`1162`	`+ {`
	`1163`	`+ "cell_type": "markdown",`
	`1164`	`+ "metadata": {},`
	`1165`	`+ "source": [`
	`1166`	`+ "<p style=\"font-family:verdana; font-size:15px\">\n",`
	`1167`	`+ "In <a href=\"Sentiment%20Analysis.ipynb\">this</a> exercise will implement a sentiment analysis model which can detect the sentiment of a text. We will also go through some feature extraction techniques and learn how to use textual data in machine learning models."`
	`1168`	`+ ]`
	`1169`	`+ },`
	`1170`	`+ {`
	`1171`	`+ "cell_type": "code",`
	`1172`	`+ "execution_count": null,`
	`1173`	`+ "metadata": {`
	`1174`	`+ "collapsed": true`
	`1175`	`+ },`
	`1176`	`+ "outputs": [],`
	`1177`	`+ "source": []`
`1154`	`1178`	`}`
`1155`	`1179`	`],`
`1156`	`1180`	`"metadata": {`

`‎Part - 2/iris.pdf`

23.2 KB

Binary file not shown.

`‎README.md`

Lines changed: 16 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,2 +1,18 @@`
`1`	`1`	`# Practical-Machine-Learning-with-Python`
`2`	`2`	`Machine Learning tutorials in Python`
	`3`	`+`
	`4`	`+1. Part - 1 [Theory](https://savan77.github.io/blog/machine-learning-part1.html)[Code](https://github.com/savan77/Practical-Machine-Learning-With-Python/blob/master/Part%20-%201/Practical%20Machine%20Learning%20With%20Python%20%5BPart%20-%201%5D.ipynb)`
	`5`	`+...* What is Machine Learning and Types of Machine Learning?`
	`6`	`+...* Linear Regression`
	`7`	`+...* Gradient Descent`
	`8`	`+...* Logistic Regression`
	`9`	`+...* Overfitting and Underfitting`
	`10`	`+...* Regularization`
	`11`	`+...* Cross Validation`
	`12`	`+`
	`13`	`+2. Part - 2 [Theory and Code](https://github.com/savan77/Practical-Machine-Learning-With-Python/blob/master/Part%20-%202/Practical%20Machine%20Learning%20With%20Python%20-%20Part%202.ipynb)`
	`14`	`+...* Naive Bayes`
	`15`	`+...* Support Vector Machines`
	`16`	`+...* Decision Tree`
	`17`	`+...* Random Forest and Boosting algorithms`
	`18`	`+...* Preprocessing and Feature Extraction techniques`