Commit a78f324

authored

Introduction to Machine Learning

1 parent 73274f1 commit a78f324Copy full SHA for a78f324

File tree

1 file changed

+79

-0

lines changed

Introduction to Machine Learning

1 file changed

+79

-0

lines changed

`‎Introduction to Machine Learning`

Lines changed: 79 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,79 @@`
	`1`	`+Machine Learning`
	`2`	`+------------------`
	`3`	`+Building a model from example inputs to make data-driven predictions versus following strictly static program instructions.`
	`4`	`+Application:`
	`5`	`+`
	`6`	`+1. Email a spam?`
	`7`	`+2. How can cars drive themselves?`
	`8`	`+3. What will people buy?`
	`9`	`+`
	`10`	`+Machine Learning`
	`11`	`+-----------------`
	`12`	`+2 categories`
	`13`	`+a. Supervised;`
	`14`	`+ - Value Prediction`
	`15`	`+ - Needs training data containing value being predicted, the trained model predicts value in the new model;`
	`16`	`+b. Unsupervised;`
	`17`	`+ - Identify clusters of like data;`
	`18`	`+ - Data does not contain cluster membership, but model provides access to data by cluster;`
	`19`	`+`
	`20`	`+url -> https://www.continuum.io/downloads`
	`21`	`+`
	`22`	`+`
	`23`	`+Machine Learning WorkFlow:`
	`24`	`+--------------------------`
	`25`	`+An orchestrated and repeatable pattern which systematically transforms and processes information to create prediction solutions.`
	`26`	`+`
	`27`	`+1. Asking the right question;`
	`28`	`+2. Preparing data;`
	`29`	`+3. Selecting the algorithm;`
	`30`	`+4. Training the model;`
	`31`	`+5. Testing the model;`
	`32`	`+`
	`33`	`+1. Asking the Right Question`
	`34`	`+-----------------------------`
	`35`	`+a. Define scope (including data sources);`
	`36`	`+ - Using Pima Indian Diabetes data, predict which people will develop diabetes.`
	`37`	`+`
	`38`	`+b. Define target performance;`
	`39`	`+ - Using Pima Indian Diabetes data, predict with 70% or grater accuracy, which people will develop diabetes.`
	`40`	`+`
	`41`	`+c. Define context for usage;`
	`42`	`+ - Using Pima Indian Diabetes data, predict with 70% or greater accuracy which people are likely to develop diabetes.`
	`43`	`+`
	`44`	`+d. Define how solution is created;`
	`45`	`+ - Use the Machine Learning Workflow to process and transform Pima Indian data to create a predictin model. This model`
	`46`	`+ must predict whih people are likely to develop diabetes with 70% or greater accuracy.`
	`47`	`+`
	`48`	`+ 2. Preparing data`
	`49`	`+ ---------------------`
	`50`	`+ a. Tidy Data`
	`51`	`+ - Tidy datasets are easy to manipulate, model and visualize,and have a specific structure:`
	`52`	`+ * each variable is a column;`
	`53`	`+ * each observation is a row;`
	`54`	`+ * each type of observational unit is a table;`
	`55`	`+ ** 50 - 80% of a ML project is spent getting, cleaning, and organizing data;`
	`56`	`+`
	`57`	`+Data Rule #1:`
	`58`	`+---------------`
	`59`	`+- Closer the data is to what you are predicting, the better;`
	`60`	`+`
	`61`	`+Data Rule #2:`
	`62`	`+--------------`
	`63`	`+- Data will never be in the format you need;`
	`64`	`+* Columns to eliminate - Not used, no values, duplicates;`
	`65`	`+* Correlated columns - Same information in different format, add little value, and cause algorithm to get confused;`
	`66`	`+* Modling Data - Adjusting data types, creating columns, if required;`
	`67`	`+`
	`68`	`+Data Rule #3:`
	`69`	`+----------------`
	`70`	`+Accurately predicting rare events is difficule;`
	`71`	`+`
	`72`	`+Data Rule #4:`
	`73`	`+--------------`
	`74`	`+Track how to manipulate data;`
	`75`	`+`
	`76`	`+3.`
	`77`	`+`
	`78`	`+`
	`79`	`+`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit a78f324

File tree

1 file changed

1 file changed

`‎Introduction to Machine Learning`

0 commit comments