Commit 4ee73dd

committed

code update

1 parent 6c3fe5b commit 4ee73ddCopy full SHA for 4ee73dd

File tree

1 file changed

+38

-18

lines changed

03. Gradient Descent and its variants
- 3.02 Performing Gradient Descent in Regression.ipynb

1 file changed

+38

-18

lines changed

`‎03. Gradient Descent and its variants/3.02 Performing Gradient Descent in Regression.ipynb‎`

Lines changed: 38 additions & 18 deletions

Original file line number	Diff line number	Diff line change
`@@ -16,7 +16,7 @@`
`16`	`16`	`"\n",`
`17`	`17`	`"The equation of a simple linear regression can be expressed as:\n",`
`18`	`18`	`"\n",`
`19`		`- "$$ \\hat{y} = mx + b \\tag{1} $$\n",`
	`19`	`+ "$$ \\hat{y} = mx + b -- (1)$$ \n",`
`20`	`20`	`"\n",`
`21`	`21`	`"Thus, we have two parameters $m$ and $b$. We will see how can we use gradient descent and find the optimal values for these two parameters $m$ and $b$. \n"`
`22`	`22`	`]`
`@@ -33,7 +33,9 @@`
`33`	`33`	`{`
`34`	`34`	`"cell_type": "code",`
`35`	`35`	`"execution_count": 1,`
`36`		`- "metadata": {},`
	`36`	`+ "metadata": {`
	`37`	`+ "collapsed": true`
	`38`	`+ },`
`37`	`39`	`"outputs": [],`
`38`	`40`	`"source": [`
`39`	`41`	`"import warnings\n",`
`@@ -59,7 +61,9 @@`
`59`	`61`	`{`
`60`	`62`	`"cell_type": "code",`
`61`	`63`	`"execution_count": 2,`
`62`		`- "metadata": {},`
	`64`	`+ "metadata": {`
	`65`	`+ "collapsed": true`
	`66`	`+ },`
`63`	`67`	`"outputs": [],`
`64`	`68`	`"source": [`
`65`	`69`	`"data = np.random.randn(500, 2)"`
`@@ -162,7 +166,9 @@`
`162`	`166`	`{`
`163`	`167`	`"cell_type": "code",`
`164`	`168`	`"execution_count": 6,`
`165`		`- "metadata": {},`
	`169`	`+ "metadata": {`
	`170`	`+ "collapsed": true`
	`171`	`+ },`
`166`	`172`	`"outputs": [],`
`167`	`173`	`"source": [`
`168`	`174`	`"theta = np.zeros(2)"`
`@@ -203,7 +209,7 @@`
`203`	`209`	`"\n",`
`204`	`210`	`"Mean Squared Error (MSE) of Regression is given as:\n",`
`205`	`211`	`"\n",`
`206`		`- "$$J=\\frac{1}{N} \\sum_{i=1}^{N}(y-\\hat{y})^{2} \\tag{2}$$\n",`
	`212`	`+ "$$J=\\frac{1}{N} \\sum_{i=1}^{N}(y-\\hat{y})^{2} -- (2) $$\n",`
`207`	`213`	`"\n",`
`208`	`214`	`"\n",`
`209`	`215`	`"Where $N$ is the number of training samples, $y$ is the actual value and $\\hat{y}$ is the predicted value.\n",`
`@@ -216,7 +222,9 @@`
`216`	`222`	`{`
`217`	`223`	`"cell_type": "code",`
`218`	`224`	`"execution_count": 8,`
`219`		`- "metadata": {},`
	`225`	`+ "metadata": {`
	`226`	`+ "collapsed": true`
	`227`	`+ },`
`220`	`228`	`"outputs": [],`
`221`	`229`	`"source": [`
`222`	`230`	`"def loss_function(data,theta):\n",`
`@@ -287,19 +295,23 @@`
`287`	`295`	`"Gradients of loss function $J$ with respect to parameter $m$ is given as:\n",`
`288`	`296`	`"\n",`
`289`	`297`	`"\n",`
`290`		`- "$ \\frac{d J}{d m}=\\frac{2}{N} \\sum_{i=1}^{N}-x_{i}\\left(y_{i}-\\left(m x_{i}+b\\right)\\right) \\tag{3}$\n",`
	`298`	`+ "$$ \\frac{d J}{d m}=\\frac{2}{N} \\sum_{i=1}^{N}-x_{i}\\left(y_{i}-\\left(m x_{i}+b\\right)\\right) -- (3) $$\n",`
`291`	`299`	`"\n",`
`292`	`300`	`"\n",`
`293`	`301`	`"Gradients of loss function $J$ with respect to parameter $b$ is given as:\n",`
`294`		`- "$ \\frac{d J}{d b}=\\frac{2}{N} \\sum_{i=1}^{N}-\\left(y_{i}-\\left(m x_{i}+b\\right)\\right)\\tag{4} $\n",`
	`302`	`+ "\n",`
	`303`	`+ "\n",`
	`304`	`+ "$$ \\frac{d J}{d b}=\\frac{2}{N} \\sum_{i=1}^{N}-\\left(y_{i}-\\left(m x_{i}+b\\right)\\right) -- (4) $$\n",`
`295`	`305`	`"\n",`
`296`	`306`	`"We define a function called compute_gradients which takes the data and parameter theta as an input and returns the computed gradients: "`
`297`	`307`	`]`
`298`	`308`	`},`
`299`	`309`	`{`
`300`	`310`	`"cell_type": "code",`
`301`	`311`	`"execution_count": 10,`
`302`		`- "metadata": {},`
	`312`	`+ "metadata": {`
	`313`	`+ "collapsed": true`
	`314`	`+ },`
`303`	`315`	`"outputs": [],`
`304`	`316`	`"source": [`
`305`	`317`	`"def compute_gradients(data, theta):\n",`
`@@ -360,14 +372,14 @@`
`360`	`372`	`"\n",`
`361`	`373`	`"After computing gradients we need to update our model paramater according to our update rule as given below:\n",`
`362`	`374`	`"\n",`
`363`		`- "$m=m-\\alpha \\frac{d J}{d m} \\tag{5}$\n",`
	`375`	`+ "$$m=m-\\alpha \\frac{d J}{d m} -- (5) $$ \n",`
`364`	`376`	`"\n",`
`365`		`- "$ b=b-\\alpha \\frac{d J}{d b}\\tag{6}$\n",`
	`377`	`+ "$$ b=b-\\alpha \\frac{d J}{d b} --(6) $$\n",`
`366`	`378`	`"\n",`
`367`	`379`	`"\n",`
`368`	`380`	`"Since we stored $m$ in theta[0] and $b$ in theta[1], we can write our update equation as: \n",`
`369`	`381`	`"\n",`
`370`		`- "$\\theta = \\theta - \\alpha \\frac{dJ}{d\\theta} \\tag{7}$\n",`
	`382`	`+ "$$\\theta = \\theta - \\alpha \\frac{dJ}{d\\theta} -- (7) $$\n",`
`371`	`383`	`"\n",`
`372`	`384`	`"As we learned in the previous section, updating gradients for just one time will not lead us to the convergence i.e minimum of the cost function, so we need to compute gradients and the update the model parameter for several iterations:\n",`
`373`	`385`	`"\n",`
`@@ -378,7 +390,9 @@`
`378`	`390`	`{`
`379`	`391`	`"cell_type": "code",`
`380`	`392`	`"execution_count": 12,`
`381`		`- "metadata": {},`
	`393`	`+ "metadata": {`
	`394`	`+ "collapsed": true`
	`395`	`+ },`
`382`	`396`	`"outputs": [],`
`383`	`397`	`"source": [`
`384`	`398`	`"num_iterations = 50000"`
`@@ -394,7 +408,9 @@`
`394`	`408`	`{`
`395`	`409`	`"cell_type": "code",`
`396`	`410`	`"execution_count": 13,`
`397`		`- "metadata": {},`
	`411`	`+ "metadata": {`
	`412`	`+ "collapsed": true`
	`413`	`+ },`
`398`	`414`	`"outputs": [],`
`399`	`415`	`"source": [`
`400`	`416`	`"lr = 1e-2"`
`@@ -410,7 +426,9 @@`
`410`	`426`	`{`
`411`	`427`	`"cell_type": "code",`
`412`	`428`	`"execution_count": 14,`
`413`		`- "metadata": {},`
	`429`	`+ "metadata": {`
	`430`	`+ "collapsed": true`
	`431`	`+ },`
`414`	`432`	`"outputs": [],`
`415`	`433`	`"source": [`
`416`	`434`	`"loss = []"`
`@@ -426,7 +444,9 @@`
`426`	`444`	`{`
`427`	`445`	`"cell_type": "code",`
`428`	`446`	`"execution_count": 15,`
`429`		`- "metadata": {},`
	`447`	`+ "metadata": {`
	`448`	`+ "collapsed": true`
	`449`	`+ },`
`430`	`450`	`"outputs": [],`
`431`	`451`	`"source": [`
`432`	`452`	`"theta = np.zeros(2)\n",`
`@@ -496,7 +516,7 @@`
`496`	`516`	`],`
`497`	`517`	`"metadata": {`
`498`	`518`	`"kernelspec": {`
`499`		`- "display_name": "Python 2",`
	`519`	`+ "display_name": "Python [default]",`
`500`	`520`	`"language": "python",`
`501`	`521`	`"name": "python2"`
`502`	`522`	`},`
`@@ -510,7 +530,7 @@`
`510`	`530`	`"name": "python",`
`511`	`531`	`"nbconvert_exporter": "python",`
`512`	`532`	`"pygments_lexer": "ipython2",`
`513`		`- "version": "2.7.12"`
	`533`	`+ "version": "2.7.11"`
`514`	`534`	`}`
`515`	`535`	`},`
`516`	`536`	`"nbformat": 4,`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 4ee73dd

File tree

1 file changed

1 file changed

`‎03. Gradient Descent and its variants/3.02 Performing Gradient Descent in Regression.ipynb‎`

0 commit comments