Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 4ee73dd

Browse files
code update
1 parent 6c3fe5b commit 4ee73dd

File tree

1 file changed

+38
-18
lines changed

1 file changed

+38
-18
lines changed

‎03. Gradient Descent and its variants/3.02 Performing Gradient Descent in Regression.ipynb‎

Lines changed: 38 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
"\n",
1717
"The equation of a simple linear regression can be expressed as:\n",
1818
"\n",
19-
"$$ \\hat{y} = mx + b \\tag{1} $$\n",
19+
"$$ \\hat{y} = mx + b -- (1)$$ \n",
2020
"\n",
2121
"Thus, we have two parameters $m$ and $b$. We will see how can we use gradient descent and find the optimal values for these two parameters $m$ and $b$. \n"
2222
]
@@ -33,7 +33,9 @@
3333
{
3434
"cell_type": "code",
3535
"execution_count": 1,
36-
"metadata": {},
36+
"metadata": {
37+
"collapsed": true
38+
},
3739
"outputs": [],
3840
"source": [
3941
"import warnings\n",
@@ -59,7 +61,9 @@
5961
{
6062
"cell_type": "code",
6163
"execution_count": 2,
62-
"metadata": {},
64+
"metadata": {
65+
"collapsed": true
66+
},
6367
"outputs": [],
6468
"source": [
6569
"data = np.random.randn(500, 2)"
@@ -162,7 +166,9 @@
162166
{
163167
"cell_type": "code",
164168
"execution_count": 6,
165-
"metadata": {},
169+
"metadata": {
170+
"collapsed": true
171+
},
166172
"outputs": [],
167173
"source": [
168174
"theta = np.zeros(2)"
@@ -203,7 +209,7 @@
203209
"\n",
204210
"Mean Squared Error (MSE) of Regression is given as:\n",
205211
"\n",
206-
"$$J=\\frac{1}{N} \\sum_{i=1}^{N}(y-\\hat{y})^{2} \\tag{2}$$\n",
212+
"$$J=\\frac{1}{N} \\sum_{i=1}^{N}(y-\\hat{y})^{2} -- (2) $$\n",
207213
"\n",
208214
"\n",
209215
"Where $N$ is the number of training samples, $y$ is the actual value and $\\hat{y}$ is the predicted value.\n",
@@ -216,7 +222,9 @@
216222
{
217223
"cell_type": "code",
218224
"execution_count": 8,
219-
"metadata": {},
225+
"metadata": {
226+
"collapsed": true
227+
},
220228
"outputs": [],
221229
"source": [
222230
"def loss_function(data,theta):\n",
@@ -287,19 +295,23 @@
287295
"Gradients of loss function $J$ with respect to parameter $m$ is given as:\n",
288296
"\n",
289297
"\n",
290-
"$ \\frac{d J}{d m}=\\frac{2}{N} \\sum_{i=1}^{N}-x_{i}\\left(y_{i}-\\left(m x_{i}+b\\right)\\right) \\tag{3}$\n",
298+
"$$ \\frac{d J}{d m}=\\frac{2}{N} \\sum_{i=1}^{N}-x_{i}\\left(y_{i}-\\left(m x_{i}+b\\right)\\right) -- (3) $$\n",
291299
"\n",
292300
"\n",
293301
"Gradients of loss function $J$ with respect to parameter $b$ is given as:\n",
294-
"$ \\frac{d J}{d b}=\\frac{2}{N} \\sum_{i=1}^{N}-\\left(y_{i}-\\left(m x_{i}+b\\right)\\right)\\tag{4} $\n",
302+
"\n",
303+
"\n",
304+
"$$ \\frac{d J}{d b}=\\frac{2}{N} \\sum_{i=1}^{N}-\\left(y_{i}-\\left(m x_{i}+b\\right)\\right) -- (4) $$\n",
295305
"\n",
296306
"We define a function called compute_gradients which takes the data and parameter theta as an input and returns the computed gradients: "
297307
]
298308
},
299309
{
300310
"cell_type": "code",
301311
"execution_count": 10,
302-
"metadata": {},
312+
"metadata": {
313+
"collapsed": true
314+
},
303315
"outputs": [],
304316
"source": [
305317
"def compute_gradients(data, theta):\n",
@@ -360,14 +372,14 @@
360372
"\n",
361373
"After computing gradients we need to update our model paramater according to our update rule as given below:\n",
362374
"\n",
363-
"$m=m-\\alpha \\frac{d J}{d m} \\tag{5}$\n",
375+
"$$m=m-\\alpha \\frac{d J}{d m} -- (5) $$ \n",
364376
"\n",
365-
"$ b=b-\\alpha \\frac{d J}{d b}\\tag{6}$\n",
377+
"$$ b=b-\\alpha \\frac{d J}{d b} --(6) $$\n",
366378
"\n",
367379
"\n",
368380
"Since we stored $m$ in theta[0] and $b$ in theta[1], we can write our update equation as: \n",
369381
"\n",
370-
"$\\theta = \\theta - \\alpha \\frac{dJ}{d\\theta} \\tag{7}$\n",
382+
"$$\\theta = \\theta - \\alpha \\frac{dJ}{d\\theta} -- (7) $$\n",
371383
"\n",
372384
"As we learned in the previous section, updating gradients for just one time will not lead us to the convergence i.e minimum of the cost function, so we need to compute gradients and the update the model parameter for several iterations:\n",
373385
"\n",
@@ -378,7 +390,9 @@
378390
{
379391
"cell_type": "code",
380392
"execution_count": 12,
381-
"metadata": {},
393+
"metadata": {
394+
"collapsed": true
395+
},
382396
"outputs": [],
383397
"source": [
384398
"num_iterations = 50000"
@@ -394,7 +408,9 @@
394408
{
395409
"cell_type": "code",
396410
"execution_count": 13,
397-
"metadata": {},
411+
"metadata": {
412+
"collapsed": true
413+
},
398414
"outputs": [],
399415
"source": [
400416
"lr = 1e-2"
@@ -410,7 +426,9 @@
410426
{
411427
"cell_type": "code",
412428
"execution_count": 14,
413-
"metadata": {},
429+
"metadata": {
430+
"collapsed": true
431+
},
414432
"outputs": [],
415433
"source": [
416434
"loss = []"
@@ -426,7 +444,9 @@
426444
{
427445
"cell_type": "code",
428446
"execution_count": 15,
429-
"metadata": {},
447+
"metadata": {
448+
"collapsed": true
449+
},
430450
"outputs": [],
431451
"source": [
432452
"theta = np.zeros(2)\n",
@@ -496,7 +516,7 @@
496516
],
497517
"metadata": {
498518
"kernelspec": {
499-
"display_name": "Python 2",
519+
"display_name": "Python [default]",
500520
"language": "python",
501521
"name": "python2"
502522
},
@@ -510,7 +530,7 @@
510530
"name": "python",
511531
"nbconvert_exporter": "python",
512532
"pygments_lexer": "ipython2",
513-
"version": "2.7.12"
533+
"version": "2.7.11"
514534
}
515535
},
516536
"nbformat": 4,

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /