Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 19dca46

Browse files
committed
updated perceptron notebook
1 parent 8836172 commit 19dca46

File tree

1 file changed

+284
-7
lines changed

1 file changed

+284
-7
lines changed

‎notebooks/Multilayer_Perceptron.ipynb‎

Lines changed: 284 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,42 +14,259 @@
1414
"**License:** [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/))\n",
1515
"\n",
1616
"**Literature:**\n",
17+
"\n",
18+
"- Samy Baladram \"[Multilayer Perceptron, Explained: A Visual Guide with Mini 2D Dataset](https://towardsdatascience.com/multilayer-perceptron-explained-a-visual-guide-with-mini-2d-dataset-0ae8100c5d1c)\"\n",
1719
"\n"
1820
]
1921
},
2022
{
2123
"cell_type": "code",
22-
"execution_count": null,
24+
"execution_count": 1,
2325
"metadata": {},
2426
"outputs": [],
2527
"source": [
26-
"import numpy as np\n"
28+
"import numpy as np"
2729
]
2830
},
2931
{
3032
"cell_type": "markdown",
3133
"metadata": {},
32-
"source": []
34+
"source": [
35+
"## Data\n",
36+
"\n",
37+
"We will use the data set from Samy Baladram's article listed above. The data shows scores for temperature and humidity from 0 to 3, and a corresponding decision whether playing golf is possible. See [here](https://towardsdatascience.com/support-vector-classifier-explained-a-visual-guide-with-mini-2d-dataset-62e831e7b9e9) for an explanation of the data set."
38+
]
39+
},
40+
{
41+
"cell_type": "code",
42+
"execution_count": 5,
43+
"metadata": {},
44+
"outputs": [],
45+
"source": [
46+
"training_data = [\n",
47+
" (0, 0, 1),\n",
48+
" (1, 0, 0),\n",
49+
" (1, 1, 0),\n",
50+
" (2, 0, 0),\n",
51+
" (3, 1, 1),\n",
52+
" (3, 2, 1),\n",
53+
" (2, 3, 1),\n",
54+
" (3, 3, 0)\n",
55+
"]"
56+
]
57+
},
58+
{
59+
"cell_type": "code",
60+
"execution_count": 6,
61+
"metadata": {},
62+
"outputs": [],
63+
"source": [
64+
"test_data = [\n",
65+
" (0, 1, 0),\n",
66+
" (0, 2, 0),\n",
67+
" (1, 3, 1),\n",
68+
" (2, 2, 1),\n",
69+
" (3, 1, 1)\n",
70+
"]"
71+
]
3372
},
3473
{
3574
"cell_type": "markdown",
3675
"metadata": {},
3776
"source": [
38-
"## Introduction"
77+
"## Introduction\n",
78+
"\n",
79+
"The network architecture will consume an input vector with two dimensions. One dimension is the score for temperature and the other is the score for humidity.\n",
80+
"\n",
81+
"We can design the first hidden layer with three nodes, a second subsequent hidden layer with two nodes, and an output layer with one node.\n",
82+
"\n",
83+
"All nodes are fully connected and represented as a matrix $W$ of 2 x 3 dimensions. The second hidden layer is a matrix $U$ with 3 x 2 dimensions."
84+
]
85+
},
86+
{
87+
"cell_type": "code",
88+
"execution_count": 37,
89+
"metadata": {},
90+
"outputs": [
91+
{
92+
"name": "stdout",
93+
"output_type": "stream",
94+
"text": [
95+
"W [[0.57916493 0.1989773 0.71685006]\n",
96+
" [0.06420334 0.23917944 0.03679699]]\n",
97+
"U [[0.44530666 0.60784364]\n",
98+
" [0.77164787 0.40612112]\n",
99+
" [0.83222563 0.69558143]]\n",
100+
"bias_W [[0.90328775 0.89391968 0.63126251]]\n",
101+
"bias_U [[0.93231218 0.7755912 ]]\n",
102+
"O [[0.6369282 ]\n",
103+
" [0.36734706]]\n",
104+
"bias_O [[0.93714153]]\n"
105+
]
106+
}
107+
],
108+
"source": [
109+
"W = np.random.random((2, 3))\n",
110+
"print(f\"W {W}\")\n",
111+
"U = np.random.random((3, 2))\n",
112+
"print(f\"U {U}\")\n",
113+
"bias_W = np.random.random((1, 3))\n",
114+
"print(f\"bias_W {bias_W}\")\n",
115+
"bias_U = np.random.random((1, 2))\n",
116+
"print(f\"bias_U {bias_U}\")\n",
117+
"O = np.random.random((2, 1))\n",
118+
"print(f\"O {O}\")\n",
119+
"bias_O = np.random.random((1, 1))\n",
120+
"print(f\"bias_O {bias_O}\")"
121+
]
122+
},
123+
{
124+
"cell_type": "code",
125+
"execution_count": 16,
126+
"metadata": {},
127+
"outputs": [
128+
{
129+
"name": "stdout",
130+
"output_type": "stream",
131+
"text": [
132+
"input_data [[0 0]\n",
133+
" [1 0]\n",
134+
" [1 1]\n",
135+
" [2 0]\n",
136+
" [3 1]\n",
137+
" [3 2]\n",
138+
" [2 3]\n",
139+
" [3 3]]\n",
140+
"input_data_ground_truth [[1]\n",
141+
" [0]\n",
142+
" [0]\n",
143+
" [0]\n",
144+
" [1]\n",
145+
" [1]\n",
146+
" [1]\n",
147+
" [0]]\n"
148+
]
149+
}
150+
],
151+
"source": [
152+
"input_data = np.array([[x[0], x[1]] for x in training_data])\n",
153+
"input_data_ground_truth = np.array([[x[2]] for x in training_data])\n",
154+
"print(f\"input_data {input_data}\")\n",
155+
"print(f\"input_data_ground_truth {input_data_ground_truth}\")"
156+
]
157+
},
158+
{
159+
"cell_type": "code",
160+
"execution_count": 17,
161+
"metadata": {},
162+
"outputs": [
163+
{
164+
"data": {
165+
"text/plain": [
166+
"array([1, 0])"
167+
]
168+
},
169+
"execution_count": 17,
170+
"metadata": {},
171+
"output_type": "execute_result"
172+
}
173+
],
174+
"source": [
175+
"one_hot = np.array([0, 1, 0, 0, 0, 0, 0, 0])\n",
176+
"one_hot.dot(input_data)"
177+
]
178+
},
179+
{
180+
"cell_type": "code",
181+
"execution_count": 18,
182+
"metadata": {},
183+
"outputs": [
184+
{
185+
"name": "stdout",
186+
"output_type": "stream",
187+
"text": [
188+
"[0 0] [1]\n",
189+
"[1 0] [0]\n",
190+
"[1 1] [0]\n",
191+
"[2 0] [0]\n",
192+
"[3 1] [1]\n",
193+
"[3 2] [1]\n",
194+
"[2 3] [1]\n",
195+
"[3 3] [0]\n"
196+
]
197+
}
198+
],
199+
"source": [
200+
"for row, true_score in zip(input_data, input_data_ground_truth):\n",
201+
" print(row, true_score)"
202+
]
203+
},
204+
{
205+
"cell_type": "code",
206+
"execution_count": 38,
207+
"metadata": {},
208+
"outputs": [],
209+
"source": [
210+
"def sigmoid(z):\n",
211+
" return 1/(1 + np.exp(-z))"
212+
]
213+
},
214+
{
215+
"cell_type": "code",
216+
"execution_count": 42,
217+
"metadata": {},
218+
"outputs": [],
219+
"source": [
220+
"def loss_function(predicted, actual):\n",
221+
" return np.log(predicted) if actual else np.log(1 - predicted)"
39222
]
40223
},
41224
{
42225
"cell_type": "code",
43226
"execution_count": null,
44227
"metadata": {},
45228
"outputs": [],
46-
"source": []
229+
"source": [
230+
"learning_rate = 0.01"
231+
]
232+
},
233+
{
234+
"cell_type": "code",
235+
"execution_count": 50,
236+
"metadata": {},
237+
"outputs": [
238+
{
239+
"name": "stdout",
240+
"output_type": "stream",
241+
"text": [
242+
"output 0.9658545034605426 - true score: 1 - loss -0.03474207364924937\n",
243+
"output 0.986959889282255 - true score: 0 - loss -4.3397252318950565\n",
244+
"output 0.9894527613414252 - true score: 0 - loss -4.5518911918432865\n",
245+
"output 0.995086368253607 - true score: 0 - loss -5.315741947375225\n",
246+
"output 0.9985133193959704 - true score: 1 - loss -0.0014877868101581678\n",
247+
"output 0.9988002123932317 - true score: 1 - loss -0.0012005079281317262\n",
248+
"output 0.9974135571146144 - true score: 1 - loss -0.002589793507494032\n",
249+
"output 0.9990317957413032 - true score: 0 - loss -6.940067481896969\n"
250+
]
251+
}
252+
],
253+
"source": [
254+
"for row, true_score in zip(input_data, input_data_ground_truth):\n",
255+
" # print(row, true_score)\n",
256+
" hidden_layer_W = np.maximum(row.dot(W) + bias_W, 0)[0] # ReLU activation\n",
257+
" # print(f\"hidden_layer_W {hidden_layer_W}\")\n",
258+
" hidden_layer_U = np.maximum(hidden_layer_W.dot(U) + bias_U, 0)[0] # ReLU activation\n",
259+
" # print(f\"hidden_layer_U {hidden_layer_U}\")\n",
260+
" output = (sigmoid(hidden_layer_U.dot(O) + bias_O))[0][0]\n",
261+
" loss = loss_function(output, true_score[0])\n",
262+
" print(f\"output {output} - true score: {true_score[0]} - loss {loss}\")"
263+
]
47264
},
48265
{
49266
"cell_type": "markdown",
50267
"metadata": {},
51268
"source": [
52-
"## Inference"
269+
"Adding a loss function using binary cross-entropy:"
53270
]
54271
},
55272
{
@@ -66,6 +283,52 @@
66283
"## Training"
67284
]
68285
},
286+
{
287+
"cell_type": "markdown",
288+
"metadata": {},
289+
"source": [
290+
"## Backpropagation \n",
291+
"\n",
292+
"\n",
293+
"### Derivative Rules\n",
294+
"\n",
295+
"\n",
296+
"#### Constant Rule\n",
297+
"\n",
298+
"$y = k$ with $k$ a constant: $\\frac{dy}{dx}=0$\n",
299+
"\n",
300+
"\n",
301+
"#### Power Rule\n",
302+
"\n",
303+
"$y=x^n$ the derivative is: $\\frac{dy}{dx} (n -1)x^{n-1}$ \n",
304+
"\n",
305+
"\n",
306+
"#### Exponential Rule\n",
307+
"\n",
308+
"$y=e^{kx}$ the derivative is: $\\frac{dy}{dx}= k e^{kx}$\n",
309+
"\n",
310+
"\n",
311+
"#### Natural Logarithm Rule\n",
312+
"\n",
313+
"$y=ln(x)$ the derivative is: $\\frac{dy}{dx}=\\frac{1}{x}$\n",
314+
"\n",
315+
"\n",
316+
"#### Sum and Difference Rule\n",
317+
"\n",
318+
"$y = u + v$ or $y = u - v$ the derivatives are: $\\frac{dy}{dx} = \\frac{du}{dx} + \\frac{dv}{dx}$ or $\\frac{dy}{dx} = \\frac{du}{dx} - \\frac{dv}{dx}$\n",
319+
"\n",
320+
"\n",
321+
"#### Product Rule\n",
322+
"\n",
323+
"$y = u v$ the derivative is: $\\frac{dy}{dx} = \\frac{du}{dx} v + \\frac{dv}{dx} u$\n",
324+
"\n",
325+
"\n",
326+
"#### Chain Rule\n",
327+
"\n",
328+
"$y(x) = u(v(x))$ the derivative is: $\\frac{dy(x)}{dx} = \\frac{du(v(x))}{dx} \\frac{dv(x)}{dx}$\n",
329+
"\n"
330+
]
331+
},
69332
{
70333
"cell_type": "code",
71334
"execution_count": null,
@@ -82,8 +345,22 @@
82345
}
83346
],
84347
"metadata": {
348+
"kernelspec": {
349+
"display_name": "Python 3",
350+
"language": "python",
351+
"name": "python3"
352+
},
85353
"language_info": {
86-
"name": "python"
354+
"codemirror_mode": {
355+
"name": "ipython",
356+
"version": 3
357+
},
358+
"file_extension": ".py",
359+
"mimetype": "text/x-python",
360+
"name": "python",
361+
"nbconvert_exporter": "python",
362+
"pygments_lexer": "ipython3",
363+
"version": "3.12.7"
87364
}
88365
},
89366
"nbformat": 4,

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /