Commit 19dca46

committed

updated perceptron notebook

1 parent 8836172 commit 19dca46Copy full SHA for 19dca46

File tree

1 file changed

+284

-7

lines changed

notebooks
- Multilayer_Perceptron.ipynb

1 file changed

+284

-7

lines changed

`‎notebooks/Multilayer_Perceptron.ipynb‎`

Lines changed: 284 additions & 7 deletions

Original file line number	Diff line number	Diff line change
`@@ -14,42 +14,259 @@`
`14`	`14`	`"License: [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/) ([CA BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/))\n",`
`15`	`15`	`"\n",`
`16`	`16`	`"Literature:\n",`
	`17`	`+ "\n",`
	`18`	`+ "- Samy Baladram \"[Multilayer Perceptron, Explained: A Visual Guide with Mini 2D Dataset](https://towardsdatascience.com/multilayer-perceptron-explained-a-visual-guide-with-mini-2d-dataset-0ae8100c5d1c)\"\n",`
`17`	`19`	`"\n"`
`18`	`20`	`]`
`19`	`21`	`},`
`20`	`22`	`{`
`21`	`23`	`"cell_type": "code",`
`22`		`- "execution_count": null,`
	`24`	`+ "execution_count": 1,`
`23`	`25`	`"metadata": {},`
`24`	`26`	`"outputs": [],`
`25`	`27`	`"source": [`
`26`		`- "import numpy as np\n"`
	`28`	`+ "import numpy as np"`
`27`	`29`	`]`
`28`	`30`	`},`
`29`	`31`	`{`
`30`	`32`	`"cell_type": "markdown",`
`31`	`33`	`"metadata": {},`
`32`		`- "source": []`
	`34`	`+ "source": [`
	`35`	`+ "## Data\n",`
	`36`	`+ "\n",`
	`37`	`+ "We will use the data set from Samy Baladram's article listed above. The data shows scores for temperature and humidity from 0 to 3, and a corresponding decision whether playing golf is possible. See [here](https://towardsdatascience.com/support-vector-classifier-explained-a-visual-guide-with-mini-2d-dataset-62e831e7b9e9) for an explanation of the data set."`
	`38`	`+ ]`
	`39`	`+ },`
	`40`	`+ {`
	`41`	`+ "cell_type": "code",`
	`42`	`+ "execution_count": 5,`
	`43`	`+ "metadata": {},`
	`44`	`+ "outputs": [],`
	`45`	`+ "source": [`
	`46`	`+ "training_data = [\n",`
	`47`	`+ " (0, 0, 1),\n",`
	`48`	`+ " (1, 0, 0),\n",`
	`49`	`+ " (1, 1, 0),\n",`
	`50`	`+ " (2, 0, 0),\n",`
	`51`	`+ " (3, 1, 1),\n",`
	`52`	`+ " (3, 2, 1),\n",`
	`53`	`+ " (2, 3, 1),\n",`
	`54`	`+ " (3, 3, 0)\n",`
	`55`	`+ "]"`
	`56`	`+ ]`
	`57`	`+ },`
	`58`	`+ {`
	`59`	`+ "cell_type": "code",`
	`60`	`+ "execution_count": 6,`
	`61`	`+ "metadata": {},`
	`62`	`+ "outputs": [],`
	`63`	`+ "source": [`
	`64`	`+ "test_data = [\n",`
	`65`	`+ " (0, 1, 0),\n",`
	`66`	`+ " (0, 2, 0),\n",`
	`67`	`+ " (1, 3, 1),\n",`
	`68`	`+ " (2, 2, 1),\n",`
	`69`	`+ " (3, 1, 1)\n",`
	`70`	`+ "]"`
	`71`	`+ ]`
`33`	`72`	`},`
`34`	`73`	`{`
`35`	`74`	`"cell_type": "markdown",`
`36`	`75`	`"metadata": {},`
`37`	`76`	`"source": [`
`38`		`- "## Introduction"`
	`77`	`+ "## Introduction\n",`
	`78`	`+ "\n",`
	`79`	`+ "The network architecture will consume an input vector with two dimensions. One dimension is the score for temperature and the other is the score for humidity.\n",`
	`80`	`+ "\n",`
	`81`	`+ "We can design the first hidden layer with three nodes, a second subsequent hidden layer with two nodes, and an output layer with one node.\n",`
	`82`	`+ "\n",`
	`83`	`+ "All nodes are fully connected and represented as a matrix $W$ of 2 x 3 dimensions. The second hidden layer is a matrix $U$ with 3 x 2 dimensions."`
	`84`	`+ ]`
	`85`	`+ },`
	`86`	`+ {`
	`87`	`+ "cell_type": "code",`
	`88`	`+ "execution_count": 37,`
	`89`	`+ "metadata": {},`
	`90`	`+ "outputs": [`
	`91`	`+ {`
	`92`	`+ "name": "stdout",`
	`93`	`+ "output_type": "stream",`
	`94`	`+ "text": [`
	`95`	`+ "W [[0.57916493 0.1989773 0.71685006]\n",`
	`96`	`+ " [0.06420334 0.23917944 0.03679699]]\n",`
	`97`	`+ "U [[0.44530666 0.60784364]\n",`
	`98`	`+ " [0.77164787 0.40612112]\n",`
	`99`	`+ " [0.83222563 0.69558143]]\n",`
	`100`	`+ "bias_W [[0.90328775 0.89391968 0.63126251]]\n",`
	`101`	`+ "bias_U [[0.93231218 0.7755912 ]]\n",`
	`102`	`+ "O [[0.6369282 ]\n",`
	`103`	`+ " [0.36734706]]\n",`
	`104`	`+ "bias_O [[0.93714153]]\n"`
	`105`	`+ ]`
	`106`	`+ }`
	`107`	`+ ],`
	`108`	`+ "source": [`
	`109`	`+ "W = np.random.random((2, 3))\n",`
	`110`	`+ "print(f\"W {W}\")\n",`
	`111`	`+ "U = np.random.random((3, 2))\n",`
	`112`	`+ "print(f\"U {U}\")\n",`
	`113`	`+ "bias_W = np.random.random((1, 3))\n",`
	`114`	`+ "print(f\"bias_W {bias_W}\")\n",`
	`115`	`+ "bias_U = np.random.random((1, 2))\n",`
	`116`	`+ "print(f\"bias_U {bias_U}\")\n",`
	`117`	`+ "O = np.random.random((2, 1))\n",`
	`118`	`+ "print(f\"O {O}\")\n",`
	`119`	`+ "bias_O = np.random.random((1, 1))\n",`
	`120`	`+ "print(f\"bias_O {bias_O}\")"`
	`121`	`+ ]`
	`122`	`+ },`
	`123`	`+ {`
	`124`	`+ "cell_type": "code",`
	`125`	`+ "execution_count": 16,`
	`126`	`+ "metadata": {},`
	`127`	`+ "outputs": [`
	`128`	`+ {`
	`129`	`+ "name": "stdout",`
	`130`	`+ "output_type": "stream",`
	`131`	`+ "text": [`
	`132`	`+ "input_data [[0 0]\n",`
	`133`	`+ " [1 0]\n",`
	`134`	`+ " [1 1]\n",`
	`135`	`+ " [2 0]\n",`
	`136`	`+ " [3 1]\n",`
	`137`	`+ " [3 2]\n",`
	`138`	`+ " [2 3]\n",`
	`139`	`+ " [3 3]]\n",`
	`140`	`+ "input_data_ground_truth [[1]\n",`
	`141`	`+ " [0]\n",`
	`142`	`+ " [0]\n",`
	`143`	`+ " [0]\n",`
	`144`	`+ " [1]\n",`
	`145`	`+ " [1]\n",`
	`146`	`+ " [1]\n",`
	`147`	`+ " [0]]\n"`
	`148`	`+ ]`
	`149`	`+ }`
	`150`	`+ ],`
	`151`	`+ "source": [`
	`152`	`+ "input_data = np.array([[x[0], x[1]] for x in training_data])\n",`
	`153`	`+ "input_data_ground_truth = np.array([[x[2]] for x in training_data])\n",`
	`154`	`+ "print(f\"input_data {input_data}\")\n",`
	`155`	`+ "print(f\"input_data_ground_truth {input_data_ground_truth}\")"`
	`156`	`+ ]`
	`157`	`+ },`
	`158`	`+ {`
	`159`	`+ "cell_type": "code",`
	`160`	`+ "execution_count": 17,`
	`161`	`+ "metadata": {},`
	`162`	`+ "outputs": [`
	`163`	`+ {`
	`164`	`+ "data": {`
	`165`	`+ "text/plain": [`
	`166`	`+ "array([1, 0])"`
	`167`	`+ ]`
	`168`	`+ },`
	`169`	`+ "execution_count": 17,`
	`170`	`+ "metadata": {},`
	`171`	`+ "output_type": "execute_result"`
	`172`	`+ }`
	`173`	`+ ],`
	`174`	`+ "source": [`
	`175`	`+ "one_hot = np.array([0, 1, 0, 0, 0, 0, 0, 0])\n",`
	`176`	`+ "one_hot.dot(input_data)"`
	`177`	`+ ]`
	`178`	`+ },`
	`179`	`+ {`
	`180`	`+ "cell_type": "code",`
	`181`	`+ "execution_count": 18,`
	`182`	`+ "metadata": {},`
	`183`	`+ "outputs": [`
	`184`	`+ {`
	`185`	`+ "name": "stdout",`
	`186`	`+ "output_type": "stream",`
	`187`	`+ "text": [`
	`188`	`+ "[0 0] [1]\n",`
	`189`	`+ "[1 0] [0]\n",`
	`190`	`+ "[1 1] [0]\n",`
	`191`	`+ "[2 0] [0]\n",`
	`192`	`+ "[3 1] [1]\n",`
	`193`	`+ "[3 2] [1]\n",`
	`194`	`+ "[2 3] [1]\n",`
	`195`	`+ "[3 3] [0]\n"`
	`196`	`+ ]`
	`197`	`+ }`
	`198`	`+ ],`
	`199`	`+ "source": [`
	`200`	`+ "for row, true_score in zip(input_data, input_data_ground_truth):\n",`
	`201`	`+ " print(row, true_score)"`
	`202`	`+ ]`
	`203`	`+ },`
	`204`	`+ {`
	`205`	`+ "cell_type": "code",`
	`206`	`+ "execution_count": 38,`
	`207`	`+ "metadata": {},`
	`208`	`+ "outputs": [],`
	`209`	`+ "source": [`
	`210`	`+ "def sigmoid(z):\n",`
	`211`	`+ " return 1/(1 + np.exp(-z))"`
	`212`	`+ ]`
	`213`	`+ },`
	`214`	`+ {`
	`215`	`+ "cell_type": "code",`
	`216`	`+ "execution_count": 42,`
	`217`	`+ "metadata": {},`
	`218`	`+ "outputs": [],`
	`219`	`+ "source": [`
	`220`	`+ "def loss_function(predicted, actual):\n",`
	`221`	`+ " return np.log(predicted) if actual else np.log(1 - predicted)"`
`39`	`222`	`]`
`40`	`223`	`},`
`41`	`224`	`{`
`42`	`225`	`"cell_type": "code",`
`43`	`226`	`"execution_count": null,`
`44`	`227`	`"metadata": {},`
`45`	`228`	`"outputs": [],`
`46`		`- "source": []`
	`229`	`+ "source": [`
	`230`	`+ "learning_rate = 0.01"`
	`231`	`+ ]`
	`232`	`+ },`
	`233`	`+ {`
	`234`	`+ "cell_type": "code",`
	`235`	`+ "execution_count": 50,`
	`236`	`+ "metadata": {},`
	`237`	`+ "outputs": [`
	`238`	`+ {`
	`239`	`+ "name": "stdout",`
	`240`	`+ "output_type": "stream",`
	`241`	`+ "text": [`
	`242`	`+ "output 0.9658545034605426 - true score: 1 - loss -0.03474207364924937\n",`
	`243`	`+ "output 0.986959889282255 - true score: 0 - loss -4.3397252318950565\n",`
	`244`	`+ "output 0.9894527613414252 - true score: 0 - loss -4.5518911918432865\n",`
	`245`	`+ "output 0.995086368253607 - true score: 0 - loss -5.315741947375225\n",`
	`246`	`+ "output 0.9985133193959704 - true score: 1 - loss -0.0014877868101581678\n",`
	`247`	`+ "output 0.9988002123932317 - true score: 1 - loss -0.0012005079281317262\n",`
	`248`	`+ "output 0.9974135571146144 - true score: 1 - loss -0.002589793507494032\n",`
	`249`	`+ "output 0.9990317957413032 - true score: 0 - loss -6.940067481896969\n"`
	`250`	`+ ]`
	`251`	`+ }`
	`252`	`+ ],`
	`253`	`+ "source": [`
	`254`	`+ "for row, true_score in zip(input_data, input_data_ground_truth):\n",`
	`255`	`+ " # print(row, true_score)\n",`
	`256`	`+ " hidden_layer_W = np.maximum(row.dot(W) + bias_W, 0)[0] # ReLU activation\n",`
	`257`	`+ " # print(f\"hidden_layer_W {hidden_layer_W}\")\n",`
	`258`	`+ " hidden_layer_U = np.maximum(hidden_layer_W.dot(U) + bias_U, 0)[0] # ReLU activation\n",`
	`259`	`+ " # print(f\"hidden_layer_U {hidden_layer_U}\")\n",`
	`260`	`+ " output = (sigmoid(hidden_layer_U.dot(O) + bias_O))[0][0]\n",`
	`261`	`+ " loss = loss_function(output, true_score[0])\n",`
	`262`	`+ " print(f\"output {output} - true score: {true_score[0]} - loss {loss}\")"`
	`263`	`+ ]`
`47`	`264`	`},`
`48`	`265`	`{`
`49`	`266`	`"cell_type": "markdown",`
`50`	`267`	`"metadata": {},`
`51`	`268`	`"source": [`
`52`		`- "## Inference"`
	`269`	`+ "Adding a loss function using binary cross-entropy:"`
`53`	`270`	`]`
`54`	`271`	`},`
`55`	`272`	`{`
`@@ -66,6 +283,52 @@`
`66`	`283`	`"## Training"`
`67`	`284`	`]`
`68`	`285`	`},`
	`286`	`+ {`
	`287`	`+ "cell_type": "markdown",`
	`288`	`+ "metadata": {},`
	`289`	`+ "source": [`
	`290`	`+ "## Backpropagation \n",`
	`291`	`+ "\n",`
	`292`	`+ "\n",`
	`293`	`+ "### Derivative Rules\n",`
	`294`	`+ "\n",`
	`295`	`+ "\n",`
	`296`	`+ "#### Constant Rule\n",`
	`297`	`+ "\n",`
	`298`	`+ "$y = k$ with $k$ a constant: $\\frac{dy}{dx}=0$\n",`
	`299`	`+ "\n",`
	`300`	`+ "\n",`
	`301`	`+ "#### Power Rule\n",`
	`302`	`+ "\n",`
	`303`	`+ "$y=x^n$ the derivative is: $\\frac{dy}{dx} (n -1)x^{n-1}$ \n",`
	`304`	`+ "\n",`
	`305`	`+ "\n",`
	`306`	`+ "#### Exponential Rule\n",`
	`307`	`+ "\n",`
	`308`	`+ "$y=e^{kx}$ the derivative is: $\\frac{dy}{dx}= k e^{kx}$\n",`
	`309`	`+ "\n",`
	`310`	`+ "\n",`
	`311`	`+ "#### Natural Logarithm Rule\n",`
	`312`	`+ "\n",`
	`313`	`+ "$y=ln(x)$ the derivative is: $\\frac{dy}{dx}=\\frac{1}{x}$\n",`
	`314`	`+ "\n",`
	`315`	`+ "\n",`
	`316`	`+ "#### Sum and Difference Rule\n",`
	`317`	`+ "\n",`
	`318`	`+ "$y = u + v$ or $y = u - v$ the derivatives are: $\\frac{dy}{dx} = \\frac{du}{dx} + \\frac{dv}{dx}$ or $\\frac{dy}{dx} = \\frac{du}{dx} - \\frac{dv}{dx}$\n",`
	`319`	`+ "\n",`
	`320`	`+ "\n",`
	`321`	`+ "#### Product Rule\n",`
	`322`	`+ "\n",`
	`323`	`+ "$y = u v$ the derivative is: $\\frac{dy}{dx} = \\frac{du}{dx} v + \\frac{dv}{dx} u$\n",`
	`324`	`+ "\n",`
	`325`	`+ "\n",`
	`326`	`+ "#### Chain Rule\n",`
	`327`	`+ "\n",`
	`328`	`+ "$y(x) = u(v(x))$ the derivative is: $\\frac{dy(x)}{dx} = \\frac{du(v(x))}{dx} \\frac{dv(x)}{dx}$\n",`
	`329`	`+ "\n"`
	`330`	`+ ]`
	`331`	`+ },`
`69`	`332`	`{`
`70`	`333`	`"cell_type": "code",`
`71`	`334`	`"execution_count": null,`
`@@ -82,8 +345,22 @@`
`82`	`345`	`}`
`83`	`346`	`],`
`84`	`347`	`"metadata": {`
	`348`	`+ "kernelspec": {`
	`349`	`+ "display_name": "Python 3",`
	`350`	`+ "language": "python",`
	`351`	`+ "name": "python3"`
	`352`	`+ },`
`85`	`353`	`"language_info": {`
`86`		`- "name": "python"`
	`354`	`+ "codemirror_mode": {`
	`355`	`+ "name": "ipython",`
	`356`	`+ "version": 3`
	`357`	`+ },`
	`358`	`+ "file_extension": ".py",`
	`359`	`+ "mimetype": "text/x-python",`
	`360`	`+ "name": "python",`
	`361`	`+ "nbconvert_exporter": "python",`
	`362`	`+ "pygments_lexer": "ipython3",`
	`363`	`+ "version": "3.12.7"`
`87`	`364`	`}`
`88`	`365`	`},`
`89`	`366`	`"nbformat": 4,`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 19dca46

File tree

1 file changed

1 file changed

`‎notebooks/Multilayer_Perceptron.ipynb‎`

0 commit comments