@@ -20,7 +20,7 @@ revealOptions:
20
20
21
21
- 📍 Principal Data Scientist, DSAI, Moderna
22
22
- 🎓 ScD, MIT Biological Engineering.
23
- - 🧬 Inverse protein, mRNA, and molecule design
23
+ - 🧬 Inverse protein, mRNA, and molecule design.
24
24
25
25
---
26
26
@@ -34,7 +34,7 @@ If you write automated tests for your work, then:
34
34
---
35
35
36
36
<!-- markdownlint-disable MD026 -->
37
- ## also...
37
+ ## 👀 also...
38
38
<!-- markdownlint-enable MD026 -->
39
39
40
40
- Tests apply to all software.
@@ -43,7 +43,7 @@ If you write automated tests for your work, then:
43
43
44
44
---
45
45
46
- ## Testing in Software
46
+ ## 💻 Testing in Software
47
47
48
48
- 🤔 Why do testing?
49
49
- 🧪 What does a test look like?
@@ -243,7 +243,7 @@ _Used to check that a system is working properly._
243
243
---
244
244
245
245
<!-- markdownlint-disable MD026 -->
246
- ## Hadley says...
246
+ ## 🧔♂️ Hadley says...
247
247
<!-- markdownlint-enable MD026 -->
248
248
249
249
<!-- markdownlint-disable MD033 -->
@@ -254,7 +254,7 @@ _You can't do data science in a GUI..._
254
254
255
255
----
256
256
257
- ### Data science needs code
257
+ ### 💻 Data science needs code
258
258
259
259
``` python
260
260
>> > code == software
@@ -265,17 +265,17 @@ _...implying that you'll be writing some kind of software to do data science wor
265
265
266
266
----
267
267
268
- ### Test your code
268
+ ### 👀 Test your code
269
269
270
270
Testing your DS code will be good for you!
271
271
272
272
---
273
273
274
- ## Testing in Data Science
274
+ ## 😎 Testing in Data Science
275
275
276
276
----
277
277
278
- ### 🧠 Machine Learning Model Code
278
+ ### 🧠 Testing Machine Learning Model Code
279
279
280
280
``` python
281
281
from project.models import Model
@@ -355,7 +355,7 @@ Ensure that model can be trained for at least 2 epochs.
355
355
356
356
---
357
357
358
- ### 📀 Data Testing
358
+ ### 📀 Testing Data
359
359
360
360
----
361
361
@@ -382,6 +382,9 @@ import pandera as pa
382
382
383
383
df_schema = pa.DataFrameSchema(
384
384
columns = {
385
+ # Declare that `some_column` must exist,
386
+ # that it must be integer type,
387
+ # and that it cannot contain any nulls.
385
388
" some_column" : pa.Column(int , nullable = False )
386
389
}
387
390
)
@@ -404,6 +407,41 @@ Code is much more readable.
404
407
405
408
---
406
409
410
+ ### 🚇 Testing Pipeline Code
411
+
412
+ ----
413
+
414
+ #### 💡 Pipelines are functions
415
+
416
+ ``` python
417
+ def pipeline (data ):
418
+ d1 = func1(data)
419
+ d2 = func2(d1)
420
+ d3 = func3(d1)
421
+ d4 = func4(d2, d3)
422
+ return outfunc(d4)
423
+ ```
424
+
425
+ ----
426
+
427
+ #### 👆 Each unit function can be unit tested
428
+
429
+ ``` python
430
+ def test_func1 (data ):
431
+ ...
432
+
433
+ def test_func2 (data ):
434
+ ...
435
+
436
+ def test_func3 (data ):
437
+ ...
438
+
439
+ def test_func4 (data ):
440
+ ...
441
+ ```
442
+
443
+ ---
444
+
407
445
## ☁️ Philosophy
408
446
409
447
Integrating testing into your work is one manifestation of _ defensive programming_ .
@@ -428,7 +466,7 @@ _Do unto others what you would have others do unto you._
428
466
429
467
---
430
468
431
- ## Summary
469
+ ## 😎 Summary
432
470
433
471
1 . ✅ Write tests for your ** code** .
434
472
2 . ✅ Write tests for your ** data** .
0 commit comments