Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit a7344ae

Browse files
committed
Update talk with pipeline unit tests
1 parent a2a4400 commit a7344ae

File tree

1 file changed

+48
-10
lines changed

1 file changed

+48
-10
lines changed

‎talk.md

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ revealOptions:
2020

2121
- 📍 Principal Data Scientist, DSAI, Moderna
2222
- 🎓 ScD, MIT Biological Engineering.
23-
- 🧬 Inverse protein, mRNA, and molecule design
23+
- 🧬 Inverse protein, mRNA, and molecule design.
2424

2525
---
2626

@@ -34,7 +34,7 @@ If you write automated tests for your work, then:
3434
---
3535

3636
<!-- markdownlint-disable MD026 -->
37-
## also...
37+
## 👀 also...
3838
<!-- markdownlint-enable MD026 -->
3939

4040
- Tests apply to all software.
@@ -43,7 +43,7 @@ If you write automated tests for your work, then:
4343

4444
---
4545

46-
## Testing in Software
46+
## 💻 Testing in Software
4747

4848
- 🤔 Why do testing?
4949
- 🧪 What does a test look like?
@@ -243,7 +243,7 @@ _Used to check that a system is working properly._
243243
---
244244

245245
<!-- markdownlint-disable MD026 -->
246-
## Hadley says...
246+
## 🧔‍♂️ Hadley says...
247247
<!-- markdownlint-enable MD026 -->
248248

249249
<!-- markdownlint-disable MD033 -->
@@ -254,7 +254,7 @@ _You can't do data science in a GUI..._
254254

255255
----
256256

257-
### Data science needs code
257+
### 💻 Data science needs code
258258

259259
```python
260260
>>> code == software
@@ -265,17 +265,17 @@ _...implying that you'll be writing some kind of software to do data science wor
265265

266266
----
267267

268-
### Test your code
268+
### 👀 Test your code
269269

270270
Testing your DS code will be good for you!
271271

272272
---
273273

274-
## Testing in Data Science
274+
## 😎Testing in Data Science
275275

276276
----
277277

278-
### 🧠 Machine Learning Model Code
278+
### 🧠 Testing Machine Learning Model Code
279279

280280
```python
281281
from project.models import Model
@@ -355,7 +355,7 @@ Ensure that model can be trained for at least 2 epochs.
355355

356356
---
357357

358-
### 📀 Data Testing
358+
### 📀 Testing Data
359359

360360
----
361361

@@ -382,6 +382,9 @@ import pandera as pa
382382

383383
df_schema = pa.DataFrameSchema(
384384
columns={
385+
# Declare that `some_column` must exist,
386+
# that it must be integer type,
387+
# and that it cannot contain any nulls.
385388
"some_column": pa.Column(int, nullable=False)
386389
}
387390
)
@@ -404,6 +407,41 @@ Code is much more readable.
404407

405408
---
406409

410+
### 🚇 Testing Pipeline Code
411+
412+
----
413+
414+
#### 💡 Pipelines are functions
415+
416+
```python
417+
def pipeline(data):
418+
d1 = func1(data)
419+
d2 = func2(d1)
420+
d3 = func3(d1)
421+
d4 = func4(d2, d3)
422+
return outfunc(d4)
423+
```
424+
425+
----
426+
427+
#### 👆 Each unit function can be unit tested
428+
429+
```python
430+
def test_func1(data):
431+
...
432+
433+
def test_func2(data):
434+
...
435+
436+
def test_func3(data):
437+
...
438+
439+
def test_func4(data):
440+
...
441+
```
442+
443+
---
444+
407445
## ☁️ Philosophy
408446

409447
Integrating testing into your work is one manifestation of _defensive programming_.
@@ -428,7 +466,7 @@ _Do unto others what you would have others do unto you._
428466

429467
---
430468

431-
## Summary
469+
## 😎 Summary
432470

433471
1. ✅ Write tests for your **code**.
434472
2. ✅ Write tests for your **data**.

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /