Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit df4e077

Browse files
Update homework from feedback
1 parent aee38e2 commit df4e077

File tree

1 file changed

+25
-45
lines changed

1 file changed

+25
-45
lines changed

‎hw/homework.md‎

Lines changed: 25 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
# Homework
1+
# BMI 565/665 Bioinformatics Programming and Scripting
22

33
Submit source code and write-up (including program output) through Sakai.
44

5+
56
## Background
67

78
A bunch of your friends really like wine, specifically Portuguese wine. One
@@ -52,14 +53,12 @@ You like to deal with comma-separated files (CSVs). Unfortunately, you find out
5253
that the data comes in a "semi-colon" separated file.
5354

5455
Use `sed` to convert these "semi-colon" separated files into a comma-separated
55-
files.
56-
57-
Save these converted data into the directory `data`.
56+
files, and save these converted data into the directory `data`.
5857

5958

6059
**Subset Data**
6160

62-
For your analysis you only want a couple physicochemical variables to check.
61+
For your analysis, you only want a couple physicochemical variables to check.
6362
There are a total of 12 variables, but you're only interested in:
6463

6564
- Citric acid
@@ -79,28 +78,28 @@ for good wine.
7978
| `white_wine_poor.csv` | <= 5 | Poor quality white wine |
8079
| `white_wine_good.csv` | > 5 | Good quality white wine |
8180

81+
**Hint**: `awk` can be used to quickly subset the data and create the 4 files.
82+
8283
Put there four files into the `data` directory.
8384

8485

8586
**Compare Low and High Quality**
8687

87-
Let's use Python to help us figure out what makes wine good or not.
88-
89-
Create a Python function to read in data from a given path and calculate the
90-
average value of a given variable name.
88+
Now use Python code to determine what makes wine good or not. Create a Python
89+
function to read in data from a given path and calculate the average value of a
90+
given variable name.
9191

9292
```python
9393
# Example
9494
avg_chloride_results = calculate_avg_value(data, "chlorides")
9595
```
9696

97-
You want to be lazy and automate as much as possible. So let's create a Python
98-
function that takes in an array of the file names and returns a dictionary.
97+
You want to automate this as much as possible. So create a Python function
98+
that takes in a list of the file names and returns a dictionary.
9999

100-
The dictionary will have four keys equal to just the file names they come from
101-
e.g. the key of `white_wine_good.csv` will be `white_wine_good`. The values of
102-
each key will be another dictionary with each key being the average value of
103-
one of the four variables we're interested in:
100+
The dictionary will have four keys equal to the file names (e.g. the key of `white_wine_good.csv` will be `white_wine_good`). The values of
101+
each filename key will be another dictionary with keys being the averages of
102+
each variable:
104103

105104
- Citric acid
106105
- Chlorides
@@ -116,37 +115,16 @@ avg_values = find_average_wines(wine_paths)
116115

117116
**Save Results**
118117

119-
120-
Write a Python function to save your dictionary of results to four separate
121-
files. Save your dictionaries as JavaScript Object Notation (JSON) files.
122-
123-
Use the built-in `json` Python package. Here's a hint on using it.
124-
125-
```python
126-
# Example on using the json package
127-
import json
128-
129-
your_dictionary = {"some_date" : "date"}
130-
f = open('destFile.txt', 'w+')
131-
f.write(json.dumps(your_dictionary))
132-
f.close()
133-
```
134-
135-
Save your four results into a directory `results`.
118+
Use the `cPickle` Python module to save the resulting dictionary to a file in a
119+
directory called `results` (Note: you'll have to create this directory
120+
beforehand).
136121

137122

138-
**Challenge**
123+
**Wrap-Up Workflow**
139124

140-
You want to automate everything as much as possible, so you want to create a
141-
Makefile to make everything. There are two Make rule: `all` and `clean`.
142-
143-
```shell
144-
# Run the entire analysis
145-
make all
146-
147-
# Remove all downloaded and created files from data/, download/, results/
148-
make clean
149-
```
125+
Now, to automate the entire workflow, create bash scripts that will
126+
automatically download and subset the data, then run the analysis (calculating
127+
average values) and save the results.
150128

151129

152130
## Homework File Structure
@@ -156,8 +134,8 @@ analysis.
156134

157135
```
158136
.
159-
|-- analyze_wine.py
160-
|-- analysis.sh
137+
|-- LastName_hw2.sh
138+
|-- LastName_analyze_wine.py
161139
|-- data/
162140
|-- results/
163141
`-- download/
@@ -169,3 +147,5 @@ analysis.
169147
- A single bash script to automate your analysis
170148
- A Python script to calculate the average citric acid, chlorides, pH, and
171149
alcohol values of good and poor quality red and white wine.
150+
- A brief write-up describing the workflow that was implemented and results
151+
produced (`LastName_hw2.doc`)

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /