Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

senime123/Dataquest-1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

85 Commits

Repository files navigation

Dataquest

Data Scientist In Python - Dataquest.io - Exercises and activities

Course Outline: 8 step Data Science Track

Step 1 of 8: Python Introduction

Course 1 of 2:
Python for Data Science: Fundamentals
 	Note: I'd already completed a 4-part Python programming course 
on EDx through GTx, CS 1301. This was mainly review. 
 	
 Programming in Python
 	 Variables and Data Types
 	 Lists and For Loops
 	 Conditional Statements
 	 Dictionaries and Frequency Tables
 	 Functions: Fundamentals
 	 Functions: Intermediate
 	 Project: Learn and Install Jupyter Notebook
 	 Guided Project: Profitable App Profiles for the App Store and Google Play Markets
 	 	Results in my github account
Course 2 of 2: 
 Python for Data Science: Intermediate
 
 Cleaning and Preparing Data in Python
 Python Data Analysis Basics
 Object-Oriented Python
 Working with Dates and Times in Python
 Guided Project: Exploring Hacker News Posts
 	Results in my github account

Step 2 of 8: Data Analysis and Visualization

Course 1 of 6: 
		NumPy and pandas Fundamentals
 	
 Intro to NumPy
 	 Boolean Indexing with NumPy
 	 Intro to pandas
 	 Exploring Data with pandas: Fundamentals
 	 Exploring Data with pandas: Intermediate
 	 Data Cleaning Basics
 	 Guided Project: Exploring Ebay Car Sales Data
 	 	Results in my github account
Course 2 of 6: 
 Exploratory Data Visualization
 
 Line charts
 	 Multiple plots
 	 Bar plots and scatter plots
 	 Histograms and box plots
 	 Guided Project: Visualize Earnings Based on College Majors
 	 results in my github account
Course 3 of 6: 
 Storytelling through Data Visualization
 
 Improving Plot Aesthetics
 Color, Layout, and Annotations
 Guided Project: Visualizing the Gender Gap in College Degrees
 	Results in my github account
 Conditional Plots
 Visualizing Geographic Data
 	Note: Uses Basemap module which is deprecated, 
	should change to Cartopy
Course 4 of 6: 
 		Data Cleaning and Analysis
 
 Data Aggregation
 Combining Data With pandas
 Working With Strings In pandas
 Working With Missing And Duplicate Data
 Guided Project: Clean and Analyze Employee Exit Surveys
 	Results in my github account
Course 5 of 6: 
 		Data Cleaning: Advanced
 
 Regular Expression Basics
 Advanced Regular Expressions
 List Comprehensions and Lambda Functions
 Working with Missing Data
Course 6 of 6: 
 		Data Cleaning Project Walkthrough
 
 Data Cleaning Walkthrough
 Data Cleaning Walkthrough: Combining the Data
 Data Cleaning Walkthrough: Analyzing and Visualizing the Data
 Guided Project: Analyzing NYC High School SAT Data
 	Note: Uses Basemap module which is deprecated
	I've created a Colaboratory notebook 
 	Results in my github account
 Challenge: Cleaning Data
 Guided Project: Star Wars Survey
 	- Results can be found in my github 

Step 3 of 8: The Command Line

Note: Since these were command line exercises with 
which I was already familiar I didn't create a Jupyter notebook. 
Course 1 of 2: 
		Elements of the Command Line
 
 Intro to the Command Line
 	The Filesystem
 	Modifying the Filesystem
 	Glob Patterns and Wildcards
 	Users and Permissions
Course 2 of 2: 
	Text Processing in the Command Line
 
 Getting Help and Reading Documentation
 	File Inspection
 	Text Processing
 	Redirection and Pipelines
 	Standard Streams and File Descriptors

Step 4 of 8: Working with Data Sources

Course 1 of 4: 
	SQL Fundamentals
 
 Intro to SQL
 	Summary Statistics
 	Group Summary Statistics
 	Subqueries
 	Querying SQLite from Python
 	Guided Project: Analyzing CIA Factbook Data Using SQLite and Python
Course 2 of 4: 
	SQL Intermediate: Table Relations and Joins
 
 Joining Data in SQL
 	Intermediate Joins in SQL
 	Building and Organizing Complex Queries
 	Guided Project: Answering Business Questions Using SQL
 	Results in my github account
 	Table Relations and Normalization
 	Guided Project: Designing and Creating a Database
	Note: The mlb.db file used is too large to upload directly. 
Course 3 of 4: 
	SQL Databases: Advanced
 
 Using PostgreSQL
 	Command Line PostgreSQL
 	Project: PostgreSQL installation
 	Introduction to Indexing
 	Multi-Column Indexing
Course 4 of 4: 
		APIs and Webscraping
 
 Working with APIs
 	Intermediate APIs
 	Challenge: Working with the reddit API
 	Web Scraping

Step 5 of 8: Probability and Statistics

Course 1 of 5: 
	Statistics Fundamentals
 
 Sampling
 	Variables in Statistics
 	Frequency Distributions
 	Visualizing Frequency Distributions
 	Comparing Frequency Distributions
 	Guided Project: Investigating Fandango Movie Ratings
Course 2 of 5: 
	Statistics Intermediate: Averages and Variability
 
 The Mean
 	The Weighted Mean and the Median
 	The Mode
 	Measures of Variability
 	Z-scores
 	Guided Project: Finding the Best Markets to Advertise in
 	Results in my github account
Course 3 of 5: 
	Probability Fundamentals
 
 Estimating Probability
 	Probability Rules
 	Solving Complex Probability Problems
 	Permutations and Combinations
 	Guided Project: Mobile App for Lottery Addiction
 	Results in my github account
Course 4 of 5: 
	Conditional Probability
 
 Conditional Probability Fundamentals
 	Conditional Probability Intermediate
 	Bayes Theorem
 	The Naive Bayes Theorem
 	Guided Project: Building a Spam Filter with Naive Bayes
 	Results in my github account
Course 5 of 5: 
	Hypothesis Testing Fundamentals
 
 Significance Testing
 	Chi-squared Tests
 	Multi category Chi-squared Tests
 	Guided Project: Winning Jeopardy
 	Results in my github account

Step 6 of 8: Machine Learning Intro

Course 1 of 6: 
	Machine Learning Fundamentals
 
 Intro to K-nearest Neighbors
 	Evaluating Model Performance
 	Multivariate K-nearest Neighbors
 	Hyperparameter Optimization
 	Cross Validation
 	Guided Project: Predicting Car Prices
 	Results in my github account
Course 2 of 6: 
	Calculus for Machine Learning
 
 Understanding Linear and Nonlinear Functions
 	Understanding Limits
 	Finding Extreme Points
Course 3 of 6: 
	Linear Algebra for Machine Learning
 
 Linear Systems
 	Vectors
 	Matrix Algebra
 	Solution Sets
Course 4 of 6: 
	Linear Regression for Machine Learning
 
 The Linear Regression Model
 	Feature Selection
 	Gradient Descent
 	Ordinary Least Squares
 	Processing and Transforming Features
 	Guided Project: Predicting House Sale Prices
Course 5 of 6: 
	Machine Learning in Python Intermediate
 
 Logistic Regression
 	Intro to evaluating binary classifiers
 	Multiclass classification
 	Overfitting
 	Clustering basics
 	K-means clustering
 	Guided Project: Predicting the Stock Market
 	Results in my github account
Course 6 of 6: 
	Decision Trees
 
 Intro to Decision Trees
 	Building a Decision Tree
 	Applying a Decision Tree
 	Intro to Random Forests
 	Guided Project: Predicting Bike Rentals
 	Results in my github account

Step 7 of 8: Machine Learning Intermediate

Course 1 of 5: 
Deep Learning Fundamentals
 
 Representing Neural Networks
 	Nonlinear Activation Functions
 	Hidden Layers
 	Guided Project: Building a Handwritten Digits Classifier
 	Results in my github account
Course 2 of 5: 
	Machine Learning Project
 
 Machine Learning Project Walkthrough: Data Cleaning
 	Machine Learning Project Walkthrough: Preparing the Features
 	Machine Learning Project Walkthrough: Making Predictions
Course 3 of 5: 
	Kaggle Fundamentals
 
 Getting Started with Kaggle
 	Feature Preparation, Selection, and Engineering
 	Model Selection and Tuning
 	Guided Project: Creating a Kaggle Workflow
 	Results
Course 4 of 5: 
	Exploring Topics in Data Science
 
 Naive Bayes for Sentiment Analysis
 	An Intro to K-Nearest Neighbors
Course 5 of 5: 
	Natural Language Processing
 
 Intro to NLP

Step 8 of 8: Advanced Topics in Data Science

Course 1 of 6: 
	Functions - Advanced
 
 Best Practices for Writing Functions
 	Context Managers
 	Intro to Decorators
 	Decorators: Advanced
Course 2 of 6: 
Data Structures and Algorithms
 	
 Memory and Unicode
 	Algorithms
 	Binary Search
 	Data Structures
 	Recursion and Advanced Data Structures
 	Guided Project: Investigating Airplane Accidents
 	Results
Course 3 of 6: 
	Python Programming: Advanced
 
 OOP
 	Exception Handling
 	Lambda Functions
 	Intro to Computer Arch
 	Parallel Processing
Course 4 of 6: 
Command Line Intermediate
 
 Working with programs
 	Command Line Python Scripting
 	Challenge: Working with the Command Line
 	Working with Jupyter Console
 	Piping and redirecting output
 	Challenge: Data munging using the Command Line
 	Data Cleaning and Exploration using Csvkit
Course 5 of 6: 
	Git and Version Control
 
 Intro to Git
 	Git Remotes
 	Git Branches
 	Merge Conflicts
 	Project: Git installation and GitHub integration
Course 6 of 6: 
	Spark and Map-reduce
 
 Intro to Spark
 	Project: Spark Installation and Jupyter Notebook integration
 	Transformations and Actions
 	Challenge: Transforming Hamlet into a Data Set
 	Spark DataFrames
 	Spark SQL

Thanks for visiting, and happy coding.

About

Data Science Track - Exercises and activities towards Dataquest.io

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

  • Jupyter Notebook 92.2%
  • HTML 7.4%
  • Python 0.4%

AltStyle によって変換されたページ (->オリジナル) /