Self-Paced Course

Data Science in Python: Unsupervised Learning

Master the foundations of unsupervised learning in Python, including clustering, anomaly detection, dimensionality reduction, and recommenders

Course Hours28 hours

Skills Learned

Machine Learning

Data Analysis

Data Visualization

Tools

Python

Course Level

Intermediate

Credentials

Paths

Course Description

This is a hands-on, project-based course designed to help you master the foundations for unsupervised learning in Python.

We’ll start by reviewing the data science workflow, discussing the techniques & applications of unsupervised learning, and walking through the data prep steps required for modeling. You’ll learn how to set the correct row granularity for modeling, apply feature engineering techniques, select relevant features, and scale your data using normalization and standardization.

From there we'll fit, tune, and interpret 3 popular clustering models using scikit-learn. We’ll start with K-Means Clustering, learn to interpret the output’s cluster centers, and use inertia plots to select the right number of clusters. Next, we’ll cover Hierarchical Clustering, where we’ll use dendrograms to identify clusters and cluster maps to interpret them. Finally, we’ll use DBSCAN to detect clusters and noise points and evaluate the models using their silhouette score.

We’ll also use DBSCAN and Isolation Forests for anomaly detection, a common application of unsupervised learning models for identifying outliers and anomalous patterns. You’ll learn to tune and interpret the results of each model and visualize the anomalies using pair plots.

Next, we’ll introduce the concept of dimensionality reduction, discuss its benefits for data science, and explore the stages in the data science workflow in which it can be applied. We’ll then cover two popular techniques: Principal Component Analysis, which is great for both feature extraction and data visualization, and t-SNE, which is ideal for data visualization.

Last but not least, we’ll introduce recommendation engines, and you'll practice creating both content-based and collaborative filtering recommenders using techniques such as Cosine Similarity and Singular Value Decomposition.

Throughout the course you'll play the role of an Associate Data Scientist for the HR Analytics team at a software company trying to increase employee retention. Using the skills you learn throughout the course, you'll use Python to segment the employees, visualize the clusters, and recommend next steps to increase retention.

If you're an aspiring or seasoned data scientist looking for a practical overview of unsupervised learning techniques in Python with a focus on interpretation, this is the course for you.

COURSE CONTENTS:

16.5 hours on-demand video
22 homework assignments
7 quizzes
3 projects
2 skills assessments (1 benchmark, 1 final)

COURSE CURRICULUM:

WHO SHOULD TAKE THIS COURSE?

Data scientists who want to learn how to build and interpret unsupervised learning models in Python
Analysts or BI experts looking to learn about unsupervised learning or transition into a data science role
Anyone interested in learning one of the most popular open source programming languages in the world

WHAT ARE THE COURSE REQUIREMENTS?

We strongly recommend taking our Data Prep & EDA course first
Jupyter Notebooks (free download, we'll walk through the install)
Familiarity with base Python and Pandas is recommended, but not required

Start learning for FREE, no credit card required!

Every subscription includes access to the following course materials

Interactive Project files
Downloadable e-books
Graded quizzes and assessments
1-on-1 Expert support
100% satisfaction guarantee
Verified credentials & accredited badges

Meet Your Instructor

Alice Zhao

Lead Data Science Instructor

Alice Zhao is a seasoned data scientist and author of the book, SQL Pocket Guide, 4th Edition (O'Reilly). She has taught numerous courses in Python, SQL, and R as a data science instructor at Maven Analytics, Northwestern and O'Reilly, and as a co-founder of Best Fit Analytics.

Qualifications

Python & SQL Expert
MS in Analytics

Testimonials

"A very strong entry into the data science series from Maven Analytics. The teaching style is excellent, there are no assumptions made about previous exposure to data science concepts, and Alice makes the explanations accessible to all levels."

- Colin T.
"Many courses are either too academic or too quick to jump into the modeling itself. Alice has done a great job in bringing her practical experience and knowledge for us to go through the full spectrum as a Data Scientist."

- Jian Z.
"This is exactly what I was looking for! The course is concise yet still covers the necessary things any data analyst or data scientist needs to know or will use."

- Yuansheng X.

Ready to become a

data rockstar?

Start learning for free, no credit card required!