Data Science, Deep Learning, & Machine Learning with Python – Course by Frank Kane

Level: Beginner
Duration: 12 hours
Delivery: Online
Certification: Unknown
Cost: 160
Course Provider: Frank Kane / Sundog


This comprehensive specialisation course includes over 80 lectures spanning 12 hours of video; most topics include  Python code examples you can use for reference and for practice. You will need some programming or scripting experience.

This course will teach you the techniques used by real data scientists and machine learning practitioners in the tech industry – and prepare you for a move into this hot career path. The instructor draws on his 9 years of experience at Amazon and IMDb to guide you through what matters, and what doesn’t.

Topics will include data distributions, probability mass functions, and probability density functions; visualising data with matplotlib; using covariance and correlation metrics, using Bayes’ Theorem to identify false positives; making predictions using linear regression, polynomial regression, and multivariate regression; using train/test and K-Fold cross validation to choose the right model; using decision trees to predict hiring decisions, clustering data using K-Means clustering and Support Vector Machines (SVM), building recommender systems using item-based and user-based collaborative filtering, predicting classifications using K-Nearest-Neighbor (KNN), applying dimensionality reduction with Principal Component Analysis (PCA), implementing machine learning, clustering, and search using TF/IDF at massive scale with Apache Spark’s MLLib and more.

Training Course Content

  • Introduction
  • Python Basics, Part 1
  • Running Python Scripts
  • Introducing the Pandas Library


Statistics and Probability Refresher, and Python Practise

  • Types of Data
  • Mean, Median, Mode
  • Probability Density Function; Probability Mass Function
  • Common Data Distributions
  • [Activity] Percentiles and Moments
  • [Activity] A Crash Course in matplotlib
  • [Activity] Covariance and Correlation
  • [Exercise] Conditional Probability
  • Exercise Solution: Conditional Probability of Purchase by Age
  • Bayes’ Theorem


Predictive Models

  • [Activity] Linear Regression
  • [Activity] Polynomial Regression
  • [Activity] Multivariate Regression, and Predicting Car Prices
  • Multi-Level Models


Machine Learning with Python

  • Supervised vs. Unsupervised Learning, and Train/Test
  • [Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression
  • Bayesian Methods: Concepts
  • [Activity] Implementing a Spam Classifier with Naive Bayes
  • K-Means Clustering
  • Measuring Entropy
  • [Activity] Install GraphViz
  • Decision Trees: Concepts
  • [Activity] Decision Trees: Predicting Hiring Decisions
  • Ensemble Learning
  • Support Vector Machines (SVM) Overview
  • [Activity] Using SVM to cluster people using scikit-learn


Recommender Systems

  • User-Based Collaborative Filtering
  • Item-Based Collaborative Filtering
  • [Activity] Finding Movie Similarities
  • [Activity] Improving the Results of Movie Similarities
  • [Activity] Making Movie Recommendations to People
  • [Exercise] Improve the recommender’s results


More Data Mining and Machine Learning Techniques

  • K-Nearest-Neighbors: Concepts
  • [Activity] Using KNN to predict a rating for a movie
  • Dimensionality Reduction; Principal Component Analysis
  • [Activity] PCA Example with the Iris data se
  • Data Warehousing Overview: ETL and ELT
  • Reinforcement Learning


Dealing with Real-World Data

  • Bias/Variance Tradeoff
  • [Activity] K-Fold Cross-Validation to avoid overfitting
  • Data Cleaning and Normalization
  • [Activity] Cleaning web log data
  • Normalizing numerical data
  • [Activity] Detecting outliers


Apache Spark: Machine Learning on Big Data

  • Warning about Java 9!
  • [Activity] Installing Spark – Part 1
  • [Activity] Installing Spark – Part 2
  • Spark Introduction
  • Spark and the Resilient Distributed Dataset (RDD)
  • Introducing MLLib
  • [Activity] Decision Trees in Spark
  • [Activity] K-Means Clustering in Spark
  • TF / IDF
  • [Activity] Searching Wikipedia with Spark
  • [Activity] Using the Spark 2.0 DataFrame API for MLLib


Experimental Design

  • A/B Testing Concepts
  • T-Tests and P-Values
  • [Activity] Hands-on With T-Tests
  • Determining How Long to Run an Experiment
  • A/B Test Gotchas


Deep Learning and Neural Networks

  • Deep Learning Pre-Requisites
  • The History of Artificial Neural Networks
  • [Activity] Deep Learning in the Tensorflow Playground
  • Deep Learning Details
  • Introducing Tensorflow
  • [Activity] Using Tensorflow, Part 1
  • [Activity] Using Tensorflow, Part 2
  • [Activity] Introducing Keras
  • [Activity] Using Keras to Predict Political Affiliations
  • Convolutional Neural Networks (CNN’s)
  • [Activity] Using CNN’s for handwriting recognition
  • Recurrent Neural Networks (RNN’s)
  • [Activity] Using a RNN for sentiment analysis
  • The Ethics of Deep Learning
  • Learning More about Deep Learning


Final Project

  • Final project review

Who Is It For?

Some prior coding or scripting experience is required, and at least high school level math skills will be required. The course is geared toward software developers or programmers who want to transition into the lucrative data science career path, or data analysts in the finance or other non-tech industries who want to transition into the tech industry.



About the Provider

Frank Kane spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis.

Rate this Course

All fields marked with red asterisks are required fields.

User Reviews

· November 1, 2018

Frank is an accomplished instructor who really knows what he's talking about. The $160 price tag is miniscule compared to the value he is providing.

Your compare list