back to listing index

Over 150 of the Best Machine Learning, NLP, and Python Tutorials I’ve Found

[web search]
Original source (
Tags: python machine-learning nlp natural-language-processing neural-networks
Clipped on: 2017-06-26

Image (Asset 1/8) alt=
Einstein’s desk a few hours after his death. Source: LIFE Magazine

While machine learning has a rich history dating back to 1959, the field is evolving at an unprecedented rate. In a recent article, I discussed why the broader artificial intelligence field is booming and likely will for some time to come. Those interested in learning ML may find it daunting to get started.

As I prepare to start my Ph.D. program in the Fall, I’ve been scouring the web for good resources on all aspects of machine learning and NLP. Typically, I’ll find an interesting tutorial or video, and that leads to three or four more tutorials or videos, and before I know it, I have 20 tabs of new material I need to go through. (On a side note, Tab Bundler has been helpful to stay organized.)

After finding over 25 ML-related “cheat sheets”, I created a post that links to all the good ones.

To help others that are going through a similar discovery process, I’ve put together a list of the best tutorial content that I’ve found so far. It’s by no means an exhaustive list of every ML-related tutorial on the web — that would be overwhelming and duplicative. Plus, there is a bunch of mediocre content out there. My goal was to link to the best tutorials I found on the important subtopics within machine learning and NLP.

By tutorial, I’m referring to introductory content that is intending to teach a concept succinctly. I’ve avoided including chapters of books, which have a greater breadth of coverage, and research papers, which generally don’t do a good job in teaching concepts. Why not just buy a book? Tutorials are helpful when you’re trying to learn a specific niche topic or want to get different perspectives.

I’ve split this post into four sections: Machine Learning, NLP, Python, and Math. I’ve included a sampling of topics within each section, but given the vastness of the material, I can’t possibly include every possible topic.

For future posts, I may create a similar list of books, online videos, and code repos as I’m compiling a growing collection of those resources too.

If there are good tutorials you are aware of that I’m missing, please let me know! I’m trying to limit each topic to five or six tutorials since much beyond that would be repetitive. Each link should have different material from the other links or present information in a different way (e.g. code versus slides versus long-form) or from a different perspective.

Machine Learning

Machine Learning is Fun! (

Machine Learning Crash Course: Part I, Part II, Part III (Machine Learning at Berkeley)

An Introduction to Machine Learning Theory and Its Applications: A Visual Tutorial with Examples (

A Gentle Guide to Machine Learning (

Which machine learning algorithm should I use? (

Activation and Loss Functions

Sigmoid neurons (

What is the role of the activation function in a neural network? (

Comprehensive list of activation functions in neural networks with pros/cons (

Activation functions and it’s types-Which is better? (

Making Sense of Logarithmic Loss (

Loss Functions (Stanford CS231n)

L1 vs. L2 Loss function (

The cross-entropy cost function (


Role of Bias in Neural Networks (

Bias Nodes in Neural Networks (

What is bias in artificial neural network? (


Perceptrons (

The Perception (

Single-layer Neural Networks (Perceptrons) (

From Perceptrons to Deep Networks (


Introduction to linear regression analysis (

Linear Regression (

Linear Regression (

Logistic Regression (

Simple Linear Regression Tutorial for Machine Learning (

Logistic Regression Tutorial for Machine Learning (

Softmax Regression (

Gradient Descent

Learning with gradient descent (

Gradient Descent (

How to understand Gradient Descent algorithm (

An overview of gradient descent optimization algorithms (

Optimization: Stochastic Gradient Descent (Stanford CS231n)

Generative Learning

Generative Learning Algorithms (Stanford CS229)

A practical explanation of a Naive Bayes classifier (

Support Vector Machines

An introduction to Support Vector Machines (SVM) (

Support Vector Machines (Stanford CS229)

Linear classification: Support Vector Machine, Softmax (Stanford 231n)


Yes you should understand backprop (

Can you give a visual explanation for the back propagation algorithm for neural networks? (

How the backpropagation algorithm works (

Backpropagation Through Time and Vanishing Gradients (

A Gentle Introduction to Backpropagation Through Time (

Backpropagation, Intuitions (Stanford CS231n)

Deep Learning

Deep Learning in a Nutshell (

A Tutorial on Deep Learning (Quoc V. Le)

What is Deep Learning? (

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning? (

Optimization and Dimensionality Reduction

Seven Techniques for Data Dimensionality Reduction (

Principal components analysis (Stanford CS229)

Dropout: A simple way to improve neural networks (Hinton @ NIPS 2012)

How to train your Deep Neural Network (

Long Short Term Memory (LSTM)

A Gentle Introduction to Long Short-Term Memory Networks by the Experts (

Understanding LSTM Networks (

Exploring LSTMs (

Anyone Can Learn To Code an LSTM-RNN in Python (

Convolutional Neural Networks (CNNs)

Introducing convolutional networks (

Deep Learning and Convolutional Neural Networks (

Conv Nets: A Modular Perspective (

Understanding Convolutions (

Recurrent Neural Nets (RNNs)

Recurrent Neural Networks Tutorial (

Attention and Augmented Recurrent Neural Networks (

The Unreasonable Effectiveness of Recurrent Neural Networks (

A Deep Dive into Recurrent Neural Nets (

Reinforcement Learning

Simple Beginner’s guide to Reinforcement Learning & its implementation (

A Tutorial for Reinforcement Learning (

Learning Reinforcement Learning (

Deep Reinforcement Learning: Pong from Pixels (

Generative Adversarial Networks (GANs)

What’s a Generative Adversarial Network? (

Abusing Generative Adversarial Networks to Make 8-bit Pixel Art (

An introduction to Generative Adversarial Networks (with code in TensorFlow) (

Generative Adversarial Networks for Beginners (

Multi-task Learning

An Overview of Multi-Task Learning in Deep Neural Networks (


A Primer on Neural Network Models for Natural Language Processing (Yoav Goldberg)

The Definitive Guide to Natural Language Processing (

Introduction to Natural Language Processing (

Natural Language Processing Tutorial (

Natural Language Processing (almost) from Scratch (

Deep Learning and NLP

Deep Learning applied to NLP (

Deep Learning for NLP (without Magic) (Richard Socher)

Understanding Convolutional Neural Networks for NLP (

Deep Learning, NLP, and Representations (

Embed, encode, attend, predict: The new deep learning formula for state-of-the-art NLP models (

Understanding Natural Language with Deep Neural Networks Using Torch (

Deep Learning for NLP with Pytorch (

Word Vectors

Bag of Words Meets Bags of Popcorn (

On word embeddings Part I, Part II, Part III (

The amazing power of word vectors (

word2vec Parameter Learning Explained (

Word2Vec Tutorial — The Skip-Gram Model, Negative Sampling (


Attention and Memory in Deep Learning and NLP (

Sequence to Sequence Models (

Sequence to Sequence Learning with Neural Networks (NIPS 2014)

Machine Learning is Fun Part 5: Language Translation with Deep Learning and the Magic of Sequences (

How to use an Encoder-Decoder LSTM to Echo Sequences of Random Integers (

tf-seq2seq (


7 Steps to Mastering Machine Learning With Python (

An example machine learning notebook (


How To Implement The Perceptron Algorithm From Scratch In Python (

Implementing a Neural Network from Scratch in Python (

A Neural Network in 11 lines of Python (

Implementing Your Own k-Nearest Neighbour Algorithm Using Python (

Demonstration of Memory with a Long Short-Term Memory Network in Python (

How to Learn to Echo Random Integers with Long Short-Term Memory Recurrent Neural Networks (

How to Learn to Add Numbers with seq2seq Recurrent Neural Networks (

Scipy and numpy

Scipy Lecture Notes (

Python Numpy Tutorial (Stanford CS231n)

An introduction to Numpy and Scipy (UCSB CHE210D)

A Crash Course in Python for Scientists (


PyCon scikit-learn Tutorial Index (

scikit-learn Classification Algorithms (

scikit-learn Tutorials (

Abridged scikit-learn Tutorials (


Tensorflow Tutorials (

Introduction to TensorFlow — CPU vs GPU (

TensorFlow: A primer (

RNNs in Tensorflow (

Implementing a CNN for Text Classification in TensorFlow (

How to Run Text Summarization with TensorFlow (


PyTorch Tutorials (

A Gentle Intro to PyTorch (

Tutorial: Deep Learning in PyTorch (

PyTorch Examples (

PyTorch Tutorial (

PyTorch Tutorial for Deep Learning Researchers (


Math for Machine Learning (

Math for Machine Learning (UMIACS CMSC422)

Linear algebra

An Intuitive Guide to Linear Algebra (

A Programmer’s Intuition for Matrix Multiplication (

Understanding the Cross Product (

Understanding the Dot Product (

Linear Algebra for Machine Learning (U. of Buffalo CSE574)

Linear algebra cheat sheet for deep learning (

Linear Algebra Review and Reference (Stanford CS229)


Understanding Bayes Theorem With Ratios (

Review of Probability Theory (Stanford CS229)

Probability Theory Review for Machine Learning (Stanford CS229)

Probability Theory (U. of Buffalo CSE574)

Probability Theory for Machine Learning (U. of Toronto CSC411)


How To Understand Derivatives: The Quotient Rule, Exponents, and Logarithms (

How To Understand Derivatives: The Product, Power & Chain Rules (

Vector Calculus: Understanding the Gradient (

Differential Calculus (Stanford CS224n)

Calculus Overview (

  • Image (Asset 2/8) alt=