This book represents a first step towards embedded machine learning. It presents techniques for optimizing and compressing deep learning models. These techniques make it easier to deploy a high-performance lightweight deep learning model on resource-constrained devices such as smartphones and microcontrollers. This paper also explores a topical knowledge transfer technique namely knowledge distillation. This technique makes it possible to improve the performance of a lightweight deep learning model while transferring to it the knowledge of a complex high-performance deep learning model. All these techniques have been detailed in this book and illustrated with practical Python implementations generally based on the use of the pytorch and tensorflow libraries.