Cover Page
images

Python® Machine Learning

 

 

Wei-Meng Lee

 

 

 

 

 

 

 

Wiley Logo

I dedicate this book with love to my dearest wife (Sze Wa) and girl (Chloe), who have to endure my irregular work schedule and for their companionship when I am trying to meet writing deadlines!

About the Author

Wei‐Meng Lee is a technologist and founder of Developer Learning Solutions (http://www.learn2develop.net), a company specializing in hands‐on training on the latest technologies.

Wei‐Meng has many years of training experience, and his training courses place special emphasis on the learning‐by‐doing approach. His hands‐on approach to learning programming makes understanding the subject much easier than just reading books, tutorials, and documentations.

Wei‐Meng's name regularly appears in online and print publications such as DevX.com, MobiForge.com, and CoDe Magazine. You can contact Wei‐Meng at: weimenglee@learn2develop.net.

About the Technical Editor

Doug Mahugh is a software developer who began his career in 1978 as a Fortran programmer for Boeing. Doug has worked for Microsoft since 2005 in a variety of roles including developer advocacy, standards engagement, and content development. Since learning Python in 2008, Doug has written samples and tutorials on topics ranging from caching and continuous integration to Azure Active Directory authentication and Microsoft Graph. Doug has spoken at industry events in over 20 countries, and he has been Microsoft's technical representative to standards bodies including ISO/IEC, Ecma International, OASIS, CalConnect, and others.

Doug currently lives in Seattle with his wife Megan and two Samoyeds named Jamie and Alice.

Credits

  • Acquisitions Editor

    Devon Lewis

  • Associate Publisher

    Jim Minatel

  • Editorial Manager

    Pete Gaughan

  • Production Manager
  • Katie Wisor
  • Project Editor

    Gary Schwartz

  • Production Editor

    Barath Kumar Rajasekaran

  • Technical Editor

    Doug Mahugh

  • Copy Editor

    Kim Cofer

  • Proofreader

    Nancy Bell

  • Indexer

    Potomac Indexing, LLC

  • Cover Designer

    Wiley

  • Cover Image

    ©Lidiia Moor/iStockphoto‐background texture

    © Rick_Jo/iStockphoto‐digital robotic brain

Acknowledgments

Writing a book is always exciting, but along with it come long hours of hard work, straining to get things done accurately and correctly. To make a book possible, a lot of unsung heroes work tirelessly behind the scenes. For this, I would like to take this opportunity to thank a number of special people who made this book possible.

First, I want to thank my acquisitions editor Devon Lewis, who was my first point of contact for this book. Thank you, Devon, for giving me this opportunity and for your trust in me!

Next, a huge thanks to Gary Schwartz, my project editor, who was always a pleasure to work with. Gary is always contactable, even when he is at the airport! Gary has been very patient with me, even though I have missed several of my deadlines for the book. I know it threw a spanner into his plan, but he is always accommodating. Working with him, I know my book is in good hands. Thank you very much, Gary!

Equally important is my technical editor—Doug Mahugh. Doug has been very eager‐eyed editing and testing my code, and never fails to let me know if things do not work the way I intended. Thanks for catching my errors and making the book a better read, Doug! I would also like to take this opportunity to thank my production editor—Barath Kumar Rajasekaran. Without his hard work, this book would not be even possible. Thanks, Barath!

Last, but not least, I want to thank my parents and my wife, Sze Wa, for all the support they have given me. They have selflessly adjusted their schedules to accommodate my busy schedule when I was working on this book. I love you all!

Introduction

This book covers machine learning, one of the hottest topics in more recent years. With computing power increasing exponentially and prices decreasing simultaneously, there is no better time for machine learning. With machine learning, tasks that usually require huge processing power are now possible on desktop machines. Nevertheless, machine learning is not for the faint of heart—it requires a good foundation in statistics, as well as programming knowledge. Most books on the market either are too superficial or go into too much depth that often leaves beginning readers gasping for air.

This book will take a gentle approach to this topic. First, it will cover some of the fundamental libraries used in Python that make machine learning possible. In particular, you will learn how to manipulate arrays of numbers using the NumPy library, followed by using the Pandas library to deal with tabular data. Once that is done, you will learn how to visualize data using the matplotlib library, which allows you to plot different types of charts and graphs so that you can visualize your data easily.

Once you have a firm foundation in the basics, I will discuss machine learning using Python and the Scikit‐Learn libraries. This will give you a solid understanding of how the various machine learning algorithms work behind the scenes.

For this book, I will cover the common machine learning algorithms, such as regression, clustering, and classification.

This book also contains a chapter where you will learn how to perform machine learning using the Microsoft Azure Machine Learning Studio, which allows developers to start building machine learning models using drag‐and‐drop without needing to code. And most importantly, without requiring a deep knowledge of machine learning.

Finally, I will discuss how you can deploy the models that you have built, so that they can be used by client applications running on mobile and desktop devices.

It is my key intention to make this book accessible to as many developers as possible. To get the most out of this book, you should have some basic knowledge of Python programming, and some foundational understanding of basic statistics. And just like you will never be able to learn how to swim just by reading a book, I strongly suggest that you try out the sample code while you are going through the chapters. Go ahead and modify the code and see how the output varies, and very often you would be surprised by what you can do.

All the sample code in this book are available as Jupyter Notebooks (available for download from Wiley’s support page for this book, www.wiley.com/go/leepythonmachinelearning). So you could just download them and try them out immediately.

Without further delay, welcome to Python Machine Learning!