Title page image

TensorFlow® For Dummies®

Published by: John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, www.wiley.com

Published simultaneously in Canada

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the Publisher. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.

Trademarks: Wiley, For Dummies, the Dummies Man logo, Dummies.com, Making Everything Easier, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and may not be used without written permission. TensorFlow is a registered trademark of Google, LLC. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.

For general information on our other products and services, please contact our Customer Care Department within the U.S. at 877-762-2974, outside the U.S. at 317-572-3993, or fax 317-572-4002. For technical support, please visit https://hub.wiley.com/community/support/dummies.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

Library of Congress Control Number: 2018933981

ISBN 978-1-119-46621-5 (pbk); ISBN 978-1-119-46619-2 (ePub); 978-1-119-46620-8 (ePDF)

TensorFlow® For Dummies®

To view this book's Cheat Sheet, simply go to www.dummies.com and search for “TensorFlow For Dummies Cheat Sheet” in the Search box.

Cover
Introduction
1. About This Book
2. Foolish Assumptions
3. Icons Used in this Book
4. Beyond the Book
5. Where to Go from Here
Part 1: Getting to Know TensorFlow
1. Chapter 1: Introducing Machine Learning with TensorFlow
  1. Understanding Machine Learning
  2. The Development of Machine Learning
  3. Machine Learning Frameworks
2. Chapter 2: Getting Your Feet Wet
  1. Installing TensorFlow
  2. Exploring the TensorFlow Installation
  3. Running Your First Application
  4. Setting the Style
3. Chapter 3: Creating Tensors and Operations
  1. Creating Tensors
  2. Creating Tensors with Known Values
  3. Creating Tensors with Random Values
  4. Transforming Tensors
  5. Creating Operations
  6. Putting Theory into Practice
4. Chapter 4: Executing Graphs in Sessions
  1. Forming Graphs
  2. Creating and Running Sessions
  3. Writing Messages to the Log
  4. Visualizing Data with TensorBoard
  5. Putting Theory into Practice
5. Chapter 5: Training
  1. Training in TensorFlow
  2. Formulating the Model
  3. Looking at Variables
  4. Determining Loss
  5. Minimizing Loss with Optimization
  6. Feeding Data into a Session
  7. Monitoring Steps, Global Steps, and Epochs
  8. Saving and Restoring Variables
  9. Working with SavedModels
  10. Putting Theory into Practice
  11. Visualizing the Training Process
  12. Session Hooks
Part 2: Implementing Machine Learning
1. Chapter 6: Analyzing Data with Statistical Regression
  1. Analyzing Systems Using Regression
  2. Linear Regression: Fitting Lines to Data
  3. Polynomial Regression: Fitting Polynomials to Data
  4. Binary Logistic Regression: Classifying Data into Two Categories
  5. Multinomial Logistic Regression: Classifying Data into Multiple Categories
2. Chapter 7: Introducing Neural Networks and Deep Learning
  1. From Neurons to Perceptrons
  2. Improving the Model
  3. Layers and Deep Learning
  4. Training with Backpropagation
  5. Implementing Deep Learning
  6. Tuning the Neural Network
  7. Managing Variables with Scope
  8. Improving the Deep Learning Process
3. Chapter 8: Classifying Images with Convolutional Neural Networks (CNNs)
  1. Filtering Images
  2. Convolutional Neural Networks (CNNs)
  3. Putting Theory into Practice
  4. Performing Image Operations
  5. Putting Theory into Practice
4. Chapter 9: Analyzing Sequential Data with Recurrent Neural Networks (RNNs)
  1. Recurrent Neural Networks (RNNs)
  2. Creating RNN Cells
  3. Long Short-Term Memory (LSTM) Cells
  4. Gated Recurrent Units (GRUs)
Part 3: Simplifying and Accelerating TensorFlow
1. Chapter 10: Accessing Data with Datasets and Iterators
  1. Datasets
  2. Iterators
  3. Putting Theory into Practice
  4. Bizarro Datasets
2. Chapter 11: Using Threads, Devices, and Clusters
  1. Executing with Multiple Threads
  2. Configuring Devices
  3. Executing TensorFlow in a Cluster
3. Chapter 12: Developing Applications with Estimators
  1. Introducing Estimators
  2. Training an Estimator
  3. Testing an Estimator
  4. Running an Estimator
  5. Creating Input Functions
  6. Using Feature Columns
  7. Creating and Using Estimators
  8. Running Estimators in a Cluster
  9. Accessing Experiments
4. Chapter 13: Running Applications on the Google Cloud Platform (GCP)
  1. Overview
  2. Working with GCP Projects
  3. The Cloud Software Development Kit (SDK)
  4. The gcloud Utility
  5. Google Cloud Storage
  6. Preparing for Deployment
  7. Executing Applications with the Cloud SDK
  8. Configuring a Cluster in the Cloud
Part 4: The Part of Tens
1. Chapter 14: The Ten Most Important Classes
  1. Tensor
  2. Operation
  3. Graph
  4. Session
  5. Variable
  6. Optimizer
  7. Estimator
  8. Dataset
  9. Iterator
  10. Saver
2. Chapter 15: Ten Recommendations for Training Neural Networks
  1. Select a Representative Dataset
  2. Standardize Your Data
  3. Use Proper Weight Initialization
  4. Start with a Small Number of Layers
  5. Add Dropout Layers
  6. Train with Small, Random Batches
  7. Normalize Batch Data
  8. Try Different Optimization Algorithms
  9. Set the Right Learning Rate
  10. Check Weights and Gradients
About the Author
Advertisement Page
Connect with Dummies
Index
End User License Agreement

Introduction

Machine learning is one of the most fascinating and most important fields in modern technology. As I write this book, NASA has discovered faraway planets by using machine learning to analyze telescope images. After only three days of training, Google’s AlphaGo program learned the complex game of Go and defeated the world’s foremost master.

Despite the power of machine learning, few programmers know how to take advantage of it. Part of the problem is that writing machine learning applications requires a different mindset than regular programming. The goal isn’t to solve a specific problem, but to write a general application capable of solving many unknown problems.

Machine learning draws from many different branches of mathematics, including statistics, calculus, linear algebra, and optimization theory. Unfortunately, the real world doesn’t feel any obligation to behave mathematically. Even if you use the best mathematical models, you can still end up with lousy results. I’ve encountered this frustration on many occasions, and I’ve referred to neural networks more than once as “high-tech snake oil.”

TensorFlow won’t give you the ideal model for analyzing a system, but it will reduce the time and frustration involved in machine learning development. Instead of coding activation functions and normalization routines from scratch, you can access the many built-in features of the framework. TensorFlow For Dummies explains how to access these features and put them to use.

About This Book

TensorFlow is a difficult subject to write about. Not only does the toolset contain thousands of classes, but many of them perform similar roles. Furthermore, some classes are deprecated, while others are simply “not recommended for use.”

Despite the vast number of classes, there are three classes that every TensorFlow developer should be familiar with: Tensor, Graph, and Session. The chapters in the first part of this book discuss these classes in detail and present many examples of their usage.

The chapters in Part 2 explain how you can use TensorFlow in practical machine learning tasks. I start with statistical methods, including linear regression, polynomial regression, and logistic regression. Then I delve into the fascinating topic of neural networks. I explore the operation of basic neural networks, and then I present convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

The chapters in Part 3 present high-level TensorFlow classes that you can use to simplify and accelerate your applications. Of the many topics discussed, the most important is the Estimator API, which allows you to implement powerful machine learning algorithms with minimal code. I explain how to code estimators and execute them at high speed using the Google Cloud Platform (GCP).

Foolish Assumptions

In essence, this book covers two topics: the theory of machine learning and the implementation of the theory using TensorFlow. With regard to theory, I make few assumptions. I expect you to know the basics of linear algebra, but I don't expect you to know anything about machine learning. I also don’t expect you to know about statistical regression or neural networks, so I provide a thorough introduction to these and other concepts.

With regard to TensorFlow development, I made assumptions related to your programming background. TensorFlow supports a handful of programming languages, but the central language is Python. For this reason, this book is Python-centric, and I provide all of the example code in Python modules. I explain how to install TensorFlow and access its modules and classes, but I don’t explain what modules and classes are.

Icons Used in this Book

To help you navigate through the text, I inserted icons in the book’s margin. Here’s what they mean:

This icon indicates that the text contains suggestions for developing machine learning applications.

This icon precedes content that delves into the technical theory of machine learning. Many readers may find this theory helpful, but you don’t need to know all the gritty details.

As much as I love TensorFlow, I admit that it isn’t simple to use or understand. There are many critical points to be familiar with, and in many cases, I use this icon to emphasize concepts that are particularly important.

Beyond the Book

This book covers a great deal of the TensorFlow API, but there’s still a lot more to learn. The first place to look is the official documentation, which you can find at www.tensorflow.org. If you’re interested in TensorFlow’s functions and data structures, the best place to look is www.tensorflow.org/api_docs.

If you have a problem that you can’t solve using this book or the official documentation, a great resource is StackOverflow. This site enables programmers to present questions and receive answers, and in my career, I’ve provided plenty of both. For TensorFlow-specific questions, I recommend visiting www.stackoverflow.com/questions/tagged/tensorflow.

In addition to what you’re reading right now, this product also comes with a free access-anywhere Cheat Sheet that gives you some pointers on using TensorFlow. To get this Cheat Sheet, simply go to www.dummies.com and search for “TensorFlow For Dummies Cheat Sheet” in the Search box.

I also provide a great deal of example code that demonstrates how to put the theory into practice. Here’s how to download the tfbook.zip file for this book.

On www.dummies.com, search for TensorFlow For Dummies or the book's ISBN.
When the book comes up, click on the More about this book link.

You are taken to the book’s product page, and the code should be on the Downloads tab.

After decompressing the archive, you’ll find a series of folders named after chapters of this book. The example code for Chapter 3 is in the ch3 folder, the code for Chapter 6 is in ch6, and so on.

Where to Go from Here

The material in this book proceeds from the simple to the complex and from the general to the recondite. If you’re already a TensorFlow expert, feel free to skip any chapters you’re already familiar with. But if you’re new to the toolset, I strongly recommend starting with Chapter 1 and proceeding linearly through Chapters 2, 3, 4, and so on.

I’ve certainly enjoyed writing this book, and I hope you enjoy the journey of discovery. Bon voyage!

Part 1

Getting to Know TensorFlow

IN THIS PART …

Explore the fascinating field of machine learning and discover why TensorFlow is so vital to machine learning development.

Download the TensorFlow package to your computer and install the complete toolkit.

Discover the fundamental data types of TensorFlow and the many operations that you can perform on tensors.

Understand how tensors and operations are stored in graphs and how graphs can be executed in sessions.

Investigate the process of TensorFlow training, which minimizes the disparity between a mathematical model and a real-world system.

Chapter 1

Introducing Machine Learning with TensorFlow

IN THIS CHAPTER

check Looking at machine learning over time

check Exploring machine learning frameworks

TensorFlow is Google’s powerful framework for developing applications that perform machine learning. Much of this book delves into the gritty details of coding TensorFlow modules, but this chapter provides a gentle introduction. I provide an overview of the subject and then discuss the developments that led to the creation of TensorFlow and similar machine learning frameworks.

Understanding Machine Learning

Like most normal, well-adjusted people, I consider The Terminator to be one of the finest films ever made. I first saw it at a birthday party when I was 13, and though most of the story went over my head, one scene affected me deeply: The heroine calls her mother and thinks she’s having a warm conversation, but she’s really talking to an evil robot from the future!

The robot wasn’t programmed in advance with the mother’s voice or the right sequence of phrases. It had to figure these things out on its own. That is, it had to analyze the voice of the real mother, examine the rules of English grammar, and generate acceptable sentences for the conversation. When a computer obtains information from data without receiving precise instructions, it’s performing machine learning.

The Terminator served as my first exposure to machine learning, but it wouldn’t be my last. As I write this book, machine learning is everywhere. My email provider knows that messages involving an “online pharmacy” are spam, but messages about “cheap mescaline” are important. Google Maps always provides the best route to my local Elvis cult, and Amazon.com always knows when I need a new horse head mask. Is it magic? No, it’s machine learning!

Machine learning applications achieve this power by discovering patterns in vast amounts of data. Unlike regular programs, machine learning applications deal with uncertainties and probabilities. It should come as no surprise that the process of coding a machine learning application is completely different than that of coding a regular application. Developers need to be familiar with an entirely new set of concepts and data structures.

Thankfully, many frameworks have been developed to simplify development. At the time of this writing, the most popular is TensorFlow, an open-source toolset released by Google. In writing this book, my goal is to show you how to harness TensorFlow to develop your own machine learning applications.

Although this book doesn’t cover the topic of ethics, I feel compelled to remind readers that programming evil robots is wrong. Yes, you’ll impress your professor, and it will look great on a resume. But society frowns on such behavior, and your friends will shun you. Still, if you absolutely have to program an evil robot, TensorFlow is the framework to use.

The Development of Machine Learning

In my opinion, machine learning is the most exciting topic in modern software development, and TensorFlow is the best framework to use. To convince you of TensorFlow’s greatness, I’d like to present some of the developments that led to its creation. Figure 1-1 presents an abbreviated timeline of machine learning and related software development.

FIGURE 1-1: Developments in machine learning extend from academia to corporations.

Once you understand why researchers and corporations have spent so much time developing the technology, you’ll better appreciate why studying TensorFlow is worth your own time.

Statistical regression

Just as petroleum companies drill into the ground to obtain oil, machine learning applications analyze data to obtain information and insight. The formal term for this process is statistical inference, and its first historical record comes from ancient Greece. But for this purpose, the story begins with a nineteenth-century scientist named Francis Galton. Though his primary interest was anthropology, he devised many of the concepts and tools used by modern statisticians and machine learning applications.

Galton was obsessed with inherited traits, and while studying dogs, he noticed that the offspring of exceptional dogs tend to acquire average characteristics over time. He referred to this as the regression to mediocrity. Galton observed this phenomenon in humans and sweet peas, and while analyzing his data, he employed modern statistical concepts like the normal curve, correlation, variance, and standard deviation.

To illustrate the relationship between a child’s height and the average height of the parents, Galton developed a method for determining which line best fits a series of data points. Figure 1-2 shows what this looks like. (Galton’s data is provided by the University of Alabama.)

FIGURE 1-2: Linear regression identifies a clear trend amidst unclear data points.

Galton’s technique for fitting lines to data became known as linear regression, and the term regression has come to be used for a variety of statistical methods. Regression plays a critical role in machine learning, and Chapter 6 discusses the topic in detail.

Reverse engineering the brain

In 1905, Ramón y Cajal examined tissue from a chicken’s brain and studied the interconnections between the cells, later called neurons. Cajal’s findings fascinated scientists throughout the world, and in 1943, Warren McCulloch and Walter Pitts devised a mathematical model for the neuron. They demonstrated that their artificial neurons could implement the common Boolean AND and OR operations.

While researching statistics, a psychologist named Frank Rosenblatt developed another model for a neuron that expanded on the work of McCulloch and Pitts. He called his model the perceptron, and by connecting perceptrons into layers, he created a circuit capable of recognizing images. These interconnections of perceptrons became known as neural networks.

Rosenblatt followed his demonstrations with grand predictions about the future of perceptron computing. His predictions deeply influenced the Office of Naval Research, which funded the development of a custom computer based on perceptrons. This computer was called the Mark 1 Perceptron, and Figure 1-3 shows what it looks like.

Credit: Cornell Aeronautical Laboratory.

FIGURE 1-3: The Mark 1 Perceptron was the first computer created for machine learning.

The future of perceptron-based computing seemed bright, but in 1969, calamity struck. Marvin Minsky and Seymour Papert presented a deeply critical view of Rosenblatt’s technology in their book, Perceptrons (MIT Press). They mathematically proved many limitations of two-layer feed-forward neural networks, such as the inability to learn nonlinear functions or implement the Boolean Exclusive OR (XOR) operation.

Neural networks have progressed dramatically since the 1960s, and in hindsight, modern readers can see how narrow-minded Minsky and Papert were in their research. But at the time, their findings caused many, including the Navy and other large organizations, to lose interest in neural networks.

Steady progress

Despite the loss of popular acclaim, researchers and academics continued to investigate machine learning. Their work led to many crucial developments, including the following:

In 1965, Ivakhnenko and Lapa demonstrated multilayer perceptrons with nonlinear activation functions.
In 1974, Paul Werbos used backpropagation to train a neural network.
In 1980, Kunihiko Fukushima proposed the neocognitron, a multilayer neural network for image recognition.
In 1982, John Hopfield developed a type of recurrent neural network known as the Hopfield network.
In 1986, Sejnowski and Rosenberg developed NETtalk, a neural network that learned how to pronounce words.

These developments expanded the breadth and capabilities of machine learning, but none of them excited the world’s imagination. The problem was that computers lacked the speed and memory needed to perform real-world machine learning in a reasonable amount of time. That was about to change.

The computing revolution

As the 1980s progressed into the 1990s, improved semiconductor designs led to dramatic leaps in computing power. Researchers harnessed this new power to execute machine learning routines. Finally, machine learning could tackle real-world problems instead of simple proofs of concept.

As the Cold War intensified, military experts grew interested in recognizing targets automatically. Inspired by Fukushima’s neocognitron, researchers focused on neural networks specially designed for image recognition, called convolutional neural networks (CNNs). One major step forward took place in 1994, when Yann LeCun successfully demonstrated handwriting recognition with his CNN-based LeNet5 architecture.

But there was a problem. Researchers used similar theories in their applications, but they wrote all their code from scratch. This meant researchers couldn’t reproduce the results of their peers, and they couldn’t re-use one another’s code. If a researcher’s funding ran out, it was likely that the entire codebase would vanish.

In the late 1990s, my job involved programming convolutional neural networks to recognize faces. I loved the theory behind neural networks, but I found them deeply frustrating in practice. Machine learning applications require careful tuning and tweaking to get acceptable results. But each change to the code required a new training run, and training a CNN could take days. Even then, I still didn’t have enough training data to ensure accurate recognition.

One problem facing me and other researchers was that, while machine learning theory was mature, the process of software development was still in its infancy. Programmers needed frameworks and standard libraries so that they weren’t coding everything by themselves. Also, despite Intel’s best efforts, practical machine learning still required faster processors that could access larger amounts of data.

The rise of big data and deep learning

As the 21st century dawned, the Internet’s popularity skyrocketed, and the price of data storage plummeted. Large corporations could now access terabytes of data about potential consumers. These corporations developed improved tools for analyzing their data, and this revolution in data storage and analysis has become known as the big data revolution.

Now CEOs were faced with a difficult question: How could they use their wealth of data to create wealth for their corporations? One major priority was advertising — companies make more money if they know which advertisements to show to their customers. But there were no clear rules for associating customers with products.

Many corporations launched in-house research initiatives to determine how best to analyze their data. But in 2006, Netflix tried something different. They released a large part of their database online and offered one million dollars to whoever developed the best recommendation engine. The winner, BellKor’s Pragmatic Chaos, combined a number of machine learning algorithms to improve Netflix’s algorithm by 10 percent.

Netflix wasn’t the only high-profile corporation using machine learning. Google’s AdSense used machine learning to determine which advertisements to display on its search engine. Google and Tesla demonstrated self-driving cars that used machine learning to follow roads and join traffic.

Across the world, large organizations sat up and paid notice. Machine learning had left the realm of wooly-headed science fiction and had become a practical business tool. Entrepreneurs continue to wonder what other benefits can be gained by applying machine learning to big data.

Researchers paid notice as well. A major priority involved distinguishing modern machine learning, with its high complexity and vast data processing, from earlier machine learning, which was simple and rarely effective. They agreed on the term deep learning for this new machine learning paradigm. Chapter 7 goes into greater detail regarding the technical meaning of deep learning.

Machine Learning Frameworks

One of the most important advances in practical machine learning involved the creation of frameworks. Frameworks automate many aspects of developing machine learning applications, and they allow developers to re-use code and take advantage of best practices. This discussion introduces five of the most popular frameworks: Torch, Theano, Caffe, Keras, and TensorFlow.

Torch

Torch is the first machine learning framework to attract a significant following. Originally released in 2002 by Ronan Collobert, it began as a toolset for numeric computing. Torch’s computations involve multidimensional arrays called tensors, which can be processed with regular vector/matrix operations. Over time, Torch acquired routines for building, training, and evaluating neural networks.

Torch garnered a great deal of interest from academics and corporations like IBM and Facebook. But its adoption has been limited by its reliance on Lua as its interface language. The other frameworks in this discussion —Theano, Caffe, Keras, and TensorFlow — can be interfaced through Python, which has emerged as the language of choice in the machine learning domain.

Theano

In 2010, a machine learning group at the University of Montreal released Theano, a library for numeric computation. Like NumPy, Theano provides a wide range of Python routines for operating on multidimensional arrays. Unlike NumPy, Theano stores operations in a data structure called a graph, which it compiles into high-performance code. Theano also supports symbolic differentiation, which makes it possible to find derivatives of functions automatically.

Because of its high performance and symbolic differentiation, many machine learning developers have adopted Theano as their numeric computation toolset of choice. Developers particularly appreciate Theano’s ability to execute graphs on graphics processing units (GPUs) as well as central processing units (CPUs).

Caffe

As part of his PhD dissertation at UC Berkeley, Yangqing Jia created Caffe, a framework for developing image recognition applications. As others joined in the development, Caffe expanded to support other machine learning algorithms and many different types of neural networks.

Caffe is written in C++, and like Theano, it supports GPU acceleration. This emphasis on performance has endeared Caffe to many academic and corporate developers. Facebook has become particularly interested in Caffe, and in 2007 it released a reworked version called Caffe2. This version improves Caffe’s performance and makes executing applications on smartphones possible.

Keras

While other offerings focus on performance and breadth of capabilities, Keras is concerned with modularity and simplicity of development. François Chollet created Keras as an interface to other machine learning frameworks, and many developers access Theano through Keras to combine Keras’s simplicity with Theano’s performance.

Keras’s simplicity stems from its small API and intuitive set of functions. These functions focus on accomplishing standard tasks in machine learning, which makes Keras ideal for newcomers to the field but of limited value for those who want to customize their operations.

François Chollet released Keras under the MIT License, and Google has incorporated his interface into TensorFlow. For this reason, many TensorFlow developers prefer to code their neural networks using Keras.

TensorFlow

As the title implies, this book centers on TensorFlow, Google’s gift to the world of machine learning. The Google Brain team released TensorFlow 1.0 in 2015, and as of the time of this writing, the current version is 1.4. It’s provided under the Apache 2.0 open source license, which means you’re free to use it, modify it, and distribute your modifications.

TensorFlow’s primary interface is Python, but like Caffe, its core functionality is written in C++ for improved performance. Like Theano, TensorFlow stores operations in a graph that can be deployed to a GPU, a remote system, or a network of remote systems. In addition, TensorFlow provides a utility called TensorBoard, which makes visualizing graphs and their operations possible.

Like other frameworks, TensorFlow supports execution on CPUs and GPUs. In addition, TensorFlow applications can be executed on the Google Cloud Platform (GCP). The GCP provides world-class processing power at relatively low cost, and in my opinion, GCP processing is TensorFlow’s most important advantage. Chapter 13 discusses this important topic in detail.

Chapter 2

Getting Your Feet Wet

IN THIS CHAPTER

check Obtaining and installing TensorFlow

check Exploring the TensorFlow package

check Running a simple application

check Understanding style conventions

Many chapters of this book present complex technical subjects and lengthy mathematical formulas. But not this one. This chapter is dead simple, and its goal is to walk you through the process of installing TensorFlow and running your first TensorFlow application.

A complete TensorFlow installation contains a vast number of files and directories. This chapter explores the installation and explains what the many files and folders are intended to accomplish. The discussion touches on many of TensorFlow’s packages and the modules they contribute.

Once you’ve installed the TensorFlow toolset, it’s easy to start coding and running applications. The end of the chapter presents a basic application that provides a cheery welcome to TensorFlow development.

Installing TensorFlow

Google provides two methods for installing TensorFlow, and the simpler option involves installing precompiled packages. This discussion presents a three-step process for installing these packages:

Install Python on your development system.
Install the pip package manager.
Use pip to install TensorFlow.

The second installation method involves compiling TensorFlow from its source code. This option takes time and effort, but you can obtain better performance because your TensorFlow package will take the fullest advantage of your processor’s capabilities. Chapter 12 explains how to obtain and compile TensorFlow’s source code.

Python and pip/pip3

TensorFlow supports development with Java and C++, but this book focuses on Python. I use Python 3 in the example code, but you’re welcome to use Python 2. As I explain in the upcoming section “Setting the Style,” TensorFlow applications should be accessible to both versions.

Python’s official package manager is pip, which is a recursive acronym that stands for “pip installs Python.” To install packages like TensorFlow, you can use pip on Python 2 systems or pip3 on Python 3 systems. Package management commands have the following format:

pip <command-name> <command-options>

pip and pip3 accept similar commands and perform similar operations. For example, executing pip list or pip3 list prints all the Python packages installed on your system. Table 2-1 lists this and five other commands.

TABLE 2-1 Package Management Commands

Command Name	Description
`install`	Installs a specified package
`uninstall`	Uninstalls a specified package
`download`	Downloads a package, but doesn't install it
`list`	Lists installed packages
`show`	Prints information about a specified package
`search`	Searches for a package whose name or summary contains the given text

For this discussion, the most important command to know is pip install and pip3 install. But keep in mind that pip/pip3 can perform many other operations.

If you execute a TensorFlow application using a precompiled package, you may receive messages like “The TensorFlow library wasn't compiled to use XYZ instructions, but these are available on your machine and could speed up CPU computations.” To turn off these messages, create an environment variable named TF_CPP_MIN_LOG_LEVEL and set its value to 3.

Installing on Mac OS

Many versions of Mac OS have Python already installed, but I recommend obtaining and installing a new Python package. If you visit www.python.org/downloads, you see one button for Python 2 and another for Python 3. If you click one of these buttons, your browser downloads a PKG file that serves as the Python installer.

When you launch the installer, the Python installation dialog box appears. To install the package, follow these five steps:

In the Introduction page, click the button labeled Continue.
In the Read Me page, click the button labeled Continue.
In the License page, click the button labeled Continue and then click Agree to accept the software license agreement.
In the Installation Type page, click Install to begin the installation process, entering your password, if necessary.
When the installation is complete, click Close to close the dialog box.

If the installation completes successfully, you can run pip or pip3 on a command line. You can install TensorFlow with the following command:

pip install tensorflow

This command tells the package manager to download TensorFlow, TensorBoard, and a series of dependencies. One dependency is six, which supports compatibility between Python 2 and 3. If the installation fails due to a preinstalled six package, you can fix the issue by executing the following command:

pip install --ignore-installed six

This command tells pip to install six on top of the existing installation. After this installation completes, you should be able to run pip install tensorflow without error. On my system, the installer stores the TensorFlow files in the /Library/Frameworks/Python.framework/Versions/<ver>/lib/python<ver>/site-packages/tensorflow directory.

Installing on Linux

Many popular distributions of Linux are based on Debian, including Ubuntu and Linux Mint. These distributions rely on the Advanced Package Tool (APT) to manage packages, which you can access on the command line by entering apt-get. This discussion explains how to install TensorFlow on these and similar operating systems.

Most Linux distributions already have Python installed, but it's a good idea to install the full development version and pip/pip3. The following command installs both for Python 2:

sudo apt-get install python-pip python-dev

Alternatively, the following command performs the installation for Python 3:

sudo apt-get install python3-pip python3-dev

After installation completes, you should be able to execute pip or pip3 on the command line. The following command installs the TensorFlow package and its dependencies (use pip3 for Python 3):

sudo pip install tensorflow

This command installs TensorFlow, TensorBoard, and their dependencies. On my Ubuntu system, the installer stores the files in the /usr/local/lib/python<ver>/dist-packages/tensorflow directory.

Installing on Windows

For Windows users, TensorFlow's documentation specifically recommends installing a 64-bit version of Python 3.5. To download the installer, visit www.python.org/downloads/windows, find a version of Python 3, and click the link entitled Windows x86-64 executable installer. This downloads an *.exe file that serves as the installer.

When you launch the installer, the Python setup dialog box appears. The following steps install Python on your system:

Check the checkbox for adding the Python installation directory to the PATH variable.
Click the link labeled Install Now.
When installation finishes, click the Close button to close the installer.

After you install Python, you should be able to run pip3 on a command line. You can install TensorFlow with the following command:

pip3 install tensorflow

The package manager downloads TensorFlow, TensorBoard, and the packages' dependencies. On my Windows system, the installer stores the files to the C:\Users\<name>\AppData\Local\Programs\Python\Python<ver>\Lib\site-packages\tensorflow directory.

Exploring the TensorFlow Installation

Once you install TensorFlow, you have a directory named tensorflow that contains a wide variety of files and folders. Two top-level folders are particularly important. The core directory contains the TensorFlow's primary packages and modules. The contrib directory contains secondary packages that may later be merged into core TensorFlow.

When you write a TensorFlow application, it’s important to be familiar with the different packages and the modules they provide. Table 2-2 lists the all-important tensorflow package and nine other packages.

TABLE 2-2 Important TensorFlow Packages

Package	Content
`tensorflow`	Central package of the TensorFlow framework, commonly accessed as `tf`
`tf.train`	Optimizers and other classes related to training
`tf.nn`	Neural network classes and related math operations
`tf.layers`	Functions related to multilayer neural networks
`tf.contrib`	Volatile or experimental code
`tf.image`	Image-processing functions
`tf.estimator`	High-level tools for training and evaluation
`tf.logging`	Functions that write data to a log
`tf.summary`	Classes needed to generate summary data
`tf.metrics`	Functions for measuring the outcome of machine learning

The first package, tensorflow, is TensorFlow's central package. Most applications import this package as tf, so when you see tf in code or an example, remember that it refers to the tensorflow package.

As I explain in Chapter 5, training is a crucial operation in machine learning applications. The tf.train package provides many of the modules and classes needed for TensorFlow training. In particular, it provides the optimizer classes that determine which algorithm should be used for training.

The tf.nn and tf.layers packages provide functions that create and configure neural networks. The two packages overlap in many respects, but the functions in tf.layers focus on multilayer networks, while the functions in tf.nn are suited toward general purpose machine learning.

Many of the packages in tf.contrib contain variants of core capabilities. For example, tf.contrib.nn contains variants of the features in tf.nn and tf.contrib.layers contains variants of the features in tf.layers. tf.contrib also provides a wealth of interesting and experimental packages, including the following:

tf.contrib.keras: Makes it possible to interface TensorFlow using the Keras interface
tf.contrib.ffmpeg: Enables audio processing through the open-source FFMPEG toolset
tf.contrib.bayesflow: Contains modules related to Bayesian learning
tf.contrib.integrate: Provides the odeint function, which integrates ordinary differential equations

The last three packages in Table 2-2 enable developers to analyze their applications and produce output. The functions in tf.logging enable logging and can be used to write messages to the log. The classes and functions in tf.summary generate data that can be read by TensorBoard, a utility for visualizing machine learning applications. The functions in tf.metrics analyze the accuracy of machine learning operations.

Running Your First Application

After you install TensorFlow, you're ready to start creating and executing applications. This section walks through the process of running an application that prints a simple message.

Exploring the example code

You can download this book’s example code from www.dummies.com by searching for TensorFlow For Dummies and going to the Downloads tab. The archive’s name is tf_dummies.zip, and if you decompress it, you see that it contains folders named after chapters (ch2, ch3, and so on).

Each chapter folder contains one or more Python files (*.py). In each case, you can execute the module by changing to the directory and running python or python3 followed by the filename.

For example, if you have Python 2 installed, you can execute the code in simple_math.py by changing to the ch3 directory and entering the following command:

python simple_math.py

The code for Chapter 13 is special because it's intended to be executed on the Google Cloud Platform, but that topic is far too exciting to be discussed here.

I haven’t provided any official license for this book’s example code, so you’re free to use it in professional products, academic work, and morally questionable experiments. But if you use any of this code to program evil robots, I will know, and I’ll be disappointed.

Launching Hello TensorFlow!

Programming books have a long tradition of introducing their topic with a simple example that prints a welcoming message. This book is no exception. If you open the ch2 directory in this book’s example code, you find a module named hello_tensorflow.py. Listing 2-1 presents the code.

LISTING 2-1 Hello TensorFlow!

"""A simple TensorFlow application"""

from __future__ import absolute_import

from __future__ import division

from __future__ import print_function

import tensorflow as tf

# Create tensor

msg = tf.string_join(["Hello ", "TensorFlow!"])

# Launch session

with tf.Session() as sess:

print(sess.run(msg))

This code performs three important tasks:

Creates a Tensor named msg that contains two string elements.
Creates a Session named sess and makes it the default session.
Launches the new Session and prints its result.

Running the code is simple. Open a command line and change to the ch2 directory in this book's example code. Then, if you’re using Python 2, you can execute the following command:

python hello_tensorflow.py

If you’re using Python 3, you can run the module with the following command:

python3 hello_tensorflow.py

As the Python interpreter does its magic, you should see the following message:

b'Hello TensorFlow!'

The welcome message is straightforward, but the application’s code probably isn’t as clear. A Tensor instance is an n-dimensional array that contains numeric or string data. Tensors play a central role in TensorFlow development, and Chapter 3 discusses them in detail.

A Session serves as the environment in which TensorFlow operations can be executed. All TensorFlow operations, from addition to optimization, must be executed through a session. Chapter 4 explains how you can create, configure, and execute sessions.

Setting the Style

Google provides the TensorFlow Style Guide at www.tensorflow.org/community/style_guide. Four of its guidelines are as follows:

Code in TensorFlow applications should be compatible with both Python 2 and Python 3.
In keeping with the first guideline, every module should have import statements for absolute_import, division, and print_function.
Indenting should use two spaces instead of four.
TensorFlow modules should rely on the guidelines in the PEP (Python Enhancement Proposal) 8 Style Guide except where they conflict with the TensorFlow Style Guide.

You can find the PEP8 guide at www.python.org/dev/peps/pep-0008. Its many recommendations include the use of docstrings, uppercase for class names, and lowercase for functions and modules. You can check Python code against the PEP8 by installing the pylint package and running pylint filename.py.

The example code in this book follows all of Google's recommendations except two. First, I use four spaces because that’s the Python way. Second, I prefer to name constants with simple lowercase names, such as the msg constant in Listing 2-1, earlier in this chapter.

I don’t blame you if you find my rebellion inexcusable. But if you send the Python police after me, they’ll never take me alive.

Chapter 3

Creating Tensors and Operations

IN THIS CHAPTER

check Creating tensors with known and random values

check Calling functions that transform tensors

check Processing tensors with operations

In grad school, I took a course on tensor mathematics that covered the usage of tensors in electromagnetism. The professor assured us that the theory was “beautiful” and “elegant,” but we beleaguered students described the relativistic mathematics as “indecipherable” and “terrifying.”

TensorFlow’s central data type is the tensor, and happily, it has nothing to do with electromagnetism or relativity. In this book, a tensor is just a regular array. If you’re familiar with Torch’s Tensors or NumPy's ndarrays, you’ll be glad to know that TensorFlow’s tensors are similar in many respects.

Unfortunately, you can’t access these tensors with regular Python routines. For this reason, the TensorFlow API provides a vast assortment of functions for creating, transforming, and operating on tensors. This chapter presents many of these functions and demonstrates how you can use them.

Creating Tensors

Just as most programs start by declaring variables, most TensorFlow applications start by creating tensors. A tensor is an array with zero or more dimensions. A zero-dimensional tensor is called a scalar, a one-dimensional tensor is called a vector, and a two-dimensional tensor is called a matrix. Keep in mind these three points about tensors:

Every tensor is an instance of the Tensor class.
A tensor may contain numbers, strings, or Boolean values. Every element of a tensor must have the same type.
Tensors can be created, transformed, and operated upon using functions of the tf package.

This discussion explains how to create tensors with known values and random values. Then I also present functions that transform a tensor's content. Once you understand these topics, you’ll have no trouble coding simple routines for tensor processing.

TensorFlow® For Dummies®

To view this book's Cheat Sheet, simply go to www.dummies.com and search for “TensorFlow For Dummies Cheat Sheet” in the Search box.

Table of Contents

Introduction

About This Book

Foolish Assumptions

Icons Used in this Book

Beyond the Book

Where to Go from Here

Getting to Know TensorFlow

Introducing Machine Learning with TensorFlow

Understanding Machine Learning

The Development of Machine Learning

Statistical regression

Reverse engineering the brain

Steady progress

The computing revolution

The rise of big data and deep learning

Machine Learning Frameworks

Torch

Theano

Caffe

Keras

TensorFlow

Getting Your Feet Wet

Installing TensorFlow

Python and pip/pip3

Installing on Mac OS

Installing on Linux

Installing on Windows

Exploring the TensorFlow Installation

Running Your First Application

Exploring the example code

Launching Hello TensorFlow!

Setting the Style

Creating Tensors and Operations

Creating Tensors