This edition first published 2020
© 2020 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Nan Zheng and Pinaki Mazumder to be identified as the authors of this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data
Names: Zheng, Nan, 1989‐ author. | Mazumder, Pinaki, author.
Title: Learning in energy‐efficient neuromorphic computing : algorithm and
architecture co‐design / Nan Zheng, Pinaki Mazumder.
Description: Hoboken, NJ : Wiley‐IEEE Press, [2020] | Includes
bibliographical references and index.
Identifiers: LCCN 2019029946 (print) | LCCN 2019029947 (ebook) | ISBN
9781119507383 (cloth) | ISBN 9781119507390 (adobe pdf) | ISBN
9781119507406 (epub)
Subjects: LCSH: Neural networks (Computer science)
Classification: LCC QA76.87 .Z4757 2019 (print) | LCC QA76.87 (ebook) |
DDC 006.3/2–dc23
LC record available at https://lccn.loc.gov/2019029946
LC ebook record available at https://lccn.loc.gov/2019029947
Cover design: Wiley
Cover image: © monsitj/Getty Images
Learning is the beginning of wealth.
Learning is the beginning of health.
Learning is the beginning of spirituality.
Searching and learning is where the miracle process all begins.
–Emanuel James Rohn
In loving memory of my father Animesh Chandra Mazumder (1918–2011).
In 1987, when I was wrapping up my doctoral thesis at the University of Illinois, I had a rare opportunity to listen to Prof. John Hopfield of the California Institute of Technology describing his groundbreaking research in neural networks to spellbound students in the Loomis Laboratory of Physics at Urbana‐Champaign. He didactically described how to design and fabricate a recurrent neural network chip to rapidly solve the benchmark Traveling Salesman Problem (TSP), which is provably NP‐complete in the sense that no physical computer could solve the problem in asymptotically bounded polynomial time as the number of cities in the TSP increases to a very large number.
This discovery of algorithmic hardware to solve intractable combinatorics problems was a major milestone in the field of neural networks as the prior art of perceptron‐type feedforward neural networks could merely classify a limited set of simple patterns. Though, the founder of neural computing, Prof. Frank Rosenblatt of Cornel University had built a Mark 1 Perceptron computer in the late 1950s when the first waves of digital computers such as IBM 650 were just commercialized. Subsequent advancements in neural hardware designs were stymied mainly because of lack of integration capability of large synaptic networks by using the then technology, comprising vacuum tubes, relays, and passive components such as resistors, capacitors, and inductors. Therefore, in 1985, when AT&T Bell Labs fabricated the first solid‐state proof‐of‐concept TSP chip by using MOS technology to verify Prof. John Hopfield's neural net architecture, it opened the vista for solving non‐Boolean and brain‐like computing on silicon.
Prof. John Hopfield's seminal work established that if the “objective function” of a combinatorial algorithm can be expressed in quadratic form, the synaptic links in a recurrent artificial neural network could be accordingly programmed to reduce (i.e. locally minimize) the value of the objective function through massive interactions between the constituent neurons. Hopfield's neural network consists of laterally connected neurons that can be randomly initialized and then the network can iteratively reduce the intrinsic Lyapunov energy function of the network to reach a local minima state. Notably, the Lyapunov function decreases in a monotone fashion under the dynamics of the recurrent neural networks, where neurons are not provided with self‐feedback.1
Prof. Hopfield used a combination of four separate quadratic functions to represent the objective function of the TSP. The first part of the objective function ensures that the energy function minimizes if the traveling salesman traverses cities exactly once, the second part ensures that the traveling salesman visits all cities in the itinerary, the third part ensures that no two cities are visited simultaneously, and the fourth part of the quadratic function is designed to determine the shortest route connecting all cities in the TSP. Because of massive simultaneous interactions between neurons through the connecting synapses that are precisely adjusted to meet the constraints in the above quadratic functions, a simple recurrent neural network could rapidly generate a very good quality solution. However, unlike well‐tested software procedures such as simulated annealing, dynamic programming, and the branch‐and‐bound algorithm, neural networks generally fail to find the best solution because of their simplistic connectionist structures.
Therefore, after listening to Prof. Hopfield's fascinating talk I harbored a mixed feeling about the potential benefit of his innovation. On the one hand, I was thrilled to learn from his lecture how computationally hard algorithmic problems could be solved very quickly by using simple neuromorphic CMOS circuits having very small hardware overheads. On the other hand, I thought that the TSP application that Prof. Hopfield selected to demonstrate the ability of neural networks to solve combinatorial optimization problems was not the right candidate, as software algorithms are well crafted to obtain nearly the best solution that the neural networks can hardly match. I started contemplating developing self‐healing VLSI chips where the power of neural‐inspired self‐repair algorithms could be used to automatically restructure faulty VLSI chips. Low overheads and the ability to solve a problem concurrently through parallel interactions between neurons are two salient features that I thought could be elegantly deployed for automatically repairing VLSI chips by built‐in neural net circuitry.
Soon after I joined the University of Michigan as an assistant professor, working with one of my doctoral students [2], and, at first, I developed a CMOS analog neural net circuitry with asynchronous state updates, which lacked robustness due to process variation within a die. In order to improve the reliability of the self‐repair circuitry, an MS student [3] and I designed a digital neural net circuitry with synchronous state updates. These neural circuits were designed to repair VLSI chips by formulating the repair problem in terms of finding the node cover, edge cover, or node pair matching in a bipartite graph. In our graph formalism, one set of vertices in the bipartite graph represented the faulty circuit elements, and the other set of vertices represented the spare circuit elements. In order to restructure a faulty VLSI chip into a fault‐free operational chip, the spare circuit elements were automatically invoked through programmable switching elements after identifying the faulty elements through embedded built‐in self‐testing circuitry.
Most importantly, like the TSP problem, the two‐dimensional array repair can be shown to be an NP‐complete problem because the repair algorithm seeks the optimal number of spare rows and spare columns that can be assigned to bypass faulty components such as memory cells, word‐line and bit‐line drivers, and sense amplifier bands located inside the memory array. Therefore, simple digital circuits comprising counters and other blocks woefully fail to solve such intractable self‐repair problems. Notably, one cannot use external digital computers to determine how to repair embedded arrays, as input and output pins of the VLSI chip cannot be deployed to access the fault patterns in the deeply embedded arrays.
In 1989 and 1992, I received two NSF grants to expand the neuromorphic self‐healing design styles to a wider class of embedded VLSI modules such as memory array [4], processors array [5], programmable logic array, and so on [6]. However, this approach to improving VLSI chip yield by built‐in self‐testing and self‐repair was a bit ahead of its time as the state‐of‐the‐art microprocessors in the early 1990s contained only a few hundred thousands of transistors as well as the submicron CMOS technology that was relatively robust. Therefore, after developing the neural‐net based self‐healing VLSI chip design methodology for various types of embedded circuit blocks, I stopped working on CMOS neural networks. I was not particularly interested in pursuing applications of neural networks for other types of engineering problems, as I wanted to remain focused on solving emerging problems in VLSI research.
On the other hand, in the late 1980s there were mounting concerns among CMOS technology prognosticators about the impending red brick wall heralding the end of the shrinking era in CMOS. Therefore, to promote several types of emerging technologies that might push the frontier of VLSI technology, the Defense Advanced Research Projects Agency (DARPA) in the USA had initiated (around 1990) the Ultra Electronics: Ultra Dense, Ultra Fast Computing Components Research Program. Concurrently, the Ministry of International Trade & Industry (MITI) in Japan had launched the Quantum Functional Devices (QFD) Project. Early successes with a plethora of innovative non‐CMOS technologies in both research programs led to the launching of the National Nanotechnology Initiative (NNI), which is a U.S. Government research and development (R&D) initiative, involving 20 departments and independent agencies to bring about revolution in nanotechnology to impact the industry and society at large.
During the period of 1995 and 2010, my research group had at first focused on a quantum physics based device and circuit modeling for quantum tunneling devices, and then we extensively worked on cellular neural network (CNN) circuits for image and video processing by using one‐dimensional (double barrier resonant tunneling device), two‐dimensional (self‐assembled nanowire), and three‐dimensional (quantum dot array) constrained quantum devices. Subsequently, we developed learning‐based neural network circuits by using resistive synaptic devices (commonly known as memristors) and CMOS neurons. We also developed analog voltage programmable nanocomputing architectures by hybridizing quantum tunneling and memristive devices in computing nodes of a two‐dimensional processing element (PE) ensemble. Our research on nanoscale neuromorphic circuits will soon be published in our new book, titled: Neuromorphic Circuits for Nanoscale Devices, River Publishing, U.K., 2019.
After spending a little over a decade developing neuromorphic circuits with various types of emerging nanoscale electronic and spintronic devices, I decided to embark on research on learning‐based digital VLSI neuromorphic chips using nanoscale CMOS technology in both subthreshold and superthreshold modes of operation. My student and coauthor of this book, Dr. Nan Zheng, conducted his doctoral dissertation work on architectures and algorithms for digital neural networks. We started our design from both machine learning and biological learning perspectives to design and fabricate energy‐efficient VLSI chips using TSMC 65 nm CMOS technology.
We captured the actor‐critic type reinforcement learning (RL) [7] and an example of temporal difference (TD) learning with off‐line policy updating, called Q‐learning [8] on VLSI chips from the machine learning perspectives. Further, we captured spiking correlation‐based synaptic plasticity commonly used in biological unsupervised learning applications. We also formulated hardware‐friendly spike‐timing‐dependent plasticity (STDP) learning rules [9], which achieved classification rates of 97.2% and 97.8% for the one‐hidden‐layer and two‐hidden‐layer neural networks, respectively, on the Modified National Institute of Standards and Technology (MNIST) database benchmark. The hardware‐friendly learning rule enabled both energy‐efficient hardware design [10] as well as implementations that were robust to the process‐voltage‐temperature (PVT) variations associated with chip manufacturing [11]. We demonstrated that the hardware accelerator VLSI chip for the actor‐critic network solved some control‐theoretic benchmark problems by emulating the adaptive dynamic programming (ADP), which is at the heart of the RL software program. However, compared with traditional software RL running on a general‐purpose processor, the VLSI chip accelerator operating at 175 MHz achieves two orders of magnitude improvement in computational time while consuming merely 25 mW [12].
The chip layout diagrams included in the Preface contain a sample of numerous digital neural network chips using CMOS technology that my research group has designed over the course of the last 35 years. On the left column: a self‐healing chip was designed in 1991 to repair faulty VLSI memory arrays automatically by running node‐covering algorithms on a bipartite graph representing the set of faulty components and the available spare circuit elements. The STDP chip was designed in 2013 for controlling the motion of a virtual insect from an initial source to the selected destination by avoiding collisions while navigating through a set of arbitrarily shaped blocked spaces. A deep learning chip described in the previous paragraph was designed in 2016.
On the right column is shown the RL chip described in the above paragraph and designed in 2016. Also included on the right column are two ultra‐low‐power (ULP) CMOS chips, which were biased in the subthreshold mode for wearable applications in health care. In one application, Kohonen's self‐organizing map (SOM) of neural networks was implemented to classify electrocardiogram (ECG) waveforms, while the body‐sensing network with a wireless transceiver was designed to sense analog neuronal signals by using an implantable multielectrode sensor and to provide the digitized data through a built‐in wake‐up transceiver to doctors who could monitor the efficacy of drugs at the neuronal and synaptic levels in brain‐related diseases such as schizophrenia, chronic depression, Alzheimer disease, and so on.
Initially, when we decided to publish a monograph highlighting our work in the form of CMOS neuromorphic chips for brain‐like computing, we wanted to aggregate various results of the cited papers in the Preface to compose the contents of the book. However, in the course of preparation of the manuscript, we modified our initial narrow goal, as it would be rather limiting to adopt the book in a regular course for teaching undergraduate and graduate students about the latest generation neural networks with learning capabilities.
Instead, we decided to write a comprehensive book on energy‐efficient hardware design for neural networks with various types of learning capability by discussing expansive ongoing research in neural hardware. This is evidently a Herculean task warranting mulling through hundreds of archival sources of references and describing co‐design and co‐optimization methodologies for building hardware neural networks that can learn to perform various tasks. We attempted to provide a comprehensive perspective, from high‐level algorithms to low‐level implementation details by covering many fundamentals and essentials in neural networks (e.g. deep learning), as well as hardware implementation of neural networks. In a nutshell, the present version of the book has the following salient features:
Finally, due to the stringent time constraint to complete this book in conjunction with the commitment to concurrently finish the complementary book (Neuromorphic Circuits for Nanoscale Devices, River Publishing, U.K., 2019), the present version of the book has been completed without describing the didactic materials pedagogically as expected in a textbook along with exercise problems at the end of each chapter. Hopefully, those goals will be achieved in the next edition of the book after gathering valuable feedback from students, instructors, practicing engineers, and other readers. I shall truly appreciate it if you give me such guiding feedback, both positive and negative, which will enable me to prepare the Second Edition of the book. My contact information is included below for your convenience.
Pinaki Mazumder February 14, 2019
Address:
4765 BBB Building
Division of Computer Science and Engineering Department of Electrical Engineering and Computer Science University of Michigan, Ann Arbor, MI 48109–2122; Ph: 734‐763‐2107; E‐mail: mazum@eecs.umich.edu, pinakimazum@gmail.com
Website: http://www.eecs.umich.edu/∼mazum
First, I would like to thank several of my senior colleagues who encouraged me to carry on my research in neural computing during the past three decades after I published my first paper on self‐healing of VLSI memories in 1989 by adopting the concept of the Hopfield network. Specifically, I would like to thank Prof. Leon O. Chua and Prof. Ernest S. Kuh of the University of California at Berkeley, Prof. Steve M. Kang, Prof. Kent W. Fuchs, and Prof. Janak H. Patel of the University of Illinois at Urbana‐Champaign, Jacob A. Abraham of the University of Texas at Austin, Prof. Supriyo Bandyopadhyay of Virginia Commonwealth University, Prof. Sudhakar M. Reddy of the University of Iowa, and Prof. Tamas Roska and Prof. Csurgay Arpad of the Technical University of Budapest, Hungary.
Second, I would like to thank several of my colleagues at the National Science Foundation where I served in the Directorate of Computer and Information Science and Engineering (CISE) as a Program Director of the Emerging Models and Technologies program from January 2007 to December 2008, and then served in the Engineering Directorate (ED) as a Program Director of Adaptive Intelligent Systems program from January 2008 to December 2009. Specifically, I would like to thank Dr. Robert Grafton and Dr. Sankar Basu of the Computing and Communications Foundation Division of CISE, and Dr. Radhakrisnan Baheti, Dr. Paul Werbos, and Dr. Jenshan Lin of the Electrical Communications and Cyber Systems (ECCS) Division of ED for providing me with research funds over the past so many years to conduct research on learning‐based systems that enabled me to delve deep into CMOS chip design for brain‐like computing. I had the distinct pleasure of interacting with Dr. Michael Roco during my stint at NSF, and subsequently, when I was invited to present our group's brain‐like computing research in US‐Korea Forums in Nanotechnology in 2016 at Seoul, Korea, and in 2017 at Falls Church, Virginia, USA.
Third, I would like to thank Dr. Jih Shyr Yih, who was my first doctoral student and started working with me soon after I joined the University of Michigan. After taking a course with me, where I taught the memory repair algorithms, he had enthusiastically implemented the first self‐healing VLSI chip using the analog Hopfield network. Next, I would like to acknowledge the contributions of Mr. Michael Smith, who implemented the digital self‐healing chip as indicated above. My other doctoral students, Dr. W. H. Lee, Dr. I. Ebong, Dr J. Kim, Dr. Y. Yalcin, and Ms. S. R. Li, worked on RTD, quantum dots, and memristors to build neuromorphic circuits. Their research works were included in a separate book. The bulk of this book is drawn from the doctoral dissertation manuscript of my student and coauthor of the book, Dr. N. Zheng. It was a joy to work with an assiduous student like him. I would like to also thank Dr. M. Erementchouk for scrutinizing the manuscript and providing some suggestions.
Finally, I would like to thank my wife, Sadhana, my son Bhaskar, my daughter Monika and their spouses Pankti Pathak and Thomas Parker, respectively, for their understanding and support despite the fact that I spent most of my time with my research group.