In his new book

Advances in Financial Machine Learning, noted financial scholar Marcos López de Prado strikes a well-aimed karate chop at the naive and often statistically overfit techniques that are so prevalent in the financial world today. He points out that not only are business-as-usual approaches largely impotent in today's high-tech finance, but in many cases they are actually prone to lose money. But López de Prado does more than just expose the mathematical and statistical sins of the finance world. Instead, he offers a technically sound roadmap for finance professionals to join the wave of machine learning. What is particularly refreshing is the author's empirical approach—his focus is on real-world data analysis, not on purely theoretical methods that may look pretty on paper but which, in many cases, are largely ineffective in practice. The book is geared to finance professionals who are already familiar with statistical data analysis techniques, but it is well worth the effort for those who want to do real state-of-the-art work in the field.”

Dr. David H. Bailey, former Complex Systems Lead,Lawrence Berkeley National Laboratory. Co-discoverer of the

BBP spigot algorithm

“Finance has evolved from a compendium of heuristics based on historical financial statements to a highly sophisticated scientific discipline relying on computer farms to analyze massive data streams in real time. The recent highly impressive advances in machine learning (ML) are fraught with both promise and peril when applied to modern finance. While finance offers up the nonlinearities and large data sets upon which ML thrives, it also offers up noisy data and the human element which presently lie beyond the scope of standard ML techniques. To err is human, but if you really want to f**k things up, use a computer. Against this background, Dr. López de Prado has written the first comprehensive book describing the application of modern ML to financial modeling. The book blends the latest technological developments in ML with critical life lessons learned from the author's decades of financial experience in leading academic and industrial institutions. I highly recommend this exciting book to both prospective students of financial ML and the professors and supervisors who teach and guide them.”

Prof. Peter Carr, Chair of the Finance and Risk EngineeringDepartment, NYU Tandon School of Engineering

“Marcos is a visionary who works tirelessly to advance the finance field. His writing is comprehensive and masterfully connects the theory to the application. It is not often you find a book that can cross that divide. This book is an essential read for both practitioners and technologists working on solutions for the investment community.”

Landon Downs, President and Cofounder, 1QBit

“Academics who want to understand modern investment management need to read this book. In it, Marcos López de Prado explains how portfolio managers use machine learning to derive, test, and employ trading strategies. He does this from a very unusual combination of an academic perspective and extensive experience in industry, allowing him to both explain in detail what happens in industry and to explain how it works. I suspect that some readers will find parts of the book that they do not understand or that they disagree with, but everyone interested in understanding the application of machine learning to finance will benefit from reading this book.”

Prof. David Easley, Cornell University. Chair of theNASDAQ-OMX Economic Advisory Board

“For many decades, finance has relied on overly simplistic statistical techniques to identify patterns in data. Machine learning promises to change that by allowing researchers to use modern nonlinear and highly dimensional techniques, similar to those used in scientific fields like DNA analysis and astrophysics. At the same time, applying those machine learning algorithms to model financial problems would be dangerous. Financial problems require very distinct machine learning solutions. Dr. López de Prado's book is the first one to characterize what makes standard machine learning tools fail when applied to the field of finance, and the first one to provide practical solutions to unique challenges faced by asset managers. Everyone who wants to understand the future of finance should read this book.”

Prof. Frank Fabozzi,EDHEC Business School. Editor of

The Journal of Portfolio Management

“This is a welcome departure from the knowledge hoarding that plagues quantitative finance. López de Prado defines for all readers the next era of finance: industrial scale scientific research powered by machines.”

John Fawcett, Founder and CEO, Quantopian

“Marcos has assembled in one place an invaluable set of lessons and techniques for practitioners seeking to deploy machine learning techniques in finance. If machine learning is a new and potentially powerful weapon in the arsenal of quantitative finance, Marcos's insightful book is laden with useful advice to help keep a curious practitioner from going down any number of blind alleys, or shooting oneself in the foot.”

Ross Garon, Head of Cubist Systematic Strategies. ManagingDirector, Point72 Asset Management

“The first wave of quantitative innovation in finance was led by Markowitz optimization. Machine Learning is the second wave, and it will touch every aspect of finance. López de Prado's

Advances in Financial Machine Learningis essential for readers who want to be ahead of the technology rather than being replaced by it.”

Prof. Campbell Harvey, Duke University. Former President ofthe American Finance Association

“The complexity inherent to financial systems justifies the application of sophisticated mathematical techniques.

Advances in Financial Machine Learningis an exciting book that unravels a complex subject in clear terms. I wholeheartedly recommend this book to anyone interested in the future of quantitative investments.”

Prof. John C. Hull, University of Toronto. Author ofOptions, Futures, and other Derivatives

“Prado’s book clearly illustrates how fast this world is moving, and how deep you need to dive if you are to excel and deliver top of the range solutions and above the curve performing algorithms... Prado’s book is clearly at the bleeding edge of the machine learning world.”

Irish Tech News

“Financial data is special for a key reason: The markets have only one past. There is no ‘control group’, and you have to wait for true out-of-sample data. Consequently, it is easy to fool yourself, and with the march of Moore’s Law and the new machine learning, it’s easier than ever. López de Prado explains how to avoid falling for these common mistakes. This is an excellent book for anyone working, or hoping to work, in computerized investment and trading.”

Dr. David J. Leinweber, Former Managing Director, First Quadrant.

Author ofNerds on Wall Street: Math, Machines and Wired Markets

“In his new book, Dr. López de Prado demonstrates that financial machine learning is more than standard machine learning applied to financial datasets. It is an important field of research in its own right. It requires the development of new mathematical tools and approaches, needed to address the nuances of financial datasets. I strongly recommend this book to anyone who wishes to move beyond the standard Econometric toolkit.”

Dr. Richard R. Lindsey, Managing Partner, Windham Capital Management.

Former Chief Economist, U.S. Securities and Exchange Commission

“Dr. Lopez de Prado, a well-known scholar and an accomplished portfolio manager who has made several important contributions to the literature on machine learning (ML) in finance, has produced a comprehensive and innovative book on the subject. He has illuminated numerous pitfalls awaiting anyone who wishes to use ML in earnest, and he has provided much needed blueprints for doing it successfully. This timely book, offering a good balance of theoretical and applied findings, is a must for academics and practitioners alike.”

Prof. Alexander Lipton, Connection Science Fellow, Massachusetts

Institute of Technology.Risk’s Quant of the Year (2000)

“How does one make sense of todays’ financial markets in which complex algorithms route orders, financial data is voluminous, and trading speeds are measured in nanoseconds? In this important book, Marcos López de Prado sets out a new paradigm for investment management built on machine learning. Far from being a “black box” technique, this book clearly explains the tools and process of financial machine learning. For academics and practitioners alike, this book fills an important gap in our understanding of investment management in the machine age.”

Prof. Maureen O'Hara, Cornell University. Former President ofthe American Finance Association

“Marcos López de Prado has produced an extremely timely and important book on machine learning. The author's academic and professional first-rate credentials shine through the pages of this book—indeed, I could think of few, if any, authors better suited to explaining both the theoretical and the practical aspects of this new and (for most) unfamiliar subject. Both novices and experienced professionals will find insightful ideas, and will understand how the subject can be applied in novel and useful ways. The Python code will give the novice readers a running start and will allow them to gain quickly a hands-on appreciation of the subject. Destined to become a classic in this rapidly burgeoning field.”

Prof. Riccardo Rebonato, EDHEC Business School. FormerGlobal Head of Rates and FX Analytics at PIMCO

“A tour de force on practical aspects of machine learning in finance, brimming with ideas on how to employ cutting-edge techniques, such as fractional differentiation and quantum computers, to gain insight and competitive advantage. A useful volume for finance and machine learning practitioners alike.”

Dr. Collin P. Williams, Head of Research, D-Wave Systems

Cover image: © Erikona/Getty Images

Cover design: Wiley

Copyright © 2018 by John Wiley & Sons, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the Web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permissions.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. The views expressed in this book are the author's and do not necessarily reflect those of the organizations he is affiliated with.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993, or fax (317) 572-4002.

Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com.

ISBN 978-1-119-48208-6 (Hardcover)

ISBN 978-1-119-48211-6 (ePDF)

ISBN 978-1-119-48210-9 (ePub)

Dedicated to the memory of my coauthor and friend,

Professor Jonathan M. Borwein, FRSC, FAAAS,

FBAS, FAustMS, FAA, FAMS, FRSNSW

(1951–2016)

There are very few things which we know, which are not capable of being reduced to a mathematical reasoning. And when they cannot, it's a sign our knowledge of them is very small and confused. Where a mathematical reasoning can be had, it's as great a folly to make use of any other, as to grope for a thing in the dark, when you have a candle standing by you.

—*Of the Laws of Chance*, Preface (1692)

John Arbuthnot (1667–1735)

**Dr. Marcos López de Prado** manages multibillion-dollar funds using machine learning (ML) and supercomputing technologies. He founded Guggenheim Partners’ Quantitative Investment Strategies (QIS) business, where he developed high-capacity strategies that consistently delivered superior risk-adjusted returns. After managing up to $13 billion in assets, Marcos acquired QIS and spun-out that business from Guggenheim in 2018.

Since 2011, Marcos has been a research fellow at Lawrence Berkeley National Laboratory (U.S. Department of Energy, Office of Science). One of the top-10 most read authors in finance (SSRN's rankings), he has published dozens of scientific articles on ML and supercomputing in the leading academic journals, and he holds multiple international patent applications on algorithmic trading.

Marcos earned a PhD in Financial Economics (2003), a second PhD in Mathematical Finance (2011) from Universidad Complutense de Madrid, and is a recipient of Spain's National Award for Academic Excellence (1999). He completed his post-doctoral research at Harvard University and Cornell University, where he teaches a Financial ML course at the School of Engineering. Marcos has an Erdős #2 and an Einstein #4 according to the American Mathematical Society.

For additional details, visit www.QuantResearch.org

- Chapter 1 Financial Machine Learning as a Distinct Subject