Details

The Book of Alternative Data


The Book of Alternative Data

A Guide for Investors, Traders and Risk Managers
1. Aufl.

von: Alexander Denev, Saeed Amen

32,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 29.06.2020
ISBN/EAN: 9781119601807
Sprache: englisch
Anzahl Seiten: 416

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>The first and only book to systematically address methodologies and processes of leveraging non-traditional information sources in the context of investing and risk management</b></p> <p>Harnessing non-traditional data sources to generate alpha, analyze markets, and forecast risk is a subject of intense interest for financial professionals. A growing number of regularly-held conferences on alternative data are being established, complemented by an upsurge in new papers on the subject. Alternative data is starting to be steadily incorporated by conventional institutional investors and risk managers throughout the financial world. Methodologies to analyze and extract value from alternative data, guidance on how to source data and integrate data flows within existing systems is currently not treated in literature. Filling this significant gap in knowledge, <i>The Book of Alternative Data</i> is the first and only book to offer a coherent, systematic treatment of the subject.</p> <p>This groundbreaking volume provides readers with a roadmap for navigating the complexities of an array of alternative data sources, and delivers the appropriate techniques to analyze them. The authors—leading experts in financial modeling, machine learning, and quantitative research and analytics—employ a step-by-step approach to guide readers through the dense jungle of generated data. A first-of-its kind treatment of alternative data types, sources, and methodologies, this innovative book:</p> <ul> <li>Provides an integrated modeling approach to extract value from multiple types of datasets</li> <li>Treats the processes needed to make alternative data signals operational</li> <li>Helps investors and risk managers rethink how they engage with alternative datasets</li> <li>Features practical use case studies in many different financial markets and real-world techniques</li> <li>Describes how to avoid potential pitfalls and missteps in starting the alternative data journey</li> <li>Explains how to integrate information from different datasets to maximize informational value</li> </ul> <p><i>The Book of Alternative Data </i>is an indispensable resource for anyone wishing to analyze or monetize different non-traditional datasets, including Chief Investment Officers, Chief Risk Officers, risk professionals, investment professionals, traders, economists, and machine learning developers and users.</p>
<p>Preface xv</p> <p>Acknowledgments xvii</p> <p><b>Part 1 Introduction and Theory 1</b></p> <p><b>1 Alternative Data: The Lay of the Land 3</b></p> <p>1.1 Introduction 3</p> <p>1.2 What is “Alternative Data”? 5</p> <p>1.3 Segmentation of Alternative Data 7</p> <p>1.4 The Many Vs of Big Data 9</p> <p>1.5 Why Alternative Data? 11</p> <p>1.6 Who is Using Alternative Data? 15</p> <p>1.7 Capacity of a Strategy and Alternative Data 16</p> <p>1.8 Alternative Data Dimensions 19</p> <p>1.9 Who Are the Alternative Data Vendors? 23</p> <p>1.10 Usage of Alternative Datasets on the Buy Side 24</p> <p>1.11 Conclusion 26</p> <p><b>2 The Value of Alternative Data 27</b></p> <p>2.1 Introduction 27</p> <p>2.2 The Decay of Investment Value 27</p> <p>2.3 Data Markets 29</p> <p>2.4 The Monetary Value of Data (Part I) 31</p> <p>2.4.1 Cost Value 34</p> <p>2.4.2 Market Value 34</p> <p>2.4.3 Economic Value 35</p> <p>2.5 Evaluating (Alternative) Data Strategies with and without Backtesting 35</p> <p>2.5.1 Systematic Investors 36</p> <p>2.5.2 Discretionary Investors 38</p> <p>2.5.3 Risk Managers 39</p> <p>2.6 The Monetary Value of Data (Part II) 39</p> <p>2.6.1 The Buyer’s Perspective 40</p> <p>2.6.2 The Seller’s Perspective 41</p> <p>2.7 The Advantages of Maturing Alternative Datasets 45</p> <p>2.8 Summary 46</p> <p><b>3 Alternative Data Risks and Challenges 47</b></p> <p>3.1 Legal Aspects of Data 47</p> <p>3.2 Risks of Using Alternative Data 50</p> <p>3.3 Challenges of Using Alternative Data 51</p> <p>3.3.1 Entity Matching 52</p> <p>3.3.2 Missing Data 54</p> <p>3.3.3 Structuring the Data 55</p> <p>3.3.4 Treatment of Outliers 56</p> <p>3.4 Aggregating the Data 57</p> <p>3.5 Summary 58</p> <p><b>4 Machine Learning Techniques 59</b></p> <p>4.1 Introduction 59</p> <p>4.2 Machine Learning: Definitions and Techniques 60</p> <p>4.2.1 Bias, Variance, and Noise 60</p> <p>4.2.2 Cross-Validation 61</p> <p>4.2.3 Introducing Machine Learning 62</p> <p>4.2.4 Popular Supervised Machine Learning Techniques 64</p> <p>4.2.5 Clustering-Based Unsupervised Machine Learning Techniques 70</p> <p>4.2.6 Other Unsupervised Machine Learning Techniques 71</p> <p>4.2.7 Machine Learning Libraries 71</p> <p>4.2.8 Neutral Networks and Deep Learning 72</p> <p>4.2.9 Gaussian Processes 80</p> <p>4.3 Which Technique to Choose? 82</p> <p>4.4 Assumptions and Limitations of the Machine Learning Techniques 84</p> <p>4.4.1 Causality 84</p> <p>4.4.2 Non-stationarity 85</p> <p>4.4.3 Restricted Information Set 86</p> <p>4.4.4 The Algorithm Choice 86</p> <p>4.5 Structuring Images 87</p> <p>4.5.1 Features and Feature Detection Algorithms 87</p> <p>4.5.2 Deep Learning and CNNs for Image Classification 89</p> <p>4.5.3 Augmenting Satellite Image Data with Other Datasets 90</p> <p>4.5.4 Imaging Tools 91</p> <p>4.6 Natural Language Processing (NLP) 91</p> <p>4.6.1 What is Natural Language Processing (NLP)? 91</p> <p>4.6.2 Normalization 93</p> <p>4.6.3 Creating Word Embeddings: Bag-of-Words 94</p> <p>4.6.4 Creating Word Embeddings: Word2vec and Beyond 94</p> <p>4.6.5 Sentiment Analysis and NLP Tasks as Classification Problems 96</p> <p>4.6.6 Topic Modeling 96</p> <p>4.6.7 Various Challenges in NLP 97</p> <p>4.6.8 Different Languages and Different Texts 98</p> <p>4.6.9 Speech in NLP 99</p> <p>4.6.10 NLP Tools 100</p> <p>4.7 Summary 102</p> <p><b>5 The Processes behind the Use of Alternative Data 105</b></p> <p>5.1 Introduction 105</p> <p>5.2 Steps in the Alternative Data Journey 106</p> <p>5.2.1 Step 1. Set up a Vision and Strategy 106</p> <p>5.2.2 Step 2. Identify the Appropriate Datasets 107</p> <p>5.2.3 Step 3. Perform Due Diligence on Vendors 108</p> <p>5.2.4 Step 4. Pre-assess Risks 109</p> <p>5.2.5 Step 5. Pre-assess the Existence of Signals 109</p> <p>5.2.6 Step 6. Data Onboarding 110</p> <p>5.2.7 Step 7. Data Preprocessing 110</p> <p>5.2.8 Step 8. Signal Extraction 111</p> <p>5.2.9 Step 9. Implementation (or Deployment in Production) 112</p> <p>5.2.10 Maintenance Process 113</p> <p>5.3 Structuring Teams to Use Alternative Data 114</p> <p>5.4 Data Vendors 116</p> <p>5.5 Summary 118</p> <p><b>6 Factor Investing 119</b></p> <p>6.1 Introduction 119</p> <p>6.1.1 The CAPM 119</p> <p>6.2 Factor Models 120</p> <p>6.2.1 The Arbitrage Pricing Theory 122</p> <p>6.2.2 The Fama-French 3-Factor Model 123</p> <p>6.2.3 The Carhart Model 124</p> <p>6.2.4 Other Approaches (Data Mining) 125</p> <p>6.3 The Difference between Cross-Sectional and Time Series Trading Approaches 126</p> <p>6.4 Why Factor Investing? 126</p> <p>6.5 Smart Beta Indices Using Alternative Data Inputs 127</p> <p>6.6 ESG Factors 128</p> <p>6.7 Direct and Indirect Prediction 129</p> <p>6.8 Summary 132</p> <p><b>Part 2 Practical Applications 133</b></p> <p><b>7 Missing Data: Background 135</b></p> <p>7.1 Introduction 135</p> <p>7.2 Missing Data Classification 136</p> <p>7.2.1 Missing Data Treatments 137</p> <p>7.3 Literature Overview of Missing Data Treatments 139</p> <p>7.3.1 Luengo et al. (2012) 139</p> <p>7.3.2 Garcia-Laencina et al. (2010) 143</p> <p>7.3.3 Grzymala-Busse et al. (2000) 146</p> <p>7.3.4 Zou et al. (2005) 147</p> <p>7.3.5 Jerez et al. (2010) 147</p> <p>7.3.6 Farhangfar et al. (2008) 148</p> <p>7.3.7 Kang et al. (2013) 149</p> <p>7.4 Summary 149</p> <p><b>8 Missing Data: Case Studies 151</b></p> <p>8.1 Introduction 151</p> <p>8.2 Case Study: Imputing Missing Values in Multivariate Credit Default Swap Time Series 152</p> <p>8.2.1 Missing Data Classification 153</p> <p>8.2.2 Imputation Metrics 154</p> <p>8.2.3 CDS Data and Test Data Generation 154</p> <p>8.2.4 Multiple Imputation Methods 157</p> <p>8.2.5 Deterministic and EOF-Based Techniques 160</p> <p>8.2.6 Results 164</p> <p>8.3 Case Study: Satellite Images 173</p> <p>8.4 Summary 176</p> <p>8.5 Appendix: General Description of the MICE Procedure 178</p> <p>8.6 Appendix: Software Libraries Used in This Chapter 179</p> <p><b>9 Outliers (Anomalies) 181</b></p> <p>9.1 Introduction 181</p> <p>9.2 Outliers Definition, Classification, and Approaches to Detection 182</p> <p>9.3 Temporal Structure 183</p> <p>9.4 Global Versus Local Outliers, Point Anomalies, and Micro-Clusters 184</p> <p>9.5 Outlier Detection Problem Setup 184</p> <p>9.6 Comparative Evaluation of Outlier Detection Algorithms 185</p> <p>9.7 Approaches to Outlier Explanation 189</p> <p>9.7.1 Micenkova et al. 189</p> <p>9.7.2 Duan et al. 191</p> <p>9.7.3 Angiulli et al. 192</p> <p>9.8 Case Study: Outlier Detection on Fed Communications Index 194</p> <p>9.9 Summary 201</p> <p>9.10 Appendix 202</p> <p>9.10.1 Model-Based Techniques 202</p> <p>9.10.2 Distance-Based Techniques 202</p> <p>9.10.3 Density-Based Techniques 203</p> <p>9.10.4 Heuristics-Based Approaches 203</p> <p><b>10 Automotive Fundamental Data 205</b></p> <p>10.1 Introduction 205</p> <p>10.2 Data 206</p> <p>10.3 Approach 1: Indirect Approach 211</p> <p>10.3.1 The Steps Followed 212</p> <p>10.3.2 Stage 1 213</p> <p>10.4 Approach 2: Direct Approach 223</p> <p>10.4.1 The Data 223</p> <p>10.4.2 Factor Generation 224</p> <p>10.4.3 Factor Performance 225</p> <p>10.4.4 Detailed Factor Results 229</p> <p>10.5 Gaussian Processes Example 238</p> <p>10.6 Summary 239</p> <p>10.7 Appendix 240</p> <p>10.7.1 List of Companies 240</p> <p>10.7.2 Description of Financial Statement Items 241</p> <p>10.7.3 Ratios Used 242</p> <p>10.7.4 IHS Markit Data Features 243</p> <p>10.7.5 Reporting Delays by Country 244</p> <p><b>11 Surveys and Crowdsourced Data 245</b></p> <p>11.1 Introduction 245</p> <p>11.2 Survey Data as Alternative Data 245</p> <p>11.3 The Data 247</p> <p>11.4 The Product 247</p> <p>11.5 Case Studies 249</p> <p>11.5.1 Case Study: Company Event Study (Pooled Survey) 249</p> <p>11.5.2 Case Study: Oil and Gas Production (Q&A Survey) 252</p> <p>11.6 Some Technical Considerations on Surveys 254</p> <p>11.7 Crowdsourcing Analyst Estimates Survey 255</p> <p>11.8 Alpha Capture Data 256</p> <p>11.9 Summary 256</p> <p>11.10 Appendix 256</p> <p><b>12 Purchasing Managers’ Index 259</b></p> <p>12.1 Introduction 259</p> <p>12.2 PMI Performance 261</p> <p>12.3 Nowcasting GDP Growth 262</p> <p>12.4 Impacts on Financial Markets 263</p> <p>12.5 Summary 266</p> <p><b>13 Satellite Imagery and Aerial Photography 267</b></p> <p>13.1 Introduction 267</p> <p>13.2 Forecasting US Export Growth 269</p> <p>13.3 Car Counts and Earnings Per Share for Retailers 271</p> <p>13.4 Measuring Chinese PMI Manufacturing with Satellite Data 277</p> <p>13.5 Summary 280</p> <p><b>14 Location Data 283</b></p> <p>14.1 Introduction 283</p> <p>14.2 Shipping Data to Track Crude Oil Supplies 283</p> <p>14.3 Mobile Phone Location Data to Understand Retail Activity 287</p> <p>14.3.1 Trading REIT ETF Using Mobile Phone Location Data 288</p> <p>14.3.2 Estimating Earnings per Share with Mobile Phone Location Data 291</p> <p>14.4 Taxi Ride Data and New York Fed Meetings 295</p> <p>14.5 Corporate Jet Location Data and M&A 296</p> <p>14.6 Summary 298</p> <p><b>15 Text Web Social Media and News 299</b></p> <p>15.1 Introduction 299</p> <p>15.2 Collecting Web Data 299</p> <p>15.3 Social Media 300</p> <p>15.3.1 Hedonometer Index 302</p> <p>15.3.2 Using Twitter Data to Help Forecast US Change in Nonfarm Payrolls 305</p> <p>15.3.3 Twitter Data to Forecast Stock Market Reaction to FOMC 308</p> <p>15.3.4 Liquidity and Sentiment from Social Media 309</p> <p>15.4 News 309</p> <p>15.4.1 Machine-Readable News to Trade FX and Understand FX Volatility 310</p> <p>15.4.2 Federal Reserve Communications and US Treasury Yields 316</p> <p>15.5 Other Web Sources 320</p> <p>15.5.1 Measuring Consumer Price Inflation 321</p> <p>15.6 Summary 322</p> <p><b>16 Investor Attention 323</b></p> <p>16.1 Introduction 323</p> <p>16.2 Readership of Payrolls to Measure Investor Attention 323</p> <p>16.3 Google Trends Data to Measure Market Themes 325</p> <p>16.4 Investopedia Search Data to Measure Investor Anxiety 328</p> <p>16.5 Using Wikipedia to Understand Price Action in Cryptocurrencies 330</p> <p>16.6 Online Attention for Countries to Inform EMFX Trading 330</p> <p>16.7 Summary 333</p> <p><b>17 Consumer Transactions 335</b></p> <p>17.1 Introduction 335</p> <p>17.2 Credit and Debit Card Transaction Data 336</p> <p>17.3 Consumer Receipts 337</p> <p>17.4 Summary 340</p> <p><b>18 Government, Industrial, and Corporate Data 341</b></p> <p>18.1 Introduction 341</p> <p>18.2 Using Innovation Measures to Trade Equities 342</p> <p>18.3 Quantifying Currency Crisis Risk 344</p> <p>18.4 Modeling Central Bank Intervention in Currency Markets 346</p> <p>18.5 Summary 348</p> <p><b>19 Market Data 351</b></p> <p>19.1 Introduction 351</p> <p>19.2 Relationship between Institutional FX Flow Data and FX Spot 351</p> <p>19.3 Understanding Liquidity Using High-Frequency FX Data 355</p> <p>19.4 Summary 357</p> <p><b>20 Alternative Data in Private Markets 359</b></p> <p>20.1 Introduction 359</p> <p>20.2 Defining Private Equity and Venture Capital Firms 360</p> <p>20.3 Private Equity Datasets 362</p> <p>20.4 Understanding the Performance of Private Firms 363</p> <p>20.5 Summary 364</p> <p><b>Conclusions 365</b></p> <p>Some Last Words 365</p> <p>References 367</p> <p>About the Authors 373</p> <p>Index 375</p>
<p><b>ALEXANDER DENEV</b> is Head of AI, Financial Services - Risk Advisory at Deloitte LLP. Prior to that he led Quantitative Research & Advanced Analytics at IHS Markit. Previously, he held roles at the Royal Bank of Scotland, Societe Generale, and European Investment Bank. Denev is a visiting lecturer at the University of Oxford where he graduated with a degree in Mathematical Finance. He is author of numerous papers and books on novel methods of financial modeling with applications ranging from stress testing to asset allocation. <p><b>SAEED AMEN</b> is the founder of Cuemacro, where he consults on systematic trading. For 15 years, he has developed systematic trading strategies and quantitative indices including at major investment banks, Lehman Brothers and Nomura. He is also a visiting lecturer at Queen Mary University of London and a co-founder of the Thalesians, a quant think tank.
<p><b>Praise for The Book of Alternative Data</b> <p>"Alternative data is one of the hottest topics in the investment management industry today. Whether it is used to forecast global economic growth in real-time, parse the entrails of a company with more granularity than that offered by a quarterly report, or to better understand stock market behavior, alternative data is something that everyone in asset management needs to get to grips with. Alexander Denev and Saeed Amen are able guides to a convoluted subject with many pitfalls, both technical and theoretical, even for those that still think Python is a snake best avoided."<br> <b> —Robin Wigglesworth, Global Finance Correspondent,<i> Financial Times</i></b> <p>"Congratulations to the authors for producing such a timely, comprehensive, and accessible discussion of alternative data. As we move further into the 21st Century, this book will rapidly become the go-to work on the subject."<br> <b> —David Hand, Senior Research Investigator and Emeritus Professor of Mathematics, Imperial College London</b> <p>"Over the last decade, Alternative Data has become central to the quest for temporary monopoly of information. Yet, despite its frequent use, little has been written about the end-to-end pipeline necessary to extract value. This book fills the omission, providing not just practical overviews of machine learning methods and data sources, but placing as much importance on data ingestion, preparation, and pre-processing as on the models that map to outcomes. The authors do not consider methodology alone, but also provide insightful case studies, practical examples and highlight the importance of cost-benefit analysis throughout. For value extraction from Alternative Data, they provide informed insights and deep conceptual understanding—crucial if we are to successfully embed such technology at the heart of trading."<br> <b> —Stephen Roberts, Royal Academy of Engineering and Man Group Professor of Machine Learning, University of Oxford, UK; director, Oxford-Man Institute of Quantitative Finance</b> <p>"True investment outperformance comes from the triad of data + machine learning + supercomputing. Alexander Denev and Saeed Amen have written the first comprehensive exposition of alternative data, revealing sources of alpha that are not tapped by structured datasets. Asset managers unfamiliar with the contents of this book are not earning the fees they charge to investors."<br> <b> —Dr. Marcos López de Prado, Professor of Practice, Cornell University; CIO, True Positive Technologies LP</b> <p>"Alexander and Saeed have written an important book about an important topic. I am involved with alternative data every day, but I still enjoyed the perspectives in the book and learned a lot. I highly recommend it to everybody looking to harness the power of alt data (and avoid the pitfalls!)."<br> <b> —Jens Nordvig, Founder and CEO, Exante Data</b>

Diese Produkte könnten Sie auch interessieren:

Mindfulness
Mindfulness
von: Gill Hasson
PDF ebook
12,99 €
Counterparty Credit Risk, Collateral and Funding
Counterparty Credit Risk, Collateral and Funding
von: Damiano Brigo, Massimo Morini, Andrea Pallavicini
EPUB ebook
69,99 €