Details

The Data Bonanza


The Data Bonanza

Improving Knowledge Discovery in Science, Engineering, and Business
Wiley Series on Parallel and Distributed Computing, Band 90 1. Aufl.

von: Malcolm Atkinson, Rob Baxter, Peter Brezany, Oscar Corcho, Michelle Galea, Mark Parsons, David Snelling, Jano van Hemert

99,99 €

Verlag: Wiley
Format: PDF
Veröffentl.: 19.03.2013
ISBN/EAN: 9781118540244
Sprache: englisch
Anzahl Seiten: 576

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<p><b>Complete guidance for mastering the tools and techniques of the digital revolution</b></p> <p>With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections.</p> <p>Emphasizing data-intensive thinking and interdisciplinary collaboration, <i>The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business</i> examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book:</p> <ul> <li>Outlines the concepts and rationale for implementing data-intensive computing in organizations</li> <li>Covers from the ground up problem-solving strategies for data analysis in a data-rich world</li> <li>Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL</li> <li>Features in-depth case studies in customer relations, environmental hazards, seismology, and more</li> <li>Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering</li> <li>Includes sample program snippets throughout the text as well as additional materials on a companion website</li> </ul> <p><i>The Data Bonanza</i> is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing.</p>
<p>CONTRIBUTORS xv</p> <p>FOREWORD xvii</p> <p>PREFACE xix</p> <p>THE EDITORS xxix</p> <p><b>PART I STRATEGIES FOR SUCCESS IN THE DIGITAL-DATA REVOLUTION 1</b></p> <p><b>1. The Digital-Data Challenge 5</b><br /> <i>Malcolm Atkinson and Mark Parsons</i></p> <p>1.1 The Digital Revolution 5</p> <p>1.2 Changing How We Think and Behave 6</p> <p>1.3 Moving Adroitly in this Fast-Changing Field 8</p> <p>1.4 Digital-Data Challenges Exist Everywhere 8</p> <p>1.5 Changing How We Work 9</p> <p>1.6 Divide and Conquer Offers the Solution 10</p> <p>1.7 Engineering Data-to-Knowledge Highways 12</p> <p><b>2. The Digital-Data Revolution 15</b><br /> <i>Malcolm Atkinson</i></p> <p>2.1 Data, Information, and Knowledge 16</p> <p>2.2 Increasing Volumes and Diversity of Data 18</p> <p>2.3 Changing the Ways We Work with Data 28</p> <p><b>3. The Data-Intensive Survival Guide 37</b><br /> <i>Malcolm Atkinson</i></p> <p>3.1 Introduction: Challenges and Strategy 38</p> <p>3.2 Three Categories of Expert 39</p> <p>3.3 The Data-Intensive Architecture 41</p> <p>3.4 An Operational Data-Intensive System 42</p> <p>3.5 Introducing DISPEL 44</p> <p>3.6 A Simple DISPEL Example 45</p> <p>3.7 Supporting Data-Intensive Experts 47</p> <p>3.8 DISPEL in the Context of Contemporary Systems 48</p> <p>3.9 Datascopes 51</p> <p>3.10 Ramps for Incremental Engagement 54</p> <p>3.11 Readers’ Guide to the Rest of This Book 56</p> <p><b>4. Data-Intensive Thinking with DISPEL 61</b><br /> <i>Malcolm Atkinson</i></p> <p>4.1 Processing Elements 62</p> <p>4.2 Connections 64</p> <p>4.3 Data Streams and Structure 65</p> <p>4.4 Functions 66</p> <p>4.5 The Three-Level Type System 72</p> <p>4.6 Registry, Libraries, and Descriptions 81</p> <p>4.7 Achieving Data-Intensive Performance 86</p> <p>4.8 Reliability and Control 108</p> <p>4.9 The Data-to-Knowledge Highway 116</p> <p><b>PART II DATA-INTENSIVE KNOWLEDGE DISCOVERY 123</b></p> <p><b>5. Data-Intensive Analysis 127</b><br /> <i>Oscar Corcho and Jano van Hemert</i></p> <p>5.1 Knowledge Discovery in Telco Inc. 128</p> <p>5.2 Understanding Customers to Prevent Churn 130</p> <p>5.3 Preventing Churn Across Multiple Companies 134</p> <p>5.4 Understanding Customers by Combining Heterogeneous Public and Private Data 137</p> <p>5.5 Conclusions 144</p> <p><b>6. Problem Solving in Data-Intensive Knowledge Discovery 147</b><br /> <i>Oscar Corcho and Jano van Hemert</i></p> <p>6.1 The Conventional Life Cycle of Knowledge Discovery 148</p> <p>6.2 Knowledge Discovery Over Heterogeneous Data Sources 155</p> <p>6.3 Knowledge Discovery from Private and Public, Structured and Nonstructured Data 158</p> <p>6.4 Conclusions 162</p> <p><b>7. Data-Intensive Components and Usage Patterns 165</b><br /> <i>Oscar Corcho</i></p> <p>7.1 Data Source Access and Transformation Components 166</p> <p>7.2 Data Integration Components 172</p> <p>7.3 Data Preparation and Processing Components 173</p> <p>7.4 Data-Mining Components 174</p> <p>7.5 Visualization and Knowledge Delivery Components 176</p> <p><b>8. Sharing and Reuse in Knowledge Discovery 181</b><br /> <i>Oscar Corcho</i></p> <p>8.1 Strategies for Sharing and Reuse 182</p> <p>8.2 Data Analysis Ontologies for Data Analysis Experts 185</p> <p>8.3 Generic Ontologies for Metadata Generation 188</p> <p>8.4 Domain Ontologies for Domain Experts 189</p> <p>8.5 Conclusions 190</p> <p><b>PART III DATA-INTENSIVE ENGINEERING 193</b></p> <p><b>9. Platforms for Data-Intensive Analysis 197</b><br /> <i>David Snelling</i></p> <p>9.1 The Hourglass Reprise 198</p> <p>9.2 The Motivation for a Platform 200</p> <p>9.3 Realization 201</p> <p><b>10. Definition of the DISPEL Language 203</b><br /> <i>Paul Martin and Gagarine Yaikhom</i></p> <p>10.1 A Simple Example 204</p> <p>10.2 Processing Elements 205</p> <p>10.3 Data Streams 213</p> <p>10.4 Type System 217</p> <p>10.5 Registration 222</p> <p>10.6 Packaging 224</p> <p>10.7 Workflow Submission 225</p> <p>10.8 Examples of DISPEL 227</p> <p>10.9 Summary 235</p> <p><b>11. DISPEL Development 237</b><br /> <i>Adrian Mouat and David Snelling</i></p> <p>11.1 The Development Landscape 237</p> <p>11.2 Data-Intensive Workbenches 239</p> <p>11.3 Data-Intensive Component Libraries 247</p> <p>11.4 Summary 248</p> <p><b>12. DISPEL Enactment 251</b><br /> <i>Chee Sun Liew, Amrey Krause, and David Snelling</i></p> <p>12.1 Overview of DISPEL Enactment 251</p> <p>12.2 DISPEL Language Processing 253</p> <p>12.3 DISPEL Optimization 255</p> <p>12.4 DISPEL Deployment 266</p> <p>12.5 DISPEL Execution and Control 268</p> <p><b>PART IV DATA-INTENSIVE APPLICATION EXPERIENCE 275</b></p> <p><b>13. The Application Foundations of DISPEL 277</b><br /> <i>Rob Baxter</i></p> <p>13.1 Characteristics of Data-Intensive Applications 277</p> <p>13.2 Evaluating Application Performance 280</p> <p>13.3 Reviewing the Data-Intensive Strategy 283</p> <p><b>14. Analytical Platform for Customer Relationship Management 287</b><br /> <i>Maciej Jarka and Mark Parsons</i></p> <p>14.1 Data Analysis in the Telecoms Business 288</p> <p>14.2 Analytical Customer Relationship Management 289</p> <p>14.3 Scenario 1: Churn Prediction 291</p> <p>14.4 Scenario 2: Cross Selling 293</p> <p>14.5 Exploiting the Models and Rules 296</p> <p>14.6 Summary: Lessons Learned 299</p> <p><b>15. Environmental Risk Management 301</b><br /> <i>Ladislav Hluchy, Ondrej Habala, Viet Tran, and Branislav Simo</i></p> <p>15.1 Environmental Modeling 302</p> <p>15.2 Cascading Simulation Models 303</p> <p>15.3 Environmental Data Sources and Their Management 305</p> <p>15.4 Scenario 1: ORAVA 309</p> <p>15.5 Scenario 2: RADAR 313</p> <p>15.6 Scenario 3: SVP 318</p> <p>15.7 New Technologies for Environmental Data Mining 321</p> <p>15.8 Summary: Lessons Learned 323</p> <p><b>16. Analyzing Gene Expression Imaging Data in Developmental Biology 327</b><br /> <i>Liangxiu Han, Jano van Hemert, Ian Overton, Paolo Besana, and Richard Baldock</i></p> <p>16.1 Understanding Biological Function 328</p> <p>16.2 Gene Image Annotation 330</p> <p>16.3 Automated Annotation of Gene Expression Images 331</p> <p>16.4 Exploitation and Future Work 341</p> <p>16.5 Summary 345</p> <p><b>17. Data-Intensive Seismology: Research Horizons 353</b><br /> <i>Michelle Galea, Andreas Rietbrock, Alessandro Spinuso, and Luca Trani</i></p> <p>17.1 Introduction 354</p> <p>17.2 Seismic Ambient Noise Processing 356</p> <p>17.3 Solution Implementation 358</p> <p>17.4 Evaluation 369</p> <p>17.5 Further Work 372</p> <p>17.6 Conclusions 373</p> <p><b>PART V DATA-INTENSIVE BEACONS OF SUCCESS 377</b></p> <p><b>18. Data-Intensive Methods in Astronomy 381</b><br /> <i>Thomas D. Kitching, Robert G. Mann, Laura E. Valkonen, Mark S. Holliman, Alastair Hume, and Keith T. Noddle</i></p> <p>18.1 Introduction 381</p> <p>18.2 The Virtual Observatory 382</p> <p>18.3 Data-Intensive Photometric Classification of Quasars 383</p> <p>18.4 Probing the Dark Universe with Weak Gravitational Lensing 387</p> <p>18.5 Future Research Issues 392</p> <p>18.6 Conclusions 392</p> <p><b>19. The World at One's Fingertips: Interactive Interpretation of Environmental Data 395</b><br /> <i>Jon Blower, Keith Haines, and Alastair Gemmell</i></p> <p>19.1 Introduction 395</p> <p>19.2 The Current State of the Art 397</p> <p>19.3 The Technical Landscape 401</p> <p>19.4 Interactive Visualization 403</p> <p>19.5 From Visualization to Intercomparison 406</p> <p>19.6 Future Development: The Environmental Cloud 409</p> <p>19.7 Conclusions 411</p> <p><b>20. Data-Driven Research in the Humanities—the DARIAH Research Infrastructure 417</b><br /> <i>Andreas Aschenbrenner, Tobias Blanke, Christiane Fritze, andWolfgang Pempe</i></p> <p>20.1 Introduction 417</p> <p>20.2 The Tradition of Digital Humanities 420</p> <p>20.3 Humanities Research Data 422</p> <p>20.4 Use Case 426</p> <p>20.5 Conclusion and Future Development 429</p> <p><b>21. Analysis of Large and Complex Engineering and Transport Data 431</b><br /> <i>Jim Austin</i></p> <p>21.1 Introduction 431</p> <p>21.2 Applications and Challenges 432</p> <p>21.3 The Methods Used 434</p> <p>21.4 Future Developments 438</p> <p>21.5 Conclusions 439</p> <p>References 440</p> <p><b>22. Estimating Species Distributions—Across Space, Through Time, and with Features of the Environment 441</b><br /> <i>Steve Kelling, Daniel Fink, Wesley Hochachka, Ken Rosenberg, Robert Cook, Theodoros Damoulas, Claudio Silva, and William Michener</i></p> <p>22.1 Introduction 442</p> <p>22.2 Data Discovery, Access, and Synthesis 443</p> <p>22.3 Model Development 448</p> <p>22.4 Managing Computational Requirements 449</p> <p>22.5 Exploring and Visualizing Model Results 450</p> <p>22.6 Analysis Results 452</p> <p>22.7 Conclusion 454</p> <p><b>PART VI THE DATA-INTENSIVE FUTURE 459</b></p> <p><b>23. Data-Intensive Trends 461</b><br /> <i>Malcolm Atkinson and Paolo Besana</i></p> <p>23.1 Reprise 461</p> <p>23.2 Data-Intensive Applications 469</p> <p>24. Data-Rich Futures 477<br /> Malcolm Atkinson</p> <p>24.1 Future Data Infrastructure 478</p> <p>24.2 Future Data Economy 485</p> <p>24.3 Future Data Society and Professionalism 489</p> <p>References 494</p> <p><b>Appendix A: Glossary 499</b><br /> <i>Michelle Galea and Malcolm Atkinson</i></p> <p><b>Appendix B: DISPEL Reference Manual 507</b><br /> <i>Paul Martin</i></p> <p><b>Appendix C: Component Definitions 531</b><br /> <i>Malcolm Atkinson and Chee Sun Liew</i></p> <p>INDEX 537</p>
<p><b>MALCOLM ATKINSON, PhD,</b> is Professor of e-Science in the School of Informatics at the University of Edinburgh in Scotland. He is also Data-Intensive Research Group leader, Director of the e-Science Institute, IT architect for the ADMIRE and VERCE EU projects and UK e-Science Envoy. Professor Atkinson has been leading research projects for several decades and served on many advisory bodies.</p>
<p><b>Complete guidance for mastering the tools and techniques of the digital revolution</b></p> <p>With the digital revolution opening up tremendous opportunities in many fields, there is a growing need for skilled professionals who can develop data-intensive systems and extract information and knowledge from them. This book frames for the first time a new systematic approach for tackling the challenges of data-intensive computing, providing decision makers and technical experts alike with practical tools for dealing with our exploding data collections.</p> <p>Emphasizing data-intensive thinking and interdisciplinary collaboration, <i>The Data Bonanza: Improving Knowledge Discovery in Science, Engineering, and Business</i> examines the essential components of knowledge discovery, surveys many of the current research efforts worldwide, and points to new areas for innovation. Complete with a wealth of examples and DISPEL-based methods demonstrating how to gain more from data in real-world systems, the book:</p> <ul> <li>Outlines the concepts and rationale for implementing data-intensive computing in organizations</li> <li>Covers from the ground up problem-solving strategies for data analysis in a data-rich world</li> <li>Introduces techniques for data-intensive engineering using the Data-Intensive Systems Process Engineering Language DISPEL</li> <li>Features in-depth case studies in customer relations, environmental hazards, seismology, and more</li> <li>Showcases successful applications in areas ranging from astronomy and the humanities to transport engineering</li> <li>Includes sample program snippets throughout the text as well as additional materials on a companion website</li> </ul> <p><i>The Data Bonanza</i> is a must-have guide for information strategists, data analysts, and engineers in business, research, and government, and for anyone wishing to be on the cutting edge of data mining, machine learning, databases, distributed systems, or large-scale computing.</p>

Diese Produkte könnten Sie auch interessieren:

Nonparametric Regression Methods for Longitudinal Data Analysis
Nonparametric Regression Methods for Longitudinal Data Analysis
von: Hulin Wu, Jin-Ting Zhang
Preis: 135,99 €
Statistics for Microarrays
Statistics for Microarrays
von: Ernst Wit, John McClure
Preis: 90,99 €
Statistics and the Evaluation of Evidence for Forensic Scientists
Statistics and the Evaluation of Evidence for Forensic Scientists
von: Colin Aitken, Franco Taroni
Preis: 103,99 €