Details

Data Analysis Using SQL and Excel


Data Analysis Using SQL and Excel


2. Aufl.

von: Gordon S. Linoff

39,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 03.12.2015
ISBN/EAN: 9781119021445
Sprache: englisch
Anzahl Seiten: 800

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<b>A practical guide to data mining using SQL and Excel</b> <p><i>Data Analysis Using SQL and Excel, 2nd Edition</i> shows you how to leverage the two most popular tools for data query and analysis—SQL and Excel—to perform sophisticated data analysis without the need for complex and expensive data mining tools. Written by a leading expert on business data mining, this book shows you how to extract useful business information from relational databases. You'll learn the fundamental techniques before moving into the "where" and "why" of each analysis, and then learn how to design and perform these analyses using SQL and Excel. Examples include SQL and Excel code, and the appendix shows how non-standard constructs are implemented in other major databases, including Oracle and IBM DB2/UDB. The companion website includes datasets and Excel spreadsheets, and the book provides hints, warnings, and technical asides to help you every step of the way.</p> <p><i>Data Analysis Using SQL and Excel, 2nd Edition</i> shows you how to perform a wide range of sophisticated analyses using these simple tools, sparing you the significant expense of proprietary data mining tools like SAS.</p> <ul> <li>Understand core analytic techniques that work with SQL and Excel</li> <li>Ensure your analytic approach gets you the results you need</li> <li>Design and perform your analysis using SQL and Excel</li> </ul> <p><i>Data Analysis Using SQL and Excel, 2nd Edition</i> shows you how to best use the tools you already know to achieve expert results.</p>
<p>Foreword xxxiii</p> <p>Introduction xxxvii</p> <p><b>Chapter 1 A Data Miner Looks at SQL 1</b></p> <p>Databases, SQL, and Big Data 2</p> <p>Picturing the Structure of the Data 6</p> <p>Picturing Data Analysis Using Dataflows 16</p> <p>SQL Queries 21</p> <p>Subqueries and Common Table Expressions Are Our Friends 36</p> <p>Lessons Learned 47</p> <p><b>Chapter 2 What’s in a Table? Getting Started with Data Exploration 49</b></p> <p>What Is Data Exploration? 50</p> <p>Excel for Charting 51</p> <p>Sparklines 65</p> <p>What Values Are in the Columns? 68</p> <p>More Values to Explore—Min, Max, and Mode 79</p> <p>Exploring String Values 81</p> <p>Exploring Values in Two Columns 86</p> <p>From Summarizing One Column to Summarizing All Columns 90</p> <p>Lessons Learned 96</p> <p><b>Chapter 3 How Different Is Different? 97</b></p> <p>Basic Statistical Concepts 98</p> <p>How Different Are the Averages? 105</p> <p>Sampling from a Table 110</p> <p>Counting Possibilities 115</p> <p>Ratios and Their Statistics 128</p> <p>Chi-Square 132</p> <p>What Months and Payment Types Have Unusual Affinities for Which Types of Products? 140</p> <p>Lessons Learned 143</p> <p><b>Chapter 4 Where Is It All Happening? Location, Location, Location 145</b></p> <p>Latitude and Longitude 146</p> <p>Census Demographics 160</p> <p>Geographic Hierarchies 172</p> <p>Mapping in Excel 188</p> <p>Lessons Learned 194</p> <p><b>Chapter 5 It’s a Matter of Time 197</b></p> <p>Dates and Times in Databases 198</p> <p>Starting to Investigate Dates 204</p> <p>How Long Between Two Dates? 218</p> <p>Year-over-Year Comparisons 229</p> <p>Counting Active Customers by Day 239</p> <p>Simple Chart Animation in Excel 247</p> <p>Lessons Learned 254</p> <p><b>Chapter 6 How Long Will Customers Last? Survival Analysis to Understand Customers and Their Value 255</b></p> <p>Background on Survival Analysis 256</p> <p>The Hazard Calculation 260</p> <p>Survival and Retention 269</p> <p>Comparing Different Groups of Customers 280</p> <p>Comparing Survival over Time 287</p> <p>Important Measures Derived from Survival 293</p> <p>Using Survival for Customer Value Calculations 298</p> <p>Forecasting 308</p> <p>Lessons Learned 314</p> <p><b>Chapter 7 Factors Affecting Survival: The What and Why of Customer Tenure 315</b></p> <p>Which Factors Are Important and When 316</p> <p>Left Truncation 328</p> <p>Time Windowing 336</p> <p>Competing Risks 342</p> <p>Before and After 353</p> <p>Lessons Learned 366</p> <p><b>Chapter 8 Customer Purchases and Other Repeated Events 367</b></p> <p>Identifying Customers 368</p> <p>RFM Analysis 393</p> <p>Which Households Are Increasing Purchase Amounts Over Time? 404</p> <p>Time to Next Event 416</p> <p>Lessons Learned 420</p> <p><b>Chapter 9 What’s in a Shopping Cart? Market Basket Analysis 421</b></p> <p>Exploring the Products 422</p> <p>Products and Customer Worth 437</p> <p>Product Geographic Distribution 448</p> <p>Which Customers Have Particular Products? 451</p> <p>Lessons Learned 463</p> <p><b>Chapter 10 Association Rules and Beyond 465</b></p> <p>Item Sets 466</p> <p>The Simplest Association Rules 480</p> <p>One-Way Association Rules 483</p> <p>Two-Way Associations 489</p> <p>Extending Association Rules 499</p> <p>Lessons Learned 506</p> <p><b>Chapter 11 Data Mining Models in SQL 507</b></p> <p>Introduction to Directed Data Mining 508</p> <p>Look-Alike Models 515</p> <p>Lookup Model for Most Popular Product 522</p> <p>Lookup Model for Order Size 528</p> <p>Lookup Model for Probability of Response 534</p> <p>Naive Bayesian Models (Evidence Models) 546</p> <p>Lessons Learned 559</p> <p><b>Chapter 12 The Best-Fit Line: Linear Regression Models 561</b></p> <p>The Best-Fit Line 562</p> <p>Measuring Goodness of Fit Using R<sup>2</sup> 581</p> <p>Direct Calculation of Best-Fit Line Coefficients 584</p> <p>Weighted Linear Regression 592</p> <p>More Than One Input Variable 600</p> <p>Lessons Learned 607</p> <p><b>Chapter 13 Building Customer Signatures for Further Analysis 609</b></p> <p>What Is a Customer Signature? 610</p> <p>Designing Customer Signatures 617</p> <p>Operations to Build Customer Signatures 622</p> <p>Extracting Features 639</p> <p>Summarizing Customer Behaviors 644</p> <p>Lessons Learned 653</p> <p><b>Chapter 14 Performance Is the Issue: Using SQL Effectively 655</b></p> <p>Query Engines and Performance 656</p> <p>Considerations When Thinking About Performance 660</p> <p>Performance: Its Meaning and Measurement 663</p> <p>Performance Improvement 101 665</p> <p>Using Indexes Effectively 668</p> <p>When OR Is a Bad Thing 683</p> <p>Pros and Cons: Different Ways of Expressing the Same Thing 686</p> <p>Window Functions 694</p> <p>Lessons Learned 701</p> <p>Appendix Equivalent Constructs Among Databases 703</p> <p>Index 731</p>
<p><b>GORDON S. LINOFF</b> has been working with databases for more decades than he cares to admit. He starting learning about SQL by memorizing the SQL 92 standard while leading a development team (at the now-defunct Thinking Machines Corporation) writing the first high-performance database focused on the complex queries needed for decision support. <p>After that endeavor, Gordon co-founded Data Miners in 1998, a consulting practice devoted to data mining, analytics, and big data. A constant theme in his work is data—and often data in relational databases. His SQL skills have only gotten stronger over the years. In 2014, he was the top contributor to Stack Overflow, the leading question-and-answer-site for technical questions. <p>His other books include the bestselling <i>Data Mining Techniques, Third Edition; Mastering Data Mining</i>; and <i>Mining</i> <i>the Web</i>—which focus on data mining and analysis. This book follows on the popularity of the first edition, with a practical focus on how to actually get and interpret results.
<p><b>Learn to perform sophisticated data analysis using SQL and Excel</b> <p>SQL is the essential language for querying databases, and Excel is the most popular tool for data presentation and analysis. Combined, they create a powerful, accessible tool for business data analysis. Many important types of analysis do not require complex and expensive data mining tools. The answers are on your desktop. <p>This no-nonsense guide, written by a leading expert on business data mining, shows you how to design and perform sophisticated data analysis using SQL and Excel. The highly regarded first edition has been revised to cover the newest enhancements to SQL and Excel, including new techniques and real-world examples. This edition features the up-to-date information business managers and data analysts need. <p>The book begins with the basics of SQL for data mining, Excel to present results, and simple ideas from statistics to understand your data. Core analytic techniques are explained as you learn to run them on real data using Excel and SQL. The chapters progress from basic queries to increasingly detailed applications as you learn why and when to perform specific types of analysis, how to design and perform them, and powerful ways of presenting the results. Each step explains the business context, the technical approach, and the implementation in these familiar tools. <p>As you progress, you'll discover the importance of geography, how to chart changes in data over time, how to use survival analysis to understand customer tenure and churn, and the factors that affect survival. You will explore methods for analyzing customer purchases patterns, market basket analysis, and association rules. Included are important data mining models in SQL, linear regression models, naive Bayesian models, information on building a customer signature, methods for analyzing results, including cumulative gains charts and ROC charts, best practices for using SQL, and getting the best performance for your queries. <p>With more than 100 pages of new material, the fully revised second edition of <i>Data Analysis Using SQL and Excel </i>enables you to: <ul> <li> Understand core analytic techniques that work with SQL and Excel</li> <li> Analyze and interpret data in a table</li> <li> Present data professionally in Excel charts</li> <li> Apply the chi-square measure and other important statistical techniques in both SQL and Excel</li> <li> Understand best practices for SQL queries, with a chapter devoted to performance</li> <li> Use survival analysis to understand time-to-event problems, both for single events and for repeated events</li> <li> Use market basket analysis to understand purchasing behavior</li> <li> Identify the analytic approach that gets the result you're looking for</li> <li> Avoid common pitfalls</li> <li> Maximize the value of the data you have about your customers and your business</li> </ul> <p>The companion website includes datasets for all examples in the book as well as related Excel spreadsheets. <p>www.wiley.com/go/dataanalysisusingsqlandexcel2e

Diese Produkte könnten Sie auch interessieren:

Statistics for Microarrays
Statistics for Microarrays
von: Ernst Wit, John McClure
PDF ebook
90,99 €