Details

Data Science with Semantic Technologies


Data Science with Semantic Technologies

Theory, Practice and Application
Advances in Intelligent and Scientific Computing 1. Aufl.

von: Archana Patel, Narayan C. Debnath, Bharat Bhusan

173,99 €

Verlag: Wiley
Format: EPUB
Veröffentl.: 26.10.2022
ISBN/EAN: 9781119865315
Sprache: englisch
Anzahl Seiten: 464

DRM-geschütztes eBook, Sie benötigen z.B. Adobe Digital Editions und eine Adobe ID zum Lesen.

Beschreibungen

<b>DATA SCIENCE WITH SEMANTIC TECHNOLOGIES</b> <p><b>This book will serve as an important guide toward applications of data science with semantic technologies for the upcoming generation and thus becomes a unique resource for scholars, researchers, professionals, and practitioners in this field. </b> <p>To create intelligence in data science, it becomes necessary to utilize semantic technologies which allow machine-readable representation of data. This intelligence uniquely identifies and connects data with common business terms, and it also enables users to communicate with data. Instead of structuring the data, semantic technologies help users to understand the meaning of the data by using the concepts of semantics, ontology, OWL, linked data, and knowledge-graphs. These technologies help organizations to understand all the stored data, adding the value in it, and enabling insights that were not available before. As data is the most important asset for any organization, it is essential to apply semantic technologies in data science to fulfill the need of any organization. <p><i>Data Science with Semantic Technologies</i> provides a roadmap for the deployment of semantic technologies in the field of data science. Moreover, it highlights how data science enables the user to create intelligence through these technologies by exploring the opportunities and eradicating the challenges in the current and future time frame. In addition, this book provides answers to various questions like: Can semantic technologies be able to facilitate data science? Which type of data science problems can be tackled by semantic technologies? How can data scientists benefit from these technologies? What is knowledge data science? How does knowledge data science relate to other domains? What is the role of semantic technologies in data science? What is the current progress and future of data science with semantic technologies? Which types of problems require the immediate attention of researchers? <p><b> Audience</b> <p>Researchers in the fields of data science, semantic technologies, artificial intelligence, big data, and other related domains, as well as industry professionals, software engineers/scientists, and project managers who are developing the software for data science. Students across the globe will get the basic and advanced knowledge on the current state and potential future of data science.
<p>Preface xv</p> <p><b>1 A Brief Introduction and Importance of Data Science 1<br /></b><i>Karthika N., Sheela J. and Janet B.</i></p> <p>1.1 What is Data Science? What Does a Data Scientist Do? 2</p> <p>1.2 Why Data Science is in Demand? 2</p> <p>1.3 History of Data Science 4</p> <p>1.4 How Does Data Science Differ from Business Intelligence? 9</p> <p>1.5 Data Science Life Cycle 11</p> <p>1.6 Data Science Components 13</p> <p>1.7 Why Data Science is Important 14</p> <p>1.8 Current Challenges 15</p> <p>1.8.1 Coordination, Collaboration, and Communication 16</p> <p>1.8.2 Building Data Analytics Teams 16</p> <p>1.8.3 Stakeholders vs Analytics 17</p> <p>1.8.4 Driving with Data 17</p> <p>1.9 Tools Used for Data Science 19</p> <p>1.10 Benefits and Applications of Data Science 28</p> <p>1.11 Conclusion 28</p> <p>References 29</p> <p><b>2 Exploration of Tools for Data Science 31<br /></b><i>Qasem Abu Al-Haija</i></p> <p>2.1 Introduction 32</p> <p>2.2 Top Ten Tools for Data Science 35</p> <p>2.3 Python for Data Science 35</p> <p>2.3.1 Python Datatypes 36</p> <p>2.3.2 Helpful Rules for Python Programming 37</p> <p>2.3.3 Jupyter Notebook for IPython 37</p> <p>2.3.4 Your First Python Program 38</p> <p>2.4 R Language for Data Science 39</p> <p>2.4.1 R Datatypes 39</p> <p>2.4.2 Your First R Program 41</p> <p>2.5 SQL for Data Science 44</p> <p>2.6 Microsoft Excel for Data Science 48</p> <p>2.6.1 Detection of Outliers in Data Sets Using Microsoft Excel 48</p> <p>2.6.2 Regression Analysis in Excel Using Microsoft Excel 50</p> <p>2.7 D3.JS for Data Science 57</p> <p>2.8 Other Important Tools for Data Science 58</p> <p>2.8.1 Apache Spark Ecosystem 58</p> <p>2.8.2 MongoDB Data Store System 60</p> <p>2.8.3 MATLAB Computing System 62</p> <p>2.8.4 Neo4j for Graphical Database 63</p> <p>2.8.5 VMWare Platform for Virtualization 65</p> <p>2.9 Conclusion 66</p> <p>References 68</p> <p><b>3 Data Modeling as Emerging Problems of Data Science 71<br /></b><i>Mahyuddin K. M. Nasution and Marischa Elveny</i></p> <p>3.1 Introduction 72</p> <p>3.2 Data 72</p> <p>3.2.1 Unstructured Data 74</p> <p>3.2.2 Semistructured Data 74</p> <p>3.2.3 Structured Data 76</p> <p>3.2.4 Hybrid (Un/Semi)-Structured Data 77</p> <p>3.2.5 Big Data 78</p> <p>3.3 Data Model Design 79</p> <p>3.4 Data Modeling 81</p> <p>3.4.1 Records-Based Data Model 81</p> <p>3.4.2 Non–Record-Based Data Model 84</p> <p>3.5 Polyglot Persistence Environment 87</p> <p>References 88</p> <p><b>4 Data Management as Emerging Problems of Data Science 91<br /></b><i>Mahyuddin K. M. Nasution and Rahmad Syah</i></p> <p>4.1 Introduction 92</p> <p>4.2 Perspective and Context 92</p> <p>4.2.1 Life Cycle 93</p> <p>4.2.2 Use 95</p> <p>4.3 Data Distribution 98</p> <p>4.4 CAP Theorem 100</p> <p>4.5 Polyglot Persistence 101</p> <p>References 102</p> <p><b>5 Role of Data Science in Healthcare 105<br /></b><i>Anidha Arulanandham, A. Suresh and Senthil Kumar R.</i></p> <p>5.1 Predictive Modeling—Disease Diagnosis and Prognosis 106</p> <p>5.1.1 Supervised Machine Learning Models 107</p> <p>5.1.2 Clustering Models 110</p> <p>5.1.2.1 Centroid-Based Clustering Models 110</p> <p>5.1.2.2 Expectation Maximization (EM) Algorithm 110</p> <p>5.1.2.3 DBSCAN 111</p> <p>5.1.3 Feature Engineering 111</p> <p>5.2 Preventive Medicine—Genetics/Molecular Sequencing 111</p> <p>5.2.1 Technologies for Sequencing 113</p> <p>5.2.2 Sequence Data Analysis with BioPython 114</p> <p>5.2.2.1 Sequence Data Formats 114</p> <p>5.2.2.2 BioPython 117</p> <p>5.3 Personalized Medicine 121</p> <p>5.4 Signature Biomarkers Discovery from High Throughput Data 122</p> <p>5.4.1 Methodology I — Novel Feature Selection Method with Improved Mutual Information and Fisher Score 123</p> <p>5.4.1.1 Algorithm for the Novel Feature Selection Method with Improved Mutual Information and Fisher Score 124</p> <p>5.4.1.2 Computing F-Score Values for the Features 125</p> <p>5.4.1.3 Block Diagram for the Method-1 125</p> <p>5.4.1.4 Data Set 126</p> <p>5.4.1.5 Identification of Biomarkers Using the Feature Selection Technique-I 127</p> <p>5.4.2 Feature Selection Methodology-II — Entropy Based Mean Score with mRMR 128</p> <p>5.4.2.1 Algorithm for the Feature Selection Methodology-II 130</p> <p>5.4.2.2 Introduction to mRMR Feature Selection 132</p> <p>5.4.2.3 Data Sets 132</p> <p>5.4.2.4 Identification of Biomarkers Using Rank Product 133</p> <p>5.4.2.5 Fold Change Values 133</p> <p>Conclusion 136</p> <p>References 136</p> <p><b>6 Partitioned Binary Search Trees (P(h)-BST): A Data Structure for Computer RAM 139<br /></b><i>Pr. D.E Zegour</i></p> <p>6.1 Introduction 140</p> <p>6.2 P(h)-BST Structure 141</p> <p>6.2.1 Preliminary Analysis 143</p> <p>6.2.2 Terminology and Conventions 143</p> <p>6.3 Maintenance Operations 143</p> <p>6.3.1 Operations Inside a Class 145</p> <p>6.3.2 Operations Between Classes (Outside a Class) 148</p> <p>6.4 Insert and Delete Algorithms 153</p> <p>6.4.1 Inserting a New Element 153</p> <p>6.4.2 Deleting an Existing Element 157</p> <p>6.5 P(h)-BST as a Generator of Balanced Binary Search Trees 160</p> <p>6.6 Simulation Results 162</p> <p>6.6.1 Data Structures and Abstract Data Types 164</p> <p>6.6.2 Analyzing the Insert and Delete Process in Random Case 164</p> <p>6.6.3 Analyzing the Insert Process in Ascending (Descending) Case 168</p> <p>6.6.4 Comparing P(2)-BST/P(∞)-BST to Red-Black/AVL Trees 174</p> <p>6.7 Conclusion 175</p> <p>Acknowledgments 176</p> <p>References 176</p> <p><b>7 Security Ontologies: An Investigation of Pitfall Rate 179<br /></b><i>Archana Patel and Narayan C. Debnath</i></p> <p>7.1 Introduction 179</p> <p>7.2 Secure Data Management in the Semantic Web 184</p> <p>7.3 Security Ontologies in a Nutshell 187</p> <p>7.4 InFra_OE Framework 189</p> <p>7.5 Conclusion 193</p> <p>References 193</p> <p><b>8 IoT-Based Fully-Automated Fire Control System 199<br /></b><i>Lalit Mohan Satapathy</i></p> <p>8.1 Introduction 200</p> <p>8.2 Related Works 201</p> <p>8.3 Proposed Architecture 203</p> <p>8.4 Major Components 205</p> <p>8.4.1 Arduino UNO 205</p> <p>8.4.2 Temperature Sensor 207</p> <p>8.4.3 LCD Display (16X2) 208</p> <p>8.4.4 Temperature Humidity Sensor (DHT11) 209</p> <p>8.4.5 Moisture Sensor 210</p> <p>8.4.6 CO<sub>2</sub> Sensor 211</p> <p>8.4.7 Nitric Oxide Sensor 212</p> <p>8.4.8 CO Sensor (MQ-9) 212</p> <p>8.4.9 Global Positioning System (GPS) 212</p> <p>8.4.10 GSM Modem 213</p> <p>8.4.11 Photovoltaic System 214</p> <p>8.5 Hardware Interfacing 216</p> <p>8.6 Software Implementation 218</p> <p>8.7 Conclusion 222</p> <p>References 223</p> <p><b>9 Phrase Level-Based Sentiment Analysis Using Paired Inverted Index and Fuzzy Rule 225<br /></b><i>Sheela J., Karthika N. and Janet B.</i></p> <p>9.1 Introduction 226</p> <p>9.2 Literature Survey 228</p> <p>9.3 Methodology 233</p> <p>9.3.1 Construction of Inverted Wordpair Index 234</p> <p>9.3.1.1 Sentiment Analysis Design Framework 235</p> <p>9.3.1.2 Sentiment Classification 236</p> <p>9.3.1.3 Preprocessing of Data 237</p> <p>9.3.1.4 Algorithm to Find the Score 240</p> <p>9.3.1.5 Fuzzy System 240</p> <p>9.3.1.6 Lexicon-Based Sentiment Analysis 241</p> <p>9.3.1.7 Defuzzification 242</p> <p>9.3.2 Performance Metrics 243</p> <p>9.4 Conclusion 244</p> <p>References 244</p> <p><b>10 Semantic Technology Pillars: The Story So Far 247<br /></b><i>Michael DeBellis, Jans Aasman and Archana Patel</i></p> <p>10.1 The Road that Brought Us Here 248</p> <p>10.2 What is a Semantic Pillar? 249</p> <p>10.2.1 Machine Learning 249</p> <p>10.2.2 The Semantic Approach 250</p> <p>10.3 The Foundation Semantic Pillars: IRI’s, RDF, and RDFS 252</p> <p>10.3.1 Internationalized Resource Identifier (IRI) 254</p> <p>10.3.2 Resource Description Framework (RDF) 254</p> <p>10.3.2.1 Alternative Technologies to RDF: Property Graphs 256</p> <p>10.3.3 RDF Schema (RDFS) 257</p> <p>10.4 The Semantic Upper Pillars: OWL, SWRL, SPARQL, and SHACL 259</p> <p>10.4.1 The Web Ontology Language (OWL) 260</p> <p>10.4.1.1 Axioms to Define Classes 262</p> <p>10.4.1.2 The Open World Assumption 263</p> <p>10.4.1.3 No Unique Names Assumption 263</p> <p>10.4.1.4 Serialization 264</p> <p>10.4.2 The Semantic Web Rule Language 264</p> <p>10.4.2.1 The Limitations of Monotonic Reasoning 267</p> <p>10.4.2.2 Alternatives to SWRL 267</p> <p>10.4.3 SPARQL 268</p> <p>10.4.3.1 The SERVICE Keyword and Linked Data 268</p> <p>10.4.4 SHACL 271</p> <p>10.4.4.1 The Fundamentals of SHACL 272</p> <p>10.5 Conclusion 274</p> <p>References 274</p> <p><b>11 Evaluating Richness of Security Ontologies for Semantic Web 277<br /></b><i>Ambrish Kumar Mishra, Narayan C. Debnath and Archana Patel</i></p> <p>11.1 Introduction 277</p> <p>11.2 Ontology Evaluation: State-of-the-Art 280</p> <p>11.2.1 Domain-Dependent Ontology Evaluation Tools 281</p> <p>11.2.2 Domain-Independent Ontology Evaluation Tools 282</p> <p>11.3 Security Ontology 284</p> <p>11.4 Richness of Security Ontologies 287</p> <p>11.5 Conclusion 295</p> <p>References 295</p> <p><b>12 Health Data Science and Semantic Technologies 299<br /></b><i>Haleh Ayatollahi</i></p> <p>12.1 Health Data 300</p> <p>12.2 Data Science 301</p> <p>12.3 Health Data Science 301</p> <p>12.4 Examples of Health Data Science Applications 304</p> <p>12.5 Health Data Science Challenges 306</p> <p>12.6 Health Data Science and Semantic Technologies 308</p> <p>12.6.1 Natural Language Processing (NLP) 309</p> <p>12.6.2 Clinical Data Sharing and Data Integration 310</p> <p>12.6.3 Ontology Engineering and Quality Assurance (QA) 311</p> <p>12.7 Application of Data Science for COVID-19 313</p> <p>12.8 Data Challenges During COVID-19 Outbreak 314</p> <p>12.9 Biomedical Data Science 315</p> <p>12.10 Conclusion 316</p> <p>References 317</p> <p><b>13 Hybrid Mixed Integer Optimization Method for Document Clustering Based on Semantic Data Matrix 323<br /></b><i>Tatiana Avdeenko and Yury Mezentsev</i></p> <p>13.1 Introduction 324</p> <p>13.2 A Method for Constructing a Semantic Matrix of Relations Between Documents and Taxonomy Concepts 327</p> <p>13.3 Mathematical Statements for Clustering Problem 330</p> <p>13.3.1 Mathematical Statements for PDC Clustering Problem 330</p> <p>13.3.2 Mathematical Statements for CC Clustering Problem 334</p> <p>13.3.3 Relations between PDC Clustering and CC Clustering 336</p> <p>13.4 Heuristic Hybrid Clustering Algorithm 340</p> <p>13.5 Application of a Hybrid Optimization Algorithm for Document Clustering 342</p> <p>13.6 Conclusion 344</p> <p>Acknowledgment 344</p> <p>References 344</p> <p><b>14 Role of Knowledge Data Science During COVID-19 Pandemic 347<br /></b><i>Veena Kumari H. M. and D. S. Suresh</i></p> <p>14.1 Introduction 348</p> <p>14.1.1 Global Health Emergency 350</p> <p>14.1.2 Timeline of the COVID-19 351</p> <p>14.2 Literature Review 354</p> <p>14.3 Model Discussion 356</p> <p>14.3.1 COVID-19 Time Series Dataset 357</p> <p>14.3.2 FBProphet Forecasting Model 358</p> <p>14.3.3 Data Preprocessing 360</p> <p>14.3.4 Data Visualization 360</p> <p>14.4 Results and Discussions 362</p> <p>14.4.1 Analysis and Forecasting: The World 362</p> <p>14.4.2 Performance Metrics 371</p> <p>14.4.3 Analysis and Forecasting: The Top 20 Countries 377</p> <p>14.5 Conclusion 388</p> <p>References 389</p> <p><b>15 Semantic Data Science in the COVID-19 Pandemic 393<br /></b><i>Michael DeBellis and Biswanath Dutta</i></p> <p>15.1 Crises Often Are Catalysts for New Technologies 393</p> <p>15.1.1 Definitions 394</p> <p>15.1.2 Methodology 395</p> <p>15.2 The Domains of COVID-19 Semantic Data Science Research 397</p> <p>15.2.1 Surveys 398</p> <p>15.2.2 Semantic Search 399</p> <p>15.2.2.1 Enhancing the CORD-19 Dataset with Semantic Data 399</p> <p>15.2.2.2 CORD-19-on-FHIR – Semantics for COVID-19 Discovery 400</p> <p>15.2.2.3 Semantic Search on Amazon Web Services (AWS) 400</p> <p>15.2.2.4 COVID*GRAPH 402</p> <p>15.2.2.5 Network Graph Visualization of CORD-19 403</p> <p>15.2.2.6 COVID-19 on the Web 404</p> <p>15.2.3 Statistics 405</p> <p>15.2.3.1 The Johns Hopkins COVID-19 Dashboard 405</p> <p>15.2.3.2 The NY Times Dataset 406</p> <p>15.2.4 Surveillance 406</p> <p>15.2.4.1 An IoT Framework for Remote Patient Monitoring 406</p> <p>15.2.4.2 Risk Factor Discovery 408</p> <p>15.2.4.3 COVID-19 Surveillance in a Primary Care Network 408</p> <p>15.2.5 Clinical Trials 409</p> <p>15.2.6 Drug Repurposing 411</p> <p>15.2.7 Vocabularies 414</p> <p>15.2.8 Data Analysis 415</p> <p>15.2.8.1 CODO 415</p> <p>15.2.8.2 COVID-19 Phenotypes 416</p> <p>15.2.8.3 Detection of “Fake News” 417</p> <p>15.2.8.4 Ontology-Driven Weak Supervision for Clinical Entity Classification 417</p> <p>15.2.9 Harmonization 418</p> <p>15.3 Discussion 418</p> <p>15.3.1 Privacy Issues 420</p> <p>15.3.2 Domains that May Currently be Under Utilized 421</p> <p>15.3.2.1 Detection of Fake News 421</p> <p>15.3.2.2 Harmonization 421</p> <p>15.3.3 Machine Learning and Semantic Technology: Synergy Not Competition 422</p> <p>15.3.4 Conclusion 423</p> <p>Acknowledgment 423</p> <p>References 423</p> <p>Index 427</p>
<p><b> Archana Patel, PhD,</b> is a faculty of the Department of Software Engineering, School of Computing and Information Technology, Binh Duong Province, Vietnam. She completed her Postdoc from the Freie Universität Berlin, Berlin, Germany. Dr. Patel is an author or co-author of more than 30 publications in numerous refereed journals and conference proceedings. She has been awarded the Best Paper award (three times) at international conferences. Her research interests are ontological engineering, semantic web, big data, expert systems, and knowledge warehouse. </p> <p><b> Narayan C. Debnath, PhD,</b> is the Founding Dean of the School of Computing and Information Technology at Eastern International University, Vietnam. He is also serving as the Head of the Department of Software Engineering at Eastern International University, Vietnam. Dr. Debnath has been the Director of the International Society for Computers and their Applications (ISCA), USA since 2014. Formerly, Dr. Debnath served as a Full Professor of Computer Science at Winona State University, Minnesota, USA for 28 years. <p><b> Bharat Bhusan, PhD,</b> is an assistant professor in the Department of Computer Science and Engineering, School of Engineering and Technology, Sharda University, India. In the last three years, he has published more than 80 research papers in various renowned international conferences and SCI indexed journals and edited 11 books.
<p><b>This book will serve as an important guide toward applications of data science with semantic technologies for the upcoming generation and thus becomes a unique resource for scholars, researchers, professionals, and practitioners in this field. </b></p> <p>To create intelligence in data science, it becomes necessary to utilize semantic technologies which allow machine-readable representation of data. This intelligence uniquely identifies and connects data with common business terms, and it also enables users to communicate with data. Instead of structuring the data, semantic technologies help users to understand the meaning of the data by using the concepts of semantics, ontology, OWL, linked data, and knowledge-graphs. These technologies help organizations to understand all the stored data, adding the value in it, and enabling insights that were not available before. As data is the most important asset for any organization, it is essential to apply semantic technologies in data science to fulfill the need of any organization. <p><i>Data Science with Semantic Technologies</i> provides a roadmap for the deployment of semantic technologies in the field of data science. Moreover, it highlights how data science enables the user to create intelligence through these technologies by exploring the opportunities and eradicating the challenges in the current and future time frame. In addition, this book provides answers to various questions like: Can semantic technologies be able to facilitate data science? Which type of data science problems can be tackled by semantic technologies? How can data scientists benefit from these technologies? What is knowledge data science? How does knowledge data science relate to other domains? What is the role of semantic technologies in data science? What is the current progress and future of data science with semantic technologies? Which types of problems require the immediate attention of researchers? <p><b> Audience</b> <p>Researchers in the fields of data science, semantic technologies, artificial intelligence, big data, and other related domains, as well as industry professionals, software engineers/scientists, and project managers who are developing the software for data science. Students across the globe will get the basic and advanced knowledge on the current state and potential future of data science.

Diese Produkte könnten Sie auch interessieren:

Trustworthy AI
Trustworthy AI
von: Beena Ammanath
EPUB ebook
32,99 €
Trustworthy AI
Trustworthy AI
von: Beena Ammanath
PDF ebook
32,99 €
The New Advanced Society
The New Advanced Society
von: Sandeep Kumar Panda, Ramesh Kumar Mohapatra, Subhrakanta Panda, S. Balamurugan
EPUB ebook
190,99 €