Search

Browse Subject Areas

For Authors

Submit a Proposal

Advances in Data Science and Analytics

Edited by M. Niranjanamurthy, Hemant Kumar Gianey, and Amir H. Gandomi
Series: Advances in Data Engineering and Machine Learning
Copyright: 2023   |   Status: Published
ISBN: 9781119791881  |  Hardcover  |  
327 pages
Price: $225 USD
Add To Cart

One Line Description
Presenting the concepts and advances of data science and analytics, this volume, written and edited by a global team of experts, also goes into the practical applications that can be utilized across multiple disciplines and industries, for both the engineer and the student, focusing on machining learning, big data, business intelligence, and analytics.


Audience
Professionals, researchers, software developers, managers, policymakers, data scientists, instructors, and students in computer science and engineering, data analytics, electronics, communication engineering, information technology, and other developers and professionals across multiple industries

Description
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining, deep learning and big data. Data analytics software is a more focused version of this and can even be considered part of the larger process. Analytics is devoted to realizing actionable insights that can be applied immediately based on existing queries. For the purposes of this volume, data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. While a data scientist is expected to forecast the future based on past patterns, data analysts extract meaningful insights from various data sources.

Although data mining and other related areas have been around for a few decades, data science and analytics are still quickly evolving, and the processes and technologies change, almost on a day-to-day basis. This volume provides an overview of some of the most important advances in these areas today, including practical coverage of the daily applications. Valuable as a learning tool for beginners in this area as well as a daily reference for engineers and scientists working in these areas, this is a must-have for any library.


Back to Top
Author / Editor Details
M. Niranjanamurthy, PhD, is an assistant professor in the Department of Computer Applications, M S Ramaiah Institute of Technology, Bangalore, Karnataka. He earned his PhD in computer science at JJTU. He has over 10 years of teaching experience and two years of industry experience as a software engineer. He has published four books and 56 papers in technical journals and conferences. He has two patents to his credit and has won numerous awards.

Hemant Kumar Gianey, PhD, is an assistant professor at Vellore University, India. He obtained his PhD from Rajasthan and was a post-doc in Computer Science and Engineering at the National Cheng Kung University, Taiwan. He has over 15 years of teaching and industry experience and has also published papers in and acted as a reviewer and guest editor for several technical and scientific journals.

Amir H. Gandomi, PhD, is a professor of data science in the Department of Engineering and Information Technology, University of Technology Sydney. Prior to joining UTS, he was an assistant professor at the School of Business, Stevens Institute of Technology, NJ, and a distinguished research fellow in BEACON center, Michigan State University. He has published over 150 journal papers and four books and collectively have been cited more than 14,000 times. He has been named as one of the world’s most influential scientific minds and a Highly Cited Researcher (top 1%) for three consecutive years, from 2017 to 2019. He has also served as associate editor, editor and guest editor in several prestigious journals and has delivered several keynote talks. He is also part of a NASA technology cluster on Big Data, Artificial Intelligence, and Machine Learning.

Back to Top

Table of Contents
Preface
1. Implementation Tools for Generating Statistical Consequence Using Data Visualization Techniques

Dr. Ajay B. Gadicha, Dr. Vijay B Gadicha, Prof. Sneha Bohra and Dr. Niranjanamurthy M
1.1 Introduction
1.2 Literature Review
1.3 Tools in Data Visualization
1.4 Methodology
1.4.1 Plotting the Data
1.4.2 Plotting the Model on Data
1.4.3 Quantifying Linear Relationships
1.4.4 Covariance vs. Correlation
1.5 Conclusion
References
2. Decision Making and Predictive Analysis for Real Time Data
Umesh Pratap Singh
2.1 Introduction
2.2 Data Analytics
2.2.1 Descriptive Analytics
2.2.2 Diagnostic Analytics
2.2.3 Predictive Analytics
2.2.4 Prescriptive Analytics
2.3 Predictive Modeling
2.4 Categories of Predictive Models
2.5 Process of Predictive Modeling
2.5.1 Requirement Gathering
2.5.2 Data Gathering
2.5.3 Data Analysis and Massaging
2.5.4 Machine Learning Statistics
2.5.5 Predictive Modeling
2.5.6 Prediction and Decision Making
2.6 Predictive Analytics Opportunities
2.6.1 Detecting Fraud
2.6.2 Reduction of Risk
2.6.3 Marketing Campaign Optimization
2.6.4 Operation Improvement
2.6.5 Clinical Decision Support System
2.7 Classification of Predictive Analytics Models
2.7.1 Predictive Models
2.7.2 Descriptive Models
2.7.3 Decision Models
2.8 Predictive Analytics Techniques
2.8.1 Predictive Analytics Software
2.8.2 The Importance of Good Data
2.8.3 Predictive Analytics vs. Business Intelligence
2.8.4 Pricing Information
2.9 Data Analysis Tools
2.9.1 Excel
2.9.2 Tableau
2.9.3 Power BI
2.9.4 Fine Report
2.9.5 R & Python
2.10 Advantages & Disadvantages of Predictive Modeling
2.10.1 Advantages
2.10.2 Disadvantages
2.10.2.1 Data Labeling
2.10.2.2 Obtaining Massive Training Datasets
2.10.2.3 The Explainability Problem
2.10.2.4 Generalizability of Learning
2.10.2.5 Bias in Algorithms and Data
2.11 Predictive Analytics Biggest Impact
2.11.1 Predicting Demand
2.11.2 Transformation using Technology and Process
2.11.3 Improved Pricing
2.11.4 Predictive Maintenance
2.12 Application of Predictive Analytics
2.12.1 Financial and Banking Services
2.12.2 Retail
2.12.3 Health and Insurance
2.12.4 Oil and Gas Utilities
2.12.5 Public Sector
2.13 Future Scope of Predictive Modeling
2.13.1 Technological Advancements 2.13.2 Changes in Work
2.13.3 Risk Mitigation
2.14 Conclusion
References
3. Optimizing Water Quality with Data Analytics and Machine Learning
Bin Liang, Zhidong Li, Hongda Tian, Shuming Liang, Yang Wang and Fang Chen
3.1 Introduction
3.2 Related Work
3.3 Data Sources and Collection
3.4 Water Demand Forecasting
3.4.1 Network Flow and Zone Demand Estimation
3.4.2 Demand Forecasting
3.4.2.1 Feature Importance
3.4.2.2 Forecast Horizon
3.4.3 Performance Characterization
3.5 Re-Chlorination Optimization
3.5.1 Data
3.5.2 Water Age Estimation
3.5.2.1 Travel Time Estimation
3.5.2.2 Residential Time Estimation
3.5.3 Ammonia Prediction
3.5.4 Optimization Model Definition
3.5.5 Improvements in Customer Water Quality
3.5.6 Plant Dosing Optimization
3.6 Conclusion
Acknowledgements
Reference
4. Lip Reading Framework using Deep Learning and Machine Learning
Hemant Kumar Gianey, Parth Khandelwal, Prakhar Goel, Rishav Maheshwari, Bhannu Galhotra and Divyanshu Pratap Singh
4.1 Introduction
4.1.1 Overview
4.1.2 Motivation
4.1.3 Lip Reading System Outcomes and Deliverables
4.2 The Emergence and Definition of the Lip-Reading System
4.2.1 Background of Domain
4.2.2 Identified Problems
4.2.3 Tools and Technologies Used
4.2.4 Implementation Aspects
4.2.4.1 Data Preparation
4.3 Design and Components of Lip-Reading System
4.4 Lip Reading System Architecture
4.5 Testing
4.6 Problems Encountered during Implementation
4.6.1 Assumptions and Constraints
4.7 Conclusion
4.8 Future Work
References
5. New Perspectives on Economic Growth and Debt Nexus Analysis: Evidence from Indian Economy
Edmund Ntom Udemba, Festus Victor Bekun, Dervis Kirikkaleli and Esra Sipahi
5.1 Introduction
5.2 Literature Review
5.2.1 External Debt and Economic Growth
5.2.2 Trade Openness, FDI, and Economic Growth
5.2.3 FDI and Economic Growth
5.3 Data
5.3.1 Analytical Framework and Data Description
5.3.2 Theoretical Background and Specifications
5.3.2.1 Model Specification
5.4 Methodology and Findings
5.4.1 Unit Root Testing
5.4.2 Cointegration
5.4.3 Vector Error Correction Model
5.4.4 Long-Run Relationship Estimation
5.4.5 Causality Test
5.5 Conclusion and Policy Implications
Declarations
Availability of Data and Materials
Competing Interests
Funding
Authors’ Contributions
Acknowledgments
References
6. Data-Driven Delay Analysis with Applications to Railway Networks
Boyu Li, Ting Guo, Yang Wang and Fang Chen
6.1 Introduction
6.2 Related Works
6.3 Background Knowledge
6.3.1 Background and Problem Formulation
6.3.1.1 Train Delay
6.3.1.2 Delay Propagation
6.3.2 Preliminaries
6.3.2.1 Bayesian Inference
6.3.2.2 Markov Property
6.4 Delay Propagation Model
6.4.1 Conditional Bayesian Delay Propagation
6.4.1.1 Delay Self-Propagation
6.4.1.2 Incremental Run-Time Delay
6.4.1.3 Incremental Dwell Time Delay
6.4.1.4 Accumulative Departure Delay
6.4.2 Cross-Line Propagation, Backward Propagation and Train Connection Propagation
6.5 Primary Delay Tracing Back
6.5.1 Delay Candidates Selection
6.5.2 Relation Construction
6.5.2.1 Preceding and Following Trains
6.5.2.2 Preceding and Connecting Trains
6.6 Evaluation on Dwell Time Improvement Strategy
6.7 Experiments
6.7.1 Experiment Setting
6.7.2 Temporal Prediction of Delay Propagation
6.7.3 Spatial Prediction of Delay Propagation
6.7.4 Case Study of Primary Delay Tracing Down
6.7.5 Evaluation of Dwell Time Improvement Strategy
6.8 Conclusion
References
7. Proposing a Framework to Analyze Breast Cancer in Mammogram Images using Global Thresholding, Gray Level Co-Occurrence Matrix, and Convolutional
Neural Network (CNN)

Ms. Tanishka Dixit and Ms. Namrata Singh
7.1 Introduction & Purpose of Study
7.1.1 Segmentation
7.1.1.1 Types of Segmentation
7.1.2 Compression
7.2 Literature Review & Motivation
7.3 Proposed Work
7.3.1 Algorithm
7.3.2 Explanation
7.3.3 Flowchart
7.4 Observation Tables and Figures
7.5 Conclusion
7.6 Future Work
References
8. IoT Technologies for Smart Healthcare
Rehab A. Rayan, Imran Zafar and Christos Tsagkaris
8.1 Introduction
8.2 Literature Review
8.2.1 IoT-based Smart Health
8.2.2 Advantages of Applying IoT in Health
8.3 Findings
8.3.1 Significant Features and Applications of IoT in Health
8.3.1.1 Simultaneous Monitoring and Reporting
8.3.1.2 End-to-End Connectivity and Affordability
8.3.1.3 Data Analysis
8.3.1.4 Tracking, Alerts, and Remote Medical Care
8.3.1.5 Research
8.3.1.6 Patient-Generated Health Data (PGHD)
8.3.1.7 Management of Chronic Diseases and Preventative Care
8.3.1.8 Home-based and Short-term Care
8.4 Case Study: CyberMed as an IoT-based Smart Health Model
8.5 Discussions
8.5.1 Limitations of Adopting IoT in Health
8.5.1.1 Data Security and Privacy
8.5.1.2 Connectivity
8.5.1.3 Compatibility and Data Integration
8.5.1.4 Implementation Cost
8.5.1.5 Complexity and Risk of Errors
8.6 Future Insights
8.7 Conclusions
References
Index Terms
9. Enhancement of Scalability of SVM Classifiers for Big Data
Vijaykumar Bhajantri, Shashikumar G. Totad and Geeta R. Bharamagoudar
9.1 Introduction
9.2 Support Vector Machine
9.2.1 Challenges
9.3 Parallel and Distributed Mechanism
9.3.1 Shared-Memory Parallelism
9.4 Distributed Big Data Architecture
9.4.1 Hadoop MapReduce
9.4.2 Spark
9.4.3 AKKA
9.5 Distributed High Performance Computing
9.5.1 GasNet
9.5.2 Charm++
9.6 GPU Based Parallelism
9.6.1 CUDA
9.6.2 OpenCL
9.7 Parallel and Distributed SVM Algorithms
9.7.1 LS-SVM
9.7.2 Cascade SVM
9.7.3 DC SVM
9.7.4 Parallel Distributed Multiclass SVM Algorithms
9.8 Conclusion and Future Research Directions
References
10. Electrical Network-Related Incident Prediction Based on Weather Factors
Hongda Tian, Jessie Nghiem and Fang Chen
10.1 Introduction
10.2 Related Work
10.3 Methodology
10.3.1 Binary Classification of Incident and Normality
10.3.2 Incident Categorization using Natural Language Processing
10.3.3 Classification of Multiple Types of Incidents
10.4 Experiments
10.4.1 Data Sets
10.4.2 Evaluation Metrics
10.4.3 Binary Classification
10.4.4 Incident Categorization
10.4.5 Multi-Class Classification
10.5 Conclusion and Future Work
Acknowledgements
References
11. Green IoT: Environment-Friendly Approach to IoT
Abhishek Goel and Siddharth Gautam
11.1 Introduction
11.2 G-IoT (Green Internet of Things)
11.3 Layered Architecture of G-IoT
11.3.1 Data Center/Cloud
11.3.2 Data Analytics and Control Applications It
11.3.3 Data Aggregation and Storage
11.3.4 Edge Computing
11.3.5 Communication and Processing Unit
11.4 Techniques for Implementation of G-IoT
11.5 Power Saving Methods Based on Components
11.6 Applications of G-IoT
11.7 Challenges and Future Scope
11.8 Case Study
11.9 Conclusion
References
12. Big-Data Analytics: A New Paradigm Shift in Micro Finance Industry
Vinay Pal Singh, Rohit Bansal and Ram Singh
12.1 Introduction
12.2 Reality of Area and Transcendent Difficulties
12.2.1 Probable Overlending
12.2.2 Information Imbalance
12.2.3 Retreating Not-for-Profit Sector
12.2.4 Neighbourhood Pressure
12.3 Data Analytics in Microfinance
12.3.1 Types of Data Analytics used in Microfinance
12.3.2 Use of Big Data in Microfinance Industry
12.3.3 Risk and Data based Credit Decisions
12.3.4 Product Development and Selection
12.3.5 Product or Service Positioning
12.3.6 M-Commerce and E-Payments
12.3.7 Making Reliable Credit Decisions
12.3.8 Big Data-Driven Model Promises Psychometric Evaluations
12.3.9 Product Build-up, Service Positioning, and Offering
12.4 Opportunities and Risks in Using Data Analytics
12.5 Risk in Utilizing Big Data
12.6 Conclusion
References
13. Big Data Storage and Analysis
Namrata Dhanda
13.1 Introduction
13.1.1 6 V’s of Big Data
13.1.2 Types of Data
13.1.3 Issues in Handling Big Data
13.2 Hadoop as a Solution to Challenges of Big Data
13.2.1 The Hadoop Ecosystem
13.2.2 Rack Awareness Policy in HDFS
13.3 In-Memory Storage and NoSQL
13.3.1 Key-Value Data Stores
13.3.2 Document Stores
13.3.3 Wide Column Stores
13.3.4 Graph Stores
13.3.5 Multi-Modal Databases
13.4 Advantages of NoSQL Database
13.5 Conclusion
References
Index Terms
14. A Framework for Analysing Social Media and Digital Data by Applying Machine Learning Techniques for Pandemic Management
Mutyala Sridevi
14.1 Introduction
14.2 Literature Review
14.3 Understanding Pandemic Analogous to a Disaster
14.4 Application of Machine Learning Techniques at Various Phases of Pandemic Management
14.4.1 Mitigation Phase
14.4.2 Preparedness Phase
14.4.3 Response Phase
14.4.4 Recovery Phase
14.5 Generalized Framework to Apply Machine Learning Techniques for Pandemic Management
14.6 Conclusion
References
About the Editors
Index


Back to Top



Description
Author/Editor Details
Table of Contents
Bookmark this page