Shamsi, Jawwad Ahmad,

Big data systems : a 360-degree approach / Jawwad Ahmad Shamsi, Muhammad Khojaye. - 1st. - 1 online resource : illustrations (black and white, and colour). - Chapman & Hall/CRC big data series .

Previously issued in print: 2020.

Preface
Author Bios
Acknowledgements
List of Figures
List of Tables


Introduction to Big Data Systems
1.1 INTRODUCTION: REVIEW OF BIG DATA SYSTEMS
1.2 UNDERSTANDING BIG DATA
1.3 TYPE OF DATA: TRANSACTIONAL OR ANALYTICAL
1.4 REQUIREMENTS AND CHALLENGES OF BIG DATA
1.5 CONCLUDING REMARKS
1.6 FURTHER READING
1.7 EXERCISE QUESTIONS

Architecture and Organization of Big Data Systems
2.1 ARCHITECTURE FOR BIG DATA SYSTEMS
2.2 ORGANIZATION OF BIG DATA SYSTEMS: CLUSTERS
2.3 CLASSIFICATION OF CLUSTERS: DISTRIBUTED MEMORY VS. SHARED MEMORY
2.4 CONCLUDING REMARKS
2.5 FURTHER READING
2.6 EXERCISE QUESTIONS

Cloud Computing for Big Data
3.1 CLOUD COMPUTING
3.2 VIRTUALIZATION
3.3 PROCESSOR VIRTUALIZATION
3.4 CONTAINERIZATION
3.5 VIRTUALIZATION OR CONTAINERIZATION
3.6 FOG COMPUTING
3.7 EXAMPLES
3.8 CONCLUDING REMARKS
3.9 FURTHER READING
3.10 EXERCISE QUESTIONS

HADOOP: An Efficient Platform for Storing and Processing Big Data
4.1 REQUIREMENTS FOR PROCESSING AND STORING BIG DATA
4.2 HADOOP -- THE BIG PICTURE
4.3 HADOOP DISTRIBUTED FILE SYSTEM
4.4 MAPREDUCE
4.5 HBASE
4.6 CONCLUDING REMARKS
4.7 FURTHER READING
4.8 EXERCISE QUESTIONS

Enhancements in Hadoop
5.1 ISSUES WITH HADOOP
5.2 YARN
5.3 PIG
5.4 HIVE
5.5 DREMEL
5.6 IMPALA
5.7 DRILL
5.8 DATA TRANSFER
5.9 AMBARI
5.10 CONCLUDING REMARKS
5.11 FURTHER READING
5.12 EXERCISE QUESTIONS

Spark
6.1 LIMITATIONS OF MAPREDUCE
6.2 INTRODUCTION TO SPARK
6.3 SPARK CONCEPTS
6.4 SPARK SQL
6.5 SPARK MLLIB
6.6 STREAM BASED SYSTEM
6.7 SPARK STREAMING
6.8 CONCLUDING REMARKS
6.9 FURTHER READING
6.10 EXERCISE QUESTIONS

NoSQL Systems
7.1 INTRODUCTION
7.2 HANDLING BIG DATA SYSTEMS -- PARALLEL RDBMS
7.3 EMERGENCE OF NOSQL SYSTEMS
7.4 KEY-VALUE DATABASE
7.5 DOCUMENT-ORIENTED DATABASE
7.6 COLUMN-ORIENTED DATABASE
7.7 GRAPH DATABASE
7.8 CONCLUDING REMARKS
7.9 FURTHER READING
7.10 EXERCISE QUESTIONS

NewSQL Systems
8.1 INTRODUCTION
8.2 TYPES OF NEWSQL SYSTEMS
8.3 FEATURES
8.4 NEWSQL SYSTEMS: CASE STUDIES
8.5 CONCLUDING REMARKS
8.6 FURTHER READING
8.7 EXERCISE QUESTIONS


Networking for Big Data
9.1 NETWORK ARCHITECTURE FOR BIG DATA SYSTEMS
9.2 CHALLENGES AND REQUIREMENTS
9.3 NETWORK PROGRAMMABILITY AND SOFTWARE DEFINED NETWORKING
9.4 LOW LATENCY AND HIGH SPEED DATA TRANSFER
9.5 AVOIDING TCP INCAST -- ACHIEVING LOW LATENCY
AND HIGH THROUGHPUT
9.6 FAULT TOLERANCE
9.7 CONCLUDING REMARKS
9.8 FURTHER READING
9.9 EXERCISE QUESTIONS

Security for Big Data
10.1 INTRODUCTION
10.2 SECURITY REQUIREMENTS
10.3 SECURITY: ATTACK TYPES AND MECHANISMS
10.4 ATTACK DETECTION AND PREVENTION
10.5 CONCLUDING REMARKS
10.6 FURTHER READING
10.7 EXERCISE QUESTIONS

Privacy for Big Data
11.1 INTRODUCTION
11.2 UNDERSTANDING BIG DATA AND PRIVACY
11.3 PRIVACY VIOLATIONS AND THEIR IMPACT
11.4 TYPES OF PRIVACY VIOLATIONS
11.5 PRIVACY PROTECTION SOLUTIONS AND THEIR LIMITATIONS
11.6 CONCLUDING REMARKS
11.7 FURTHER READING
11.8 EXERCISE QUESTIONS

High Performance Computing for Big Data
12.1 INTRODUCTION
12.2 SCALABILITY: NEED FOR HPC
12.3 GRAPHIC PROCESSING UNIT
12.4 TENSOR PROCESSING UNIT
12.5 HIGH SPEED INTERCONNECTS
12.6 MESSAGE PASSING INTERFACE
12.7 OPENMP
12.8 OTHER FRAMEWORKS
12.9 CONCLUDING REMARKS
12.10 FURTHER READING
12.11 EXERCISE QUESTIONS

Deep Learning with Big Data
13.1 INTRODUCTION
13.2 FUNDAMENTALS
13.3 NEURAL NETWORK
13.4 TYPES OF DEEP NEURAL NETWORK
13.5 BIG DATA APPLICATIONS USING DEEP LEARNING
13.6 CONCLUDING REMARKS
13.7 FURTHER READING
13.8 EXERCISE QUESTIONS

Big Data Case Studies
14.1 GOOGLE EARTH ENGINE
14.2 FACEBOOK MESSAGES APPLICATION
14.3 HADOOP FOR REAL-TIME ANALYTICS
14.4 BIG DATA PROCESSING AT UBER
14.5 BIG DATA PROCESSING AT LINKEDIN
14.6 DISTRIBUTED GRAPH PROCESSING AT GOOGLE
14.7 FUTURE TRENDS
14.8 CONCLUDING REMARKS
14.9 FURTHER READING
14.10 EXERCISE QUESTIONS

Bibliography
Index



Big Data Systems encompass massive challenges related to data diversity, storage mechanisms, and requirements of massive computational power. Further, capabilities of big data systems also vary with respect to type of problems. For instance, distributed memory systems are not recommended for iterative algorithms. Similarly, variations in big data systems also exist related to consistency and fault tolerance. The purpose of this book is to provide a detailed explanation of big data systems. The book covers various topics including Networking, Security, Privacy, Storage, Computation, Cloud Computing, NoSQL and NewSQL systems, High Performance Computing, and Deep Learning. An illustrative and practical approach has been adopted in which theoretical topics have been aided by well-explained programming and illustrative examples. Key Features: Introduces concepts and evolution of Big Data technology. Illustrates examples for thorough understanding. Contains programming examples for hands on development. Explains a variety of topics including NoSQL Systems, NewSQL systems, Security, Privacy, Networking, Cloud, High Performance Computing, and Deep Learning. Exemplifies widely used big data technologies such as Hadoop and Spark. Includes discussion on case studies and open issues. Provides end of chapter questions for enhanced learning.

9780429531576 0429531575 9780429155444 0429155441 9781498752718 1498752713 9780429546273 0429546270

10.1201/9780429155444 doi


Big data.
Systems engineering.
COMPUTERS / Database Management / Data Mining
BUSINESS & ECONOMICS / Statistics
COMPUTERS / Database Management / General

QA76.9.B45

005.7