As more and more data is generated at a faster-than-ever rate, processing large volumes of data is becoming a challenge for data analysis software. Addressing performance issues, Cloud Computing: Data-Intensive Computing and Scheduling explores the evolution of classical techniques and describes completely new methods and innovative algorithms. The book delineates many concepts, models, methods, algorithms, and software used in cloud computing.
After a general introduction to the field, the text covers resource management, including scheduling algorithms for real-time tasks and practical algorithms for user bidding and auctioneer pricing. It next explains approaches to data analytical query processing, including pre-computing, data indexing, and data partitioning. Applications of MapReduce, a new parallel programming model, are then presented. The authors also discuss how to optimize multiple group-by query processing and introduce a MapReduce real-time scheduling algorithm.
A useful reference for studying and using MapReduce and cloud computing platforms, this book presents various technologies that demonstrate how cloud computing can meet business requirements and serve as the infrastructure of multidimensional data analysis applications.
Overview of Cloud Computing
Resource Scheduling for Cloud Computing
Game Theoretical Allocation in a Cloud Datacenter
Multidimensional Data Analysis in a Cloud Datacenter
Data-Intensive Applications with MapReduce
Large-Scale Multidimensional Data Aggregation
Multidimensional Data Analysis Optimization
Improvements by speed-up measurements
Improvements by affecting factors
Improvement by cost estimation
Compressed data structures
Real-Time Scheduling with MapReduce
Future for Cloud Computing
Frédéric Magoulès is a professor at École Centrale Paris, where he leads the high performance computing research group. His research focuses on the algorithmic interface between parallel computing and the numerical analysis of PDEs and algebraic differential equations. He earned a Ph.D. in applied mathematics from Université Pierre et Marie Curie.
Jie Pan is a Java developer at the Klee Group Company. She earned a Ph.D. in applied mathematics. During her doctoral work, she focused on large-scale data analysis on distributed systems.
Fei Teng is a researcher in the Key Lab of Cloud Computing and Intelligent Technology at Southwest Jiaotong University. Her research interests are mainly in cloud computing, data mining, resource allocation, and distributed scheduling algorithms.