January 7, 2019 –
Title: Performance Modeling for Cloud Platforms: A Data Driven Approach
Ph.D. Candidate: Kewen Wang
Major Advisor: Dr. Mohammad Maifi Hasan Khan
Associate Advisors: Dr. Swapna Gokhale, Dr. Song Han
Date/Time: Monday, January 7, 2019 1:00pm
Location: ITE 336
Software service providers are increasingly adopting cloud-based solutions to maximize resource utilization and minimize the operating cost. While performance predictability is becoming of paramount importance as the safety-critical nature of such systems continues to grow (e.g., IoT applications, infrastructure monitoring), large scale, high-degree of concurrency, and dynamic allocation of resources are making traditional performance modeling/tuning frameworks ill-suited that are not extendable.
To address the aforementioned challenge, this thesis focuses on developing a data-driven performance modeling framework. Towards this objective, hierarchical performance models that can effectively capture and predict the execution time of a given job with high accuracy based on limited scale execution data are first developed. Subsequently, the models are extended to account for the underlying interactions among multiple jobs and predict the execution time of a job with interference from other jobs. The extended models are then leveraged to design and implement a dynamic job scheduler that can automatically predict potential interference, and reschedules them to minimize interference and job execution time significantly. Second, analytical models are developed to predict the possibility of suboptimal performance problems caused by inefficient partition of input data and/or skewed task distribution across worker nodes, and recommend ways to address the identified problems by either repartitioning of input data (in case of task straggler problem) and/or changing the locality configuration setting (in case of skewed task distribution problem). Finally, the thesis focuses on understanding the relationship between the resource contention at the operating system layer and the application performance. Specifically, kernel level resource contention models are integrated with the application layer analytical models to identify when and what type of resources (e.g., CPU, memory, network bandwidth) to allocate to address resource bottlenecks during runtime. The performance of the developed models is evaluated on a real cluster and is presented in the thesis.