Mitigation of Straggler Nodes through Virtualization for Resource Optimization

  • Ajay Kumar Bansal, Manmohan Sharma, Ashu Gupta

Abstract

Abstract Modern computing systems are generally enormous in scale, consisting of hundreds to thou- sands of heterogeneous machine nodes, to meet ris- ing demands for Cloud services. MapReduce and other parallel computing frameworks are frequently used on such cluster architecture to offer consumers de- pendable and timely services. However, Cloud work- loads’ complex features, such as multi-dimensional resource requirements and dynamically changing sys- tem settings, such as dynamic node performance, are posing new difficulties for providers in terms of both customer experience and system efficiency. The strag- gler problem occurs when a small subset of paral- lelized jobs takes an excessively long time to execute in contrast to their siblings, resulting in a delayed job response and the possibility of late-timing failure. Speculative execution is the state-of-the-art method to straggler mitigation. Speculative execution has been used in numerous real-world systems with a variety of implementation improvements, but the results of this thesis’ research demonstrate that it is typically

wasteful. The failure rate of speculative execution might be as high as 71 percent, according to different data center production trace logs. Straggler mitigation is a difficult task in and of itself:

1) stragglers may have varying degrees of severity in parallel job execution;

2) whether a task should be considered a straggler is highly subjective, depending on various application and system conditions;

3) the efficiency of speculative execution would be improved if dynamic node quality could be adequately modeled and predicted;

4) Other sorts of stragglers, such as those generated by data skews, are beyond speculative execution’s ca- pabilities.

Published
2021-09-29
How to Cite
Ajay Kumar Bansal, Manmohan Sharma, Ashu Gupta. (2021). Mitigation of Straggler Nodes through Virtualization for Resource Optimization. Design Engineering, 15293-15298. Retrieved from http://thedesignengineering.com/index.php/DE/article/view/4812
Section
Articles