We show effective mitigation methods can alleviate both scheduled and statistical ShadowSync reducing the latency long tail to less than 20% of the original at the 99.9 th percentile. Our experimental results show that with typical settings of benchmark components such as RocksDB, ShadowSync can prolong request message latency by up to 2 seconds. For scheduling and statistical reasons, significant latency long tail can arise from ShadowSync caused by asynchronous recurrent operations. Although asynchronous, such recurrent maintenance operations can cause frequent millibottlenecks, particularly when they overlap, a problem we call ShadowSync. Outside of the critical path, LSM-tree relies on maintenance operations (e.g., flushing/compaction during a checkpoint) to reorganize LSM-tree in memory and on disk to keep data access latency short. Using a traffic analysis stream processing application benchmark, we evaluated the impact of asynchronous workload bursts generated by a multi-layer data structure called LSM-tree (log-structured merge-tree) for continuous checkpointing. This paper shows that asynchronous, very short but intense resource demands (called millibottlenecks) outside of critical paths can also cause significant latency long tail. Recent work showed a positive correlation between latency long tail and variance in the execution time of synchronous invocation chains (critical paths) in microservices benchmarks. For example, e-commerce websites like Amazon improve their marketing strategy by performing real-time advertising based on customers' behavior, and latency long tail can cause significant revenue loss. Mission-critical, real-time, continuous stream processing applications that interact with the real world have stringent latency requirements.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |