CloudRaid: Hunting Concurrency Bugs in the Cloud via Log-Mining
Cloud systems suffer from distributed concurrency bugs, which are notoriously difficult to detect and often lead to data loss and service outage. This paper presents CloudRaid, a new effective tool to battle distributed concurrency bugs. CloudRaid automatically detects concurrency bugs in cloud systems, by analyzing and testing those message orderings that are likely to expose errors. We observe that large-scale online cloud applications process millions of user requests per second, exercising many permutations of message orderings extensively. Those already sufficiently-tested message orderings are unlikely to expose errors. Hence, CloudRaid mines logs from previous executions to uncover those message orderings which are feasible, but not sufficiently tested. Specifically, CloudRaid tries to flip the order of a pair of messages $<S,P>$ if they may happen in parallel, but $S$ always arrives before $P$ from existing logs, i.e., excercising the order $P \rightarrowtail S$. The log-based approach makes it suitable to live systems.
We have applied CloudRaid to automatically test four representative distributed systems: Apache Hadoop2/Yarn, HBase, HDFS and Cassandra. CloudRaid can automatically test 40 different versions of the 4 systems (10 versions per system) in 35 hours, and can successfully trigger 28 concurrency bugs, including 8 new bugs that have never been found before. The 8 new bugs have all been confirmed by their original developers, and 3 of them are considered as critical bugs that have already been fixed.
Tue 6 NovDisplayed time zone: Guadalajara, Mexico City, Monterrey change
10:30 - 12:00 | Concurrency and RacesResearch Papers at Horizons 10-11 Chair(s): Willem Visser Stellenbosch University | ||
10:30 22mTalk | CloudRaid: Hunting Concurrency Bugs in the Cloud via Log-Mining Research Papers Jie Lu , Feng Li Institute of Computing Technology at Chinese Academy of Sciences, China, Lian Li Institute of Computing Technology at Chinese Academy of Sciences, China, Xiaobing Feng ICT CAS | ||
10:52 22mTalk | Testing Multithreaded Programs via Thread Speed Control Research Papers Dongjie Chen , Yanyan Jiang Nanjing University, Chang Xu Nanjing University, Xiaoxing Ma Nanjing University, Jian Lu Nanjing University | ||
11:15 22mTalk | Data Race Detection on Compressed Traces Research Papers Dileep Kini University of Illinois at Urbana-Champaign, Umang Mathur University of Illinois at Urbana-Champaign, Mahesh Viswanathan University of Illinois at Urbana-Champaign | ||
11:37 22mTalk | Practical AJAX Race Detection for JavaScript Web Applications Research Papers Christoffer Quist Adamsen Aarhus University, Anders Møller Aarhus University, Saba Alimadadi Northeastern University, Frank Tip Northeastern University |