Insertion bottlenecks lie at the heart of database and file-system innovations, best practices, and system workarounds. Most databases and file systems are based on B-tree data structures, and suffer from the performance cliffs and unpredictable run times of B-trees. In this talk, we introduce the Fractal Tree data structure and explain how it works and how it provides dramatically improved performance in both theory and in practice. Although our company Tokutek is selling a transaction-safe fractal-tree storage engine for MySQL, this talk is primarily about the underlying technology.
From a theoretical perspective, if B is the block-transfer size, the B-tree performs O(log_B N) block transfers per insert in the worst case. In contrast, the Fractal Tree structure performs O((log_B N)/B) memory transfers per insert, which translates to run-time improvements of two orders of magnitude.
To relate that theory to practice, we present an algorithmic model for B-tree performance bottlenecks. We explain how the bottlenecks affect best practice and how database designers typically modify B-trees to try to mitigate the bottlenecks. Then we show how Fractal Tree structures can attain faster insertion rates, intuitively by transforming disk-seek bottlenecks into disk-bandwidth bottlenecks
We conclude with performance results showing how a Fractal Tree storage engine can maintain rich indexes more efficiently than B-trees. Surprisingly, Fractal Tree structures seem to maintain their order-of-magnitude competitive advantage over B-trees on SSDs as well as traditional rotating media.
Dr. Kuszmaul’s research focuses on developing computer systems and hardware that behave well both in practice and in theory. His entry won 5 out of 6 categories in Jim Gray’s 2007 sorting benchmark contest, sorting a terabyte in 197 seconds. He formerly architected Akamai’s distributed data collection system, was a Yale Professor of Computer Science and was a principal network architect for the Thinking Machines Connection Machine CM-5 Supercomputer. He was one of the developers of the MIT Cilk multithreaded programming system and wrote, in Cilk++, winning entries for 4 out of 12 of problems in the 2009 Intel Threading Challenge. Dr. Kuszmaul a founder and Chief Architect at Tokutek and is a Research Scientist in the Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory (MIT CSAIL).
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Yvonne Romaine at firstname.lastname@example.org
For media-related inquiries, contact Maureen Jennings at email@example.com
To stay abreast of conference news and to receive email notification when registration opens, please sign up for the O'Reilly MySQL Conference newsletter (login required).
View a complete list of O'Reilly MySQL Conference contacts.