Introducing kewpie - a feedback-based query generation and testing framework

Patrick Crews (HP Cloud Services)
Drizzle
Location: Ballroom C
Tags: drizzle, qa

Testing database systems is a daunting task. The possible combinations of data, queries, and system configurations make for a practically infinite test domain. One consequence of this is that the time-honored tradition of using hand-crafted tests is becoming ineffective. Research from Microsoft’s SQL Server team has indicated that reliance on such methods allows far too many defects to remain undetected.

One solution to this problem has been the rise of random testing tools. Software such as Microsoft’s RAGS system and the Random Query Generator (randgen) rely on stochastic models that define the realm of possible queries. Unfortunately, purely random systems tend to require one to make a choice between domain coverage and accuracy. If one casts too broad a net with their possible queries, they are likely to produce a high volume of completely invalid tests. If they tune their query to produce more valid queries, a cost of time and query coverage must be paid.

Microsoft’s solution to this problem was to adopt a genetic-algorithm based approach. What this means is that they maintain a database of known valid queries. When they wish to generate a new set of tests, such as for a new database component, they will pull from their existing tests, define rules to let the system know if a query is interesting / useful, and provide rules for how to mutate the query so that it will exercise the intended code. Using this system, they are able to progressively build upon valid queries until they have hit their test target. Their research has shown a 10% increase in bug detection over purely random systems.

Kewpie (the query probulator) is the Drizzle team’s effort at creating such a system. Utilizing an execute/evaluate/mutate paradigm, each new set of queries is created as a result of the user-specified evaluation and mutation criteria. This presentation will discuss the history and design motivations behind the system, the architecture of the system itself, and provide several examples of how it can be applied to wreak havoc on your favorite DMBS.

Patrick Crews

HP Cloud Services

Currently working QA on the Rackspace Drizzle team, Patrick was a QA Engineer at MySQL / Sun / Oracle since 2007, where he focused on code coverage tools, maintaining and updating the MySQL test suite, and testing with the Random Query Generator.

Prior to that, he worked in Blue Cross and Blue Shield of Florida’s data warehousing division, did internships for the Mayo Clinic and the NFL as a student, and spent 5 years in the Navy doing electrical work on the mighty P-3C Orion aircraft.

Patrick currently lives in Jacksonville, FL with his wife and a crazed Boston Terrier. He holds an M.S. in computer science from the University of North Florida and is a MySQL certified developer. In his free time, he likes to run, read, and think of ways to work with obscenely large amounts of data.

  • EnterpriseDB
  • Amazon Web Services
  • Clustrix
  • Continuent
  • Facebook
  • HTI Consultoria e Tecnologia
  • Monty Program
  • Percona
  • Rackspace Hosting
  • Schooner Information Technology
  • SkySQL
  • Xeround

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Yvonne Romaine at yromaine@oreilly.com

Media Partners Opportunities

For media partnerships, contact mediapartners@ oreilly.com

Press & Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

O'Reilly MySQL Conference Bulletin

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the O'Reilly MySQL Conference Bulletin (login required).

Contact Us

View a complete list of O'Reilly MySQL Conference Contacts