Modeling Denormalization - The Speed You Need, the Order You Crave

Duncan Beevers (Kongregate)
13:40 Wednesday, 3-09-2008
General
Location: Salon 4
Average rating: ***..
(3.42, 12 ratings)

Brief:

Most developers have been trained to normalize data wherever possible. When business requirements start walking all over your nicely-normalized data, your queries can grow complex and inefficient. Denormalization of data can ease the pressure when your queries get out of hand, but it shouldn’t be handled as an after-thought. Creating first-class representations of your denormalized data makes it easy to keep data in sync and developers on the same page.

Abstract:

As a dataset grows it becomes more and more costly to retrieve records from it, and as your application requirements change, new queries are introduced against existing datasets. Applying more indices to the dataset to support these new queries decreases both the clarity (why is this index here again?) and scalability (additional indices increase write times) of your application.

Instead of shoe-horning denormalization functionality into your core models, break the behavior out into models that express answers to the kinds of questions you want to ask.

Leveraging ActiveRecord

Data Denormalization models built on top of ActiveRecord provide a number of benefits:

  • AR callbacks make keeping your denorm models up-to-date a cinch
  • AR associations and association extensions provide a natural and expressive grammar for working with denormalized data

Key Speaking Points

  • be explicit about what objects are responsible for keeping data in-sync
  • use commutative updates
  • rolling-window coalescing can be achieved with data-overlap, further denormalization

Coalescing data discretely is simpler and more efficient than rolling-windows. Stick to it if your business requirements

Leveraging the DB

Commutative updates that change existing row values in-place are valuable as they eliminate race conditions for row-specific changes. Using broad-reaching queries that update multiple rows simultaneously are helpful as well as they can update several denormalized models of different granularities residing in the same table.

Use the features of your database. Blobs can be used to store marshaled objects. Using ruby’s built-in marshaling instead of Rails’ YAML is faster (benchmarks provided!) and maps easily to database blobs.

Other Benefits of Denormalization Automatic snapshotting – Data that is denormalized and then not updated along with the canonical version provides a history.

Photo of Duncan Beevers

Duncan Beevers

Kongregate

Unfailingly polite.

News and Coverage
co-presented by Ruby Central, Inc. O'Reilly
  • Engine Yard
  • Sun Microsystems
  • Brightbox
  • ELC Technologies
  • T3N

Sponsor Opportunities

For information on exhibition and sponsorship opportunities at RailsConf Europe, contact Yvonne Romaine at yromaine@oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com.

Contact Us

View a complete list of RailsConf Europe Contacts