Faster Data Reduction and Smoothing for Analysis & Archival in MySQL

Michael McFadden (Sciencelogic.com)
Average rating: ****.
(4.00, 2 ratings)

We will discuss the technique of data smoothing in MySQL using group averaging and possibly running/moving averages. Data smoothing is the act of translating high-resolution data into lower resolution data for lower frequency trend analysis over a long period, and more importantly, to reduce unneeded resolution for data archival storage. Topics coverered in this session will include business needs and use cases, techniques for different normalization needs and different data sets, and a benchmark comparison between scripted/programmed solutions verses database-side, SQL query-side solution.

Any application that performs frequent polling for data and archives this data for long (or infinite duration) has run into the situation of ‘too much data’. Tables become huge and unsearchable, performance goes down, and over time – the resolution of this data becomes worthless compared to the cost to store it.

We will discuss the simple ‘gauge data’ smoothing process, the more complicated ‘counter data’ smoothing process, and the ‘waterfall smoothing’ technique for multiple levels of archival and smoothing. We will also discuss optimal indexing for efficient grouping. If time permits, we will show examples of stored procedures for the very tough methods, and also look at solving moving averages in SQL.

Many times developers fall back on programming and scripting languages such as php, perl, or python to solve this problem, as it can be hard to do strictly in SQL, but here we show some comparative run times that show magnitudes of performance gain.

If time permits, we will also go into using partitioning for maintenance and quick deletion of archived data, and using stored procedures for more complex smoothing applications.

Photo of Michael McFadden

Michael McFadden

Sciencelogic.com

Michael “Flip” McFadden earned his degree in Computer Science from Appalachian State University in 2000 and was hired on by the university to support the NASA Nearby Stars Survey as a Research Assistant and Data Reduction Specialist. He went on to earn his Linux Professionals Institute Level 1 certification in April 2004, and earned his MySQL 5.0 Certified Developer in May of 2008. Michael currently works for ScienceLogic as a Sr. Software Architect and can occasionally be found trolling on the #mysql IRC channel as ‘flipmcf’.

Comments on this page are now closed.

Comments

Tom Hanlon
04/22/2009 1:42am PDT

Wow !! what a session

Co-presented By:

O'Reilly Media MySQL/Sun Microsystems
  • Kickfire
  • Virident
  • Infobright, Inc
  • JasperSoft
  • Intel
  • Advanced Micro Devices
  • BIRT Exchange by Actuate
  • Calpont
  • Canonical
  • Continuent
  • Dolphin Interconnect Solutions
  • Facebook
  • HiT Software, Inc.
  • IBM
  • iDashboards
  • Oracle
  • Pentaho
  • R1Soft
  • Schooner Information Technology
  • SQLstream
  • Ticketmaster
  • Zmanda, Inc.
  • Linux Journal

Sponsorship Opportunities

For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at scordesse@oreilly.com

Download the MySQL Sponsor/Exhibitor Prospectus

Media Partner Opportunities

Download the Media & Promotional Partner Brochure (PDF) for information on trade opportunities with O'Reilly conferences or contact mediapartners@ oreilly.com

Press and Media

For media-related inquiries, contact Maureen Jennings at maureen@oreilly.com

MySQL Conference Newsletter

To stay abreast of conference news and to receive email notification when registration opens, please sign up for the MySQL Conference newsletter.

Contact Us

View a complete list of MySQL contacts.