AOL deployed its large scale Real Time News (RTN) system in 2007. This system receives news updates from over 30,000 sources on every second around the clock. Today, its data store, MySQL, has accumulated over several billions of rows and terabytes of data. However, news are delivered to end users in close to real time fashion. This presentation shares how it is done and the lessons learned.
This talk will cover forecasting and planning future growth for high volume mysql instances occupying many (500+) servers. Will include metrics tracked (and how to filter through noise), when and how to migrate, operational optimizations such as upgrades that can be incorporated, and how to deal with server technology that evolves faster than server lifetimes
Founded by two high school students in 2005, myYearbook.com has grown to become one of the top 5 social networks and one of the top 25 most trafficked web sites in the United States. In this session, we'll review the growing pains, unique architectural decisions, and methodologies employed to support the consistent growth and demand of a social network.
The Facebook database engineering team works with the community and on its own to make MySQL better for data center deployments. This work is visible in the Facebook patch, bugs fixed in official MySQL and features sponsored in other distributions. We will describe work to support a large number of large databases. We focus on backup, replication and quality of service.
GitHub's history with MySQL and what we've built off of it.
In the midst of many attempts to "solve" the RDBMS high availability problems, the vast majority of Yahoo sites are still using plain old boring MySQL replication to accomplish HA. This talk will cover the principles of this architecture, it's advantages and disadvantages, as well what we see as needed for future HA advances. It's old-school, it's crude, but somehow it solves most HA problems.
A real-world example of how re-sharding and table partitioning cut load data times in Facebook's analytics infrastructure from greater than 24 hours to less than 5 minutes.
Site failures can blow your business out of the water unless you have a disaster recovery site already setup, tested, and ready to go. This talk presents a cookbook approach for setting up and managing MySQL DR. Standard architectures, failover procedures, and failback are covered. Finally, we talk about how to test it all so you know it works.
Amazon engineers share experiences managing a large fleet of MySQL databases.
We will examine the challenges faced by Zynga in running a large scale MySQL plant in EC2. Serving our social games to millions of players around the globe has required significant investment in automation and performance optimization to the thousands of MySQL instances that drive the games. Delivering high performance in the cloud requires a unique approach to support high CPU and I/O demands.
People talk about NoSQL in the context of distributed cloud-based web applications, but what if your application needs to be deployed throughout rural Africa, with limited computer resources, intermittent power, and above all, extremely unreliable internet? This talk discusses the features of CouchDB that make it uniquely suited for developing world health applications.
As CTO of Outside.in, and in my new stealth company, I've seen my share of challenging scenarios keeping a very busy PostgreSQL-based startup online and responsive during tremendous growth. EC2 + PostgreSQL + PostGIS + no downtime. Others can probably learn from my battle scars!
This talk will discuss the ongoing evolution of data storage at Craigslist, starting from a homogeneous one-size fits all "MySQL everywhere" approach and moving toward a heterogeneous environment that considers our real data and performance needs and the plethora of tools available today (including Redis, MongoDB, MySQL, and Sphinx).