EC2, MapReduce, and Distributed Processing
Rails makes simple web interaction easy (request/response), and provides some great tools for interacting with an application from the outside (like migrations). However, Rails doesn’t offer much help with long-running processes, massive datasets, or complex computations. One approach to this is the MapReduce algorithm that powers much of Google, and EC2 is a natural place to run MapReduce.
In this talk, I will provide an overview of MapReduce, and will show a simple Ruby implementation. I will also demonstrate how to perform easy parallel processing using Amazon EC2. Finally, I will discuss ways in which Rails applications may benefit from this sort of distributed processing, and will compare MapReduce to other approaches to distributed/background processing.
Jonathan Dahl
Tumblon
Jonathan Dahl is a developer and entrepreneur who started using Ruby on Rails in 2005, as a founding partner at Slantwise Design. He has written gems and plugins and led development on more than a dozen Rails applications, which run the spectrum from Web 2.0 to the enterprise.
Most recently, he has focused on Zencoder, a distributed video transcoding product, and Tumblon, a web-based service for parents to track and share their children’s growth.













