MongoSV
December 3rd Mountain View, CA
About
MongoSV was a four-track, one-day conference on December 3, 2010 at Microsoft Research Silicon Valley in Mountain View, CA. The main conference track featured 10gen founders Dwight Merriman and Eliot Horowitz, as well as Roger Bodamer, the head of 10gen's west coast operations, and several of the key engineers developing the MongoDB project. These sessions were geared towards developers and administrators interested in learning how to use the database, with sessions on schema design, indexing, administration, deployment strategies, scaling, and other features. A second track showcased several high-profile deployments of the database at Shutterfly, Craigslist, IGN, Intuit, Wordnik, and more. For more experienced users of the database, there were several advanced sessions, covering the storage engine, replication, sharding, and consistency models.
Video
Recordings of the sessions are available on the Video page.
Photos
Click here to view these pictures larger
Sessions
Introduction to MongoDB Dwight Merriman, the CEO and Co-Founder of 10gen, demos the key features of MongoDB, the open source, non-relational, document-oriented database.
Sharing Life's Joy using MongoDB: A Shutterfly Case Study Shutterfly is an Internet-based social expression and personal publishing service. MongoDB is used for various persistent data storage requirements within Shutterfly. MongoDB helps Shutterfly build an unrivaled service that enables deeper, more personal relationships between customers and those who matter most in their lives.
MongoDB and Craigslist's Data Storage Evolution This talk will discuss the ongoing evolution of data storage at Craigslist, starting from a homogeneous one-size fits all "MySQL everywhere" approach and moving toward a heterogeneous environment that considers our real data and performance needs and the plethora of tools available today. I'll discuss how and why we chose MongoDB to be part of our future infrastructure and highlight what has gone well (and not so well) in our multi-billion document deployment. This will include a discussion of some new 1.6.x features, including replica sets and auto-sharding.
Schema Design: Data as Documents One of the challenges that comes with moving to MongoDB is figuring how to best model your data. While most developers have internalized the rules of thumb for designing schemas for RDBMSs, these rules don't always apply to MongoDB. The simple fact that documents can represent rich, schema-free data structures means that we have a lot of viable alternatives to the standard, normalized, relational model. Not only that, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense. Understandably, this begets good questions:
- Are foreign keys permissible, or is it better to represent one-to-many relations withing a single document?
- Are join tables necessary, or is there another technique for building out many-to-many relationships?
- What level of denormalization is appropriate?
- How do my data modeling decisions affect the efficiency of updates and queries?
In this session, we'll answer these questions and more, provide a number of data modeling rules of thumb, and discuss the tradeoffs of various data modeling strategies.
Indexing and Query Optimizer We all know that MongoDB is one of the most flexible and feature-rich databases available. In this session we'll discuss how you can leverage this feature set and maintain high performance with your project's massive data sets and high loads. We'll cover how indexes can be designed to optimize the performance of MongoDB. We'll also discuss tips for diagnosing and fixing performance issues should they arise.
Administration This talk centers around the responsibilities of anyone administering MongoDB: security, server status, replication, and backups are all covered in detail. We'll describe how to integrate with monitoring services, discuss the output of various important diagnostic command, provide hardware recommendations, and hint at a few tricky issues surrounding backups. There will be ample time at the end for any administrative questions.
Deployment Strategies Now that you have developed your application, it is time to deploy it into production. In this session we examine the tradeoffs of some of the standard deployment configurations. We'll highlight how to scale for read/write and maximum availability.
Scaling with MongoDB For applications that outgrow the resources of a single database server, MongoDB can convert to a sharded cluster, automatically managing failover and balancing of nodes, with few or no changes to the original application code. This talk starts by discussing when to shard and continues on to describe MongoDB's sharding architecture. We'll describe how to configure a shard cluster and provide several example topologies. We'll also give some advice on schema design for sharding and how to pick the best shard key.
Map/reduce, geospatial indexing, and other cool features This session will demonstrate how to build a simple location-based application with geospatial indexing, map/reduce and other interesting MongoDB features.
MongoDB and Windows Azure Windows Azure is a cloud computing or cloud services operating system for the development service hosting and service management environment.
Building a Social Graph at Eventbrite with MongoDB Eventbrite is an online ticketing solution that inherently connects users who attend events together. Using MongoDB, Eventbrite has been able to expose this social graph as well as integrate connections from other social networks to create custom event recommendations. Brian will talk about how and why they chose MongoDB as the platform for this work as well as their experiences with it in a production environment. For more details, read the September TechCrunch article or Eventbrite blog.
Confessions of a Recovering Relational Addict Chandra Patni of IGN will talk about his experiences going from multi versioning concurrency control in relation databases to distributed database design using MongoDB.
Deriving deep customer insights using MongoDB One of the biggest opportunities for innovation on the web today is analyzing the huge set of customer data. Unfortunately, supporting these datasets is labor-intensive due to immature database solutions. With MongoDB's document based solution Intuit created a solution that provided real time minute by minute data of their customer websites and help create "websites that work" for small businesses. We will discuss Intuit's experience in tracking and analyzing the data using MongoDB. We also cover the set of challenges faced by Internet-scale systems, including dependency management, database changes, data collection and aggregation, and reusable patterns associated with each challenge.
Mastering the MongoDB Shell (Advanced Topic) This session will cover the many operations that can be performed with MongoDB through a command-line interface. We'll cover basic CRUD operations, database commands, scripting with JavaScript, and ways to get help on the fly. By the end of the session, and with a little extra practice, you'll have yourself some respectable MongoDB-shell-fu.
Low Latency Event Logging with BSON Use BSON as a high performance serialization format for event logging in C++.
Hadoop Plugin for MongoDB Learn how to integrate MongoDB with Hadoop for large-scale distributed data processing. Using tools like MapReduce, Pig and Streaming you will learn how to do analytics and ETL on large datasets with the ability to load and save data against MongoDB. With Hadoop MapReduce, Java and Scala programmers will find a native solution for using MapReduce to process their data with MongoDB. Programmers of all kinds will find a new way to work with ETL using Pig to extract and analyze large datasets and persist the results to MongoDB. Python and Ruby Programmers can rejoice as well in a new way to write native Mongo MapReduce using the Hadoop Streaming interfaces.
Scalable event analytics with Ruby on Rails & MongoDB Are you developing a big data application and wish that it was as easy as building a Ruby on Rails application? Are you frustrated with MySQL and the bottlenecks it imposes on your system ? Are you fighting with Hadoop and how to make it work seamlessly with your Rails apps? At Yottaa, we faced these challenges: we need to process lots and lots of data very quickly and we want to be able to build these apps as quickly and efficiently as any other Rails applications. In this talk I'll show you how you can use Ruby on Rails and MongoDB to build a scalable data processing platform with just a fraction of the resources of competing solutions.
MongoDB Internals: The Storage Engine (Advanced Topic) This session will discuss MongoDB internals including the storage engine: how files are stored on disk, collections, extents, records, internal data management, and other topics.
Hosted MongoDB Learn about MongoHQ, MongoMachine, and MongoLab
MongoDB Consistency Models and Transactional Semantics This session will cover various deployment strategies for MongoDB which can be used, including strong/immediate consistency and eventually consistent read models. Atomic operations on documents will be discussed, including what sorts of transactions are possible and not possible with MongoDB, as well as a discussion of the commit model.
MongoDB is Powering ShareThis Count System ShareThis makes it easy to share content online. ShareThis is the world's largest sharing network reaching over 400 million users across 900,000+ domains across the web.
Powering Social Games though MongoDB PlayFirst is powering its data driven Social Games platform through the use of MongoDB. The session will discuss the architecture behind PlayFirst's innovative Real Time Analytics Data Warehouse and how PlayFirst is managing its high volume user data transactions to drive its Social Games.
Crazy Stuff : hacks, internals, and sneaky tricks Fun stuff and tips for those who already know MongoDB well.
Monitoring MongoDB What you need to keep tabs on to make sure your MongoDB cluster stays running healthily. At Boxed Ice, David uses MongoDB to handle billions of database documents.
Welcome the Offical C# Driver An introduction to the offical C# driver developed by 10gen.
Full-Auto Mongo: Taking the Pain out of Deploying and Scaling MongoDB in the Cloud Deploying MongoDB can be tedious and error-prone without even thinking about sharding or backups. A minimum of 3 nodes must be created. Config servers must be started on each. Routers must then be started to configure these config servers. At this point, we can finally start shard servers, create replica sets, and finally create shards with each replica set. Then rinse, lather, and repeat every time you want to scale up or down. But, still you have no monitoring, and as we’ve all learned, this unnecessarily increases our exposure to the risk of dramatic and extended outage. Makara has set out to end this misery. This hands-on talk will show you how to automate deployment, scaling, and monitoring for MongoDB in the cloud. By the end of this talk, you will be well on your way to MongoDB nirvana!
MongoDB with Morphia: Web Scale Java Development Morphia is a lightweight type-safe library for mapping your Java objects to and from MongoDB.
Mobile Apps with iOS and MongoDB Fast and flexible document stores make MongoDB a great backbone for networked mobile apps. We'll look at tools and examples that show how you can work with MongoDB directly within an iOS app or to build simple and powerful networked backends for your iOS and other mobile client apps.
Logging Application Behavior to MongoDB An overview of libraries and techniques for storing and analyzing application log events in MongoDB. The presenter is a committer on log4mongo-java, but other languages will also be covered.
Replication in Detail MongoDB supports asynchronous replication of data between servers for failover and redundancy. In this session, we'll introduce the different modes the replication, including master-slave and replica sets, and we'll describe how to achieve better durability by adjusting the write concern. We'll also discuss backups and provide some tips on scaling with replication alone.
Keeping the Lights on with MongoDB Wordnik stores its entire text corpus in MongoDB - 3.5TB of data in over 11 billion records. At MongoSF, Tony presented on his experiences migrating from MySQL to MongoDB. In this talk, Tony will present on tools he's been working on to utilize the oplog for backups, point-in-time recovery, disaster recovery, multi-data center replication, and master-master "mesh" modes. This session will be of interest to administrators who are concerned about supporting MongoDB from the administrative point of view.
Sharding: The Details Learn how auto-sharding works under the hood. This talk will cover routing, balancing, and other aspects of MongoDB's sharding architecture.
Indexing and Query Optimizer: The Details We will take an in depth look at the MongoDB indexing system and query optimizer, with a number of examples. We will discuss recent query optimizer enhancements including support for $or queries and rudimentary support for multidimensional queries.
MongoDB Spring Integration Chris Richardson of VMWare presents on MongoDB and Spring.
Lightning Talk: Monitaur Monitaur is instant server monitoring. Set it up in under 30 seconds (seriously, it's that easy), and rest assured your server is staying healthy. Monitaur was built in 48 hours for Rails Rumble. MongoDB bridged the gap between the Rails 3 website and the Node.js metric collection + real-time comet server.
Lightning Talk: Crypotographically Signing Data in BSON Lightning talk by Andy Rondeau.
Lightning Talk: Scala and Casbah Lightning talk by Gregg Carrier.
Location
Microsoft Research Silicon Valley
1065 La Avenida St
Mountain View, CA 94043










