MongoSV 2012
December 4thSanta Clara, CA, United States
MongoSV is an annual one-day conference in Silicon Valley, CA, dedicated to the open source, non-relational database MongoDB. This year’s conference includes over 50 sessions by 10gen engineers and MongoDB users from companies such as foursquare, Github, Apollo Group (University of Pheonix), AOL, and more. For the first time, we are offering a full track dedicated to operations for those interested in learning about the maintenance strategies and best practices for your MongoDB clusters.
For more information, check out the agenda below or our blog post Get Ready for MongoSV.
Slides & Video
Slides and video from MongoSV are available at 10gen.com/presentations.
Hashtag
Follow the #MongoSV hashtag to stay up-to-date on all things MongoSV.
Agenda
See below, or download the PDF agenda here.
MongoSV Office Hours
M101(MongoDB for Developers) and M102 (MongoDB for DBAs) Office Hours
At MongoSV, we're hosting office hours for students enrolled in 10gen's online MongoDB classes. M101 instructor Andrew Erlichson and M102 instructor Dwight Merriman will be available to answer your questions about the course and MongoDB in general. Office Hours will be from 11:00am - 12:30pm
Ask the Experts
At MongoSV, we're hosting office hours for anyone who wants to ask a 10gen engineer a question directly. Sign up is on site and is first come, first serve, and time is limited to 15 minutes per attendee.
MongoDB Sponsors - learn more here
Schedule
| Room B4 (10gen Dev Track) | Room B5 (10gen Ops Track) | Room M1 | Room M3 | Room H | Room G | |
|---|---|---|---|---|---|---|
|
|
Registration | |||||
|
|
Welcome to MongoSVEliot Horowitz, CTO/Co-Founder, 10gen Located in room B4 | |||||
|
|
Session Transition, Coffee, and Registration | |||||
|
|
Schema DesignSridhar Nanjundeswaran, Software Engineer, 10gen One of the challenges that comes with moving to MongoDB is figuring how to best model your data. While most developers have internalized the rules of thumb for designing schemas for RDBMSs, these rules don't always apply to MongoDB. The simple fact that documents can represent rich, schema-free data structures means that we have a lot of viable alternatives to the standard, normalized, relational model. Not only that, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense. Understandably, this begets good questions: Are foreign keys permissible, or is it better to represent one-to-many relations withing a single document? Are join tables necessary, or is there another technique for building out many-to-many relationships? What level of denormalization is appropriate? How do my data modeling decisions affect the efficiency of updates and queries? In this session, we'll answer these questions and more, provide a number of data modeling rules of thumb, and discuss the tradeoffs of various data modeling strategies. |
ShardingBrandon Black, 10gen Sharding allows you to distribute load across multiple servers and keep your data balanced across those servers. This session will review MongoDB’s sharding support, including an architectural overview, design principles, and automation. |
How We Evaluated MongoDB as an Relational Database ReplacementBrig Lamoreaux, Senior Software Engineer, Apollo Group Explain the process, methodology, and results used at Apollo Group to evaluated MongoDB to replace Oracle for a core platform component. |
Lightning TalksNiall O'Higgins; Robert Vandehey, Senior Director, Rovi Corp; Michael Calabrese, Senior Developer, Lunar Logic - NOMP Stack has arrived: Node.JS, MongoDB and PaaS: Niall O'Higgins, CTO, Beyond Fog |
Build an App Track - Part 1Eliot Horowitz, CTO/Co-Founder, 10gen We will build an IRC server based on MongoDB. In this first session we will review how the server works and how to manage the necessary data. We will look in detail at how to build a Message Bus that is the backbone of our IRC server. |
Sailthru: Moving from Cloud to ColoIan White, CTO and Co-Founder, Sailthru For nearly two years, Sailthru ran our entire MongoDB deployment on Amazon EC2, serving terabytes of data and processing thousands. We had a number of successes and a number of challenges, and ultimately decided to move our primaries to a physical hardware. I'll reflect on the move and the pros and cons of both cloud and metal, from hard experience. |
|
|
Indexing and Query OptimizationMax Schireson, CEO, 10gen MongoDB supports a wide range of indexing options to enable fast querying of your data. In this talk we’ll cover how indexing works, the various indexing options, and cover use cases where each might be useful. |
Deployment PreparednessAlvin Richards, 10Gen Technical Director for EMEA, 10gen The last bugs are finished, testing is complete, and business is ready. What do you do next? In this talk we will cover the topics to ensure that you are prepared for a successful launch of your MongoDB based product, including: • Key counters and metrics: Page Faulting? IO Bound? What's my working set? How do I know? • Load Testing and Capacity Planning: How much resource is my MongoDB going to use? When do I need to add replicas and shards? • Monitoring: What should I be watching and how do I know if things are running correctly? We will map the theory to the practice by illustrating with real world examples. |
High Performance, Scalable MongoDB in the SoftLayer Bare Metal CloudDuke S. Skarda, CTO, SoftLayer Duke Skarda joined SoftLayer in November, 2010. As the CTO, he holds responsibilities for implementing, designing and enhancing the proprietary SoftLayer Infrastructure Management System (IMS). Prior to joining SoftLayer, Mr. Skarda served as the Vice President of Information Technology and Software Development at The Planet. He served in this role from June, 2009 to October, 2010. Previously, Mr. Skarda spent 10 years with Level 3 Communications in a series of increasingly responsible positions. As Senior Vice President for its Content Markets Group, he led Engineering and IT development for its Content Distribution Network platform and IT support systems. As Senior Vice President for IT Architecture and Application Development, he led a broad range of programs, including business process management, order-entry and service assurance development. Mr. Skarda also served as Vice President of IT Architecture, where he led the development of a long-range systems roadmaps, systems merger and acquisition planning, and the development of the company's Enterprise Architecture team. Additionally, he has four years of experience with Sprint. Mr. Skarda earned a B.S. in Computer Engineering and Electrical Engineering from The University of Texas at El Paso. |
Lightning TalksMark Nielsen, Programming DBA Geek, Reputation.com; Michael Salera, Software Architect, MercadoLibre - Database Administration Dashboard for MongoDB : Mark Nielsen, Programming DBA Geek, Reputiation.com |
Building an IRC App on MongoDB: Deployment, Replication and MonitoringEliot Horowitz, CTO/Co-Founder, 10gen; Shaun Verch, Software Engineer, 10gen Now that we've built our app, we will look at how to deploy it in production. We will design and deploy a replica set to support a highly available backend for the server. We'll also show you how to monitor your server, both from the shell and using the MongoDB Monitoring Service (MMS). Additionally, we will try to break our cluster and show how the service stays running throughout failures as well as how to recover from catastrophic failures. |
Automated Slow Query Analysis: Dex the Index RobotEric Sedor, Engineer, MongoLab A well-indexed query improves performance by several orders of magnitude. The trick is to identify an ideal set of indexes for a particular use case. Even for experts, hand-crawling MongoDB log for slow queries is a laborious process. Introducing Dex: an open-source automated tool for analyzing the slow query log or system.profile collection. Dex's primary author Eric Sedor demonstrates Dex's usage and elaborates on indexing topics from the basic to the advanced. Includes how to pick indexes in an elegant, practical way. You learn how Dex categorizes slow queries and recommends indexes to help keep your application running smoothly. Eric is an engineer at MongoLab, cloud-hoster of MongoDB, where Dex is used daily to optimize customer indexes. |
|
|
Coffee Break | |||||
|
|
ReplicationHannes Magnusson, PHP Evangelist, 10gen MongoDB supports replication for failover and redundancy. In this session we will introduce the basic concepts around replica sets, which provide automated failover and recovery of nodes. We'll show you how to set up, configure, and initiate a replica set, and methods for using replication to scale reads. We'll also discuss proper architecture for durability. |
Capacity PlanningScott Hernandez, Software Engineer, 10gen Deploying MongoDB can be a challenge if you don't understand how resources are used nor how to plan for the capacity of your systems. If you need to deploy, or grow, a MongoDB single instance, replica set, or tens of sharded clusters then you probably share the same challenges in trying to size that deployment. This talk will cover what resources MongoDB uses, and how to plan for their use in your deployment. Topics covered will include understanding how to model and plan capacity needs from the perspective of a new deployment, growing an existing one, and defining where the steps along scalability on your path to the top. The goal of this presentation will be to provide you with the tools needed to be successful in managing your MongoDB capacity planning tasks. |
The three aaS's of MongoDB in Windows AzureDavid Makogon, Microsoft; Doug Mahugh, Microsoft; Sridhar Nanjundeswaran, Software Engineer, 10gen MongoDB can be deployed to Windows Azure via PaaS or IaaS. And - surprise! - Now there's a SaaS option as well! This session will show all three approaches and compare each. |
Lightning TalksRick Copeland, Synapp.io; Katia Aresti, Developer - Multi-Master Replication for MongoDB : Rick Copeland, Principal, Arborian |
Building an IRC App on MongoDB: Build a Scalable Message Logging Service and Then Shard It - Live!Eliot Horowitz, CTO/Co-Founder, 10gen; Shaun Verch, Software Engineer, 10gen We will extend the IRC server to log the full chat history. This requires a scalable backend to store an infinitely growing volume of messages. We'll look at several ways of designing the message storage and the limitations and tradeoffs of each. We will then deploy our updated IRC server and upgrade from a replica set to a sharded cluster without any downtime. |
MongoDB at Foursquare: From the Cloud to Bare MetalJon Hoffman, Server Engineer, foursquare Foursquare recently migrated their Mongo infrastructure from servers running on Amazon's EC2 to bare metal hardware in their own DC. I'll talk about why and how we moved and what we learned along the way. |
|
|
AggregationBryan Reinero, Software Engineer, 10gen MongoDB's native tools for processing data can help you make sense of your data. This presentation will focusing on our native implementation of The new Aggregation Framework and Map Reduce. This session will include examples and practical strategies for aggregating data. |
Understanding MongoDB Storage for Performance and Data SafetyAntoine Girbal, Solution Architect, 10gen MongoDB supports write-ahead journaling of operations to facilitate fast crash recovery and durability in the storage engine. In this session, we'll give an overview of durability with MongoDB, demo journaling, and discuss journaling internals. |
Exploring Public Datasets and APIs with MongoDB and AnalyticaNosh Petigara, President, Analytica With its JSON-based data model, MongoDB is the ideal database for storing and analyzing data from public APIs. In this session, we'll use a few different examples to show how one can import data from public APIs directly into MongoDB using tools built into MongoDB or simple command line scripts. Next, using some examples from twitter, foursquare, and the NYTimes APIs, we will go through how you can explore, analyze, and visualize these datasets to extract useful (and often surprising!) conclusions. For this session we'll be using MongoDB's aggregation framework as well as Analytica, a new tool for analyzing and reporting on data in MongoDB. |
Lightning TalksVenky Vaddineni, CEO/CTO, Sphata Systems; Mike Saparov, Director of Engineering, MuleSoft - Connecting Healthcare in remote locations (Hospitals, Physicians and Patients) : Venky Vaddineni, CEO/CTO, Sphata Systems |
Building an IRC App on MongoDB: Backups, Monitoring and Ops for a Replicated and Sharded DeploymentEliot Horowitz, CTO/Co-Founder, 10gen; Shaun Verch, Software Engineer, 10gen We will explore the various options for backups, go through the rich information available through the monitoring service (MMS), and observe the automated chunk split and automated chunk migration across shards. We will also provoke several failures in the cluster and see how to recover. |
Managing Large Scale Data Streams with MongoDBTao Cheng, System Architect, AOL Managing hundreds of millions of structured data with real time updates is a daunting task by itself. Meshing the data with semantic linking such as trending, clustering, and relationships makes the task even more challenging if not impossible. In this presentation, the author shares why a Document-Oriented database (like Mongo) is a must in such real world projects, and the numerous benefits that come with it. Schema design techniques, trade-offs, query optimization, performance statistics, caveats, and comparisons with other technologies, such as MySQL and SOLR, are discussed. |
|
|
Lunch Break | |||||
|
|
Data Modeling Examples From the Real WorldJared Rosoff, 10gen In this session, we'll examine schema design insights and trade-offs using real world examples. We'll look at three example applications: building an email inbox, selecting a shard key for a large scale web application, and using MongoDB to store user profiles. From these examples you should leave the session with an idea of the advantages and disadvantages of various approaches to modeling your data in MongoDB. Attendees should be well versed in basic schema design and familiar with concepts in the morning's basic schema design talk. No beginner topics will be covered in this session. |
Advanced Sharding FeaturesBernie Hackett, Software Engineer, 10gen In this session we will take an in-depth look at shard keys and look at multi-data center and tag aware sharding. Attendees should be well versed in basic sharding and familiar with concepts in the morning's basic sharding talk. No beginner topics will be covered in this session |
MongoDB for AnalyticsJohn Nunemaker, Developer, Github The flexibility of MongoDB makes it perfect for storing analytics. I'll discuss a few patterns for storing data that we have learned while growing Gaug.es from zero to millions of page views a day. You'll leave with a desire to measure everything and the ability to do it. |
Simplify your MongoDB Java cloud apps with Spring DataThomas Risberg, VMWare Let Spring Data make developing MongoDB Java cloud applications a lot easier. We'll show you step-by-step how to create a simple application using Spring, Spring Data and MongoDB and deploy it to a Platform as a Service cloud like Cloud Foundry or AppFog. We will also cover in more detail, using code examples, what the Spring data project provides for MongoDB Java developers including object mapping, annotation based mapping metadata, query/criteria/update DSLs, automatic implementation of Repository interfaces and more. |
Real-Time Location Based Social Discovery Using MongoDBFredrik Björk, Director of Engineering, Banjo Banjo offers the ability for users to go anywhere in the world to discover content from social networks like Foursquare, Facebook, Instagram and Twitter. Using MongoDB's geospatial indexing and TTL collections we allow users to see friends, mutual friends and other relevant geo updates in real-time. |
Putting 3 Billion Ancestors in Your Pocket: FamilySearch's Journey from RDBMS to MongoDBJudson Flamm, Cloud Architect / Principal Engineer, LDS Church FamilySearch, owned by the Church of Jesus Christ of Latter-day Saints, is one of the largest web properties in the family history industry holding over 3 billion vital records and images used to help patrons in discovering their roots. Only the social network Facebook rivals it for sheer amount of data it must parse and serve: The social graph, or lineage graph in this case, poses a very significant Big Data challenge. FamilySearch is curating the family history of mankind, and holds approximately one billion records in a lineage-linked “tree of mankind.” This session will outline why FamilySearch is moving off traditional relational database technology to tackle its family history Big Data problems, and will include a discussion of its process for evaluating MongoDB to significantly increase both the performance and scale of its family tree application. It will also feature an early demonstration of a special “Bumping” application that allows two users to locate where their family trees may have linked up generations ago. |
|
|
Advanced Replication FeaturesDwight Merriman, Chairman/Co-Founder, 10gen In this session we will cover wide area replica sets, using tags for backup, and how the changes in 2.2 make replication better. Attendees should be well versed in basic replication and familiar with concepts in the morning's basic replication talk. No beginner topics will be covered in this session |
MongoDB Security FeaturesRon Avnur, VP Products and Services, 10gen In this session, we'll provide a preview of the security features that we are working on for the next version of MongoDB. |
MongoDB + PigK Young, CEO, Mortar Learn to process Mongo data with Hadoop—specifically with Apache Pig. We will cover the steps needed to read JSON from Mongo into Pig, parallel process it on Hadoop with sophisticated functions, and write back to Mongo. This talk will demonstrate its concepts with Mortar. Mortar is a PaaS for developing and executing Hadoop data pipelines. Mortar has contributed to the Mongo Hadoop connector, extending it to work with Pig. |
MongIOPS- Your Favorite Data Store, Only FasterMatthew Kennedy, Solution Architect, Fusion IO; Dale Russell, Chief Technical Officer, Talksum, Inc. Today's real-time world leaves nothing to wait. With more data to process, and less time at hand, pairing MongoDB with Fusion-io flash memory solutions is a winning combination. In this talk we’ll present the basics of architecting MongoDB for real time results, and the ability to process more data faster. We’ll walk through the architecture, results and ways you can harness the power of MongoDB along with the speed of NAND flash memory today. |
MongoDB at Coupons, Inc.Hemant Bist, Senior Engineer, Coupons.com MongoDB is playing a critical role in new initiatives at Couopns inc. For new initiatives MongoDb is our primary database. We use it for wide variety of cases. including real time site statistics for business development, and primary backend for coupon codes Content Management system. We would like to talk about our experience with MongoDB. What things worked for us: ease of data modeling, rapid prototyping, scaling etc. And the things that we felt would have helped us in migration, and in our future projects e.g. integration with solr, features in php client, better join support(when we need it!). |
NodeStack: Node.js, MongoDB and SmartOS as the Modern Stack for the Real-Time WebJason Hoffman, Founder and CTO, Joyent The combination of node.js and mongodb has emerged as the framework of choice for people building real-time applications, and with the use of SmartOS as the foundation is on the path to becoming the replacement for the common LAMP stack. This talk will suggest some reasons why and discuss some of the observed patterns in these types of applications. |
|
|
Coffee Break | |||||
|
|
MongoDB and HadoopSteve Francia, Chief Evangelist, 10gen Learn how to integrate MongoDB with Hadoop for large-scale distributed data processing. Using tools like MapReduce, Pig and Streaming you will learn how to do analytics and ETL on large datasets with the ability to load and save data against MongoDB. With Hadoop MapReduce, Java and Scala programmers will find a native solution for using MapReduce to process their data with MongoDB. Programmers of all kinds will find a new way to work with ETL using Pig to extract and analyze large datasets and persist the results to MongoDB. Python and Ruby Programmers can rejoice as well in a new way to write native Mongo MapReduce using the Hadoop Streaming interfaces. |
Lessons from the Field: Performance and OperationsScott Hernandez, Software Engineer, 10gen In this session we’ll talk through a series of examples to distill some of our best operational practices. The format of this talk is an interactive and fun adventure through some real-world cases that come from real systems and large deployments. This session will touch on backups, network availability, performance pitfalls, indexing/schema-design, log management, monitoring and alerting along with some good examples of diagnostic techniques with a goal of finding good solutions. |
MongoDB Performance TuningKenny Gorman, Founder, ObjectRocket Kenny Gorman will explain how to find, diagnose, and remedy bad statements, bad configurations, and poorly setup systems. He will take an in depth look at the MongoDB profiler, and how to use the new Aggregation Framework for new powerful profiling techniques. He will cover techniques in data locality, read and write tuning, SSD usage, tools, and performance enhancements with the latest version of MongoDB. |
White Board SessionEliot Horowitz, CTO/Co-Founder, 10gen |
MongoDB for Multi-Dimension Spatial IndexingNick Knize, Senior Geospatial Engineer, Thermopylea Learn how Thermopylae Sciences and Technology adapted MongoDB for tackling multi-dimension geospatial indexing at massive scale. During the presentation we will cover: -Why MongoDB over alternatives -What are the enhancements to the spatial indexer for multi-dimension spatial data -Where is it being used and what are the next steps |
Bringing Spatial Love to Your Java ApplicationSteve Citron-Pousty, OpenShift by Red Hat You have seen the stuff that Foursquare has done with spatial and you want some of that hotness for your app. But, where to start? MongoDB offers an easy way to get started and enables a variety of location-based applications - ranging from field resource management to social check-ins. In this session we are going to show you how easy it is to add spatial functionality to your application using MongoDB. We will load up a spatial database and then create web services, using a straight Java MongoDB driver and Hibernate OGM, then let your web or mobile application take advantage of the spatial functionality in MongoDB. Our application will be hosted on OpenShift, Red Hat's Platform as a Service, which provides multiple language development and native MongoDB hosting. By the end of this session, you will be ready to go home and start using MongoDB to add some great functionality and spatial love to your Java application. |
|
|
MongoDB Schema Design: Insights and TradeoffsMontse Medina, COO, Jetlore I will describe the challenges we faced when designing a MongoDB database for processing large data streams and the solutions we applied. Some of the difficulties included write-intensive loads, uneven access patterns (posts with many followers get many more hits than posts with few followers), and non-trivial support of privacy. I will describe the choices we made for schema design to optimize writes and efficient querying/retrieval. I will also talk about indexing strategies, tradeoffs we made to work around MongoDB design, and reasoning we applied to find the most optimal denormalization of collections. |
Monitor and Optimize your cluster with the MongoDB Monitoring Service (MMS)Antoine Girbal, Solution Architect, 10gen This talk will cover MMS - the MongoDB monitoring system. MMS is a Free MongoDB monitoring Saas solution built by 10gen and used by many MongoDB users. Monitoring is a necessary activity for any production database system to detect upcoming or ongoing issues. In addition it gives an insight on all the vitals of your system and can help detect bottlenecks and inefficiencies for improved performance. This talk will focus on: - what is MMS and how to get started - understanding each metric and graph - what are signs of trouble, when to take actions or panic - what are signs that your hardware ressources are not properly used - how did we build MMS, the high performance time series system |
High Performance Real Time AnalyticsDavid Mytton, Server Density Real time analytics is a compelling use case for MongoDB, but only if everything is up and running smoothly. We'll talk about how to set up and configure MongoDB to maintain high performance and redundancy. This will cover what to consider for high write throughput performance from hardware configuration through to the use of replica sets, multi-data centre deployments, monitoring and sharding to ensure your database is fast and stays online. |
White Board SessionDwight Merriman, Chairman/Co-Founder, 10gen |
Schema Documentation and Design Using a Mind MapTavo De Leon, Solution Architect, Steelhead Consulting Most people are visual learners. Often, visual learners will find that information "clicks" when it is explained with the aid of a chart or picture. In the RDBMS world a database schema is "visualized" through an Entity Relationship (ER) diagram. An ER diagram is the primary communication tool for a data model. MongoDB provides a powerful dynamic database schema. However it is sometimes difficult to visualize. An accurate visualization of a schema dramatically increases the ability to communicate the flexibility and power MongoDB between developers, architects, DBAs and end users. A mind map is a visual thinking tool that helps structure information, do better analysis, comprehend, synthesize, recall and generate new ideas. Its power lies in its simplicity, much like MongoDB. Using mind mapping open source tools, a clear and vibrant visualization of a dynamic MongoDB schema can be created that "clicks." |
Optimizing Your MongoDB Database on AWSMiles Ward, Amazon Web Services MongoDB is one of the fastest growing NoSQL workloads on AWS due to its simplicity and scalability, and recent product additions by the AWS team have only improved those traits. Join us for a deep-dive on MongoDB best practices, including installation, configuration, orchestration, performance, and durability optimization, as well as operational management using tools from AWS and 10gen. |
|
|
Session Transition | |||||
|
|
Closing Session and MongoDB RoadmapDwight Merriman, Chairman/Co-Founder, 10gen Located in room B4 | |||||
|
|
Conference After PartyGordon Biersch 33 East San Fernando St. San Jose, CA | |||||
