Historically, the advertising industry has distributed centralized brand messages to consumers en masse. Founded in 2008, Struq is transforming online advertising from a generic push communication to a personalized communication, resulting in users seeing display, video and mobile ads that are meaningful and relevant.
Foregoing the cost and resource drain of a traditional relational database, Struq became an early adopter of MongoDB in 2010, using the NoSQL database to create a scalable ad personalization platform that dynamically creates and customizes ads in real-time. Thanks in part to the performance, flexibility and ease of use of MongoDB, Struq is able to provide brands like Adidas, Hilton Hotels and Levis with 12 times higher click-through rates than standard retargeted ads, and up to $30 in revenue for every $1 spent in marketing spend.
In the world of online advertising, milliseconds matter; real-time bidding (RTB) auctions require a 50ms deadline. In order to make informed decisions on user behavior and make a bid within the allocated window of time, Struq needs to access data and perform complex algorithms in real-time.
To handle running this kind of operation on a traditional database, Struq’s Head of Data Engineering Pierre Dane said they would need a cluster of at least 8 SQL Servers, SQL replication, and a large number of licenses – a costly proposition that would also stifle their ability to scale.
In addition to scale and latency requirements, Struq must support diverse and evolving schema for their data storage. Each time they improve their targeting algorithms, they require new fields to be added to their data store. Additionally, since they are storing customer-specific data – such as product views on the advertiser’s website – they need a database that can flexibly accommodate customer-specific fields. This data is stored against an anonymous cookie ID and cannot be linked back to the user.
The backbone of Struq's proprietary technology is its understanding of which users are valuable to an advertiser – those consumers that show behavioral patterns that make them statistically more likely to purchase. MongoDB enables this by allowing Struq to store terabytes of data – billions of objects such as recorded page views, product information and other anonymous user detail that is populated on the fly – and query it in real-time. The result is the ability to make tens of thousands of ad serving decisions per second.
EASE OF USE
Soon after Struq determined that MongoDB was a good fit for their ad platform, they were able to quickly release the application, putting it into production in days. Compared to administrating SQL Server, which is full-time work, MongoDB “just works,” according to Dane. There have been periods of months when he hasn’t had to look at the MongoDB cluster. “It’s phenomenal how MongoDB handles this volume of data and this number of queries every day,” he said.
INEXPENSIVE COST PER QUERY
Combining the open source software with commodity hardware and efficient use of man hours (no excess DBA or admin time!), MongoDB has proven to be an extraordinarily cost effective way for Struq to support a large number of queries.
Struq uses MongoDB Monitoring Service (MMS) for retrospective analysis. The graphical view makes it easy to see all the different metrics, such as MongoDB stats on replica sets, at any given moment, especially if there’s a “hiccup.” Of particular use to Struq are MMS features such as how replica sets self-discover, page faults and locking stats.
Struq also uses MongoDB for its own internal monitoring, some of its message queuing requirements, email queues and a host of other purposes.
In 2010, Struq had 2 master slave MongoDB instances and 20GB data. Two years later, they now run 7 replica sets, each with at least three nodes and a terabyte of data.
- OS: Windows
- Deployment platform: SoftLayer
- Server hardware configuration: 16-64GB RAM, 200-1TB SSD/server; RAID0 – between 1-5 drives/RAID depending on size of replica set
- Replica Sets: 7 replica sets; each running on a dedicated server
- Sharding: not yet (Struq splits its user data over replica sets to keep document sizes small – this reduces overhead of moving documents when they grow)
- Application Language: C#, Java , Python
- Other database technologies: Hive on Hadoop for analytics, MSSQLServer and PostGreSQL for configuration, Redis for non-persistent in-memory caching
MongoDB services billions of requests a day with an average latency of under a millisecond. “Flexibility is the opportune word,” said Dane, in explaining how MongoDB helps Struq to provide clients with optimum return on their marketing dollars.
Since real-time decision-making is at the core of Struq’s success, they often need to add new fields – e.g. product attributes, or interesting metrics like revenue per click – which they can do simply by amending the object model in the code base. “It’s very different than having to add an extra column to a SQL Server database with a hundred million rows,” said Dane. He doesn’t have any reservations about adding a replica set (which he can set up in an hour), and if it turns out they don’t need the data, he can very quickly wind it down. MongoDB also makes it very simple to add a couple of nodes when Struq needs more processing power or I/O availability.
Struq anticipates that their data will continue to grow exponentially, and in the near future, they’ll add MongoDB sharding.
Industry: Online Advertising
Location: London, UK
- Ad personalization platform enabled by MongoDB services billions of ad requests a day with an average latency of under a millisecond
- Continuously improve targeting algorithms by adding new metrics and details without schema migrations
- Optimizes customers’ marketing spend with high-performance, scalable data store