Oct 26, 2020
Community Spotlight: LiquidM powers real-time highly-targeted adtech with Apache Druid
LiquidM provides modular cloud-based software that allows agencies and trading desks to run their adtech activities and campaigns on a customizable, standardized, open platform. LiquidM provides real time efficiency, control, and insights into media planning and buying.
I recently enjoyed a lively and far-reaching discussion with Hagen Rother, LiquidM’s Lead Architect, about the systems that they’ve built on Apache Druid. The first thing I learned is that LiquidM runs Druid directly on bare metal. “We only do bare metal. We scripted what we could out of the APIs of all the ISPs. We’ve been with seven different ISPs. We can order 20 boxes in a drop down menu, it may take an hour or two for them to be provisioned, and then in another hour we have those 20 machines in production,” says Hagen.
So many bids, so little time
Druid is a vital part of LiquidM’s stack. As a demand-side platform (DSP), performance and scale are far and away the most important drivers of LiquidM’s business. The system gives buyers of digital ads a way to manage their campaigns across multiple ad exchanges and data exchanges through a single interface. The online advertising industry revolves around real-time bidding and by using a DSP like LiquidM, marketers can manage their ad placements and the associated data used to target an audience.
All of this happens over http across the global internet in a fraction of a second. LiquidM’s infrastructure is capable of responding in less than 100ms to millions of requests per second. To make this more complicated, the results of the auctions are published as separate event streams by each exchange. LiquidM must then merge event streams at high volume in real time. They merge the events streams prior to ingesting into Druid, all within seconds. Additional apps – live reporting, billing, monitoring – are built on Druid.
Marketers combine targeting with price as they make ad spend decisions, and LiquidM has built it all on top of Druid. Marketers can filter the reach, or potential audience, to find exactly the right people for their ads, and cost scales with reach. Hagen explains, “we filter on 50+ dimensions, so we can do something like “all male iPhone users over 40 that are less than one mile from a Starbucks”. Highly targeted ads require LiquidM to process millions of events per seconds and make it manageable for media buyers with Druid. It’s the perfect use case for a service.”
Getting started with Druid
Flashback to the year 2012. It’s Hagen’s first day on the job, and his manager asks him, “hey, we asked Metamarkets if we can license their engine and, before they answered, they actually open-sourced it as Druid. Can you evaluate and set that up, please?” So Hagen did, comparing Druid to anything that might meet LiquidM’s requirements, including in-memory technologies like SAP Hana (he ruled out solutions that required vast amounts of expensive RAM and opted for Druid which could utilize less expensive NVME storage). Hagen recalls, “Druid made the integration easy. A few patches to our system and I had introduced Kafka 0.7 and the initial Druid release was up and running. We built our platform on top, and just when it was feature complete, the old reporting component died because we couldn’t index as fast as the exchanges could stream. So we coded ruby-druid to integrate it into our reporting platform. That was the crazy ride of the startup days, and while lots of things have changed since then, Druid is still here.”
Druid today at LiquidM
LiquidM runs Druid to be resilient to failures and to provide the highest level of performance available. They’re currently running 2 tiers, one is 10 dual AMD 7402 (96 vcores), the other is 9 AMD 7502P (64 vcores). Each machine contains about 1T of NVME storage, which Hagen refers to as “the sweet spot”. The tier running on 96 vcores also runs druid-indexer, the tier running on 64 vcores also runs middle-manager. Each Druid tier is built with one replica in order to balance reliability with ingestion, compaction, and query performance.
They use compaction tasks to gradually reduce granularity and dimensions over time. Most users are focused on what happens in real time and on the first day, so it makes sense to eliminate needless data as time passes. On day one, LiquidM ingests one million requests per second, and via a series of compaction tasks brings the deep storage required down to about three gigabytes for the year.
Hagen is an active participant on the #druid Apache Slack channel, and enjoys putting his years of Druid experience to use helping others.