Comparison of the Open Source OLAP Systems for Big Data: ClickHouse, Druid, and Pinot

Sources of Information

Similarities between the Systems

Coupled Data and Compute

Differences from Big Data SQL Systems: Indexes and Static Data Distribution

  • Have their own format for storing data with indexes, and tightly integrated with their query processing engines. SQL-on-Hadoop systems are generally agnostic of the data format and therefore less “intrusive” in Big Data backends.
  • Have data distributed relatively “statically” between the nodes, and the distributed query execution takes advantage of this knowledge. On the flip side, ClickHouse, Druid and Pinot don’t support queries that require movements of large amounts of data between the nodes, e. g. joins between two large tables.

No Point Updates and Deletes

Big Data Style Ingestion

Proven at Large Scale

Immature

On Performance Comparisons and Choice of the System

CloudFlare: ClickHouse vs. Druid

When choosing a Big Data OLAP system, don’t compare how optimal they currently are for your use case. They are all very suboptimal, at the moment. Compare how fast your organization could make those systems to move in the direction of being more optimal for your use case.

  • Either your organization should have engineers who are able to read, understand and modify the source code of the chosen system, and have capacity to do that. Note that ClickHouse is written in C++, Druid and Pinot — in Java.
  • Or, your organization should sign a contract with a company that provides support of the chosen system. There is Altinity for ClickHouse, Imply and Hortonworks for Druid. There are no such companies for Pinot at the moment.
  • ClickHouse developers at Yandex stated that they spend 50% of their time building features that they need internally at the company, and 50% of time — the features with the most “community votes”. However, to be able to benefit from this, the features that you need in ClickHouse should match the features that most other people in the community need.
  • Druid developers from Imply are motivated to build widely applicable features in order to maximize their future business.
  • Druid’s development process is very similar to the Apache model, for several years it’s been developed by several companies with quite different priorities and without dominance of either one of those companies. ClickHouse and Pinot are currently far from that state, they are developed almost exclusively at Yandex and LinkedIn, respectively. Contributions into Druid have the least chance of being declined or revoked later, because they misalign with the goals of the primary developer. Druid doesn’t have a “primary” developer company.
  • Druid commits to support “developer API”, that allows to contribute custom column types, aggregation algorithms, “deep storage” options, etc. and maintain them separate from the codebase of the core Druid. Druid developers document this API and track it’s compatibility with the prior versions. However, this API is not matured yet and is broken pretty much in every Druid release. As far as I know, ClickHouse and Pinot don’t maintain similar APIs.
  • According to Github, Pinot has the most people who work on it, seems that at least 10 man-years were invested in Pinot in the last year. This figure is probably 6 for ClickHouse and about 7 for Druid. It means that in theory, Pinot is being improved the fastest among the subject systems.

Differences between ClickHouse and Druid/Pinot

Data Management: Druid and Pinot

Data Management: ClickHouse

Data Management: Comparison

Data distribution tradeoff in ClickHouse

Tiering of Query Processing Nodes in Druid

Tiering of query processing nodes in Druid

Data Replication: Druid and Pinot

Data Replication: ClickHouse

Data Ingestion: Druid and Pinot

Data Ingestion: ClickHouse

Data Ingestion: Comparison

Query Execution

ClickHouse vs. Druid or Pinot: Conclusions

Differences between Druid and Pinot

Segment Management in Druid

  • Less data is stored in ZooKeeper. Only minimal information about the mapping from the segment id to the list of query processing nodes on which the segment is loaded is kept in ZooKeeper. The remaining extended metadata, such as size of the segment, list of dimensions and metrics in it’s data, etc. is stored only in the SQL database.
  • When segments of data are evicted from the cluster because they become too old (this is a commonplace feature of timeseries databases, all ClickHouse, Druid and Pinot have it), they are offloaded from the query processing nodes and metadata about them is removed from ZooKeeper, but not from the “deep storage” and the SQL database. As long as they are not removed manually from those places, it allows to “revive” really old data quickly, in case the data is needed for some reporting or investigation.
  • Unlikely it was an intention originally, but now there are plans in Druid to make dependency on ZooKeeper optional. Currently ZooKeeper is used for three different things: segment management, service discovery, and property store, e. g. for realtime data ingestion management. Service discovery and property store functionality could be provided by Consul. Segment management could be implemented with HTTP announcements and commands, and it’s partially enabled by the fact that the persistence function of ZooKeeper is “backed up” by SQL database.

Segment Management in Pinot

“Predicate pushdown” in Pinot

“Pluggable” Druid and Opinionated Pinot

  • HDFS, or Cassandra, or Amazon S3, or Google Cloud Storage, or Azure Blob Storage, etc. as “deep storage”;
  • Kafka, or RabbitMQ, Samza, or Flink, or Spark, Storm, etc. (via tranquility) as real-time data ingestion source;
  • Druid itself, or Graphite, or Ambari, or StatsD, or Kafka as a sink for telemetry of Druid cluster (metrics).

Data Format and Query Execution Engine are Optimized Better in Pinot

  • Compression of indexed columns with bit granularity, byte granularity in Druid.
  • Inverted index is optional for each column, in Druid it’s obligatory, that’s not needed sometimes and consumes a lot of space. The difference in space consumption between Druid and Pinot observed by Uber could probably be attributed to this.
  • Min and max values in numeric columns are recorded per segment.
  • Out-of-the-box support for data sorting. In Druid it could be achieved only manually and in a hackish way, as explained in the section “CloudFlare: ClickHouse vs. Druid” above. Data sorting means better compression, so this feature of Pinot is another probable reason of the difference in space consumption (and query performance!) between Druid and Pinot observed by Uber.
  • Some more optimized format is used for multi-value columns than in Druid.

Druid Has a Smarter Segment Assignment (Balancing) Algorithm

Pinot is More Fault Tolerant on the Query Execution Path

Tiering of Query Processing Nodes in Druid

Summary

--

--

--

Writing about systems, technology, philosophy.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Roman Leventov

Roman Leventov

Writing about systems, technology, philosophy.

More from Medium

MLOps vs DevOps: Why is MLOps different? (MLOps-2)

MLOps at MoMo: Feature Store

MLOps with Databricks: (I) process flow design

Serving Spark NLP via API (3/3): Databricks and MLFlow Serve APIs