Dangerous Musings

🚤 Achieving 10x Faster Queries by Switching to Rockset

To drastically improve query performance on transaction aggregations, Neo Financial1 implemented an online analytics processing (OLAP) database solution: Rockset. Instead of creating aggregation pipelines to group transactions by spend category or month, we would store the data pre-aggregated, significantly improving query times.

One month later: The project was a huge development success, with an average query latency of less than 200ms! Our team presented Rockset to the company shortly thereafter, which resulted in other teams wanting to adopt Rockset for themselves.

In contrast to the presentation, this post includes some of the challenges our four-person team encountered during the four-week project.

Rockset

Rockset stood out among Druid and ClickHouse for its ease of setup, configuration, and maintenance. It achieved these features in part due to its fully managed, serverless platform.

interface

Rockset is the interface encompassing: 1. RocksDB, a Meta-developed key-value store for ingested data 2. S3 for cold, inactive data 3. Compute resources including - Queries - Autoscaling - Dashboard Metrics - Monitoring - etc.

Rockset's secret sauce is its Converged Index. Instead of using the same indexing strategy for each query and dataset, the query optimizer selects the fastest indexing method (row, column or search) to use. No manual index creation is required, and the query is super fast!

Migrations Don't Exist

Meme showing that you can't make Rockset migrations

You can't migrate ingested data2 since that would mess up responses for anyone querying that data.

Alternatively, Rockset uses aliases: 1. Create a transactions_v1 collection with the original ingested data 2. Create an alias transactions 3. Point transactions to the transactions_v1 collection 4. Query transactions

aliases

Now, when you need to re-ingest (migrate) all of your data, you can: 1. Create a collection, transactions_v2, with the new migrated data 2. Point transactions to transactions_v2 when your clients are ready

The Rockset Client

We chose to build on top of the official Rockset JS Client/SDK by wrapping it in our own "Neo Rockset Client." Some improvements included: 1. Generic types on query results 2. A simpler interface 3. A retry strategy in case Rockset returned a 500 error

Here's what querying looked like before:

const results = await rocksetClient.queries
  .query({
    sql: {
      query: 'SELECT * FROM workspace_name.example_collection ORDER BY createdAt',
      parameters: [],
    },
  }) as QueryResult

Here's what it looks like with the modified client SDK:

const results = await neoRocksetClient.query<QueryResult>(
  {
   query: `SELECT * FROM workspace_name.example_collection ORDER BY createdAt`,
   parameters: []
  },
  { retryCount: 1 }
);

Test Workspaces

Another major improvement we made to the official Rockset client was the test workspace methods. For our use case, it was important to understand if LIMIT, ORDER BY, WHERE, >= and other operators worked correctly with our mock data.

Test Workspace Setup/Teardown Flow Chart

To accomplish this, each integration test encompassed the full lifecycle of a Rockset query: 1. Spin up a test workspace 2. Create a collection and associated alias 3. Populate/injest the documents 4. Delete the records, alias, collection, and workspace

Fortunately, the only tradeoff of using the test workspaces was that we needed to write integration tests in a Jest unit testing environment.


Challenges

The Converged Index and Comparisons

We experimented early on with our queries in prod to make sure all users were getting a good response time. After adding a filter with a >= comparison, our response latency increased from 200ms to 2000ms.

We tried several optimizations such as using Common Table Expressions (CTEs) to no avail. After a long chat with the Rockset team, we found out that the converged index was using the wrong index type! We weren't sure how the query optimizer did this, but the fix was to include a Rockset SQL hint to change the index.

Of course, this wasn't the only challenge we faced on our Rockset implementation journey. However, the Rockset team was always quick to support us; one guy even released a new Terraform module feature the day after we needed it.

Transforming Datetimes w/ Timezones

During ingestion, the timezone offset associated with each transaction is stripped away, leaving only an inaccurate UTC date time. The way to fix this is to transpose the timezone on ingestion:

SELECT DATETIME(createdAt, 'America/Edmonton') as createdAt FROM transactions

We couldn't do that because not every user in Canada lives in Edmonton. We would have to dynamically look up the timezone for each user and transpose the date time at the time of the query:

EXTRACT(
   DATE
   FROM
       DATE_TRUNC(
           'MONTH',
           DATETIME(
               TRY_CAST(createdAt AS TIMESTAMP),
               'America/Edmonton'
           )
       )
) as createdAt

Yeah, it's completely different. You have to cast the date time back to a timestamp, then convert it to a date/time with a timezone, and then cut off the date from that.

It took hours of experimentation just to get Rockset-flavored SQL dates to work.

Conclusion

With any new technology, there are challenges, and Rockset has been wonderful for our customers and our developers.

Acknowledgements: Thanks to Malcolm Fischer and Nick Poltz for proofreading this post.


  1. Disclaimer: This post does not accurately represent the technology used or implemented at Neo Financial. All opinions are my own and are not shared by Neo. 

  2. You can make a migration if you choose to use the Write API instead of ingesting from a third-party data source. 

Thoughts? Leave a comment