Rockset Is Acquired by OpenAI. What Does It Mean for Its Users?

StarRocks Engineering
5 min readJun 21, 2024

--

On June 21st, 2024, OpenAI announced its acquisition of Rockset, a real-time analytics database known for its data indexing and querying capabilities. This acquisition marks a significant change for Rockset users, giving them limited time to offload from the platform and leaving many wondering about their next steps. This article will guide Rockset users through this transition, offering insights into why OpenAI made this move, what immediate actions are necessary, and what solutions could serve as ideal alternatives for Rockset users and their real-time analytics needs.

Why Did OpenAI Acquire Rockset?

OpenAI aims to integrate Rockset’s technology to power its retrieval infrastructure across products. This is a fairly clear indicator of the importance of real-time data access and processing in the battle for AI supremacy. Additionally, by acquiring Rockset, OpenAI has absorbed an experienced team of real-time analytics experts who will continue to bolster OpenAI’s capabilities.

The First Thing Rockset Users Need to Do

The clock is ticking for current Rockset users. According to the detailed FAQ from Rockset, all month-to-month paid users without contracts must off-board by September 30th, 2024. While contracted customers will be able to coordinate with their Rockset account teams to develop a suitable offboarding plan, all customers will need to find an alternative to Rockset quickly. With the acquisition in place, it’s time for Rockset users to make their next move.

Rockset users should start by taking the following steps:

  1. Assess their current usage and requirements: It’s better to do this first and know what you’re looking for before you begin evaluating solutions. It can save you a lot of time.
  2. Start making a list of alternative platforms that offer similar or better capabilities: Your business’ needs can range from simple to complex depending on how you’ve been using Rockset to date. Every platform has its pros and cons. Knowing what a platfrom must do to avoid any disruptions in your business can help you avoid wasting valuable time evaluating solutions unable to deliver the performance and capabilities you care about.
  3. Begin planning the migration process to avoid any disruptions to their operations: Whether you go open source or commercial, it’s vital to evaluate the support or community available with your solution. Finding a partner who will be right by your side to guide you through a successful POC or an active Slack community that can help you troubleshoot at all hours of the day can be just as important for ensuring your migration goes smoothly.

Alternatives for Rockset Users

As Rockset users plan their next steps, exploring every reasonable alternatives is essential. Depending on your specific use cases and performance needs, different platforms may offer the capabilities you’re looking for. Here are some options to consider:

For open-source real-time analytics SQL workloads:

  • Apache Druid: Druid is a high-performance, real-time analytics database that delivers sub-second queries on streaming and batch data at scale and under load.
  • ClickHouse: ClickHouse is a fast open-source column-oriented database management system that allows generating analytical data reports in real-time using SQL queries.
  • StarRocks: Perfect for running scalable JOIN queries and delivering real-time analytics without denormalization pipelines. With out-of-box real-time data upsert support, StarRocks can deliver second-level data freshness with mutable data directly on its columnar storage.

For Proprietary (commercial) managed solutions for real-time analytics SQL workloads:

  • Imply: Managed Apache Druid on the cloud with enterprise support.
  • CelerData: Cloud-managed StarRocks, supported by the initiators and maintainers of the StarRocks project.

For open source vector search (VectorDB):

  • Weaviate: Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
  • Milvus: A cloud-native vector database and storage for next-generation AI applications
  • Qdrant: A high-performance, massive-scale vector database for the next generation of AI.

Managed vector search (VectorDB):

  • SingleStore: Beyond its SQL capabilities, SingleStore also provides managed vector search functionalities, making it a comprehensive solution for both types of workloads.
  • Zilliz: The company behind Milvus, Zilliz offers a managed service for vector search, providing the benefits of Milvus with added support and maintenance.
  • Pinecone: A fully managed vector search platform that simplifies the deployment and scaling of vector search applications, ensuring high availability and performance.

The urgency to transition is real. You need to ensure your critical infrastructure remains intact and operational. Each of these platforms has unique strengths, and evaluating them based on your specific requirements will ensure a successful migration.

Why StarRocks is the Best Next Step for Rockset Refugees

Many Rockset users adopted it for their real-time analytics needs, so it’s important to call out one of the current leaders in the real-time space specifically: StarRocks. For Rockset users looking for a powerful and efficient alternative for real-time analytics, StarRocks presents a compelling case. Here’s why:

  1. Scalable JOIN Queries: StarRocks allows users to run scalable JOIN queries, and delivers real-time analytics without the need for denormalization pipelines, simplifying data processing and enhancing performance.
  2. Real-Time Data Upserts: Don’t lose the data freshness you love when transitioning from Rockset.
  3. Superior Performance: Utilizing columnar storage, vectorization, and SIMD, StarRocks achieves better performance than Rockset, while only requiring a fraction of the storage, making it a cost-effective solution.
  4. Open Source Community: Being an Apache-licensed, Linux Foundation project, StarRocks has a massive and growing global community ready to help you troubleshoot anytime.

Your Next Move

The acquisition of Rockset by OpenAI presents both challenges and opportunities for its users. While the transition may seem daunting, it’s also an opportunity to upgrade to a platform that offers superior performance and scalability. To learn more about StarRocks’ real-time analytics performance and to get support during your migration, join the community on Slack.

Don’t wait until the last minute — start planning your transition today to ensure a seamless move off Rockset.

Behind the Article

Sida Shen is a contributor to the StarRocks project and a product marketing manager at CelerData. As an engineer with a background in building machine learning and big data infrastructures, he oversees the company’s market research while working closely with engineers and developers across the analytics industry to tackle challenges related to data lakehouse analytics.

--

--