Tencent Brings AI Capabilities to StarRocks
Tencent uses StarRocks extensively in many well-known applications, such as Tencent games, WeChat, and Tencent Video for a wide range of scenarios including ad-hoc analysis and BI dashboards.All of these use cases results in a StarRocks cluster of over twenty thousands of cores, which is a lot of cores given StarRocks runs so efficiently compared to other OLAP databases, especially for Tenchant’s PB-Scale Analytical Queries.
Before we dig into features and implementations, let’s take a look at an actual business scenario at Tencent that utilizes vector similarity search:
Many folks have used text2image models before. These models use some text as inputs and in turn generate relevant images. Tencent also offers these services. Training a text-to-image model requires a large amount of images. To support this, Tencent has built a content library that houses billions of images. Model trainers retrieve images from this library for training or fine-tuning purposes. The raw images are stored in the cloud, and each image is associated with some kind of metadata, such as the image URL, image size, image quality, and so on. Furthermore, each image contains two feature vectors: one is the image feature vector and the other is the feature vector about the text descriptions.
There are two primary requirements to query the metadata and feature vectors:
- The first is BI Dashboards: These dashboards are needed to display trends in the number of images, statistics on image sizes, and other information to help manage images.
- The second is Similarity Search: Includes both image-based search and text-based search. Users can upload an image or some text descriptions, then their feature vectors will be used to find the most similar ones in the database. For example, uploading a picture of a tiger, using its visual feature vector, it will find some similar pictures in the content library.
For scenarios related to vectors, Tencent’s first thought was to use a VectorDB to solve the problem. However, they faced some challenges using an existing VectorDB. Generally speaking, there are three main challenges they faced:
- Scale and compatibility
- Requirements for complex search
- Heterogeneous Workloads
Let’s examine each of these.
Scale and Compatibility
Regarding scale, Tencent processes billions of rows of data, each containing metadata and two feature vectors. This data exceeds one hundred TB. Due to this, they had concerns about whether existing vector databases could maintain stability and availability at such a massive scale.
Complex Search
Tencent’s scenario also required complex search over both vector data and scalar data. For example, they use Top-k search to find the k images most similar to a specific query, use Range search to find images within a defined similarity range, and use hybrid search to find similar objects that satisfy specific structural constraints. Initially, no existing VectorDBs could cover these complex search requirements.
Heterogeneous Workloads
Tencent also had to account for handling two different workloads over a single dataset. One being an OLAP workload for relational queries for BI dashboards, and the other being Vector search. VectorDBs are efficient for vector search. However, they fail at analytical queries such as aggregation and join queries. Although deploying multiple systems can solve this problem, it may cause data consistency issues and high maintenance costs.
To tackle this challenge, one of the paths forward for Tencent was to implement vectordb capabilites within a system that already runs in Tencent’s production systems. At Tencent, StarRocks had been working well for years and had been tested across many different scenarios. It has rich tools for system management and maintenance, and Tencent’s staff and developers were familiar with StarRocks and its tools. After consideration, Tencent decided to build a VectorDB inside StarRocks.
Adding Vector Index Support to StarRocks
In addition seamlessly working with Tencents current tools, StarRocks was specifically designed for big data applications and can easily handle scale in the hundreds of TBs. Tecent was also able to accomodate their complex search requirements as well by developing a dedicated library and algorithms. StarRocks has a high-performance query engine. Equipped with Tencent’s dedicated library and algorithms, it was able to efficiently process complex searches. Furthermore, with vector index support, StarRocks was capable of processing both analytical queries and vector searches.
Tecent’s vector index implementation supports two types of well-known vector indexes, they are HNSW and IVFPQ. Additionally, the L-two distance and cosine similarity can be used as similarity measures.
Here’s an example: Through this implementation, Tencent can perform top-k, range, and hybrid search through an easy-to-use SQL interface with the first query selecting the top-ten similar objects from a table by L-two distance. Meanwhile the query optimizer will generate efficient plan to scan the vector indexes and return the results. Within a single table, data is logically divided into several partitions. Each partition is further sharded into multiple tablets. Tablets serve as the fundamental units for basic units for distributed storage and processing, and each tablet has multiple replicas for high availability.
Tencent’s key contribution lies in enhancing the storage layer with vector index support. The tablet data is stored in segment files, and Tencent’s vector indexes are maintained in separate index files.
Benefiting from the MVCC feature of StarRocks, Tencent’s implementation natrualy supports data updates. Whenever data is updated, new versions of both data files and indexes are generated. Each flush of data files triggers the creation and flushing of vector indexes. Both data files and index files remain immutable. The system periodically compacts these versions to optimize storage and performance.
Most importantly, Tencent’s implementation is fully compatible with all existing data importing methods in StarRocks, including Stream Load, Routine Load, Broker Load, and others. This ensures seamless integration with existing data ingestion tools.
A Dedicated Library and Algorithms
Along with the implementations above, Tencent also developed TNN, a high-level vector index library. All vector index operations in StarRocks are now proxied by Tencent’s TNN library. TNN serves as a general index library, and was created to integrate and enhance existing index libraries. Currently, it functions like a supercharged version of the well-known vector index library Faiss, with some cool extra features. For instance, it supports Index Cache, range search, and advanced block cache. TNN also has advanced range search algorithms for the well-known graph-based index HNSW and quantization-based index IVF-PQ. Range search over those two indexes are not supported in Faiss.
Another novel feature in TNN is the Block Cache. Where Faiss reads indexes file-by-file, TNN reads and caches index block-by-block. Further comparing with Faiss, TNN can achieve equivalent performance and accuracy while using half the memory with Block Cache.
Processing and Workload Isolation
With vector index support, StarRocks is now capable of handling both analytical and vector search workloads. What’s more, Tencent is actively developing new features, such as a Serving Processing Engine and Workload Isolation, to further enhance support for heterogeneous workloads. These developments are currently in progress, but more details will come in the future.
After conducting some benchmarks on their system, when compared to dedicated VectorDBs, StarRocks displayed a medium level of performance. This is mainly caused by the fixed overhead of the system, such as the complex query scheduler and query optimizer. Tencent is continuing to try and reduce overhead and further optimize the performance of vector search. Additional results from Tencent’s production environment show that for data scale in the millions, queries can be completed in tens of milliseonds. For data scale in the billions, Tencent reports query performance in the tens of seconds.
Tencent’s Solution in Action
RAG, or Retrieval-Augmented Generation, is a promising technology that combines the power of large language models with external knowledge sources to enhance the quality and reliability of generated content. It’s like having a smart assistant that can look up the latest information on the spot, making sure you get the most accurate and up-to-date answers.
Imagine you‘re asking a question, and instead of just relying on what the model learned during training, RAG can go out and find relevant information from a vast database. It takes the question, then searches the vector database to retrieve relevant documents, and then uses the found contents to generate a more detailed and accurate response. It’s here that Tencent is using StarRocks as a vector database for RAG with their chat bots.
Tencent has a massive content library containing billions of images. Each image is associated with some metadata like image size or image quality. Additionally there are two feature vectors that represent textual features and visual features, respectively. There are also two different workloads happening on the same data. One is OLAP queries for BI dashboards. The other is vector search requests from users. Previously, two separate systems were used in this scenario for BI queries and vector search. StarRocks has now successfully unified these into a single system. It not only doubles the speed of vector search but also reduces total costs by half.
Try This for Yourself
In this article, we outlined the challenges Tencent faced in vector similarity search, such as scale and compatibility issues, complex search requirements, and heterogeneous workloads, and how StarRocks helped to address these challenges when used as a vector database. For users looking to:
- Process data scaling into the billions, with rich tool support
- Perform complex search with SQL;
- Or support analytic and vector search workloads within a single system
Then it’s worth taking a serious look at StarRocks. The best way to get started is by joining the StarRocks Slack community, learning about the project, and connecting with its contributors.
Join Us on Slack
If you’re interested in the StarRocks project, have questions, or simply seek to discover solutions or best practices, join our StarRocks community on Slack. It’s a great place to connect with project experts and peers from your industry.