Semantic Search Engine with ChromaDB

In the realm of artificial intelligence (AI), data representation plays a pivotal role. Gone are the days when we relied solely on keywords and simple numerical data. Today, the concept of embeddings is revolutionizing the way AI systems understand and process information. Let’s explore the world of embeddings and how they lead us to the powerful concept of vector databases.

An embedding is a numerical representation of a word, phrase, image, or even a whole piece of code. These numerical representations, or vectors, carry the essence of the original data point within them. In the case of “love,” the vector might capture elements of relationships, emotion, and positive sentiment. This vector sits within a vast, multi-dimensional space.

The magic lies in this space: Proximity Means Similarity: Words with similar meanings or associations — “affection,” “adoration,” “devotion” — will have embedding vectors that reside close to the vector for “love”. Context Matters: Depending on the training data, embeddings can pick up on different shades of love — romantic, familial, platonic. Words and concepts associated with that type of love will cluster near it. Beyond Synonyms: Embeddings aren’t just about direct synonyms. Words that evoke the feelings or actions tied to love, like “cherish” or “protect” might also be situated nearby. Embeddings give AI systems the ability to calculate similarity. With vectors in hand, computers can determine which concepts are more closely related by measuring the distance between their corresponding vectors. This opens up incredible possibilities across many AI applications.

Embeddings in Action

Here’s how embeddings are transforming the world of AI: Search Engines: Ever notice how search engines understand your intent beyond specific keywords? Embeddings power this change, helping retrieve results that are thematically related to your query even if they don’t contain the exact same words. Recommendation Systems: Recommendation systems in streaming platforms and online stores analyze your preferences with the help of embeddings. Products, movies, and music with similar embedding vectors are likely to be suggested to you. Image Recognition: Can your phone now group your photos by people? Embedding-based similarity measurements let AI models recognize similarities between faces, objects, and scenes within images. Chatbots: Modern chatbots understand the nuances of human language due in part to embeddings. This allows them to go beyond rigid word matching and understand the underlying meaning of what you’re saying.

Vector Databases

As AI applications rely more heavily on embeddings, traditional databases start to fall short. Here’s where vector databases swoop in to save the day! Vector databases are purpose-built to: Store Embeddings: They are designed to efficiently store and manage massive collections of embedding vectors. Search by Similarity: The magic of vector databases lies in their ability to perform blazing-fast similarity searches. Think of finding the most similar images in a collection of millions, or identifying relevant documents based on a conceptual query.

How Vector Databases Work

Let’s dive deeper into the mechanics of how vector databases work, focusing on indexing, querying, and the technologies that make them efficient and scalable.

Indexing in Vector Databases

The primary challenge that vector databases address is the efficient indexing of high-dimensional data. Traditional indexing methods, effective for structured data like integers and strings, falter with the high dimensionality and unstructured nature of vector data. To overcome this, vector databases employ specialized indexing strategies: Tree-based Indexing: Techniques such as KD-trees and Ball trees partition the vector space into regions, organizing vectors in a way that reflects their spatial distribution. This structure allows the database to quickly eliminate large portions of the dataset that are unlikely to contain the query’s nearest neighbors. Hashing-based Indexing: Locality-sensitive hashing (LSH) is another popular method where vectors are hashed in such a way that similar items are more likely to be placed into the same “buckets.” This reduces the dimensionality of the problem by limiting searches to relevant buckets. Quantization: Methods like product quantization divide the vector space into a finite number of regions, each represented by a centroid vector. Vectors are then approximated by their nearest centroid, significantly reducing storage requirements and speeding up distance computations. Graph-based Indexing: Some vector databases use navigable small world (NSW) graphs or hierarchical navigable small world (HNSW) graphs, where vectors are nodes in a graph. These methods ensure that each node is linked to its nearest neighbors, facilitating efficient traversal during queries.

Querying Process

The querying process in vector databases is designed to efficiently find the “nearest neighbors” to a given query vector, which are the vectors in the database that are most similar to the query. This process involves: Distance Metrics: The similarity between vectors is typically measured using distance metrics such as cosine similarity or Euclidean distance. The choice of metric depends on the specific application and the nature of the data. Search Algorithms: Depending on the indexing method, different algorithms are used to traverse the index and identify the nearest neighbors. For tree-based methods, this might involve traversing down the tree, while graph-based methods involve navigating the graph’s nodes. Approximate Nearest Neighbor (ANN) Searches: Given the computational expense of exact nearest neighbor searches in high-dimensional spaces, vector databases often resort to ANN searches. These searches sacrifice a degree of accuracy for significant improvements in speed and resource efficiency, providing “good enough” results much faster.

Word Embeddings

Ideally, words that are semantically similar in natural language should have embeddings that are similar to each other in the encoded vector space. Analogously, words that are unrelated or opposite of one another should be further apart in the vector space. Get started using word vectors in Python. For this, you’ll use the popular spaCy library, a general-purpose NLP library. SpaCy’s en_core_web_md model includes 20,000 pre-trained word embeddings, each of which has 300 dimensions. This is more than enough for the examples that you’ll see next, but if you have the appetite for more word embeddings, then you can download the en_core_web_lg model, which has 514,000 embeddings.


                  import spacy
                  from cosine_similarity import compute_cosine_similarity
                  
                  nlp = spacy.load("en_core_web_md")
                  
                  dog_embedding = nlp.vocab["dog"].vector
                  cat_embedding = nlp.vocab["cat"].vector
                  apple_embedding = nlp.vocab["apple"].vector
                  tasty_embedding = nlp.vocab["tasty"].vector
                  delicious_embedding = nlp.vocab["delicious"].vector
                  truck_embedding = nlp.vocab["truck"].vector
                  
                  print(compute_cosine_similarity(dog_embedding, cat_embedding))
                  print(compute_cosine_similarity(delicious_embedding, tasty_embedding))
                  print(compute_cosine_similarity(apple_embedding, delicious_embedding))
                  print(compute_cosine_similarity(dog_embedding, apple_embedding))
                  print(compute_cosine_similarity(truck_embedding, delicious_embedding))

Text Embeddings

Text embeddings encode information about sentences and documents, not just individual words, into vectors. This allows you to compare larger bodies of text to each other just like you did with word vectors. Because they encode more information than a single word embedding, text embeddings are a more powerful representation of information. Text embeddings are typically the fundamental objects stored in vector databases like ChromaDB, and in this section, you’ll learn how to create and compare them. The most efficient way to generate text embeddings is to use pretrained models. These models vary in size, but they’re all typically trained on a large corpus of text, enabling them to pick up on complex semantic relationships. The SentenceTransformers library in Python is one of the best tools for this. You can install sentence-transformers with the following command:


                  !python -m pip install sentence-transformers
                  from sentence_transformers import SentenceTransformer
                  from cosine_similarity import compute_cosine_similarity

                  model = SentenceTransformer("all-MiniLM-L6-v2")
                  texts = [
                      "The canine barked loudly.",
                      "The dog made a noisy bark.",
                      "He ate a lot of pizza.",
                      "He devoured a large quantity of pizza pie.",
                  ]

                  text_embeddings = model.encode(texts)

                  print(type(text_embeddings))
                  print(text_embeddings.shape)
                  print("-------------------------")
                  text_embeddings_dict = dict(zip(texts, list(text_embeddings)))

                  dog_text_1 = "The canine barked loudly."
                  dog_text_2 = "The dog made a noisy bark."
                  print(
                      compute_cosine_similarity(
                          text_embeddings_dict[dog_text_1], text_embeddings_dict[dog_text_2]
                      )
                  )

                  pizza_text_1 = "He ate a lot of pizza."
                  pizza_text_2 = "He devoured a large quantity of pizza pie."
                  print(
                      compute_cosine_similarity(
                          text_embeddings_dict[pizza_text_1], text_embeddings_dict[pizza_text_2]
                      )
                  )
                  print("-------------------------")

                  print(
                      compute_cosine_similarity(
                          text_embeddings_dict[dog_text_1], text_embeddings_dict[pizza_text_1]
                      )
                  )

ChromaDB

We'll store ten documents to search over. To illustrate the power of embeddings and semantic search, each document covers a different topic, and you’ll see how well ChromaDB associates your queries with similar documents. We'll start by importing dependencies, defining configuration variables, and creating a ChromaDB client:


                  import chromadb
                  from chromadb.utils import embedding_functions

                  CHROMA_DATA_PATH = "chroma_data/"
                  EMBED_MODEL = "all-MiniLM-L6-v2"
                  COLLECTION_NAME = "demo_docs"

                  client = chromadb.PersistentClient(path=CHROMA_DATA_PATH)

You first import chromadb and then import the embedding_functions module, which you’ll use to specify the embedding function. Next, you specify the location where ChromaDB will store the embeddings on your machine in CHROMA_DATA_PATH, the name of the embedding model that you’ll use in EMBED_MODEL, and the name of your first collection in COLLECTION_NAME. You then instantiate a PersistentClient object that writes your embedding data to CHROMA_DB_PATH. By doing this, you ensure that data will be stored at CHROMA_DB_PATH and persist to new clients. Alternatively, you can use chromadb.Client() to instantiate a ChromaDB instance that only writes to memory and doesn’t persist on disk. Next, you instantiate your embedding function and the ChromaDB collection to store your documents in:


                  embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(
                      model_name=EMBED_MODEL
                  )

                  collection = client.create_collection(
                      name=COLLECTION_NAME,
                      embedding_function=embedding_func,
                      metadata={"hnsw:space": "cosine"},
                  )

You specify an embedding function from the SentenceTransformers library. ChromaDB will use this to embed all your documents and queries. In this example, you’ll continue using the "all-MiniLM-L6-v2" model. You then create your first collection. A collection is the object that stores your embedded documents along with any associated metadata. If you’re familiar with relational databases, then you can think of a collection as a table. In this example, your collection is named demo_docs, it uses the "all-MiniLM-L6-v2" embedding function that you instantiated, and it uses the cosine similarity distance function as specified by metadata={"hnsw:space": "cosine"}. The last step in setting up your collection is to add documents and metadata:


                  documents = [
                      "The latest iPhone model comes with impressive features and a powerful camera.",
                      "Exploring the beautiful beaches and vibrant culture of Bali is a dream for many travelers.",
                      "Einstein's theory of relativity revolutionized our understanding of space and time.",
                      "Traditional Italian pizza is famous for its thin crust, fresh ingredients, and wood-fired ovens.",
                      "The American Revolution had a profound impact on the birth of the United States as a nation.",
                      "Regular exercise and a balanced diet are essential for maintaining good physical health.",
                      "Leonardo da Vinci's Mona Lisa is considered one of the most iconic paintings in art history.",
                      "Climate change poses a significant threat to the planet's ecosystems and biodiversity.",
                      "Startup companies often face challenges in securing funding and scaling their operations.",
                      "Beethoven's Symphony No. 9 is celebrated for its powerful choral finale, 'Ode to Joy.'",
                  ]

                  genres = [
                      "technology",
                      "travel",
                      "science",
                      "food",
                      "history",
                      "fitness",
                      "art",
                      "climate change",
                      "business",
                      "music",
                  ]

                  collection.add(
                      documents=documents,
                      ids=[f"id{i}" for i in range(len(documents))],
                      metadatas=[{"genre": g} for g in genres]
                  )

In this block, you define a list of ten documents in documents and specify the genre of each document in genres. You then add the documents and genres using collection.add(). Each document in the documents argument is embedded and stored in the collection. You also have to define the ids argument to uniquely identify each document and embedding in the collection. You accomplish this with a list comprehension that creates a list of ID strings. The metadatas argument is optional, but most of the time, it’s useful to store metadata with your embeddings. In this case, you define a single metadata field, "genre", that records the genre of each document. When you query a document, metadata provides you with additional information that can be helpful to better understand the document’s contents. You can also filter on metadata fields, just like you would in a relational database query. With documents embedded and stored in a collection, you’re ready to run some semantic queries:


                  query_results = collection.query(
                      query_texts=["Find me some delicious food!"],
                      n_results=1,
                  )

                  query_results.keys()
                  query_results["documents"]
                  query_results["ids"]
                  query_results["distances"]
                  query_results["metadatas"]

Here, you pass two queries into collection.query(), Teach me about history and What’s going on in the world. You also request the two most similar documents for each query by specifying n_results=2. Lastly, by passing include=["documents", "distances"], you ensure that the dictionary only contains the documents and their embedding distances. Calling query_results["documents"][0] shows you the two most similar documents to the first query in query_texts, and query_results["distances"][0] contains the corresponding embedding distances. As an example, the cosine distance between Teach me about history and Einstein’s theory of relativity revolutionized our understanding of space and time is about 0.627. Similarly, query_results["documents"][1] shows you the two most similar documents to the second query in query_texts, and query_results["distances"][1] contains the corresponding embedding distances. For this query, the two most similar documents weren’t as strong of a match as in the first query. Recall that cosine distance is one minus cosine similarity, so a cosine distance of 0.80 corresponds to a cosine similarity of 0.20. Another awesome feature of ChromaDB is the ability to filter queries on metadata. To motivate this, suppose you want to find the single document that’s most related to music history. You might run this query:


                collection.query(
                    query_texts=["Teach me about music history"],
                    n_results=1
                )

Your query is Teach me about music history, and the most similar document is Einstein’s theory of relativity revolutionized our understanding of space and time. While Einstein is a historical figure who was a musician and teacher, this isn’t quite the result that you’re looking for. Because you’re particularly interested in music history, you can filter on the "genre" metadata field to search over more relevant documents:


                collection.query(
                  query_texts=["Teach me about music history"],
                  where={"genre": {"$eq": "music"}},
                  n_results=1,
              )

In this query, you specify in the where argument that you’re only looking for documents with the "music" genre. To apply filters, ChromaDB expects a dictionary where the keys are metadata names and the values are dictionaries specifying how to filter. In plain English, you can interpret {"genre": {"$eq": "music"}} as filter the collection where the "genre" metadata field equals "music". As you can see, the document about Beethoven’s Symphony No. 9 is the most similar document. Of course, for this example, there’s only one document with the music genre. To make it slightly more difficult, you could filter on both history and music:


                query_results = collection.query(
                    query_texts=["Teach me about music history"],
                    where={"genre": {"$in": ["music", "history"]}},
                    n_results=2,
                )

                query_results["documents"]
                query_results["distances"]

This query filters the collection of documents that have either a music or history genre, as specified by where={"genre": {"$in": ["music", "history"]}}. As you can see, the Beethoven document is still the most similar, while the American Revolution document is a close second. These were straightforward filtering examples on a single metadata field, but ChromaDB also supports other filtering operations that you might need. If you want to update existing documents, embeddings, or metadata, then you can use collection.update(). This requires you to know the IDs of the data that you want to update. In this example, you’ll update both the documents and metadata for "id1" and "id2":


                collection.update(
                    ids=["id1", "id2"],
                    documents=["The new iPhone is awesome!",
                              "Bali has beautiful beaches"],
                    metadatas=[{"genre": "tech"}, {"genre": "beaches"}]
                )

                query_results = collection.get(ids=["id1", "id2"])

                query_results["documents"]


                query_results["metadatas"]

Here, you rename the documents for "id1" and "id2", and you also modify their metadata. To confirm that your update worked, you call collection.get(ids=["id1", "id2"]) and can see that you’ve successfully updated both documents and their metadata. If you’re not sure whether a document exists for an ID, you can use collection.upsert(). This works the same way as collection.update(), except it’ll insert new documents for IDs that don’t exist. Lastly, if you want to delete any items in the collection, then you can use collection.delete():


                  collection.delete(ids=["id1", "id2"])

                collection.count()


                collection.get(["id1", "id2"])

In this block, you use collection.delete(ids=["id1", "id2"]) to delete all data associated with "id1" and "id2". You then verify the deletion of these two documents by calling collection.count(), and you can see that collection.get(["id1", "id2"]) has no data. You’ve now seen many of ChromaDB’s main features, and you can learn more with the getting started guide or API cheat sheet. You used a collection of ten hand-crafted documents that allowed you to get familiar with ChromaDB’s syntax and querying functionality, but this was by no means a realistic use case. In the next section, you’ll see ChromaDB shine while you embed and query over thousands of real-world documents!

Prepare and Inspect Your Dataset

We’ll use the Edmunds-Consumer Car Ratings and Reviews dataset from Kaggle to create the review collection. Put it data\archieve.



                  import pathlib
                  import polars as pl

                  def prepare_car_reviews_data(data_path: pathlib.Path, vehicle_years: list[int] = [2017]):
                      """Prepare the car reviews dataset for ChromaDB"""

                      # Define the schema to ensure proper data types are enforced
                      dtypes = {
                          "": pl.Int64,
                          "Review_Date": pl.Utf8,
                          "Author_Name": pl.Utf8,
                          "Vehicle_Title": pl.Utf8,
                          "Review_Title": pl.Utf8,
                          "Review": pl.Utf8,
                          "Rating": pl.Float64,
                      }

                      # Scan the car reviews dataset(s)
                      car_reviews = pl.scan_csv(data_path, dtypes=dtypes)

                      # Extract the vehicle title and year as new columns
                      # Filter on selected years
                      car_review_db_data = (
                          car_reviews.with_columns(
                              [
                                  (
                                      pl.col("Vehicle_Title").str.split(
                                          by=" ").list.get(0).cast(pl.Int64)
                                  ).alias("Vehicle_Year"),
                                  (pl.col("Vehicle_Title").str.split(by=" ").list.get(1)).alias(
                                      "Vehicle_Model"
                                  ),
                              ]
                          )
                          .filter(pl.col("Vehicle_Year").is_in(vehicle_years))
                          .select(["Review_Title", "Review", "Rating", "Vehicle_Year", "Vehicle_Model"])
                          .sort(["Vehicle_Model", "Rating"])
                          .collect()
                      )

                      # Create ids, documents, and metadatas data in the format chromadb expects
                      ids = [f"review{i}" for i in range(car_review_db_data.shape[0])]
                      documents = car_review_db_data["Review"].to_list()
                      metadatas = car_review_db_data.drop("Review").to_dicts()

                      return {"ids": ids, "documents": documents, "metadatas": metadatas}

                  if __name__ == '__main__':
                      DATA_PATH = "data/archive/*"
                      chroma_car_reviews_dict = prepare_car_reviews_data(DATA_PATH)
                      chroma_car_reviews_dict.keys()

                      print(chroma_car_reviews_dict["documents"][-10])

Collection & Reviews


                  import pathlib
                  import chromadb
                  from chromadb.utils import embedding_functions
                  from more_itertools import batched

                  def build_chroma_collection(
                      chroma_path: pathlib.Path,
                      collection_name: str,
                      embedding_func_name: str,
                      ids: list[str],
                      documents: list[str],
                      metadatas: list[dict],
                      distance_func_name: str = "cosine",
                  ):
                      """Create a ChromaDB collection"""

                      chroma_client = chromadb.PersistentClient(chroma_path)

                      embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(
                          model_name=embedding_func_name
                      )

                      collection = chroma_client.create_collection(
                          name=collection_name,
                          embedding_function=embedding_func,
                          metadata={"hnsw:space": distance_func_name},
                      )

                      document_indices = list(range(len(documents)))

                      for batch in batched(document_indices, 166):
                          start_idx = batch[0]
                          end_idx = batch[-1]

                          collection.add(
                              ids=ids[start_idx:end_idx],
                              documents=documents[start_idx:end_idx],
                              metadatas=metadatas[start_idx:end_idx],
                          )

In lines 1 to 4, you import the dependencies needed to define build_chroma_collection(). This function accepts the path where you’ll store the embeddings, the name of the collection to create, the name of the embedding function to use, the data to store in the collection, and the name of the distance function to use. You then instantiate a PersistentClient() object, create the collection, and add data to the collection. In lines 29 to 39, you add data to the collection in batches using the more-itertools library. Calling batched(document_indices, 166) breaks document_indices into a list of tuples, each with size 166. ChromaDB’s current maximum batch size is 166, but this might change in the future. As before, you import dependencies, define some configuration variables, and transform the raw reviews data. You then build a collection called car_review_embeddings using build_chroma_collection(). Notice that you’re now using the "multi-qa-MiniLM-L6-cos-v1" embedding function. The model behind this embedding function was specifically trained to solve question-and-answer semantic search tasks. Building the collection will take a few minutes, but once it completes, you can run queries like the following:

Connect to an LLM Service

As you know, you’re going to use the car reviews as context to an LLM. This means that you’ll ask the LLM a question like How would you summarize the most common complaints from negative car reviews?, and you’ll provide relevant reviews to help the LLM answer this question. To do this, you’ll first need to install the openai library:


                  import os
                  import json
                  import openai
                  os.environ["TOKENIZERS_PARALLELISM"] = "false"

                  with open("config.json", mode="r") as json_file:
                      config_data = json.load(json_file)

                  openai.api_key = config_data.get("openai-secret-key")
                  context = "You are a customer success employee at a large car dealership."
                  question = "What's the key to great customer satisfaction?"

                  chat_response = openai.ChatCompletion.create(
                      model="gpt-3.5-turbo",
                      messages=[
                          {"role": "system", "content": context},
                          {"role": "user", "content": question},
                      ],
                      temperature=0,
                      n=1,
                  )

                  print(chat_response["choices"][0]["message"]["content"])

Provide Context to the LLM

As you can see, the LLM gives you a fairly generic description of what it takes to promote customer satisfaction. None of this information is particularly useful to you because it isn’t specific to your car dealership. To make this response more tailored to your business, you need to provide the LLM with some reviews as context:


                  import os
                  import json
                  import openai
                  import chromadb
                  from chromadb.utils import embedding_functions
                  os.environ["TOKENIZERS_PARALLELISM"] = "false"

                  DATA_PATH = "data/archive/*"
                  CHROMA_PATH = "car_review_embeddings"
                  EMBEDDING_FUNC_NAME = "multi-qa-MiniLM-L6-cos-v1"
                  COLLECTION_NAME = "car_reviews"

                  with open("config.json", mode="r") as json_file:
                  config_data = json.load(json_file)

                  openai.api_key = config_data.get("openai-secret-key")

                  client = chromadb.PersistentClient(CHROMA_PATH)
                  embedding_func = embedding_functions.SentenceTransformerEmbeddingFunction(
                  model_name=EMBEDDING_FUNC_NAME
                  )

                  collection = client.get_collection(
                  name=COLLECTION_NAME, embedding_function=embedding_func
                  )

                  context = """
                  You are a customer success employee at a large
                  car dealership. Use the following car reviews
                  to answer questions: {}
                  """

                  question = """
                  What's the key to great customer satisfaction
                  based on detailed positive reviews?
                  """

                  good_reviews = collection.query(
                  query_texts=[question],
                  n_results=10,
                  include=["documents"],
                  where={"Rating": {"$gte": 3}},
                  )

                  reviews_str = ",".join(good_reviews["documents"][0])

                  good_review_summaries = openai.ChatCompletion.create(
                  model="gpt-3.5-turbo",
                  messages=[
                  {"role": "system", "content": context.format(reviews_str)},
                  {"role": "user", "content": question},
                  ],
                  temperature=0,
                  n=1,
                  )

                  print(good_review_summaries["choices"][0]["message"]["content"])

As before, you import dependencies, define configuration variables, set your OpenAI API key, and load the car_reviews collection. You then define context and question variables that you’ll feed into an LLM for inference. The key difference in context is the {} at the end, which will be replaced with relevant reviews that give the LLM context to base its answers on. You then pass the question into collection.query() and request ten reviews that are most related to the question. In this query, where={"Rating": {"$gte": 3}} filters the collection to reviews that have a rating greater than or equal to 3. Lastly, you pass the comma-separated review_str into context and request an answer from "gpt-3.5-turbo". Notice how much more specific and detailed ChatGPT’s response is now that you’ve given it relevant car reviews as context. For example, if you look through the documents in good_reviews, then you’ll see reviews that mention smooth acceleration and federal tax credits, both of which are incorporated into the LLM’s response. Now, even though ChatGPT used relevant reviews to inform its response, you might still be thinking that the response was fairly generic. To really see the power of using ChromaDB to provide ChatGPT with context, you can ask a question about a specific review:


                context = """
                You are a customer success employee at a large
                car dealership. Use the following car reviews
                to answer questions: {}
                """

                question = """
                Which of these poor reviews has the
                worst implications about our dealership?
                Explain why.
                """

                poor_reviews = collection.query(
                query_texts=[question],
                n_results=5,
                include=["documents"],
                where={"Rating": {"$lte": 3}},
                )

                reviews_str = ",".join(poor_reviews["documents"][0])

                poor_review_analysis = openai.ChatCompletion.create(
                model="gpt-3.5-turbo",
                messages=[
                {"role": "system", "content": context.format(reviews_str)},
                {"role": "user", "content": question},
                ],
                temperature=0,
                n=1,
                )

                print(poor_review_analysis["choices"][0]["message"]["content"])

In this example, you query the collection for five reviews that have the worst implications on the dealership, and you filter on reviews that have a rating less than or equal to 3. You then pass this question, along with the five relevant reviews, to ChatGPT. ChatGPT points to a specific review where a customer had a poor experience at the dealership, quoting the review directly. ChatGPT has no knowledge of this review without your providing it, and you may not have found this review without a vector database capable of accurate semantic search. This is the power that you unlock when combining vector databases with LLMs. You’ve now seen why vector databases like ChromaDB are so useful for adding context to LLMs. In this example, you’ve scratched the surface of what you can create with ChromaDB, so just think about all the potential use cases for applications like this. The LLM and vector database landscape will likely continue to evolve at a rapid pace, but you can now feel confident in your understanding of how the two technologies interplay with each other.