Skip to content
Snippets Groups Projects
Commit 97ab6e75 authored by Dellandrea Emmanuel's avatar Dellandrea Emmanuel
Browse files

Create chromadb_example.ipynb

parent 136605a9
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Chroma database
Chroma is an open-source vector database that is similar to Milvus and can be used with Windows systems. Here is an example of code illustrating its use.
%% Cell type:code id: tags:
``` python
# Installing the chromadb package
!pip install chromadb
```
%% Cell type:code id: tags:
``` python
# Importing the necessary module
from chromadb import PersistentClient
```
%% Cell type:code id: tags:
``` python
# Creating a database client stored in the "ragdb" folder, or loading it if it already exists
client = PersistentClient(path="./ragdb")
```
%% Cell type:code id: tags:
``` python
# Creating or loading a collection in ChromaDB
collection_name = "my_rag_collection"
try:
collection = client.get_collection(name=collection_name)
except:
collection = client.create_collection(name=collection_name)
```
%% Cell type:code id: tags:
``` python
from sentence_transformers import SentenceTransformer
# Load an embedding model
embedding_model = SentenceTransformer("BAAI/bge-small-en-v1.5")
# Define an embedding function
def text_embedding(text):
return embedding_model.encode(text).tolist()
```
%% Cell type:code id: tags:
``` python
# Adding documents with their metadata and unique identifiers
documents = [
"The sun rises in the east and sets in the west.",
"Raindrops create soothing sounds as they hit the ground.",
"Stars twinkle brightly in the clear night sky.",
"The ocean waves crash gently against the shore.",
"Mountains stand tall and majestic, covered in snow.",
"Birds chirp melodiously during the early morning hours.",
"The forest is alive with the sounds of rustling leaves and wildlife.",
"A gentle breeze flows through the meadow, carrying the scent of flowers."
]
embeddings = [text_embedding(document) for document in documents]
ids = [f"{i}" for i in range(len(documents))]
collection.add(
documents=documents,
embeddings=embeddings,
ids=ids
)
```
%% Cell type:code id: tags:
``` python
# Querying to find the documents most similar to a given phrase
query = "What happens in the forest during the day?"
# query = "Describe how stars appear in a clear night sky."
query_embedding = text_embedding(query)
results = collection.query(
query_embeddings=[query_embedding],
n_results=2 # Number of desired similar results
)
```
%% Cell type:code id: tags:
``` python
# Displaying the results
for result in results['documents']:
print("Similar document:", result)
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment