Build Multilingual AI Memory using Cohere Embedding Models
Langbase AI memory now supports Cohere embeddings. This guide will highlight their performance, cost, and use cases, and walk you through the process of creating and using Cohere embeddings in Langbase Memory.
Memory allows you to store, organize, and retrieve information. It can be used to build powerful Retrieval Augmented Generation (RAG) based AI agents. These agents can assist your queries using your own data, leading to more accurate and relevant responses.
What are Embeddings?
Embeddings are a crucial part of Memory. They are mathematical representations of text as vectors in a multi-dimensional space. They capture the semantic meaning of the input, making it easier for LLMs to understand and process the information.
For example, the phrases "I like cats" and "I enjoy felines" will have similar embeddings because they mean roughly the same thing.
In a RAG, high quality embeddings help the model retrieve the most-relevant information from the memory to generate accurate responses. Good embedding qualities include multilingual support, semantic understanding, and performance.
Cohere Embeddings
At Langbase, we are introducing support for embedding models from various providers, starting from Cohere. Currently, the following embedding models are supported:
embed-multilingual-v3.0By Cohereembed-multilingual-light-v3.0By Coheretext-embedding-3-largeBy OpenAI
Cohere embed has emerged as an impressive OpenAI alternative, and here are two reasons that stand out:
-
Multilingual Advantage: Cohere's
embedexcel at multilingual applications showing better performance across different languages in benchmarks compared to OpenAI. -
Cost: Cohere can be more cost-effective than OpenAI, especially for high volume data due to their flat rates. Cohere charges $1 per 1000 embeddings, while OpenAI charges on a per-token basis.
It's bananas easy to try it out. Let's take a look at how we can build a AI memory using Cohere's embeddings at Langbase.
Step #0Sign up
We will be building an AI memory agent using Langbase. So please go ahead and create an account on Langbase.
Step #1Setup a Node project and Install Langbase SDK
Now let's set up a Node project. To do it, run the following command in your project terminal:
1npm init -yThis will create a package.json file with basic information. We will use the Langbase SDK and dotenv package to read environment variables in our code. Run the command below.
1npm install langbase dotenvLastly, let's create an index.js file in our project directory. This will contain all the necessary code we will write to build an AI memory agent.
Step #2Generate a user API key
The next step is to generate a user API key which you can do here. We will use this key to authenticate our Langbase API requests.
Now go ahead and create a .env file in your project directory and add your API key there.
LANGBASE_API_KEY=<REPLACE_WITH_LANGBASE_API_KEY>
Step #3Create an AI memory agent
Now we will create an AI memory agent on Langbase using Langbase SDK. Using the docs, we will write a function to create a new memory in the index.js file.
1import 'dotenv/config';
2import { Langbase } from 'langbase';
3
4const langbase = new Langbase({
5 apiKey: process.env.LANGBASE_API_KEY!,
6});
7
8async function createNewMemory() {
9 const response = await langbase.memory.create({
10 name: 'multilingual-knowledge-base',
11 description: 'Advanced memory with multilingual support using Cohere',
12 embedding_model: 'cohere:embed-multilingual-v3.0'
13 });
14
15 const newMemory = await response.json();
16 return newMemory;
17}This function will create a multilingual-knowledge-base memory on Langbase. The embedding_model parameter specifies the embedding model you want to use for the memory. In this case, we are using Cohere's embed-multilingual-v3.0.
Step #4Upload data to memory
Now when you upload docs in multilingual-knowledge-base memory, Langbase will use Cohere embed-multilingual-v3.0 to embed them.
Let's say you have a markdown file product-faqs.md with product FAQs in different languages like this:
11. What is the battery life of the smartphone? (English)
2 The smartphone has a battery life of up to 24 hours on a single charge.
3
42. ¿Cuál es la duración de la batería del teléfono inteligente? (Spanish)
5 El teléfono inteligente tiene una duración de batería de hasta 24 horas con una sola carga.
6
73. Quelle est l'autonomie de la batterie du smartphone ? (French)
8 Le smartphone a une autonomie de batterie allant jusqu'à 24 heures avec une seule charge.
9
104. Wie lange hält der Akku des Smartphones? (German)
11 Das Smartphone hat eine Akkulaufzeit von bis zu 24 Stunden mit einer einzigen Ladung.
12
135. Qual è la durata della batteria dello smartphone? (Italian)
14 Lo smartphone ha una durata della batteria di fino a 24 ore con una sola carica.
15
16---
17
186. Does the smartphone support wireless charging? (English)
19 Yes, the smartphone supports fast wireless charging.
20
217. ¿El teléfono inteligente admite carga inalámbrica? (Spanish)
22 Sí, el teléfono inteligente admite carga inalámbrica rápida.
23
248. Le smartphone prend-il en charge la charge sans fil ? (French)
25 Oui, le smartphone prend en charge la charge sans fil rapide.
26
279. Unterstützt das Smartphone kabelloses Laden? (German)
28 Ja, das Smartphone unterstützt schnelles kabelloses Laden.
29
3010. Lo smartphone supporta la ricarica wireless? (Italian)
31 Sì, lo smartphone supporta la ricarica wireless veloce.
32
33---
34
3511. What storage options are available for the smartphone? (English)
36 The smartphone is available in 64GB, 128GB, and 256GB storage options.
37
3812. ¿Qué opciones de almacenamiento están disponibles para el teléfono inteligente? (Spanish)
39 El teléfono inteligente está disponible en opciones de almacenamiento de 64GB, 128GB y 256GB.
40
4113. Quelles options de stockage sont disponibles pour le smartphone ? (French)
42 Le smartphone est disponible en options de stockage de 64 Go, 128 Go et 256 Go.
43
4414. Welche Speicheroptionen sind für das Smartphone verfügbar? (German)
45 Das Smartphone ist in den Speicheroptionen 64 GB, 128 GB und 256 GB erhältlich.
46
4715. Quali opzioni di archiviazione sono disponibili per lo smartphone? (Italian)
48 Lo smartphone è disponibile nelle opzioni di archiviazione da 64 GB, 128 GB e 256 GB.
49
50---
51
5216. Is there a warranty for the smartphone? (English)
53 Yes, the smartphone comes with a one-year warranty.
54
5517. ¿Hay garantía para el teléfono inteligente? (Spanish)
56 Sí, el teléfono inteligente viene con un año de garantía.
57
5818. Y a-t-il une garantie pour le smartphone ? (French)
59 Oui, le smartphone est livré avec une garantie d'un an.
60
6119. Gibt es eine Garantie für das Smartphone? (German)
62 Ja, das Smartphone hat eine einjährige Garantie.
63
6420. C'è una garanzia per lo smartphone? (Italian)
65 Sì, lo smartphone viene fornito con una garanzia di un anno.You can upload this file to the memory using the Langbase SDK:
1import { readFileSync } from 'fs';
2
3async function uploadDocument(filePath) {
4 return await langbase.memory.documents.upload({
5 memoryName: 'multilingual-knowledge-base',
6 contentType: 'text/markdown',
7 documentName: 'faqs.md',
8 document: readFileSync(filePath),
9 meta: {
10 url: 'https://example.com/faqs.md'
11 }
12 });
13}This function will upload the product-faqs.md file from your project folder (using the given file path) to the multilingual-knowledge-base memory.
Once uploaded, Langbase will use Cohere's embed-multilingual-v3.0 to process it. You can view the file's status in the memory tab of Langbase Studio. Once ready, you can test the memory by asking questions in different languages to verify if it provides relevant answers.
Step #5Retrieve relevant data
We will use the Memory Retrieve function, to get the relevant chunks from the memory.
1const englishQuery = `Is there a warranty for the smartphone?`;
2const spanishQuery = `¿Hay garantía para el teléfono inteligente?`;
3
4async function retrieveSimilarChunks(query) {
5 const response= await langbase.memory.retrieve({
6 query: spanishQuery,
7 memory: [{ name: "multilingual-knowledge-base" }]
8 });
9
10 const similarChunks = await response.json();
11 return similarChunks;
12}Here is what the final code will look like:
1import 'dotenv/config';
2import { readFileSync } from 'fs';
3import { Langbase } from 'langbase';
4const path = require('path');
5
6const langbase = new Langbase({
7 apiKey: process.env.LANGBASE_API_KEY
8});
9
10async function createNewMemory() {
11 const response = await langbase.memory.create({
12 name: 'multilingual-knowledge-base',
13 description: 'Advanced memory with multilingual support using Cohere',
14 embedding_model: 'cohere:embed-multilingual-v3.0'
15 });
16
17 const newMemory = await response.json();
18 return newMemory;
19}
20
21async function uploadDocument(filePath) {
22 return await langbase.memory.documents.upload({
23 memoryName: 'multilingual-knowledge-base',
24 contentType: 'text/markdown',
25 documentName: 'faqs.md',
26 document: readFileSync(filePath),
27 meta: {
28 url: 'https://example.com/faqs.md'
29 }
30 });
31}
32
33async function retrieveSimilarChunks(query) {
34 const response = await langbase.memory.retrieve({
35 query: query,
36 memory: [{ name: 'multilingual-knowledge-base' }]
37 });
38
39 const similarChunks = await response.json();
40 return similarChunks;
41}
42
43(async function () {
44 const newMemory = await createNewMemory();
45
46 // Update file path to use your local file
47 const filePath = path.join(__dirname, 'product-faqs.md');
48 await uploadDocument(signedUrl, 'product-faqs.md');
49
50 // Wait for the uploaded document to be processed, then retrieve similar chunks
51
52 const englishQuery = `What kind of warranty do I get after purchasing the phone?`;
53 const spanishQuery = `¿Qué tipo de garantía tengo después de comprar el teléfono?`;
54
55 const englishQueryResult = await retrieveSimilarChunks(englishQuery);
56 console.log(JSON.stringify(englishQueryResult, null, 2));
57
58 const spanishQueryResult = await retrieveSimilarChunks(spanishQuery);
59 console.log(JSON.stringify(spanishQueryResult, null, 2));
60})();Step #6Run AI memory agent
Lastly, we will run our index.js file. It will create a memory, upload markdown documents inside it along with their metadata and then finally retrieve chunks from the memory for the following indirect user queries:
What kind of warranty do I get after purchasing the phone?(English)¿Qué tipo de garantía tengo después de comprar el teléfono?(Spanish)
1node index.jsIt should show an output close to this for the english and spanish queries:
1// English query response
2[
3 {
4 "text": "Yes, the smartphone comes with a one-year warranty.",
5 "similarity": 0.99,
6 "metadata": {
7 "url": "https://example.com/faqs"
8 },
9 }
10]1// Spanish query response
2[
3 {
4 "text": "Sí, el teléfono inteligente viene con un año de garantía.",
5 "similarity": 0.99,
6 "metadata": {
7 "url": "https://example.com/faqs"
8 },
9 }
10]You can see that the memory agent is able to retrieve the relevant information for both the English and Spanish queries. Feel free to play around with different queries.
Wrap up
In single digit minutes, your multilingual knowledge base memory is ready to assist users in multiple languages. That's all from this guide.
Dive deeper into Langbase's Memory API to explore more features and functionalities. You can also checkout the guide on building multi-agent AI support here.