In this post, I’ll show you how to use Langflow to build a RAG pipeline in a drag-and-drop no-code interface, use GPT-4 to generate embeddings and save them in ChromaDB, then deploy it to Azure.

Setup

First, we need to install Langflow.

Langflow

Make sure to download v1.0.0a56 pre-release (the stable version is missing a lot of nice updates, as of June 14, 2024).

pip install langflow==1.0.0a56

ChromaDB

Chroma is a vector database. We will store our embeddings locally here.

pip install chromadb

How to Use

I’ll walk through how to setup your local Langflow server, connect it to ChromaDB, and run a query on it using Python.

1. Start Langflow server (local)

First, run the Langflow server by running:

langflow run

This will start a server at http://localhost:7860

2. Setup API keys / variables

We’ll start by creating a variable for securely storing our OpenAI API Key. Drag and drop the “OpenAI Embeddings” component onto the screen, then click “+ Add New Variable”

Screenshot 2024-06-14 at 11.25.06 AM

Fill out the popup form with your API key:

Screenshot 2024-06-14 at 4.09.21 PM

3. Setup the RAG pipeline

Next, we need to setup a pipeline to load our content, embed it, and save it in a database for later retrieval.

To do so, we drag and drop the following elements:

Add the “URL” component to load content from a URL.
Add “Recursive Character Text Splitter” to split the URL’s content into chunks for embedding.
Add “OpenAI Embeddings” and add the OPENAI_API_KEY as the variable for the OpenAI API Key.
Add the “Chroma” component and set a path for the “Persist Directory” to save our embeddings.
Finally, connect everything together as shown below.

Screenshot 2024-06-14 at 11.24.10 AM

4. Setup Chat pipeline

Now that we’ve embedded our documents, we need to tie them into a chatbot.

To do so, we:

Screenshot 2024-06-14 at 1.19.32 PM

Note: You’ll need to click the “Edit Code” button for the Chroma Search component, and change the last line from:

return self.search_with_vector_store(input_value, search_type, vector_store, k=number_of_results)

return self.search_with_vector_store(input_value.text, search_type, vector_store, k=number_of_results)

to fix a bug.

5. End-to-End Test

To test our setup, click on the “Playground” button in the bottom right of the screen.

This will open up a chat window that will allow you to interact with your chatbot and view its intermediate outputs:

Screenshot 2024-06-14 at 1.20.47 PM

6. Package for Deployment

To package our Langflow flow for deployment, we need to follow two steps.

First, export the flow as a .json file by clicking the “Export” button in the top left dropdown menu. The file size of the .json download for this tutorial was ~90KB.

Screenshot 2024-06-14 at 3.49.36 AM

Second, click the ”</> API” button in the bottom right of the page to open the code popup.

Screenshot 2024-06-14 at 3.51.14 AM

Third, click the “Python Code” tab and copy all of the code in the panel.

Screenshot 2024-06-14 at 1.21.36 PM

Fourth, move the export .json into your Flask app’s directory, and add the following route:

@app.route("/chat", methods=['POST'])
def chat():
    question = request.form.get('question')
    if question:
        result = run_flow_from_json(flow="Azure RAG.json",
                                    input_value=question,
                                    fallback_to_env_vars=False, # False by default
                                    tweaks=TWEAKS)
        answer = result[0].outputs[0].results
        print("Question=", question, "Answer=", answer)
        return render_template('chat.html', question = question, answer = answer)

7. Deploy to Azure

Follow the instructions here.

Local testing:

docker build --tag azure_rag .
docker run --detach --publish 5000:80 azure_rag

Deployment to Azure:

# Create resource group
az group create --name azure_rag --location eastus

# Create Azure container registry
az acr create --resource-group azure_rag --name azureragregistry --sku Basic --admin-enabled true

# Get password
ACR_PASSWORD=$(az acr credential show \
--resource-group azure_rag \
--name azureragregistry \
--query "passwords[?name == 'password'].value" \
--output tsv)

# Build Docker image
az acr build --resource-group azure_rag --registry azureragregistry --image azure_rag:latest .

# Create App Service
az appservice plan create --name azure_rag_webplan --resource-group azure_rag --sku B1 --is-linux

# Create Web App
az webapp create --resource-group azure_rag --plan azure_rag_webplan --name azure-rag-web-app --container-registry-password $ACR_PASSWORD --container-registry-user azureragregistry --role acrpull --deployment-container-image-name azureragregistry.azurecr.io/azure_rag:latest

To redeploy:

az acr build --resource-group azure_rag --registry azureragregistry --image azure_rag:latest .
az webapp restart --name azure-rag-web-app --resource-group azure_rag

Some tips:

You can’t deploy a container using the free trial Azure subscription, so you’ll need to create a Pay-as-you-go subscription
To save logs on your Linux app service, follow the instructions in the Azure documentation here
You need to change gunicorn’s port from 50505 to 80 in the Dockerfile. This is because Azure App Service expects the app to be running on port 80. If you don’t do this, you’ll get a 504 gateway error when you try to access the app.

Takeaways

I only scratched the surface of what these two libraries can do, but even just logging/querying GPT-3.5 is extremely useful.

Strengths

Easy setup: Getting Langflow running locally was super simple.
Pretty UI: The Langflow web UI is really nice. I like how it shows you all the possible inputs/outputs for each component. The components are organized logically, and it comes with support for a ton of different services out-of-the-box
Customization: I liked the ability to directly edit the code of each component.

Weaknesses

Debugging is Hard. The abstractions make it hard to pinpoint bugs and fix them. This lack of transparency was a huge drawback to the system and honestly made the rest of the benefits not worth it.
1. I kept getting a bug in the ChromaSearch that threw the error ValueError: Error building vertex Chroma Search: Invalid inputs provided. The cause was that a Mesage object was getting passed instead of Text to the Chroma vector object – I fixed this by using the “Edit Code” feature on the component itself, but this was hard to debug.
2. When I ran the Flask app on my exported .json I get getting ValueError: Error running graph: Error building vertex Ollama: Ollama call failed with status code 400. Details: {"error":"time: missing unit in duration \"-1\""}. I fixed this by adding a tweak of "timeout" : "100s", but then I got the cryptic ValueError: Error running graph: Error building vertex Ollama: Could not initialize Ollama LLM.. I couldn’t figure out how to fix this and gave up, so if anyone has advice would appreciate it!
Ollama support buggy? I tried getting Ollama to work but had trouble exporting it, so I switched to OpenAI for the model/embeddings.
Exporting .json is very coarse. It felt very ham-handed to export a 85MB json file to represent the flow I created, as I couldn’t edit it after exporting or debug/optimize it. I’d much rather have it convert to Python code so that I could directly modify it.

AI Tool Review - Langflow