AI Tool Review - Langflow
In this post, I’ll show you how to use Langflow to build a RAG pipeline in a drag-and-drop no-code interface, use GPT-4 to generate embeddings and save them in ChromaDB, then deploy it to Azure.
Setup
First, we need to install Langflow.
Langflow
Make sure to download v1.0.0a56 pre-release (the stable version is missing a lot of nice updates, as of June 14, 2024).
pip install langflow==1.0.0a56
ChromaDB
Chroma is a vector database. We will store our embeddings locally here.
pip install chromadb
How to Use
I’ll walk through how to setup your local Langflow server, connect it to ChromaDB, and run a query on it using Python.
1. Start Langflow server (local)
First, run the Langflow server by running:
langflow run
This will start a server at http://localhost:7860
2. Setup API keys / variables
We’ll start by creating a variable for securely storing our OpenAI API Key. Drag and drop the “OpenAI Embeddings” component onto the screen, then click “+ Add New Variable”
Fill out the popup form with your API key:
3. Setup the RAG pipeline
Next, we need to setup a pipeline to load our content, embed it, and save it in a database for later retrieval.
To do so, we drag and drop the following elements:
- Add the “URL” component to load content from a URL.
- Add “Recursive Character Text Splitter” to split the URL’s content into chunks for embedding.
- Add “OpenAI Embeddings” and add the
OPENAI_API_KEY
as the variable for the OpenAI API Key. - Add the “Chroma” component and set a path for the “Persist Directory” to save our embeddings.
- Finally, connect everything together as shown below.
4. Setup Chat pipeline
Now that we’ve embedded our documents, we need to tie them into a chatbot.
To do so, we:
Note: You’ll need to click the “Edit Code” button for the Chroma Search component, and change the last line from:
return self.search_with_vector_store(input_value, search_type, vector_store, k=number_of_results)
to
return self.search_with_vector_store(input_value.text, search_type, vector_store, k=number_of_results)
to fix a bug.
5. End-to-End Test
To test our setup, click on the “Playground” button in the bottom right of the screen.
This will open up a chat window that will allow you to interact with your chatbot and view its intermediate outputs:
6. Package for Deployment
To package our Langflow flow for deployment, we need to follow two steps.
First, export the flow as a .json
file by clicking the “Export” button in the top left dropdown menu. The file size of the .json
download for this tutorial was ~90KB.
Second, click the ”</> API” button in the bottom right of the page to open the code popup.
Third, click the “Python Code” tab and copy all of the code in the panel.
Fourth, move the export .json
into your Flask app’s directory, and add the following route:
@app.route("/chat", methods=['POST'])
def chat():
question = request.form.get('question')
if question:
result = run_flow_from_json(flow="Azure RAG.json",
input_value=question,
fallback_to_env_vars=False, # False by default
tweaks=TWEAKS)
answer = result[0].outputs[0].results
print("Question=", question, "Answer=", answer)
return render_template('chat.html', question = question, answer = answer)
7. Deploy to Azure
Follow the instructions here.
Local testing:
docker build --tag azure_rag .
docker run --detach --publish 5000:80 azure_rag
Deployment to Azure:
# Create resource group
az group create --name azure_rag --location eastus
# Create Azure container registry
az acr create --resource-group azure_rag --name azureragregistry --sku Basic --admin-enabled true
# Get password
ACR_PASSWORD=$(az acr credential show \
--resource-group azure_rag \
--name azureragregistry \
--query "passwords[?name == 'password'].value" \
--output tsv)
# Build Docker image
az acr build --resource-group azure_rag --registry azureragregistry --image azure_rag:latest .
# Create App Service
az appservice plan create --name azure_rag_webplan --resource-group azure_rag --sku B1 --is-linux
# Create Web App
az webapp create --resource-group azure_rag --plan azure_rag_webplan --name azure-rag-web-app --container-registry-password $ACR_PASSWORD --container-registry-user azureragregistry --role acrpull --deployment-container-image-name azureragregistry.azurecr.io/azure_rag:latest
To redeploy:
az acr build --resource-group azure_rag --registry azureragregistry --image azure_rag:latest .
az webapp restart --name azure-rag-web-app --resource-group azure_rag
Some tips:
- You can’t deploy a container using the free trial Azure subscription, so you’ll need to create a Pay-as-you-go subscription
- To save logs on your Linux app service, follow the instructions in the Azure documentation here
- You need to change gunicorn’s port from
50505
to80
in the Dockerfile. This is because Azure App Service expects the app to be running on port 80. If you don’t do this, you’ll get a 504 gateway error when you try to access the app.
Takeaways
I only scratched the surface of what these two libraries can do, but even just logging/querying GPT-3.5 is extremely useful.
Strengths
- Easy setup: Getting Langflow running locally was super simple.
- Pretty UI: The Langflow web UI is really nice. I like how it shows you all the possible inputs/outputs for each component. The components are organized logically, and it comes with support for a ton of different services out-of-the-box
- Customization: I liked the ability to directly edit the code of each component.
Weaknesses
- Debugging is Hard. The abstractions make it hard to pinpoint bugs and fix them. This lack of transparency was a huge drawback to the system and honestly made the rest of the benefits not worth it.
- I kept getting a bug in the
ChromaSearch
that threw the errorValueError: Error building vertex Chroma Search: Invalid inputs provided.
The cause was that aMesage
object was getting passed instead ofText
to the Chroma vector object – I fixed this by using the “Edit Code” feature on the component itself, but this was hard to debug. - When I ran the Flask app on my exported
.json
I get gettingValueError: Error running graph: Error building vertex Ollama: Ollama call failed with status code 400. Details: {"error":"time: missing unit in duration \"-1\""}
. I fixed this by adding a tweak of"timeout" : "100s"
, but then I got the crypticValueError: Error running graph: Error building vertex Ollama: Could not initialize Ollama LLM.
. I couldn’t figure out how to fix this and gave up, so if anyone has advice would appreciate it!
- I kept getting a bug in the
- Ollama support buggy? I tried getting Ollama to work but had trouble exporting it, so I switched to OpenAI for the model/embeddings.
- Exporting
.json
is very coarse. It felt very ham-handed to export a 85MB json file to represent the flow I created, as I couldn’t edit it after exporting or debug/optimize it. I’d much rather have it convert to Python code so that I could directly modify it.