AI Tool Review - Langfuse + LiteLLM
In this post, I’ll show you how to use LiteLLM to query 100+ different LLMs using the same unified interface, then track all of these calls automatically in a locally hosted Langfuse dashboard.
Langfuse logs the following info for each query (aka “trace”):
- Total tokens (prompt and completion)
- Total cost ($)
- Latency (s)
- Input and output
- Model
- and more!
End result of this tutorial
Setup
First, we need to install LiteLLM and Langfuse. We’ll use the self-hosted Docker version of Langfuse.
LiteLLM
pip install litellm
Langfuse
First, you’ll need to install Docker:
brew install docker
brew install docker-compose
Second, the Python library:
pip install langfuse>=2.0.0
Third, the self-host Docker option:
git clone https://github.com/langfuse/langfuse.git
cd langfuse
How to Use
I’ll walk through how to setup your local Langfuse server, connect it to LiteLLM, and run a query on it using Python.
1. Start Langfuse server (local)
First, run the Langfuse server by cd
ing into the langfuse/
directory we cloned previously, and running:
docker compose up
This will start a server at http://localhost:3000
2. Get Langfuse API keys (local)
Go to http://localhost:3000 in your browser.
Click “Sign up” to create a local account:
Create your account. (Note: This isn’t an official Langfuse account, just a locally hosted one. I’m not sure why this step is necessary for self hosting.)
You should now see a page that looks like this. This is where your LLM calls will be logged:
Retrieve your API keys. Click on “Settings”, then scroll down to “API keys”. Click “Create new API keys” to create your new public/secret key pair.
Copy down your public/secret keys. We’ll use them for LiteLLM in the step below.
3. LiteLLM
Here is a minimal viable example of using LiteLLM to query GPT-3.5, and log the results in Langfuse:
from typing import Dict, Any
import litellm
import os
# Set environment variables + connect Langfuse <> LiteLLM
os.environ["LANGFUSE_HOST"] = "http://localhost:3000"
os.environ["LANGFUSE_PUBLIC_KEY"] = 'pk-...' # from Step 2
os.environ["LANGFUSE_SECRET_KEY"] = 'sk-...' # from Step 2
litellm.success_callback = ["langfuse"]
messages: List[Dict[str, str]] = [{
'role' : 'user',
'content' : 'What is 1+1?'
}]
response: Dict[str, Any] = litellm.completion(model="gpt-3.5-turbo", messages=messages)
print(response['choices'][0]['message']['content'])
If you go to http://localhost:3000, you should now see this query logged as a “trace” in the Langfuse Dashboard:
And clicking on this trace should reveal a lot of details about it:
Takeaways
I only scratched the surface of what these two libraries can do, but even just logging/querying GPT-3.5 is extremely useful.
Strengths
- Easy setup: Getting Langfuse running locally was super simple.
- Pretty UI: The Langfuse dashboard is super clean. It makes it very easy to track everything you’d want to know about your LLM queries.
- In-depth tracking: Langfuse logs pretty much everything about your LLM queries, so I haven’t found any need yet for additional logging
Weaknesses
- None yet