If you are programming in Python around LLM APIs like OpenAI, Anthropic or Huggingface etc., I present you LiteLLM which can save you a lot of time.

This is a Python library that can interact with a whole range of APIs that use the OpenAI format. It provides a simple and standardized interface for calling these models, making it easier for you to use them for things like text generation, translation or even chat…

Installation couldn’t be easier:

pip install litellm

Then all you have to do is create a LiteLLM object in your code and assign it the ID and name of the model to use. For example, to connect to OpenAI, the code is as follows:

from litellm import completion
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-api-key"

response = completion(
  model="gpt-3.5-turbo", 
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

For Claude 2 it will be:

from litellm import completion
import os

## set ENV variables
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

response = completion(
  model="claude-2", 
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

Use Ollamait would also look like this:

from litellm import completion

response = completion(
            model="ollama/llama2", 
            messages = [{ "content": "Hello, how are you?","role": "user"}], 
            api_base="http://localhost:11434"
)

So not much changes.

So you will have understood that LiteLLM allows you to write a single code but discuss with all current AI vendors (and existing free software).

It is possible to create a stream for the responses (that is, the text that is displayed little by little), the exception management and the log, without forgetting the cost calculation and the use of these APIs so as not to destroy your bank account.

LiteLLM also integrates an OpenAI proxy to redirect your requests to the model of your choice. To install it:

pip install 'litellm[proxy]'

Then start the proxy with the model of your choice:

litellm --model huggingface/bigcode/starcoder

And forward your requests in Python code directly to it:

import openai # openai v1.0.0+
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

If LiteLLM interests you, you can find all the information here on the Github page as well as Endpoints here.


]