AI News HubLIVE
站内改写

Show HN: Torrix, self hosted, LLM Observability,(no Postgres, no Redis)

Torrix is a self-hosted LLM observability tool that tracks tokens, cost, latency, full prompt traces, reasoning tokens, and PII masking. It supports many LLM providers and can be deployed with Docker without Postgres or Redis. It offers SDKs for Python, Node.js, Go, C#, Java, as well as LangChain callback and HTTP proxy.

Article intelligence

InvestorsAdvanced

Key points

  • Self-hosted LLM observability without Postgres or Redis.
  • Tracks tokens, cost, latency, prompt traces, reasoning tokens, and PII masking.
  • Supports many LLM providers and offers SDKs for multiple languages plus HTTP proxy.

Why it matters

This matters because self-hosted LLM observability without Postgres or Redis.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

Notifications You must be signed in to change notification settings

Fork 0

Star 1

BranchesTags

Open more actions menu

Folders and files

NameName

Last commit message

Last commit date

Latest commit

History

68 Commits

68 Commits

demos

demos

docs

docs

README.md

README.md

docker-compose.community.yml

docker-compose.community.yml

Repository files navigation

Track every LLM request: tokens, cost, latency, full prompt traces, reasoning token capture, and PII masking. Works with OpenAI, Anthropic, Google Gemini, Groq, Mistral, Azure OpenAI, DeepSeek, Perplexity, Fireworks, Together AI, Cohere, HuggingFace, Replicate, Ollama, and any HTTP endpoint. Self-hosted, no data leaves your machine.

Getting Started

The only requirement is Docker Desktop.

Mac

Open Terminal and run:

curl -o docker-compose.yml https://raw.githubusercontent.com/torrix-ai/install/main/docker-compose.community.yml docker compose up

This downloads the community edition config and saves it as docker-compose.yml so Docker picks it up automatically.

Windows

Open PowerShell and run:

curl.exe -o docker-compose.yml https://raw.githubusercontent.com/torrix-ai/install/main/docker-compose.community.yml docker compose up

This downloads the community edition config and saves it as docker-compose.yml so Docker picks it up automatically.

Or download the file manually:

Go to github.com/torrix-ai/install

Click docker-compose.community.yml then click Raw

Save the file as docker-compose.yml

Open a terminal in that folder and run docker compose up

After startup

Open http://localhost:8088

Create your account

Copy your API key from Settings

Start sending LLM calls through the proxy or SDK

Verify your setup

Check the server is running (no API key needed):

curl http://localhost:8088/health

Expected response:

{"ok":true,"name":"Torrix","version":"2.0.0"}

Check runs are being logged (requires your API key from Settings):

Mac / Linux:

curl http://localhost:8088/api/runs -H "Authorization: Bearer "

Windows (PowerShell):

Invoke-WebRequest http://localhost:8088/api/runs -Headers @{Authorization="Bearer "} | Select-Object -ExpandProperty Content

Returns a list of all logged runs. An empty array [] means the server is working but no runs have been sent yet.

Send a test run

Send a real request through the Torrix proxy to confirm runs appear in the dashboard. Even if the OpenAI key is invalid, Torrix will still log the attempt.

Mac / Linux:

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: https://api.openai.com/v1/chat/completions" \ -H "x-upstream-authorization: Bearer " \ -H "x-torrix-name: test-run" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'

Windows (PowerShell):

Invoke-WebRequest -Method Post http://localhost:8088/proxy ` -Headers @{ "Authorization"="Bearer "; "x-target-url"="https://api.openai.com/v1/chat/completions"; "x-upstream-authorization"="Bearer "; "x-torrix-name"="test-run" } ` -ContentType "application/json" ` -Body '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}' | Select-Object -ExpandProperty Content

Then open http://localhost:8088. The run should appear in your dashboard.

Sending data to Torrix

Option 1: Python SDK

pip install torrix

OpenAI:

import torrix from openai import OpenAI

torrix.init(api_key="", base_url="http://localhost:8088") client = torrix.wrap(OpenAI(api_key=""))

response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}], torrix_name="my-run", ) print(response.choices[0].message.content)

Anthropic:

from anthropic import Anthropic

client = torrix.wrap(Anthropic(api_key=""))

response = client.messages.create( model="claude-3-5-sonnet-20241022", max_tokens=1024, messages=[{"role": "user", "content": "Hello!"}], torrix_name="my-run", ) print(response.content[0].text)

Streaming:

stream = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}], stream=True, ) for chunk in stream: print(chunk.choices[0].delta.content or "", end="", flush=True)

Option 2: Node.js SDK

npm install torrix openai

or: npm install torrix @anthropic-ai/sdk

OpenAI:

import * as torrix from 'torrix' import OpenAI from 'openai'

torrix.init('', 'http://localhost:8088') const client = torrix.wrap(new OpenAI({ apiKey: '' }))

const response = await client.chat.completions.create({ model: 'gpt-4o-mini', messages: [{ role: 'user', content: 'Hello!' }], torrix_name: 'my-run', }) console.log(response.choices[0].message.content)

Anthropic:

import Anthropic from '@anthropic-ai/sdk'

const client = torrix.wrap(new Anthropic({ apiKey: '' }))

const response = await client.messages.create({ model: 'claude-3-5-sonnet-20241022', max_tokens: 1024, messages: [{ role: 'user', content: 'Hello!' }], torrix_name: 'my-run', }) console.log(response.content[0].text)

Option 3: Go SDK

go get torrix.ai/sdk/go

package main

import ( "context" "os"

torrix "torrix.ai/sdk/go" openai "github.com/sashabaranov/go-openai" )

func ptr[T any](v T) *T { return &v }

func main() { torrix.Init(os.Getenv("TORRIX_API_KEY"), torrix.WithBaseURL("http://localhost:8088"), )

client := openai.NewClient(os.Getenv("OPENAI_API_KEY")) userMsg := "What is the capital of France?"

var resp openai.ChatCompletionResponse latency, err := torrix.Measure(func() error { var e error resp, e = client.CreateChatCompletion(context.Background(), openai.ChatCompletionRequest{ Model: openai.GPT4oMini, Messages: []openai.ChatCompletionMessage{{Role: openai.ChatMessageRoleUser, Content: userMsg}}, }) return e }) if err != nil { panic(err) }

reply := resp.Choices[0].Message.Content

torrix.Ingest(torrix.IngestPayload{ Model: &resp.Model, InputTokens: ptr(int(resp.Usage.PromptTokens)), OutputTokens: ptr(int(resp.Usage.CompletionTokens)), LatencyMs: ptr(latency.Milliseconds()), Status: ptr(200), Prompt: &userMsg, Response: &reply, }) }

See docs/go-sdk.md for the full reference.

Option 4: C# / .NET SDK

dotnet add package Torrix

using TorrixAI;

Torrix.Init("", new TorrixOptions { BaseUrl = "http://localhost:8088" });

var chatClient = new ChatClient("gpt-4o-mini", ""); var userMessage = "What is the capital of France?";

var (response, latencyMs) = await Torrix.MeasureAsync(async () => await chatClient.CompleteChatAsync(userMessage));

Torrix.Ingest(new IngestPayload { Model = "gpt-4o-mini", Provider = "openai", LatencyMs = latencyMs, InputTokens = response.Value.Usage.InputTokenCount, OutputTokens = response.Value.Usage.OutputTokenCount, Prompt = userMessage, Response = response.Value.Content[0].Text, });

Targets .NET 6 and above. Zero external dependencies. Works with OpenAI, Azure OpenAI, and SAP AI Core.

See docs/csharp-sdk.md for Azure OpenAI, SAP AI Core examples, and the full API reference.

Option 5: Java SDK

Maven:

ai.torrix torrix 0.2.0

Gradle: implementation 'ai.torrix:torrix:0.2.0'

import ai.torrix.*;

Torrix.init(System.getenv("TORRIX_API_KEY"), "http://localhost:8088");

long start = System.currentTimeMillis(); // ... your LLM call ... long latencyMs = System.currentTimeMillis() - start;

Torrix.ingest(IngestPayload.builder() .model("gpt-4o-mini") .provider("openai") .latencyMs(latencyMs) .inputTokens(usage.getPromptTokens()) .outputTokens(usage.getCompletionTokens()) .build());

Java 11+. Zero external dependencies. Works with Spring AI, LangChain4j, OpenAI Java SDK, and any Java HTTP client.

See docs/java-sdk.md for the full reference.

Option 6: LangChain callback

Use TorrixCallbackHandler to trace every LLM call made through a LangChain LLM or ChatModel.

pip install torrix langchain-core

import torrix from torrix.wrappers.langchain_callback import TorrixCallbackHandler from langchain_openai import ChatOpenAI

torrix.init(api_key="", base_url="http://localhost:8088") handler = TorrixCallbackHandler()

llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[handler]) response = llm.invoke("What is the capital of France?")

Every invocation is logged to Torrix with model, token counts, latency, prompt, and response. Works with any LangChain LLM or ChatModel.

Option 7: HTTP Proxy (any language or tool)

Route any HTTP request through Torrix. Works with Google Gemini, Azure OpenAI, Groq, Mistral, DeepSeek, Perplexity, Fireworks, Together AI, Cohere, HuggingFace, Replicate, SAP AI Core, GitHub Copilot, n8n, Make, curl, and any OpenAI-compatible API.

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: https://api.openai.com/v1/chat/completions" \ -H "x-upstream-authorization: Bearer " \ -H "x-torrix-name: my-run" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'

Header Description

Authorization Your Torrix API key (from Settings)

x-target-url The real LLM endpoint to forward to

x-upstream-authorization Your LLM provider API key (omit if using ?key= in URL)

x-torrix-name Optional label for this run

x-torrix-provider Optional provider hint: openai, anthropic, google

x-torrix-trace Optional trace ID to group multiple calls into one agent run

x-torrix-session Optional session ID to group a multi-turn conversation

Google Gemini (uses ?key= instead of Bearer token):

import requests

response = requests.post( "http://localhost:8088/proxy", headers={ "Authorization": "Bearer ", "x-target-url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent?key=", "x-torrix-provider": "google", "x-torrix-name": "gemini-test", }, json={"contents": [{"parts": [{"text": "Hello!"}]}]}, )

Azure OpenAI:

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: https://.openai.azure.com/openai/deployments//chat/completions?api-version=2024-02-01" \ -H "x-upstream-authorization: Bearer " \ -H "x-torrix-name: azure-test" \ -H "Content-Type: application/json" \ -d '{"messages":[{"role":"user","content":"Hello"}]}'

Groq:

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: https://api.groq.com/openai/v1/chat/completions" \ -H "x-upstream-authorization: Bearer " \ -H "x-torrix-name: groq-test" \ -H "Content-Type: application/json" \ -d '{"model":"llama3-8b-8192","messages":[{"role":"user","content":"Hello"}]}'

Mistral:

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: https://api.mistral.ai/v1/chat/completions" \ -H "x-upstream-authorization: Bearer " \ -H "x-torrix-name: mistral-test" \ -H "Content-Type: application/json" \ -d '{"model":"mistral-small-latest","messages":[{"role":"user","content":"Hello"}]}'

DeepSeek:

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: https://api.deepseek.com/chat/completions" \ -H "x-upstream-authorization: Bearer " \ -H "x-torrix-name: deepseek-test" \ -H "Content-Type: application/json" \ -d '{"model":"deepseek-chat","messages":[{"role":"user","content":"Hello!"}]}'

Ollama (local models):

curl -X POST http://localhost:8088/proxy \ -H "Authorization: Bearer " \ -H "x-target-url: http://host.docker.internal:11434/v1/chat/completions" \ -H "x-torrix-name: ollama-test" \ -H "Content-Type: application/json" \ -d '{"model":"llama3.2","messages":[{"role":"user","content":"Hello!"}]}'

No API key needed for Ollama. Omit x-upstream-authorization. Use host.docker.internal instead of localhost when running Torrix in Docker on Mac or Windows. On Linux, use your machine's actual IP address (e.g. 172.17.0.1) instead.

n8n workflow: Use the HTTP Request node pointed at http://host.docker.internal:8088/proxy with these headers:

Header Value

Authorization Bearer

x-target-url https://api.openai.com/v1/chat/completions

x-upstream-authorization Bearer

Content-Type application/json

n8n Community Node

Install the officia

[truncated for AI cost control]