2026-04-29原文

DeepInfra on Hugging Face Inference Providers 🔥

DeepInfra joins Hugging Face Hub as an Inference Provider, offering cost-effective serverless inference on over 100 models, starting with conversational and text-generation tasks, accessible via UI and SDKs.

Article intelligence

EngineersAdvanced

Key points

DeepInfra is now an Inference Provider on Hugging Face Hub, providing serverless inference for 100+ models.
Initial support for models like DeepSeek V4, Kimi-K2.6, GLM-5.1, with more tasks (image, video) coming soon.
Users can use custom keys or route through Hugging Face; billing is direct or via HF with no markup.
PRO users get $2 in inference credits monthly, plus free tier quota for signed-in users.

Why it matters

This matters because deepInfra is now an Inference Provider on Hugging Face Hub, providing serverless inference for 100+ models.

Technical impact

May affect model selection, inference cost, product capability, and evaluation benchmarks.

DeepInfra on Hugging Face Inference Providers 🔥

Hugging Face

Models

Datasets

Spaces

Buckets new

Docs

Enterprise

Pricing

Back to Articles

DeepInfra on Hugging Face Inference Providers 🔥

Published April 29, 2026

Update on GitHub

Upvote

Aray Sultanbekova

araikin

guest

Shang-Pin

shang-pin-deepinfra

guest

Utemuratov

Pernekhan

guest

Yessen K

yessenzhar

guest

Oguz Vuruskaner

ovuruska

guest

Célina Hanouti

celinah

Simon Brandeis

sbrandeis

Lucain Pouget

Wauplin

How it works

In the website UI

From the client SDKs

Billing

Feedback and next steps

We're thrilled to share that DeepInfra is now a supported Inference Provider on the Hugging Face Hub!

DeepInfra joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub's model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers.

DeepInfra is a serverless AI inference platform offering one of the most cost-effective pricing per token in the industry. With a catalog of over 100 models, DeepInfra makes it easy for developers to integrate a wide range of AI capabilities into their applications with minimal setup.

DeepInfra supports a broad spectrum of model types - from LLMs to text-to-image, text-to-video, embeddings, and more. As part of this initial integration, DeepInfra is launching support for conversational and text-generation tasks on Hugging Face, enabling access to popular open-weight LLMs such as DeepSeek V4, Kimi-K2.6, GLM-5.1, and many more. Support for additional tasks (text-to-image, text-to-video, embeddings, and more) will roll out soon!

Read more about how to use DeepInfra as an Inference Provider in its dedicated documentation page.

See the full list of models supported by DeepInfra here.

Follow DeepInfra on Hugging Face: https://huggingface.co/DeepInfra.

How it works

In the website UI

In your user account settings, you are able to:

Set your own API keys for the providers you've signed up with. If no custom key is set, your requests will be routed through HF.

Order providers by preference. This applies to the widget and code snippets in the model pages.

As mentioned, there are two modes when calling Inference Providers:

Custom key (calls go directly to the inference provider, using your own API key of the corresponding inference provider)

Routed by HF (in that case, you don't need a token from the provider, and the charges are applied directly to your HF account rather than the provider's account)

Model pages showcase third-party inference providers (the ones that are compatible with the current model, sorted by user preference)

From the client SDKs

DeepInfra is available through the Hugging Face SDKs - huggingface_hub (>= 1.11.2) for Python and @huggingface/inference for JavaScript.

The following examples show how to use DeepSeek V4 Pro through DeepInfra. Use a Hugging Face token to authenticate - the request will be routed to DeepInfra automatically.

From your favorite Agent Harness

Hugging Face Inference Providers are integrated in most Agent Harnesses - including Pi, OpenCode, Hermes Agents, OpenClaw, and more. This means you can plug DeepInfra-hosted models straight into your favorite tools without any extra glue code. Browse the full list of integrations here.

from Python

import os from openai import OpenAI

client = OpenAI( base_url="https://router.huggingface.co/v1", api_key=os.environ["HF_TOKEN"], )

completion = client.chat.completions.create( model="deepseek-ai/DeepSeek-V4-Pro:deepinfra", messages=[ { "role": "user", "content": "Write a Python function that returns the nth Fibonacci number using memoization." } ], )

print(completion.choices[0].message)

from JS

import { OpenAI } from "openai";

const client = new OpenAI({ baseURL: "https://router.huggingface.co/v1", apiKey: process.env.HF_TOKEN, });

const chatCompletion = await client.chat.completions.create({ model: "deepseek-ai/DeepSeek-V4-Pro:deepinfra", messages: [ { role: "user", content: "Write a Python function that returns the nth Fibonacci number using memoization.", }, ], });

console.log(chatCompletion.choices[0].message);

Billing

For direct requests, i.e. when you use the key from an inference provider, you are billed by the corresponding provider. For instance, if you use a DeepInfra API key you're billed on your DeepInfra account.

For routed requests, i.e. when you authenticate via the Hugging Face Hub, you'll only pay the standard provider API rates. There's no additional markup from us; we just pass through the provider costs directly. (In the future, we may establish revenue-sharing agreements with our provider partners.)

Important Note ‼️ PRO users get $2 worth of Inference credits every month. You can use them across providers. 🔥

Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.

We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO if you can!

Feedback and next steps

We would love to get your feedback! Share your thoughts and/or comments here: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49

Models mentioned in this article 3

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated about 6 hours ago • 787k • 3.61k

moonshotai/Kimi-K2.6

Image-Text-to-Text • 1.1T • Updated 6 days ago • 997k • 1.21k

zai-org/GLM-5.1

Text Generation • 754B • Updated 20 days ago • 295k • 1.59k