5 min read

Your GPUs Can Do More Than Transcode Video — Here's How to Put Them to Work on Livepeer

Stop letting your GPUs sit idle. Join the BlueClaw Network as a provider and power the next generation of AI agents on Livepeer. Get the v1.3 onboarding guide for Chat, Embeddings, and Image Gen.
Your GPUs Can Do More Than Transcode Video — Here's How to Put Them to Work on Livepeer

A guide for Livepeer Orchestrators ready to serve AI inference and unlock a new revenue stream.


If you're running a Livepeer orchestrator, you already have the infrastructure most AI companies would kill for: GPUs connected to a decentralized network, Docker expertise, and skin in the game. What if those same GPUs could serve LLM chat completions, text embeddings, and image generation — with demand already waiting?

That's exactly what BlueClaw Network plans to make possible.

What Is BlueClaw?

BlueClaw is an OpenAI-compatible AI inference gateway built on top of the Livepeer GPU network. It provides:

  • Chat completions (/v1/chat/completions)
  • Text embeddings (/v1/embeddings)
  • Image generation (/v1/images/generations)

All accessible at https://openai.blueclaw.network/v1 — the same API shape developers already use with OpenAI, so any application using the OpenAI SDK can switch to BlueClaw by changing a single line: the base_url.

BlueClaw doesn't run its own GPUs. Your GPUs power it. The gateway discovers orchestrators through the on-chain AI Service Registry, routes inference requests to them, and your infrastructure does the work. You set your own pricing. You keep your earnings.

Why Should Orchestrators Care?

New Workloads, Same Hardware

If you're running RTX 3090s, 4090s, or better — you already meet the requirements. BlueClaw's BYOC (Bring Your Own Compute) framework lets you deploy lightweight runner containers alongside your existing orchestrator setup. No new hardware needed.

Real Demand from Day One

BlueClaw isn't speculative. It's designed for autonomous AI agent builders who need unlimited, always-on inference without rate limits or per-token billing. These workloads are persistent and growing — agents don't sleep, and they don't stop sending requests at 5 PM.

The AI Inference Market Is Massive

Video transcoding put Livepeer on the map. AI inference is where it scales. LLM inference, embeddings for RAG pipelines, image generation — these are the workloads every company on the planet is trying to provision right now. BlueClaw gives you a seat at that table, powered by infrastructure you already run.

Expanding Capabilities on the Horizon

Beyond the three core capabilities live today, Cloud SPE is actively building reranking (Cohere-compatible /v1/rerank) and video generation runners. Orchestrators who onboard now will be first in line when these capabilities go live on BlueClaw.

What You'll Need

Here's the honest picture of what's required:

Requirement Details
GPU NVIDIA RTX 3090+ (24GB VRAM minimum for chat + embeddings)
OS Linux only — Windows is not supported, macOS is unverified
Software Docker, NVIDIA Container Toolkit, latest NVIDIA drivers
Orchestrator Existing Livepeer AI Orchestrator with stake on Arbitrum One
Registry Registered on the AI Service Registry
Networking A domain you control + Cloudflare Tunnel (free tier) for valid HTTPS
ETH Small amount on Arbitrum One for gas fees

GPU requirements by capability:

Capability Minimum GPU What You'll Serve
Chat (small models) RTX 3090 qwen3:8b, gemma-3-4b-it
Chat (medium/large) RTX 4090+ / A100 Qwen2.5-14B-AWQ, Llama-3.3-70B-FP8
Text Embeddings RTX 3090 nomic-embed-text, SFR-Embedding-2_R
Image Generation RTX 4090+ RealVisXL V4.0, FLUX.1-dev

A 3090 operator can serve chat completions and embeddings on day one. Image generation requires a 4090 or better, on a dedicated GPU.

How It Works — The Architecture

The flow is clean and modular:

BlueClaw Gateway (discovers you on-chain)
        │
        ▼
  Your AI Orchestrator (go-livepeer)
        │
        ▼  (via Cloudflare Tunnel)
  BYOC Runners (chat / embeddings / image gen)
        │
        ▼
  Inference Backend (Ollama or vLLM)
        │
        ▼
     Your GPU

You deploy:

  1. Your AI Orchestrator — the go-livepeer node you likely already run
  2. A Cloudflare Tunnel — provides valid HTTPS without managing certificates
  3. An inference backend — Ollama (simpler) or vLLM (higher throughput, larger models)
  4. BYOC runner containers — lightweight proxies that register capabilities with your orchestrator and route requests to your backend

Each component is a Docker container. The runners are open source under Cloud-SPE on GitHub. If you want to build a custom runner for a new workload, the framework only requires an HTTP endpoint and a capability registration sidecar.

The Onboarding Guide

Cloud SPE has published a comprehensive BlueClaw GPU Provider Onboarding Guide (v1.3) that walks you through every step:

  • Step 1: Create Docker networks and volumes
  • Step 2: Deploy and configure your AI Orchestrator (including ticket redemption wallet setup and AI Service Registry registration)
  • Step 3: Set up your Cloudflare Tunnel with published application routes for each capability
  • Step 4: Deploy your inference backend — complete Docker Compose files for both Ollama and vLLM, with tested model configurations per GPU (3090, 4090, 5090)
  • Step 5: Deploy BYOC runners for chat completions, text embeddings, and image generation — each with its own compose file and environment variable reference
  • Step 6: Start everything in the correct order and verify runner registration
  • Step 7: Verify end-to-end with BlueClaw's playground

The guide includes Docker Compose files you can use directly, tested vLLM configurations by GPU type, a full environment variable reference, troubleshooting for common issues (TLS errors, capability registration failures, VRAM management), and a quick-reference section for 3090 operators who want the shortest path to serving jobs.

Quick-Start: The 3090 Operator Path

If you have a 3090 and want the fastest route:

  1. Run tztcloud/go-livepeer:latest registered on the AI Service Registry
  2. Set up one Cloudflare Tunnel with one subdomain
  3. Deploy Ollama, pull qwen3:8b and nomic-embed-text:latest
  4. Deploy the chat completions runner and embeddings runner
  5. Sign up at blueclaw.network, test via the playground

That's it. You're serving AI inference on a decentralized network.

What Models Are Supported?

BlueClaw currently supports a growing roster across all three capabilities:

Chat: qwen3:8b, gemma-3-4b-it, Qwen2.5-14B-Instruct-AWQ, Llama-3.3-70B-Instruct-FP8
Embeddings: nomic-embed-text, SFR-Embedding-2_R
Image Generation: RealVisXL V4.0 Lightning, FLUX.1-dev

The model list will expand as more orchestrators come online and new runners are developed.

Open Source, All the Way Down

Every BYOC runner is open source. The full suite lives under github.com/Cloud-SPE:

  • Chat + Embeddings runners — Go
  • Capability registration — Go
  • Rerank runner — Python/FastAPI (experimental)
  • Video generation runner — Python/FastAPI (experimental)
  • Gateway proxy — Go

Want to build a runner for a workload that doesn't exist yet? The framework is designed for exactly that.

Get Started

The full onboarding guide is available now. To get your copy and start the process:

👉 Reach out to @mike_zoop on the Livepeer Discord

Mike will share the complete guide, answer your questions, and help you through any setup issues. You can also find him in the #orchestrating channel.

Whether you're running a single 3090 or a rack of 4090s, there's a path for you. The guide covers it all — from the minimal setup to multi-GPU, multi-backend deployments with vLLM and image generation.


The Bigger Picture

Livepeer started as a video transcoding network. With BlueClaw, it becomes something larger: a decentralized GPU compute layer for AI inference. The same orchestrators who built the network's video infrastructure are now positioned to power the next generation of AI applications.

The demand is here. The tooling is ready. The guide is written.

The only question is whether your GPUs are going to sit idle — or get to work.


BlueClaw Network — Decentralized AI inference on the Livepeer network.
Built by Cloud SPE for the Livepeer community.
Questions? Find @mike_zoop on Livepeer Discord.