Cheap LLM Inference API for Developers

Submit LLM, embedding, document, or container jobs asynchronously. Get results at a fraction of the cost of dedicated GPU instances. No infrastructure to manage, no minimum commitment, pay per compute-second.

Everything you need to integrate.

§01 · CAPABILITIES

CAP · 01

Simple REST API

Clean, well-documented endpoints. Integrates with any language or framework.

CAP · 02

Pay per use

Only pay for compute time. No idle costs, no minimum commitments.

CAP · 03

Batch processing

Submit thousands of jobs at once. Built for data pipelines and bulk work.

CAP · 04

100+ models

Llama, Mistral, Qwen, and more. All major open-source LLMs.

§02 · QUICK START

Get started in minutes.

Install the Python SDK, point it at MicroDC.ai, submit a job. Poll, stream, or hand us a webhook.

Python SDK Get API Key →

from microdc import MicroDC

client = MicroDC(api_key="your-api-key")

# Submit an inference job
job = client.submit_job(
    model="llama-3.1-8b",
    prompt="Explain quantum computing in simple terms",
    max_tokens=500,
)

# Check job status
status = client.get_job(job.id)
print(f"Status: {status.state}")

# Get results when ready
result = client.wait_for_result(job.id)
print(result.output)

curl https://api.microdc.ai/v1/jobs \
  -H "Authorization: Bearer $MDC_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.1-8b",
    "prompt": "Explain quantum computing in simple terms",
    "max_tokens": 500
  }'

# Returns: { "id": "J-a1b2c3", "status": "queued" }

curl https://api.microdc.ai/v1/jobs/J-a1b2c3 \
  -H "Authorization: Bearer $MDC_KEY"

Four job types, one API.

§03 · JOB TYPES

Type	What it runs	Routing	Pricing
llm	Chat completions on any supported LLM	By model name	Token-based
embed	Vector embeddings for RAG, search, classification	By model name	Token-based
document	Summarize, extract, analyze uploaded files	By model name	Flat-rate
container	Any Docker image with your script or code	By capability (docker)	GPU-hour or CPU-core-hour

§04 · OPENAI-COMPATIBLE

Drop-in replacement for the
OpenAI client.

Point your existing openai client at MicroDC.ai and keep your code. Multimodal content lists supported. Works with LangChain, LlamaIndex, Instructor, and any OpenAI-shaped library.

No code rewrite — just change base_url
Multimodal content lists (text + image parts)
Async-native under the hood

from openai import OpenAI

client = OpenAI(
    api_key="your-microdc-key",
    base_url="https://api.microdc.ai/v1",
)

resp = client.chat.completions.create(
    model="llama-3.1-8b",
    messages=[
        {"role": "user",
         "content": "Summarize quantum tunneling."}
    ],
)
print(resp.choices[0].message.content)

§05 · CONTAINER JOBS

Bring your own Docker image.

Submit any image. Upload script files (.py, .sh, .js, .ts, .go, .rs, .java...) as inputs. Live log streaming via per-job heartbeat. Routes only to workers advertising the docker capability.

Batch ETL · scientific compute · headless browsers
Video transcoding · custom ML pipelines
GPU-hour or CPU-core-hour billing — your call

job = client.submit_job(
    type="container",
    model="my-registry/pdf-extract:v2",
    payload={
        "image": "my-registry/pdf-extract:v2",
        "args": ["--input", "/data/report.pdf"],
        "env": {"LOG_LEVEL": "info"},
    },
    files=["report.pdf"],
)

# Stream logs while it runs
for line in client.stream_logs(job.id):
    print(line)

result = client.wait_for_result(job.id)

End-to-end encrypted, zero-knowledge results.

§06 · ENCRYPTION

Prompts are encrypted on your machine before they leave it; results can only be decrypted by you. The server stores and routes ciphertext — it never inspects, logs, or decrypts your content, and the payload key it escrows for the worker is destroyed on acknowledgment. Results are encrypted to your public key, so no one but you can read them — not the worker pool, not MicroDC. For regulated industries, IP-sensitive workflows, and anyone who simply doesn't want their prompts logged. Read the full architecture →

STEP 01

You encrypt

Client encrypts the payload with AES-256-GCM on your machine and sends your RSA public key alongside. Your private key never leaves your machine.

STEP 02

Managed worker decrypts & runs

Encrypted jobs route only to MicroDC-managed workers with the admin-approved encryption capability. The worker receives the key on claim, decrypts in memory, and runs inference against a local model — your decrypted data never touches an external network.

STEP 03

Result re-encrypted

The worker encrypts the result with a fresh one-time key wrapped to your public key. Only your private key can decrypt it. All key material is deleted on acknowledgment.

Build AI apps without the
infrastructure headaches.

Everything you need to integrate.

Simple REST API

Pay per use

Batch processing

100+ models

Get started in minutes.

Four job types, one API.

Drop-in replacement for the
OpenAI client.

Bring your own Docker image.

End-to-end encrypted, zero-knowledge results.

You encrypt

Managed worker decrypts & runs

Result re-encrypted

Ship your first job.

Everything you need to integrate.

Simple REST API

Pay per use

Batch processing

100+ models

Get started in minutes.

Four job types, one API.

Drop-in replacement for theOpenAI client.

Bring your own Docker image.

End-to-end encrypted, zero-knowledge results.

You encrypt

Managed worker decrypts & runs

Result re-encrypted

Ship your first job.

Drop-in replacement for the
OpenAI client.