Submit LLM, embedding, document, or container jobs asynchronously. Get results at a fraction of the cost of dedicated GPU instances. No infrastructure to manage, no minimum commitment, pay per compute-second.
Clean, well-documented endpoints. Integrates with any language or framework.
Only pay for compute time. No idle costs, no minimum commitments.
Submit thousands of jobs at once. Built for data pipelines and bulk work.
Llama, Mistral, Qwen, and more. All major open-source LLMs.
Install the Python SDK, point it at MicroDC.ai, submit a job. Poll, stream, or hand us a webhook.
from microdc import MicroDC client = MicroDC(api_key="your-api-key") # Submit an inference job job = client.submit_job( model="llama-3.1-8b", prompt="Explain quantum computing in simple terms", max_tokens=500, ) # Check job status status = client.get_job(job.id) print(f"Status: {status.state}") # Get results when ready result = client.wait_for_result(job.id) print(result.output)
curl https://api.microdc.ai/v1/jobs \
-H "Authorization: Bearer $MDC_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.1-8b",
"prompt": "Explain quantum computing in simple terms",
"max_tokens": 500
}'
# Returns: { "id": "J-a1b2c3", "status": "queued" }
curl https://api.microdc.ai/v1/jobs/J-a1b2c3 \
-H "Authorization: Bearer $MDC_KEY"
| Type | What it runs | Routing | Pricing |
|---|---|---|---|
| llm | Chat completions on any supported LLM | By model name | Token-based |
| embed | Vector embeddings for RAG, search, classification | By model name | Token-based |
| document | Summarize, extract, analyze uploaded files | By model name | Flat-rate |
| container | Any Docker image with your script or code | By capability (docker) | GPU-hour or CPU-core-hour |
Point your existing openai client at MicroDC.ai and keep your code. Multimodal content lists supported. Works with LangChain, LlamaIndex, Instructor, and any OpenAI-shaped library.
base_urlfrom openai import OpenAI
client = OpenAI(
api_key="your-microdc-key",
base_url="https://api.microdc.ai/v1",
)
resp = client.chat.completions.create(
model="llama-3.1-8b",
messages=[
{"role": "user",
"content": "Summarize quantum tunneling."}
],
)
print(resp.choices[0].message.content)
Submit any image. Upload script files (.py, .sh, .js, .ts, .go, .rs, .java...) as inputs. Live log streaming via per-job heartbeat. Routes only to workers advertising the docker capability.
job = client.submit_job(
type="container",
model="my-registry/pdf-extract:v2",
payload={
"image": "my-registry/pdf-extract:v2",
"args": ["--input", "/data/report.pdf"],
"env": {"LOG_LEVEL": "info"},
},
files=["report.pdf"],
)
# Stream logs while it runs
for line in client.stream_logs(job.id):
print(line)
result = client.wait_for_result(job.id)
The server never sees your prompts or results. Per-job symmetric keys, public-key result encryption, automatic key destruction on acknowledgment. For regulated industries, IP-sensitive workflows, and anyone who simply doesn't want their prompts logged.
Client generates a per-job symmetric key and encrypts the payload. A public key is sent alongside so results return encrypted to you.
Only workers with the encryption capability can claim. They receive the key on claim, decrypt in memory, run the job.
Worker encrypts the result with your public key and submits. You decrypt with your private key. Per-job keys deleted on acknowledgment.
Free credits to start. No credit card. No minimum. Full API and SDK access from day one.