Explore our catalog of AI models available for distributed inference. LLMs, embeddings, image generation, multimodal — pick one and submit a job.
| Type | Code tag | Description |
|---|---|---|
| LLM | llm | Classic chat-completion prompts. Routed by model name, priced per token. |
| Embeddings | embed | Vector embeddings for RAG, semantic search, and classification. Priced per token. CPU-only workers excel here. |
| Document | document | Summarize, extract, or analyze uploaded files using the models in this catalog. Flat-rate per job. |
| Container | container | Any Docker image. Routed to workers with docker capability. GPU-hour or CPU-core-hour billing. |
JOBS ARE TAGGED WITH A CONTEXT TIER (1–4) AT SUBMISSION BASED ON PROMPT SIZE. WORKERS DECLARE THEIR CAPACITY SO YOU ALWAYS LAND ON HARDWARE THAT CAN RUN YOUR PROMPT EFFICIENTLY — SEE HOW IT WORKS.
Sign up and start running inference jobs across our distributed network.