Trending labs
Pre-optimised LLM containers, one click
Pull NVIDIA's pre-optimised inference microservices from NGC, deploy them via the GenAI App Engine, and front them with the AI Application Gateway. Token-auth, RBAC, auto-scaling — the fast path to production-grade LLM endpoints.
Vector store + retriever + LLM, governed end-to-end
Ingest a corpus, embed it, deploy the LLM, wire the retriever, and ship a chat endpoint behind the AI Application Gateway. A complete retrieval-augmented generation pipeline on infrastructure you own.
Vector generation services for search, RAG, and recsys
Stand up text and image embedding endpoints — sentence-transformers, BGE, NV-Embed — with an OpenAI-compatible API. Auto-scaled, RBAC-controlled, observable end-to-end through the AI Application Gateway.
Disaggregated, low-latency LLM serving
Run NVIDIA Dynamo's distributed inference framework on your ClearML fleet — disaggregated prefill / decode, KV-cache reuse, multi-node tensor parallelism. All governed through the AI Application Gateway.
Auto-scaling inference behind the AI Application Gateway
Take a deployed LLM and scale it across multiple GPU workers behind the AI Application Gateway — token-auth, RBAC, session-affine load balancing, and live latency observability out of the box.
SLURM clusters spun up as containers, on demand
ClearML provisions a SLURM controller + worker pods on your K8s
fleet — submit with sbatch and run HPC-style
workloads on the same governed compute as the rest of your AI
work. No bare-metal required.
A secured, sandboxed K8s namespace per team
Spin up a fully isolated Kubernetes cluster inside your ClearML tenant — its own control plane, networking, and RBAC. Give a team or customer hands-on K8s without exposing the parent fleet.
Set up projects, users, RBAC, and credentials
Master the operational side of ClearML — multi-team projects, access policies, credential rotation, audit. The day-2 admin track.
Coordinate large-scale training across GPU nodes
Run a distributed training job across multiple GPU workers, track the metrics live, and inspect artifact + experiment lineage on a clean tenant.