ClearML Hands-On Lab

Coming soon

Deploy NVIDIA NIM Microservices

Pre-optimised LLM containers, one click

NVIDIA NIM Model Serving LLM

Pull NVIDIA's pre-optimised inference microservices from NGC, deploy them via the GenAI App Engine, and front them with the AI Application Gateway. Token-auth, RBAC, auto-scaling — the fast path to production-grade LLM endpoints.

Coming soon

Build & Deploy an End-to-End RAG System

Vector store + retriever + LLM, governed end-to-end

RAG LLM Model Serving

Ingest a corpus, embed it, deploy the LLM, wire the retriever, and ship a chat endpoint behind the AI Application Gateway. A complete retrieval-augmented generation pipeline on infrastructure you own.

Coming soon

Deploy Embedding Models at Scale

Vector generation services for search, RAG, and recsys

Embeddings Model Serving LLM

Stand up text and image embedding endpoints — sentence-transformers, BGE, NV-Embed — with an OpenAI-compatible API. Auto-scaled, RBAC-controlled, observable end-to-end through the AI Application Gateway.

Coming soon

Scale Inference with NVIDIA Dynamo

Disaggregated, low-latency LLM serving

NVIDIA Dynamo Model Serving LLM

Run NVIDIA Dynamo's distributed inference framework on your ClearML fleet — disaggregated prefill / decode, KV-cache reuse, multi-node tensor parallelism. All governed through the AI Application Gateway.

Coming soon

Scalable LLM Endpoints

Auto-scaling inference behind the AI Application Gateway

Model Serving LLM Auto-scaling

Take a deployed LLM and scale it across multiple GPU workers behind the AI Application Gateway — token-auth, RBAC, session-affine load balancing, and live latency observability out of the box.

Coming soon

Run SLURM Workloads on Kubernetes

SLURM clusters spun up as containers, on demand

SLURM HPC Kubernetes

ClearML provisions a SLURM controller + worker pods on your K8s fleet — submit with sbatch and run HPC-style workloads on the same governed compute as the rest of your AI work. No bare-metal required.

Coming soon

Isolated Kubernetes with K3K

A secured, sandboxed K8s namespace per team

K3K Multi-tenant Kubernetes

Spin up a fully isolated Kubernetes cluster inside your ClearML tenant — its own control plane, networking, and RBAC. Give a team or customer hands-on K8s without exposing the parent fleet.

Coming soon

ClearML Administrator Essentials

Set up projects, users, RBAC, and credentials

User Management RBAC Governance

Master the operational side of ClearML — multi-team projects, access policies, credential rotation, audit. The day-2 admin track.

Coming soon

Distributed Model Training

Coordinate large-scale training across GPU nodes

Training Parallelisation Artifacts

Run a distributed training job across multiple GPU workers, track the metrics live, and inspect artifact + experiment lineage on a clean tenant.

Configure Dynamic Fractional
Compute for AI Teams

GPU-as-a-Service
Overview

Deploy NVIDIA NIM Microservices

Build & Deploy an End-to-End RAG System

Deploy Embedding Models at Scale

Scale Inference with NVIDIA Dynamo

Scalable LLM Endpoints

Run SLURM Workloads on Kubernetes

Isolated Kubernetes with K3K

ClearML Administrator Essentials

Distributed Model Training

Configure Dynamic FractionalCompute for AI Teams

GPU-as-a-ServiceOverview

Deploy NVIDIA NIM Microservices

Build & Deploy an End-to-End RAG System

Deploy Embedding Models at Scale

Scale Inference with NVIDIA Dynamo

Scalable LLM Endpoints

Run SLURM Workloads on Kubernetes

Isolated Kubernetes with K3K

ClearML Administrator Essentials

Distributed Model Training

Configure Dynamic Fractional
Compute for AI Teams

GPU-as-a-Service
Overview