Inference Observability
Track 100+ GPU and inference metrics, monitor XID errors and code exceptions, record CUDA profiles, and trace LLM generations - all in one place, all automatically.

NVIDIA
PyTorch
Hugging Face
vLLM




Inference tracing
Trace LLM generations, communication, kernel launches, and more.
Inference profiling
Identify top contributors to inference latency.
GPU monitoring
Monitor inference performance, CPU/GPU utilization, and errors.
Issue detection
Track and get alerts on errors and inefficiencies.