AI Surgical Planning AI Specialist
An AI Surgical Planning AI Specialist designs, validates, and deploys machine learning systems that transform preoperative medical…
Skill Guide
The engineering discipline of converting, optimizing, and deploying deep learning models for real-time inference on GPU hardware using TensorRT and ONNX Runtime, specifically for latency-critical applications on surgical robotic consoles and medical imaging systems.
Scenario
You have a PyTorch U-Net model for instrument segmentation that runs at 15 FPS on the target NVIDIA Jetson AGX Orin in the surgical console. The requirement is ≥30 FPS.
Scenario
The surgical console requires simultaneous inference for: 1) Real-time tissue depth estimation, 2) Instrument detection, 3) Anatomical landmark recognition. Each model must complete within a 10ms frame budget, sharing GPU memory on an embedded platform.
Scenario
Develop a production-grade inference service for a Class IIa surgical device that must maintain function during GPU thermal throttling or transient hardware faults, guaranteeing a fallback to a less accurate but faster model.
TensorRT is the primary compiler/runtime for NVIDIA GPU inference, essential for kernel fusion and precision calibration. ONNX Runtime provides cross-platform, backend-agnostic inference (TensorRT, DirectML, CoreML). trtexec is the CLI for benchmarking and engine building. GraphSurgeon is used for advanced ONNX graph manipulation before TensorRT ingestion.
Nsight Systems provides system-wide timeline visualization (CPU/GPU kernels, memory ops). Nsight Compute offers detailed kernel-level GPU performance analysis. TensorBoard is used for profiling TF/TRT execution. Polygraphy is a TensorRT utility for validating ONNX-to-TRT conversions and layer-wise debugging.
JetPack SDK provides the L4T OS, CUDA, cuDNN, and TensorRT for Jetson platforms. AGX Orin is the reference high-compute edge hardware for surgical consoles. Containerization ensures reproducible deployment. Triton can serve multiple optimized models with concurrent execution and metrics on edge.
1 career found
Try a different search term.