[Remote] Senior Software Engineer, AI and DL Kernel Libraries
Note: The job is a remote job and is open to candidates in USA. NVIDIA is a leading technology company specializing in AI and deep learning solutions. They are seeking a Senior Software Engineer to develop innovative AI systems technologies, focusing on optimizing kernels for high-impact AI workloads and collaborating across teams to enhance NVIDIA's hardware architecture.
Responsibilities
- Innovating and developing new AI systems technologies for efficient inference
- Designing, implementing, and optimizing kernels for high impact AI workloads
- Designing and implementing extensible abstractions for LLM serving engines
- Building efficient just-in-time domain specific compilers and runtimes
- Collaborating closely with other engineers at NVIDIA across deep learning frameworks, libraries, kernels, and GPU arch teams
- Contributing to open source communities like FlashInfer, vLLM, and SGLang
Skills
- Masters degree in Computer Science, Electrical Engineering, or related field (or equivalent experience); PhD are preferred
- 6+ years (academic/ industry) experience with ML/DL systems development preferable
- Strong experience in developing or using deep learning frameworks (e.g. PyTorch, JAX, TensorFlow, ONNX, etc) and ideally inference engines and runtimes such as vLLM, SGLang, and MLC
- Strong Python and C/C++ programming skills
- Strong experience in GPU kernel development and performance optimizations (especially using CUDA C/C++, cuTile, Triton, or similar)
- Background in domain specific compiler and library solutions for LLM inference and training (e.g. FlashInfer, Flash Attention)
- Expertise in inference engines like vLLM and SGLang
- Expertise in machine learning compilers (e.g. Apache TVM, MLIR)
- Open source project ownership or contributions
Benefits
- Equity
- Benefits
Company Overview
Company H1B Sponsorship