← all jobs

[Remote] Principal Software Engineer, DGX Cloud Production Engineering

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. NVIDIA is a leader in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. They are looking for Principal Software Engineers to help shape the technical direction for production engineering, Kubernetes-based operations, automation, and reliability across large-scale GPU clusters.

Responsibilities

  • Define and execute the technical strategy for DGX Cloud cluster operations, building the automation, GitOps, and Day 2 reliability needed to operate large-scale GPU clusters across NVIDIA Cloud Partners (NCPs) and on-prem environments
  • Lead design and implementation of systems for cluster lifecycle, validation, repair, upgrades, observability, and readiness
  • Establish patterns for Kubernetes-based GPU cluster operations across partner and on-prem environments
  • Identify and eliminate operational toil through software, APIs, automation, and agent-assisted workflows
  • Set technical standards for production readiness, SLOs, incident response, handoff gates, and operational acceptance
  • Mentor engineers and influence platform, infrastructure, storage, networking, security, and workload teams

Skills

  • 15+ years of experience building and operating large-scale distributed systems or cloud infrastructure
  • Deep experience with Kubernetes, Linux, infrastructure automation, and production operations
  • Strong programming experience in Go, Python, or similar
  • Proven ability to lead complex cross-org technical initiatives
  • Experience designing reliable systems with clear SLOs, observability, incident response, and automation
  • BS/MS in Computer Science or equivalent experience
  • Experience with GPU clusters, AI/ML infrastructure, Kubernetes operators, GitOps, BMaaS/VMaaS, managed Kubernetes, or multi-cloud fleet operations
  • Experience building internal platforms, control planes, lifecycle automation, or production readiness frameworks
  • Track record of turning operational pain into reusable software, APIs, and engineering standards

Benefits

  • Equity
  • Benefits

Company Overview

  • NVIDIA is a computing platform company operating at the intersection of graphics, HPC, and AI. It was founded in 1993, and is headquartered in Santa Clara, California, USA, with a workforce of 10001+ employees. Its website is https://www.nvidia.com.
  • Company H1B Sponsorship

  • NVIDIA has a track record of offering H1B sponsorships, with 448 in 2026, 1872 in 2025, 1354 in 2024, 976 in 2023, 835 in 2022, 601 in 2021, 529 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Design Quality Assurance Engineer II

    Work from home Full-time role

    [Remote] Senior Recruiting Manager - National Accounting + Finance Practice

    Work from home Full-time role

    [Remote] Staff Software Engineer, Partner Integrations

    Work from home Full-time role

    [Remote] SEO Account Manager

    Work from home Full-time role

    [Remote] Account Manager - West and Central U.S. (Commercial)

    Work from home Full-time role

    KYC Analyst - German Speaking

    Work from home Full-time role

    Salesforce Developer (Remote) with Security Clearance

    Work from home Full-time role

    Customer Service Representative – careerzynith E-Commerce Support Team (Full-Time Positions Available)

    Work from home Full-time role

    [Remote] Board Certified Behavior Analyst (BCBA)/ Clinical Director - Riverdale, GA 30274 (Remote)

    Work from home Full-time role

    Cardiac Rhythm Management Clinical Specialist - Little Rock,

    Work from home Full-time role

    Research Assistant (Epidemiology)

    Work from home Full-time role

    Sr Clinical Trials Data Coord

    Work from home Full-time role

    Software Engineer, Payments

    Work from home Full-time role

    Social Media Manager for an E-commerce Fashion Brand in the US (Home Based Part Time)

    Work from home Full-time role

    [Remote] Strategic Finance Business Partner

    Work from home Full-time role

    Backend PHP-Developer (m/f/x), remote/freelance

    Work from home Full-time role

    Supervisor, Healthcare Services (Remote in FL - Weekends)

    Work from home Full-time role

    EHS Management Lead - Data Centres

    Work from home Full-time role

    Consulting Psychologist/Neurologist/Psychiatrist - Austria - MADRS Experience

    Work from home Full-time role

    [Remote] Sales Development Representative

    Work from home Full-time role

    [Remote] Sr Associate Sales Trainer - Employee Benefits

    Work from home Full-time role