Sr. Software Engineer - K8s - GPU Orchestration - REMOTE Job at Living Talent, San Jose, CA

MkNJcEhkS0VYaFdkc3pYU1dhSExJbVFUd3c9PQ==
  • Living Talent
  • San Jose, CA

Job Description

GPU Orchestration
  • Startup
  • Company size: 30
  • Remote within North America
  • Compensation: Base Salary 250k + Equity

Key Responsibilities

  • Lead Design, Architecture & Development of K8s-based cloud infrastructure.
  • Use K8s Controllers, Operators & CRs to Implement scalable, high-availability solutions.
  • Integrate Karpenter, and/or other advanced tools for infrastructure optimization.
  • Architect MLOps Middleware integration (dynamic workload migration, resource disaggregation).
  • Build monitoring, logging & alerting systems.
  • Drive infrastructure cost optimization through FinOps best practices in K8s deployments.
  • Promote K8s best practices & mentor software engineers.
  • Collaborate across teams to drive K8s adoption in multi-cloud and hybrid environments.
  • Open-Source Contributions in the Kubernetes community.

Qualifications

Kubernetes Expertise

  • Designing, deploying, and managing K8s clusters (AKS, EKS, GKE, OpenStack, etc.).
  • Hands-on experience with K8s core components (Karpenter, cluster autoscaler, CNI, CSI, CRI, CRD, operators).
  • 5+ years in Kubernetes infrastructure.
  • Contributing to open-source Kubernetes projects.
  • 10+ years: software engineering experience.
  • Go, Python, Bash, etc. (one or more).
  • Excellent communication skills for both technical and non-technical stakeholders.
  • Bachelor’s or Master’s degree in Computer Science or related field (preferred).

Preferred Experience

  • GPU scheduling, container orchestration, HPC (high-performance computing) workloads.
  • Multi-cloud & hybrid cloud deployments familiarity.
  • MLOps platforms experience (Kubeflow, TFX, etc.).
  • FinOps practices & cloud cost management experience/knowledge

Job Tags

Remote job,

Similar Jobs

US Foods, Inc.

Deisel Mechanic Technician Job at US Foods, Inc.

 ...all required repairs on diesel equipment (i.e., tractors, trailers, bobtails, reefers, lift gates, etc.). Perform preventative maintenance within company standards. Complete documentation of all repair orders, PM list and parts accountability. Handle road... 

Lehigh Valley Health Network

Trained Medical Interpreter Job at Lehigh Valley Health Network

 ...Valley Health Network. Summary Provides foreign language interpreting services to enable the understanding and successful...  ...conduit of information for all clinical services which include medical terminology and support services to LEP patients. Informs L... 

Peraton

Full Stack Software Developer Job at Peraton

Responsibilities We are seeking a highly motivated Full Stack Software Developer to support the design and development and integration of cutting-edge software tools that utilize photometric data in the pursuit of analysis of objects at low earth orbit to provide... 

Bestcare, Inc.

HHA Training - Free - Staten Island Job at Bestcare, Inc.

 ...Train for FREE to become a NYS Certified Home Health Aide and join the Bestcare Team! We are scheduling a Free HHA Training course in our Staten Island location. Submit your application online to be considered for this opportunity. Our Bestcare Home Health Aides enjoy... 

United Software Group Inc

Scrum Master Job at United Software Group Inc

 ...Job Description Job Title: Scrum Master Location: 7000 - 113 Street Edmonton, Alberta, Canada T6H 5T6 (Primarily Remote with Occasional Onsite Work) Duration: 12 Months (Full-Time, 7.25 Hours/Day) Possible Extension: Up to 24 Months Security Requirements...