Principal Engineer - AI Infrastructure Abstractions

Company: Diversity Talent Scouts
Location: San Jose
Posted on: February 16, 2026

Job Description:

Job Description Job Description As a Principal AI Infrastructure Abstraction Engineer , you will design and implement the foundational systems that make shared AI compute environments scalable, secure, and developer-friendly. Your work will focus on creating abstractions that hide hardware complexity while providing predictable, cloud-native interfaces for AI workloads. This position bridges infrastructure and applied AI—turning raw GPUs and accelerators into programmable, elastic, and multi-tenant resources for both internal developers and enterprise clients. Key Responsibilities Architect abstractions that map logical compute constructs (vGPUs, GPU pools, workload queues) to physical devices. Build APIs, services, and control planes that expose GPU and accelerator resources with strong isolation and quality-of-service guarantees. Develop mechanisms for secure GPU sharing, including time-slicing, partitioning, and namespace isolation. Work with orchestration and scheduling systems to ensure intelligent mapping of resources based on utilization, priority, and network topology. Define policies for quotas, fair allocation, and resource elasticity in shared environments. Integrate with AI/ML frameworks (PyTorch, TensorFlow, Triton, etc.) to optimize model training and inference workflows. Deliver observability and monitoring capabilities that trace resource usage from logical abstractions to hardware. Partner with platform security teams to strengthen access controls, onboarding processes, and tenant isolation. Support internal developer adoption of abstraction APIs while maintaining high performance and low overhead. Contribute to long-term compute platform strategy with a focus on modularity, abstraction, and scale. Minimum Qualifications Bachelor’s degree with 15 years of experience, Master’s with 12 years, or PhD with 8 years. Proven track record building production-grade infrastructure systems, preferably in Go, Python, or C++. Strong experience with containerization and orchestration platforms (Kubernetes, Docker, KubeVirt). Background in designing logical abstractions for compute, storage, or networking in multi-tenant systems. Familiarity with integrating with machine learning platforms (e.g., PyTorch, TensorFlow, Triton, MLFlow). Preferred Qualifications Hands-on experience with GPU sharing, scheduling, or isolation (MIG, MPS, vGPUs, time-slicing, or device plugin models). Deep knowledge of resource management: quotas, prioritization, fairness, elasticity. Strong ability to think across hardware/software boundaries and design abstractions that scale.

Keywords: Diversity Talent Scouts, Milpitas , Principal Engineer - AI Infrastructure Abstractions, IT / Software / Systems , San Jose, California

Didn't find what you're looking for? Search again!

Let San Jose recruiters find you. Post your resume for free!

Get San Jose IT / Software / Systems jobs via email.

View more Milpitas IT / Software / Systems jobs

Other IT / Software / Systems Jobs

eCommerce Operation Engineer
Description: Job Description Job Description Posting Title: eCommerce Operation Engineer Groundbreaking advances in synthetic biology achieved at Amyris allow us to create products which are better and safer for humans (more...)
Company: The Rockridge Group
Location: Emeryville
Posted on: 02/18/2026

AI Engineer
Description: Job Description Job Description We are on the lookout for an innovative and driven AI Engineer . In this role, you will be responsible for designing, developing, and deploying AI models that will enhance (more...)
Company: Zone IT Solutions
Location: San Jose
Posted on: 02/18/2026

Senior IT Technologist, Applications Engineering
Description: Job Description Job Description Company Description At Western Digital, our vision is to power global innovation and push the boundaries of technology to make what you thought was once impossible, possible. (more...)
Company: Western Digital
Location: San Jose
Posted on: 02/18/2026

Salary in Milpitas, California Area | More details for Milpitas, California Jobs |Salary

Teradata and Hadoop Developer
Description: Job Description Job Description Zone IT Solutions is seeking a Teradata and Hadoop Developer to enhance our data solutions team. In this role, you will be responsible for developing and managing large-scale (more...)
Company: Zone IT Solutions
Location: San Jose
Posted on: 02/18/2026

Technical Campaign Manager
Description: Job Description Job Description Technical Campaign Manager 24 Month W2 Contract San Jose, CA Hybrid Here's how you'll become a key player with this opportunity: This is a hybrid role requiring both (more...)
Company: NextDeavor Inc.
Location: San Jose
Posted on: 02/18/2026

Senior Engineer, RTL Memory Centric Computing
Description: Job Description Job Description Please Note: To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month (more...)
Company: Samsung Semiconductor
Location: San Jose
Posted on: 02/18/2026

Research Intern (Deep Learning), 2026 Spring (Master/PhD)
Description: Job Description Job Description Founded in 2016 in Silicon Valley, Pony.ai has quickly become a global leader in autonomous mobility and is a pioneer in extending autonomous mobility technologies and (more...)
Company: pony.ai
Location: Fremont
Posted on: 02/18/2026

IT DATA ANALYST/TECHNICIAN
Description: Job Description Job Description TITLE: IT DATA ANALYST/TECHNICIAN POSITION: FULL TIME/DIRECT HIRE Summary: Founded in Santa Clara County in 1928, Goodwill of Silicon is dedicated to improving employment (more...)
Company: The Rockridge Group
Location: San Jose
Posted on: 02/18/2026

Senior Security Engineer II, Vulnerability Management
Description: Job Description Job Description CoreWeave is The Essential Cloud for AI . Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build (more...)
Company: CoreWeave
Location: Sunnyvale
Posted on: 02/18/2026

Principal Engineer - AI Infrastructure Abstractions
Description: Job Description Job Description As a Principal AI Infrastructure Abstraction Engineer , you will design and implement the foundational systems that make shared AI compute environments scalable, secure, (more...)
Company: Diversity Talent Scouts
Location: San Jose
Posted on: 02/18/2026

Loading more jobs...

Principal Engineer - AI Infrastructure Abstractions

Didn't find what you're looking for? Search again!

Other IT / Software / Systems Jobs

Log In or Create An Account