Deploy, configure, and manage workloads on GPU-enabled hardware.
Design, implement, and maintain systems that handle massive data volumes for both training and low-latency inference.
Provision GPU-enabled infrastructure and specialized hardware to host and run large language models (LLMs).
Leverage orchestration tools (e.g., Kubernetes) and IaC practices to automate and scale infrastructure.
Monitor, troubleshoot, and optimize performance of GPU-enabled environments.
Collaborate with cross-functional teams to ensure reliability, efficiency, and scalability of systems.
3–5 years of professional experience in Linux systems engineering or related roles.
Hands-on experience with GPU-enabled infrastructure and workload management.
Proven ability to design and manage systems handling massive data volumes.
Strong background in hosting and running machine learning/LLM workloads.
Experience with Kubernetes for orchestration and Infrastructure as Code (IaC) tools.