Job Description
About the Opportunity
My client is seeking a Technical Project Manager (TPM) to oversee the delivery of an Enterprise AI Platform that extends across Azure Cloud, on-premises data centers, and edge environments. This role entails managing cross-functional engineering programs related to infrastructure, DevSecOps, and MLOps teams to ensure efficient deployment, governance, and optimization of AI/ML workloads in hybrid environments. The position requires strong technical proficiency in cloud platforms, particularly Microsoft Azure, along with robust program management skills.
Responsibilities
- Lead planning and execution of large-scale AI infrastructure and platform initiatives across Azure and on-prem environments.
- Translate business and architecture goals into actionable project plans with measurable milestones and success criteria.
- Manage dependencies between infrastructure, MLOps, and DevSecOps workstreams.
- Oversee sprint planning, backlog prioritization, and release schedules using Azure DevOps Boards or Jira.
- Drive delivery of hybrid workloads — spanning compute, storage, and GPU-accelerated clusters — while ensuring performance, security, and compliance.
- Maintain unified visibility of project health, risks, and budget across all streams.
- Collaborate with engineering leads to track design and delivery of cloud resource provisioning and hybrid connectivity.
- Coordinate integration with Azure services such as Azure Arc, Azure Policy, Azure Monitor, and Key Vault.
- Validate infrastructure readiness for GPU and high-performance computing workloads.
- Review architecture documentation and ensure consistency across cloud and on-prem environments.
- Define KPIs and OKRs for platform delivery — uptime, deployment velocity, cost efficiency, and governance adherence.
- Ensure compliance with Azure governance frameworks (e.g., policies, role-based access control, and tagging).
- Manage platform budgets and reporting using Azure Cost Management and other observability tools.
- Serve as the bridge between technical teams, business leadership, and external vendors.
- Communicate technical status, priorities, and risks in clear, outcome-focused terms.
- Manage relationships with cloud providers, GPU vendors, and implementation partners.
- Coordinate budgeting, licensing, and resource forecasting with operations and finance teams.
- Promote Agile and DevOps practices across engineering teams.
- Drive post-project reviews and implement lessons learned into future platform releases.
- Mentor technical project coordinators and engineers on structured delivery methodologies.
About You
Required Knowledge & Skills:
- Strong understanding of Microsoft Azure architecture and services, including Azure Machine Learning, Azure DevOps, Azure Arc, Azure Policy, Azure Monitor, and Key Vault.
- Experience managing hybrid connectivity (VPN, ExpressRoute, private endpoints).
- Broad knowledge of cloud computing, hybrid architectures, and on-prem data center integration.
- Familiarity with MLOps/DevOps practices — CI/CD pipelines, infrastructure as code, and automated testing.
- Exposure to GPU-based or high-performance computing (HPC) environments.
- Ability to interpret architectural documentation and convert strategy into delivery roadmaps.
- Strong background in project governance, stakeholder communication, and risk management.
- Proficiency in Agile/Scrum project tools such as Azure DevOps Boards, Jira, or equivalent.
Experience & Background:
- 8–12 years of experience in technology project or program management, with exposure to AI/ML platform delivery.
- Proven success managing hybrid or cloud-native projects, ideally leveraging Microsoft Azure.
- Experience leading cross-functional engineering teams across infrastructure, data, and platform domains.
- Knowledge of CI/CD, automation, and DevSecOps governance in enterprise environments.
- Certifications preferred: PMP, PMI-ACP, Azure Fundamentals / Azure Administrator, or SAFe Agilist.