Senior DevOps Engineer (AWS-Focused)
Skills
About the role
Position Overview
We are seeking a highly experienced Senior DevOps Engineer to design, build, and operate scalable cloud infrastructure with a primary focus on AWS. The ideal candidate has extensive experience managing complex distributed systems, deploying machine learning workloads, implementing secure and automated infrastructure, and supporting multi-service architectures in production environments.
This role requires a strong infrastructure engineering mindset, deep cloud expertise, and the ability to collaborate with software, data, and machine learning teams to deliver highly available and scalable platforms.
Key ResponsibilitiesCloud Infrastructure & Platform Engineering
Design, implement, and manage cloud-native infrastructure primarily on AWS.
Build and maintain highly available, fault-tolerant, and scalable production environments.
Develop Infrastructure as Code (IaC) using tools such as Terraform or CloudFormation.
Establish cloud governance, security, networking, and operational best practices.
Optimize infrastructure costs while maintaining performance and reliability.
Kubernetes & Container Orchestration
Design and operate production Kubernetes environments.
Manage containerized applications and platform services across multiple environments.
Implement autoscaling, service discovery, ingress routing, and workload isolation strategies.
Optimize cluster performance, reliability, and resource utilization.
Machine Learning Infrastructure
Deploy, manage, and scale machine learning workloads in production environments.
Support GPU-based and CPU-based workloads for training and inference.
Build deployment pipelines for ML models and AI services.
Collaborate with ML engineers and data scientists to operationalize machine learning systems.
Manage model-serving infrastructure and inference scaling requirements.
Networking & Multi-Service Architecture
Design and maintain complex networking architectures across cloud environments.
Configure and manage:
Load balancers
API gateways
Service meshes
Reverse proxies
Traffic routing policies
Support multi-service and microservice-based platforms.
Implement secure communication between distributed services.
CI/CD & Automation
Build and maintain robust CI/CD pipelines.
Automate infrastructure provisioning, deployments, testing, and operational workflows.
Implement deployment strategies including:
Blue/green deployments
Canary releases
Rolling updates
Improve engineering productivity through platform automation.
Reliability & Observability
Implement monitoring, logging, tracing, and alerting solutions.
Establish SLOs, SLIs, and operational metrics.
Lead incident response and root-cause analysis activities.
Continuously improve platform reliability and operational excellence.
Required QualificationsExperience
5+ years of DevOps, Platform Engineering, Site Reliability Engineering (SRE), or Infrastructure Engineering experience.
Proven experience operating production environments at scale.
Experience supporting mission-critical systems with high availability requirements.
AWS Expertise
Strong hands-on experience with AWS services including:
EC2
VPC
IAM
Route 53
Application Load Balancer (ALB)
Network Load Balancer (NLB)
ECS and/or EKS
S3
RDS
ElastiCache
CloudWatch
Secrets Manager
Lambda (preferred)
Kubernetes & Containers
Extensive Kubernetes production experience.
Strong understanding of:
Networking
Ingress controllers
Storage management
Cluster operations
Security policies
Advanced Docker experience.
Infrastructure as Code
Experience with:
Terraform (strongly preferred)
CloudFormation
Pulumi (nice to have)
CI/CD
Hands-on experience with one or more:
GitHub Actions
GitLab CI/CD
Jenkins
ArgoCD
CircleCI
Networking
Strong understanding of:
DNS
TLS/SSL
VPNs
Routing
Reverse proxies
Service-to-service communication
Network security architecture
Preferred QualificationsGoogle Cloud Platform Exposure
Experience with:
GKE
Cloud Run
Compute Engine
Cloud Storage
IAM
VPC Networking
Machine Learning Infrastructure
Experience deploying and operating:
ML inference services
GPU workloads
Model-serving platforms
MLOps workflows
Vector databases
LLM applications and AI infrastructure
Observability & Operations
Experience with:
Prometheus
Grafana
OpenTelemetry
ELK/OpenSearch
Datadog
New Relic
Security
Cloud security best practices.
IAM design and access controls.
Secrets management.
Vulnerability management and compliance frameworks.
Required Application Submission
Applicants should include:
Resume/CV
LinkedIn profile (optional)
GitHub profile (if applicable)
Description of the largest production infrastructure they have managed
Summary of Kubernetes and AWS environments they have operated
Details of machine learning or AI workloads they have deployed
Examples of CI/CD pipelines and Infrastructure-as-Code projects they have implemented
Success Criteria
The successful candidate will:
Architect and maintain highly reliable cloud infrastructure.
Independently manage AWS-based production environments.
Successfully deploy and operate complex ML and AI workloads.
Design secure and scalable multi-service routing architectures.
Drive automation, observability, and operational excellence.
Serve as a technical leader for cloud infrastructure and platform engineering initiatives.
Improve deployment velocity, system reliability, and infrastructure scalability across the organization.
Pay: ₹647,045.59 - ₹804,652.19 per year
Benefits:
Health insurance
Paid sick time
Paid time off
Provident Fund
Work Location: In person
Questions about this role
How do I apply to this Senior DevOps Engineer (AWS-Focused) role at VSynergize Outsourcing Pvt Ltd?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for DevOps / SRE in India?
Compensation for DevOps / SRE roles in India varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our DevOps / SRE hub for India medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does VSynergize Outsourcing Pvt Ltd use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.
Want AI Applyd to auto-apply to roles like this?
We tailor your resume per posting, fill the forms, and track replies for you.