Site Reliability Engineer
Skills
About the role
Site Reliability Engineer (SRE)
π Phoenix, AZ (Hybrid)
We are seeking a hands-on Site Reliability Engineer (SRE) to help build, maintain, and improve highly available, scalable, and resilient production systems. This role will partner closely with engineering, infrastructure, and operations teams to ensure platform reliability, automation, and operational excellence across cloud environments.
Responsibilities
Design, implement, and support scalable, fault-tolerant cloud infrastructure.
Monitor system performance, availability, and reliability across production environments.
Automate operational processes and improve deployment efficiency through CI/CD pipelines.
Define and manage SLOs, SLIs, and error budgets.
Lead incident response, root cause analysis, and post-incident improvements.
Collaborate with development teams to improve application reliability and system performance.
Build and maintain monitoring, logging, and alerting solutions.
Continuously optimize infrastructure, tooling, and operational workflows.
Required Skills & Qualifications
5+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
Strong experience with cloud platforms such as AWS, Azure, or GCP.
Proficiency with scripting/programming languages like Python, Bash, or Go.
Experience with Infrastructure as Code tools such as Terraform or CloudFormation.
Hands-on experience with CI/CD tools and automation pipelines.
Strong knowledge of Linux systems administration and networking fundamentals.
Experience with monitoring and observability tools such as Prometheus, Grafana, Datadog, or New Relic.
Excellent troubleshooting, analytical, and communication skills.
Preferred Qualifications
Experience with Kubernetes and containerized environments.
Familiarity with GitOps, service mesh, and modern deployment strategies.
Understanding of security best practices and compliance frameworks.
Work Environment
Onsite work model based in Phoenix, AZ.
Collaborative and fast-paced engineering environment.
Opportunity to work on large-scale, mission-critical systems.
Questions about this role
How do I apply to this Site Reliability Engineer role at Mastech Digital?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for DevOps / SRE in United States?
Compensation for DevOps / SRE roles in United States varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our DevOps / SRE hub for United States medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does Mastech Digital use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.
Want AI Applyd to auto-apply to roles like this?
We tailor your resume per posting, fill the forms, and track replies for you.