SVP, Team Lead, SRE Engineer (Observability Platforms)
At a glance
Highlights
- Enterprise observability platform
- Leadership role shaping SRE strategy
- Integration of agentic AI and RAG
- Collaboration across finance and engineering teams
- Full-time position at a major bank
Why this role might suit you
The role provides leadership over a high-impact observability team, exposure to cutting-edge AI-driven reliability tools, and the opportunity to influence SRE strategy within a leading financial institution.
Skills
About the role
As the Team Lead Site Reliability Engineering (SRE), you will lead a group of technical engineering staff to develop, maintain and scale enterprise Observability and Command Center platforms and services. This role requires a leader with a deep understanding of SRE principles, application of Agentic AI, Observability and a hands-on approach to problem-solving. The ideal candidate will have a strong background in observability, software engineering, data engineering, analytics, application of AI and a proven track record in driving reliability improvements in complex technical environments.
Key Responsibilities:
Leadership & Strategy:
Develop and execute the SRE strategy and roadmap for enhancing Observability capabilities in alignment with the bank's business goals and technological vision.
Lead, mentor, and manage a team of SRE engineers, fostering a culture of collaboration, innovation, and continuous improvement.
Reliability & Performance:
Design, implement, and maintain robust situational awareness, monitoring and alerting, to ensure high availability and performance of banking services.
Drive the adoption of best practices in system design, capacity planning, and performance optimization.
Identify and mitigate potential risks to system reliability, proactively addressing issues before they impact customers.
Engineering and Services:
Develop and implement enterprise observability and monitoring strategies, platforms and services for the organization.
Develop and implement enterprise command center platforms and services for managing incidents and situational awareness.
Collaborate with cross-functional teams to establish monitoring tools and metrics, ensuring alignment with business objectives and goals.
Automation & Tooling:
Champion automation efforts to streamline operational processes, reduce manual intervention, and increase system efficiency.
Develop and maintain tools and scripts for infrastructure management, deployment, and monitoring.
Collaboration & Communication:
Work closely with the application & infrastructure teams to ensure that reliability is built into the architecture and design of new features and services.
Communicate reliability goals, progress, and challenges to executive leadership and other stakeholders.
Promote a culture of transparency and accountability within the SRE team and across the organization.
Qualifications & Requirements:
Education & Experience: o Bachelor's or Master’s degree in computer science, Engineering, or a related field.
Minimum 10 years of experience in software engineering, data engineering, infrastructure management, or a related technical field.
Minimum 5 years of experience in a leadership role within an SRE or DevOps team, preferably in the banking or financial services industry.
Technical Skills:
Proficiency in programming languages such as Python, Java, or similar.
Deep knowledge of monitoring and observability tools (Grafana, ELK stack, etc.)
Experience and good knowledge in building Agentic AI applications; including prompt engineering, RAG (Retrieval-Augmented Generation)
Experience building web and workflow applications
Strong understanding of containerization technologies (Docker, Kubernetes).
Experience with CI/CD pipelines
Good understanding and experience with ITIL processes and best practices
Other Skills:
Excellent leadership, mentoring, and team-building skills.
Strong problem-solving and analytical abilities.
Effective communication and interpersonal skills, with the ability to convey complex technical concepts to non-technical stakeholders.
Strategic thinking and a proactive approach to identifying and addressing potential issues.
Location:
DBS Asia Hub
Job:
Technology
Schedule:
Regular
Employee Status:
Full time
Questions about this role
How do I apply to this SVP, Team Lead, SRE Engineer (Observability Platforms) role at DBS Bank?
Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.
What's the typical salary for DevOps / SRE in Singapore?
Compensation for DevOps / SRE roles in Singapore varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our DevOps / SRE hub for Singapore medians across recent openings.
How fast does AI Applyd auto-apply?
Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.
What ATS does DBS Bank use?
AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.
Want AI Applyd to auto-apply to roles like this?
We tailor your resume per posting, fill the forms, and track replies for you.