Site Reliability Engineer at jfrog in Tel Aviv, IL

Skills

kubernetesprometheuslangchainjenkinsgrafanagithubpythonazurecicdgooglecloudawsgo

About the role

At JFrog, we’re reinventing DevOps to help the world’s greatest companies innovate -- and we want you along for the ride. This is a special place with a unique combination of brilliance, spirit, and just all-around great people. If you’re willing to do more, your career can take off. And since software plays a central role in everyone’s lives, you’ll be part of an important mission. Thousands of customers, including the majority of the Fortune 100, trust JFrog to manage, accelerate, and secure their software delivery from code to production -- a concept we call “liquid software.” Wouldn't it be amazing if you could join us on our journey?

We’re hiring an SRE to help improve the availability, performance, scalability, and operational excellence of our SaaS environments. You’ll work closely with Engineering and Cloud teams to automate operations, scale JFrog’s large-scale, multi-cloud, Kubernetes-based SaaS environments, strengthen observability, and improve incident response using modern SRE practices (SLOs/SLIs, error budgets, postmortems). This role is hands-on, collaborative, and impact-focused. If you're eager to make a significant impact in a fast-paced, high-growth environment, we encourage you to apply.

As a Site Reliability Engineer at JFrog, you will…

Support the reliability, availability, performance, and scalability of JFrog’s large-scale, multi-cloud, Kubernetes-based SaaS environments

Investigate and troubleshoot production issues across distributed systems, infrastructure, Kubernetes, and cloud environments in close collaboration with Engineering teams

Design and develop backend services, internal platforms, and production engineering tools using Python, Go, or similar technologies

Improve reliability, observability, and operational readiness through SRE practices, monitoring and alerting, capacity awareness, postmortems, and safer CI/CD and production change processes

Evaluate and contribute to AI-assisted and agentic automation solutions that improve operational efficiency, troubleshooting, and production workflows

Support resilience initiatives, including disaster recovery validation, service readiness, health checks, and production readiness reviews

Participate in on-call rotations, lead incident response when needed, and drive follow-up actions to prevent recurrence

Continuously learn and evaluate new technologies that can improve reliability, automation, and operational excellence

To be a Site Reliability Engineer at JFrog, you need…

2-4 years of experience in SRE, Production Engineering, DevOps, or a similar role with hands-on production exposure

Strong troubleshooting and analytical skills, with the ability to investigate production issues in a structured and methodical way

Hands-on experience with Kubernetes-based containerized workloads

Experience with at least one public cloud provider: AWS, GCP, or Azure

Experience developing backend services, internal platforms, automation, or production engineering tools using Python, Go, or another programming language

Practical understanding of Linux fundamentals, networking concepts, HTTP, DNS, service connectivity, and production troubleshooting

Familiarity with CI/CD tools such as Jenkins, ArgoCD, GitHub Actions, or similar

Exposure to observability tools covering metrics, logs, and traces, such as Prometheus, Grafana, Coralogix, New Relic, or similar platforms

Understanding of incident management processes, alerting systems, and production support workflows

Ability to learn quickly, take ownership, communicate clearly, and work well in a collaborative production environment

Experience using AI-assisted operational workflows such as log analysis, incident summarization, triage support, or troubleshooting – an advantage

Familiarity with agentic automation frameworks such as LangGraph, LangChain, CrewAI, or similar – an advantage

Experience using AI-assisted development tools such as Cursor, Claude Code, GitHub Copilot, ChatGPT, or similar tools – an advantage

Questions about this role

Click "Apply with AI Applyd" above. We auto-fill the application from your resume and answer screening questions in seconds. No copy and paste, no juggling tabs.

Compensation for DevOps / SRE roles in Israel varies widely by seniority, employer size, and remote vs onsite arrangement. Check the salary range on this listing when published, or browse our DevOps / SRE hub for Israel medians across recent openings.

Most applications complete in under 90 seconds. You can track the status in your dashboard and watch the screenshot proof land the moment the application submits.

AI Applyd supports Greenhouse, Lever, Ashby, Workday, iCIMS, SmartRecruiters, LinkedIn Easy Apply, and most other ATS platforms. If we can submit through the platform, we do.

Want AI Applyd to auto-apply to roles like this?

We tailor your resume per posting, fill the forms, and track replies for you.

Start free Report this listing

Skills

About the role

Questions about this role

How do I apply to this Site Reliability Engineer role at jfrog?

What's the typical salary for DevOps / SRE in Israel?

How fast does AI Applyd auto-apply?

What ATS does jfrog use?