Site Reliability Engineering, Sr Staff

Bengaluru, Karnataka, India Apply Now

We Are

Synopsys is the leader in engineering solutions from silicon to systems, enabling customers to rapidly innovate AI-powered products. We deliver industry-leading silicon design, IP, simulation and analysis solutions, and design services. We partner closely with our customers across a wide range of industries to maximize their R&D capability and productivity, powering innovation today that ignites the ingenuity of tomorrow.

You Are

You have spent years keeping complex systems running when everyone else is asleep, and you have learned that reliability is not about firefighting, it is about building systems that do not catch fire in the first place. You know the difference between a metric that tells you something broke and one that tells you something is about to break, and you have strong opinions about which matters more.

You are comfortable working across the stack, from Kubernetes clusters to cloud infrastructure to the Python scripts that tie it all together. At Synopsys, you will work on platforms that support semiconductor design tools used globally, and the reliability work you do will matter to thousands of engineers every day.

What You'll Be Doing

Own the reliability and availability of on-premises and SAAS systems that support Synopsys engineering platforms, ensuring they perform as expected under real-world load
Design, build, and operate observable and self-healing services using OTEL, Elastic stack, Grafana, and Beats to reduce MTTD and MTTR across deployed environments
Define and maintain SLIs, SLOs, and error budgets for platform teams, translating reliability goals into measurable outcomes that drive prioritization
Deploy and manage infrastructure on Azure using Terraform, Kubernetes, Helm, Docker, and Azure native services including AKS, Azure Monitor, Key Vault, and networking components
Evaluate and integrate new observability, automation, and cloud technologies to improve system resilience and operational efficiency
Partner with engineering teams to recommend architecture improvements and process changes that reduce toil and increase platform stability
Serve as the subject matter expert in observability tooling and incident resolution, mentoring teams on best practices for monitoring, alerting, and root cause analysis

The Impact You Will Have

Reduce mean time to detection and resolution across critical platforms, giving engineering teams more time to build and less time firefighting
Build self-healing capabilities into services that eliminate entire classes of recurring incidents and reduce on-call burden
Establish SLO-driven reliability culture that helps teams make informed tradeoffs between feature velocity and system stability
Enable faster, safer deployments by improving observability and automation across cloud and on-premises infrastructure
Drive architectural decisions that prevent outages before they happen, not just respond to them after the fact
Create runbooks, dashboards, and tooling that make the next engineer more effective and the next incident less painful
Influence platform roadmaps by surfacing reliability gaps and recommending technology investments that improve long-term operational health

What You'll Need

7+ years of hands-on experience as a Site Reliability Engineer or in a similar platform reliability role
Strong proficiency in Python and TypeScript, including data structures, algorithms, object-oriented programming, and design patterns
Deep expertise deploying and managing infrastructure on Azure, including AKS, Azure Monitor, Virtual Networks, Azure SQL, Cosmos DB, Key Vault, and Azure AD
Hands-on experience with GitHub, Helm, Docker, Kubernetes, and Terraform in production environments
Proven ability to build and maintain observability pipelines using tools like OTEL, Elastic stack, Grafana, and Beats
Working knowledge of ITIL and Agile processes, and experience with ITSM tools like ServiceNow or Rootly
PowerShell programming experience is a plus, as is familiarity with GCP, AWS, Azure Machine Learning, or Azure OpenAI services
A proactive approach to incident response and a willingness to work on-call to support business-critical services.

Who You Are

You can walk into a production incident, cut through the noise, and identify the actual problem while others are still gathering logs
You push back when a team asks for a new feature without considering the operational cost or the reliability impact
You can explain the tradeoff between reliability and velocity to a product manager in two sentences without losing the nuance
You stay current on cloud and observability tooling not because it is trendy, but because you care about solving problems better

The Team You'll Be Part Of

You will join a rapidly growing Cloud development and SRE team focused on delivering state-of-the-art cloud solutions. The team is scaling to meet increasing demand and building the next generation of reliability and automation capabilities across Synopsys SaaS products.

Rewards and Benefits

We offer a comprehensive range of health, wellness, and financial benefits to cater to your needs. Our total rewards include both monetary and non-monetary offerings. Your recruiter will provide more details about the salary range and benefits during the hiring process.

At Synopsys, we want talented people of every background to feel valued and supported to do their best work. Synopsys considers all applicants for employment without regard to race, color, religion, national origin, gender, sexual orientation, age, military veteran status, or disability.

Apply Now

Relevant Jobs

R&D Engineering, Staff Engineer Bengaluru, India Engineering
DevOps Engineering, Senior Staff Bengaluru, India Engineering
Intern Bengaluru, India Interns/Temp

BROWSE JOBS

Find the open role that’s
right for you

R&D Engineering, Staff Engineer Bengaluru, India
DevOps Engineering, Senior Staff Bengaluru, India
Intern Bengaluru, India
Site Reliability Engineering, Sr Staff Bengaluru, India

View all job opportunities here

View All Jobs

Engineer the Future
with Us

Innovation Starts Here

Site Reliability Engineering, Sr Staff

We Are

You Are

What You'll Be Doing

The Impact You Will Have

What You'll Need

Who You Are

The Team You'll Be Part Of

Rewards and Benefits

Relevant Jobs

Find the open role that’s
right for you

Engineer the Future with Us

Innovation Starts Here

Site Reliability Engineering, Sr Staff

Share this job

We Are

You Are

What You'll Be Doing

The Impact You Will Have

What You'll Need

Who You Are

The Team You'll Be Part Of

Rewards and Benefits

Relevant Jobs

Find the open role that’s right for you

Engineer the Future
with Us

Find the open role that’s
right for you