Reliability Engineering Lead
海得拉巴, 印度 Permanent 发布于 Dec. 12, 2025 申请截止于 Dec. 27, 2025Position Title: Reliability Engineering Lead
Location: Hyderabad
Role Description (Process-First Responsibilities)
1. Service Level Management & Reliability Framework
Process Owner: SLO-driven reliability decision making across digital services
Establish SLO Foundation: Define, implement, and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for critical services, ensuring alignment with business impact and patient safety requirements
Error Budget Management: Implement error budget policies that balance feature velocity with reliability, using budget consumption as the primary decision-making tool for release management and incident prioritization
Reliability Governance: Create and maintain reliability standards that comply with GxP, SOX, and other pharmaceutical regulatory frameworks while enabling innovation velocity
Business Impact Correlation: Translate technical reliability metrics into business language, demonstrating clear connections between SLO compliance and revenue, patient safety, or operational efficiency
2. Incident Management & Learning Culture
Process Owner: Blameless incident response and organizational learning
Incident Command: Lead critical incident response using structured protocols, focusing on rapid detection, mitigation, and recovery while maintaining detailed audit trails for regulatory compliance
Blameless Postmortem Leadership: Facilitate blameless postmortems that focus on system improvements rather than individual accountability, creating a culture of psychological safety for honest analysis
Learning Repository Management: Maintain and curate incident learning repositories with transparent sharing across digital units, enabling pattern recognition and systemic improvement
Predictive Issue Prevention: Implement proactive monitoring and alerting systems that identify potential failures before they impact users, shifting from reactive to preventive operations
3. Toil Elimination & Engineering Balance
Process Owner: Systematic automation of operational overhead
Toil Measurement & Reduction: Maintain operational work (toil) below 50% of time through systematic identification, measurement, and elimination of manual, repetitive tasks
Automation Strategy: Design and implement automation solutions using cost-benefit analysis, prioritizing work that scales linearly with service growth and requires minimal human judgment
Engineering Project Delivery: Dedicate minimum 50% of time to engineering projects that improve reliability, performance, or developer experience, delivering measurable improvements quarterly
Knowledge Transfer: Create self-service documentation, runbooks, and automation tools that reduce dependency on human intervention and enable team scaling
4. Platform Engineering Integration & AI Enablement
Process Owner: Reliability integration in AI-first platform services
AI Workload Reliability: Design and implement reliability practices for AI/ML workloads, including agent-to-agent communication systems, model serving infrastructure, and data pipeline reliability
Platform Collaboration: Partner with platform teams to embed reliability principles into Internal Developer Platforms (IDPs), enabling self-service infrastructure with built-in reliability guardrails
Agentic System Support: Provide reliability engineering expertise for Sanofi's agentic AI ecosystem, ensuring conversational AI systems meet enterprise reliability and compliance standards
Developer Experience Enhancement: Contribute to CI/CD pipeline reliability, infrastructure-as-code best practices, and observability integration that accelerates developer productivity
5. Observability & Performance Engineering
Process Owner: Comprehensive system visibility and performance optimization
Full-Stack Observability: Implement and maintain observability platforms covering metrics, logs, traces, and business KPIs, providing end-to-end visibility into service health and user experience
Performance Optimization: Conduct systematic performance engineering including capacity planning, bottleneck identification, and scalability improvements aligned with business growth projections
Intelligent Monitoring: Deploy AI-powered monitoring and alerting systems that reduce noise, provide intelligent root cause analysis, and enable predictive maintenance
Cross-System Correlation: Establish monitoring federation across diverse technology stacks (cloud, on-premises, legacy) while maintaining regulatory audit trails
6. Security & Compliance Integration
Process Owner: Reliability practices within regulatory frameworks
Secure Reliability Engineering: Implement reliability practices that enhance rather than compromise security posture, integrating DevSecOps principles with pharmaceutical compliance requirements
Compliance Automation: Automate compliance checks, audit trail generation, and regulatory reporting while maintaining system reliability and performance
Risk Assessment Integration: Conduct reliability impact assessments for changes affecting GxP systems, balancing innovation speed with regulatory validation requirements
Disaster Recovery: Design and test disaster recovery procedures that meet both technical recovery objectives and regulatory continuity requirements
7. Team Leadership
Process Owner: Represent the reliability engineering discipline
Team Grooming: Groom a team of SREs that can work independently across the key SRE principles
Communication: Provide crisp and strategic updates to the leadership team
Lead by Example: Demonstrate expertise by taking on complex scenarios and providing innovative solutions that can be leveraged by the team, documented for knowledge sharing, and scaled across the organization to drive systematic reliability improvements
追寻 发展。探索 菲凡。
进步需要我们每个人的参与——不论其背景、地域、或职业,我们都有一个共同的愿望:创造奇迹。你也可以成为其中的一员。我们不断追求变革,拥抱新思想,探索我们所能提供的一切机会。让我们一起追求进步。共同发现非凡。
在赛诺菲,不分种族、肤色、血统、宗教、性别、国籍、性取向、年龄、公民身份、婚姻状况、残疾或性别认同,我们为所有人提供平等的机会。
观看 “在赛诺菲的一天” ,并在官网 (sanofi.com) 上查看赛诺菲的多元化、公平与包容倡议!
共享中心
从波哥大到布达佩斯,从吉隆坡到海得拉巴,我们的版图上处处都有您的全力付出。如果您选择在共享中心施展抱负,您将身处全球变革的中心。我们无惧艰难,并肩作战,努力缩短新药抵达患者手中的时间。您将充分发挥创造力,成就独一无二的自我,从而帮助其他人拥有健康生活。让我们努力探索先进科技,改变更多人的生活。
体验可能性
-
-
Ama
Ama puts her project management techniques and ServiceNow knowledge to use to help advance Sanofi’s Digital Data operating model. Learn how our team connects data and AI to do what’s never been done before.
-
Cambridge Crossing
We're bringing together 2,500 people from across our organization — R&D, Medical, Commercial and Global colleagues all working to realize the power of collaboration.
-
Innovation in Action
Our flexible lab of the future will transform how we conduct research, while our innovation center will be fully integrated with existing R&D locations.
-
Sanofi’s AI Centre of Excellence in Toronto
The Centre is focused on using leading technologies to develop world-class data and artificial intelligence (AI) products to create value for the health sector.
-
Sanofi Canada's Philanthropic Efforts
By chasing the miracles of science to improve people’s lives, we surprise ourselves with what we can achieve. Our team is humbled by the impact our efforts make.
-
Sustainable and Green
Our new facility was built to minimize the environmental impact — helping protect our planet and people. Using resources efficiently, we're providing greener, healthier workspaces.
-
-
-
我们的故事
我们关注每一个员工的声音。因为,我们的未来取决于所有员工的付出与努力。正因为他们的助力,我们才能追求远大的理想。
-
心怀梦想,成就一番事业
我们希望您以饱满的热情投入到自己的工作岗位中,给全球数百万人带来美好生活。您的职业发展道路由您自己来掌控。您只管制定目标,我们会提供充足的培训机会和支持,让您得偿所愿。
-
勇敢追梦,奔赴美好未来
想要改变自己的生活,乃至改变全球数百万人的生活,该怎么做?加入我们,开启职业新篇章,然后在我们的保驾护航中展翅高飞,并向优秀的人求教,为这份事业做出切实的贡献。
-
我们的办公地点
我们的员工遍布60多个国家/地区。他们勠力同心,携手共创医疗健康领域的美好未来。无论您在哪里工作,我们的专家都会指导您推动职业发展,您也将能够运用先进的科学技术,取得意义非凡的重大突破。
-
我们的人与文化
我们是首个建立多元化、公平性和包容性(DE&I)委员会的制药企业。我们还建立了“菲常联盟”,为每位员工提供发声的平台。您的声音是我们建设未来道路的重要基石。
-
您和我们相互依存,共同成长
我们精心打造薪酬体系,为您的身心健康、财务健康与社交健康提供全面保障。我们有着海纳百川的包容性团队文化,无论您在哪个岗位,都能展翅高飞。
-
为什么选择我们?
我们为您提供各种工具、支持和培训机会,帮助您实现自己的目标。我们也希望您充分发挥潜力,帮助我们实现目标:将新药研发到临床治疗的时间减半。