Principal Site Reliability Developer

Description:

Responsibilities:
You will be a part of an early-stage team that is developing, educating, implementing and refining SRE principles
Collaborate, discover and analyze the current estate of systems, applications, processes, tools, teams, and solutions. Identify strengthens, gaps, and areas of improvement in collaboration with Tech Ops, developer teams, and business stakeholders
Contribute to a set of patterns and blueprints for deploying cloud solutions in a secure, reliable, scalable, cost-effective and fast manner
Define and support the evolution of CI/CD capabilities as we progress to rapid, safe continuous deployment
Embed within TechOps to improve overall security, reliability, performance, scalability, and speed of deployment of platforms and solutions
Support TechOps in Incident Management, primarily through post-mortems, RCA, and post-incident improvements
Support the design, development and deployment of services through activities such as collaborating with developers and architects on system design, reuse of blueprints and frameworks, capacity planning, and readiness reviews
Work across divisional and corporate teams to align strategies, blueprints, and solutions
Collaborate with teams to define, monitor and measure SLIs, SLOs, and SLAs for services, infrastructure, and processes running in production
Support the definition, implementation and refinement of an observability strategy and framework
Work with teams to eliminate toil through automation of infrastructure provisioning, configuration management, deployment, testing, and operation
Work with security and developers to shift-left, embedding security design principles and capabilities early in the development process
Maintain an excellent understanding of the business’s long-term goals and strategy ensuring that the design, architecture, scale and availability are aligned with these goals
Research and experiment with emerging technologies and tools related to performance, availability, observability, CI/CD, service design and consumption, micro-services, and other SRE-related technologies
Collaborate to promote and reinforce disciplined production software engineering processes and best-practices

Organization	S&P Global
Industry	Web Development / Design Jobs
Occupational Category	Principal Site Reliability Developer
Job Location	Toronto,Canada
Shift Type	Morning
Job Type	Full Time
Gender	No Preference
Career Level	Intermediate
Experience	2 Years
Posted at	2023-01-19 12:12 pm
Expires on	2025-06-06