Site Reliability Engineer

Description:

As a SRE Engineer, you will be part of a team that secures, deploys, manages, and automates our cloud systems. Thanks to your ambition and your technical know-how, you are making informed decisions to build solutions, implement new technology, provide training, maintain our infrastructure, and assist our development team.

Responsibilities

Participate in the design, development, implementation, and the maintenance of the cloud infrastructure platform alongside your team.
Communicate with different teams to identify system dependencies.
Deploy, configure, and maintain, servers, network appliances and cloud PaaS services.
Ensure that adequate monitoring is in place to detect any security or availability issue for escalation.
Ensure industry best practices are applied uniformly across the environment.
Make suggestions to optimize system performance.
Create extensible, automated continuous integration and delivery pipelines to test and release application code and IaC.
Build CI/CD Pipelines as Code.
Troubleshoot and diagnose technical issues.
Maintain and update technical documentation and internal processes.
Follow business processes to document changes.
Willing to work outside standard operational hours.

Requirements

3+ years of SysOps, SRE or DevOps experience.
Excellent communication skills
Strong problem-solving skills.
Excellent Knowledge of Public cloud environments (Azure preferred). AWS and GCP are a plus
Advanced experience with at least one scripting or programming language (PowerShell, Bash, Python, Ruby, Go etc.).
Strong pipeline experience (Azure DevOps preferred)
Experience with web traffic load balancing tools such as Azure Application Gateway
Experience with Infrastructure as code deployments done using descriptive languages such as Arm Templates and Bicep(Preferred). Terraform and Ansible are a plus
Knowledge of deploying applications written in .NET (Desirable).
Experience with monitoring tools (Azure Monitor, Dynatrace, Datadog, etc.).
Knowledge of cloud security best practices to protect data from intrusion, loss, and corruption.
Knowledge of Windows system orchestration.
Knowledge and experience with docker and containers orchestration such as ECS and Kubernetes.

Experience with New Relic and Packer
Experience with Windows, Linux and IISS administration
Experience with ElaticsSearch and REDIS administration
Experience with JMeter performance load testing

Organization	Descartes
Industry	Engineering Jobs
Occupational Category	Site Reliability Engineer
Job Location	Toronto,Canada
Shift Type	Morning
Job Type	Full Time
Gender	No Preference
Career Level	Intermediate
Experience	3 Years
Posted at	2024-07-11 7:09 am
Expires on	2025-03-25