Site Reliability Engineer

 

Description:

As a SRE Engineer, you will be part of a team that secures, deploys, manages, and automates our cloud systems. Thanks to your ambition and your technical know-how, you are making informed decisions to build solutions, implement new technology, provide training, maintain our infrastructure, and assist our development team.

Responsibilities

  • Participate in the design, development, implementation, and the maintenance of the cloud infrastructure platform alongside your team.
  • Communicate with different teams to identify system dependencies.
  • Deploy, configure, and maintain, servers, network appliances and cloud PaaS services.
  • Ensure that adequate monitoring is in place to detect any security or availability issue for escalation.
  • Ensure industry best practices are applied uniformly across the environment.
  • Make suggestions to optimize system performance.
  • Create extensible, automated continuous integration and delivery pipelines to test and release application code and IaC.
  • Build CI/CD Pipelines as Code.
  • Troubleshoot and diagnose technical issues.
  • Maintain and update technical documentation and internal processes.
  • Follow business processes to document changes.
  • Willing to work outside standard operational hours.


Requirements

  • 3+ years of SysOps, SRE or DevOps experience.
  • Excellent communication skills
  • Strong problem-solving skills.
  • Excellent Knowledge of Public cloud environments (Azure preferred). AWS and GCP are a plus
  • Advanced experience with at least one scripting or programming language (PowerShell, Bash, Python, Ruby, Go etc.).
  • Strong pipeline experience (Azure DevOps preferred)
  • Experience with web traffic load balancing tools such as Azure Application Gateway
  • Experience with Infrastructure as code deployments done using descriptive languages such as Arm Templates and Bicep(Preferred). Terraform and Ansible are a plus
  • Knowledge of deploying applications written in .NET (Desirable).
  • Experience with monitoring tools (Azure Monitor, Dynatrace, Datadog, etc.).
  • Knowledge of cloud security best practices to protect data from intrusion, loss, and corruption.
  • Knowledge of Windows system orchestration.
  • Knowledge and experience with docker and containers orchestration such as ECS and Kubernetes.
  • Experience with New Relic and Packer
  • Experience with Windows, Linux and IISS administration
  • Experience with ElaticsSearch and REDIS administration
  • Experience with JMeter performance load testing

Organization Descartes
Industry Engineering Jobs
Occupational Category Site Reliability Engineer
Job Location Toronto,Canada
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Intermediate
Experience 3 Years
Posted at 2024-07-11 7:09 am
Expires on 2024-10-08