Sr Site Reliability Engineer (SRE) [REMOTE]

Posted 3 years ago
Applications may have closed

Braintrust

JOB TYPE: Contract Position (no agencies/C2C – see notes below)
LOCATION: Remote – North America (US and Canada Only) (Time Zone: EST Partial overlap)
HOURLY RANGE: Our client is looking to pay $100 – $125/hr
ESTIMATED DURATION: 40hr/week – Long-term

THE OPPORTUNITY

Our client is looking for a Senior Site Reliability Engineer who will interface with senior management, Platform Engineering, QA and the Precisely development teams to continuously improve the stability, reliability and efficiency of their global SaaS platform
This individual will work with the team to architect and deploy the tools and systems that will make their production environment more resilient, in order to effectively respond to and address incidents

YOUR RESPONSIBILITIES

Partner closely with SaaS Development, Pipeline Engineering, and Platform Engineering teams to ensure that SRE is an integral part of continuous Delivery model for SaaS applications
Build necessary tooling and automation to ensure that our client is able to manage their cloud native infrastructure in a reliable, maintainable , observable and secure way
Build infrastructure needed to host our Cloud Native SaaS Solution
Follow a true 1-team culture despite globally disparate teams
Follow a 24×7 incidence response process that addresses SLA for SaaS Products through efficient alerting, playbook documentation and blameless postmortems
Build relationships across product management, development, and support organizations to socialize the culture of SRE
Drive the culture of observability through the SaaS development organization
Leads prioritization of reliability features and contributes to the design, development and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks
Ensure appropriate security cloud tooling is planned for and implemented in the production environment
Regularly defend the quality, scalability and reliability of production SaaS environment

REQUIREMENTS

Atleast 5 years of experience in a global multi-tenanted production environment
Hands on skills on Kubernetes, AWS/GCP/Azure, Terraform/Cloudformation/Ansible
Strong knowledge on Linux fundamentals, experience troubleshooting production issues
Experience working in a 24×7 production environment
Strong understanding of SRE and general SaaS service management principles
Past experience working with SRE teams and handling on-call coordination challenges
Strong collaboration, communication and interpersonal skills
The ability to operate calmly in challenging and stressful situations
A deep understanding of Kubernetes and Cloud Networking or previous experience in infrastructure
Exposure to any programming language (Go/Python/C,C++) is a big plus

Apply Now!

#PL-BT