SRE (Site Reliability Engineer)

  • Applications may have closed

Who Are We?

Obol Labs is a remote-first research and software development team focused on Proof of Stake infrastructure for public blockchain networks
Specific topics of focus are Internet Bonds, Distributed Validator Technology, and Multi-Operator Validation
The core team includes 14 members spread across 8 countries
The core team is building the Obol Network, a protocol to foster trust-minimized staking through multi-operator validation
This will enable low-trust access to Ethereum staking yield, which can be used as a core building block in various Web3 products
The Network

The network can be best visualized as a work layer that sits directly on top of the base layer consensus
This work layer is designed to provide the base layer with more resiliency and decentralization as it scales
In this chapter of Ethereum, we will move on to the next great scaling challenge, which is stake centralization
Layers like Obol are critical to the long-term viability and resiliency of public networks, especially networks like Ethereum
Obol as a layer is focused on scaling main chain staking by providing permissionless access to Distributed Validators
The network utilizes a middleware implementation of Distributed Validator Technology (DVT), to enable the operation of distributed validator clusters that can preserve validators' current client and remote signing configurations
Similar to how roll-up technology laid the foundation for L2 scaling implementations, we believe DVT will do the same for scaling the consensus layer while preserving decentralization
Staking infrastructure is entering its protocol phase of evolution, which must include trust-minimized staking networks that can be plugged into at scale
We believe DVT will evolve into a widely used primitive and will ensure the security, resiliency, and decentralization of public networks
The Obol Network develops and maintains four core public goods that will eventually work together through circular economics:The DV Launchpad

, a User Interface for bootstrapping and managing Distributed ValidatorsCharon,

a middleware Golang client that enables validators to run in a fault-tolerant, distributed mannerObol Managers

, a set of solidity libraries for the formation of Distributed Validators tailored to different use cases such as DeFi, Liquid Staking, and Fractionalized Deposits Obol Testnets

, a set of ongoing public incentivized testnets that enable any sized operator to test their deployment before serving for the Ethereum Main netSustainable Public Goods

Obol is inspired by previous work on Ethereum public goods and experimenting with circular economics
We believe that to unlock innovation in staking use cases, a credibly neutral layer must exist for innovation to flow and evolve vertically

Without this layer, highly available uptime will continue to be a moat

The Obol Network will become an open, community-governed, self-sustaining project over the coming months and years

Together we will incentivize, build, and maintain distributed validator technology that makes public networks a more secure and resilient foundation to build on top of

The Platform Engineering team at Obol is looking for a talented and experienced SRE (site reliability engineer )

to help us build and support our global infrastructure and operations
Join our growing organization and you will get a chance to be in the driving seat of innovation and change at Obol
As a site reliability engineer, you will be responsible for building, monitoring, securing, and ensuring the reliability of our globally distributed infrastructure that supports Obol's network of thousands of Distributed Validator clusters and deployments


Responsibilities

    • Responsible for infrastructure automation and observability on different Cloud and on-prem
    • and troubleshoot incidents and issues in Obol’s infrastructure and ensure the incident management and post-mortem standard procedures are
    • with staking operators to ensure Obol’s DVT optimal performance, seamless deployments and rollouts, identify issues and fix them
    • This collaboration includes synchronous communication through calls and asynchronous communication via discord, telegram, emails,
    • and enhance the reliability and performance of Obol’s Distributed Validator Client for running Ethereum validators in a fault-tolerant
    • Participate in the engineering on-call rotations to ensure systems uptime and incident resolution
    • platform engineering and cloud-native best practices and standards to the software you write

Requirements

    • At least 2 years of experience in Site Reliability Engineering or a similar
    • in one of the public cloud platforms GCP, AWS, or
    • in containerization technologies with Docker, Docker-compose, and in developing infrastructure as code with
    • and scripting experience, preferably with bash, Python, or with monitoring using Prometheus, and
    • experience in web3 and blockchain technologies such as Ethereum is highly preferred, particularly an understanding of how an Ethereum proof of stake validator works
       
    • Excellent communication and delivery skills to represent Obol externally by working with enterprise staking node operators deploying our software

Nice to have

    • Experience with Ansible, Helm, and Prometheus Loki
    • Experience in networking and distributed systems
    • Experience working with remote teams

Benefits

    • Work with a team of talented engineers building amazing stuff
    • We are Fully Remote
    • We care about work-life balance, you are important as a professional and also as a person!
    • Annual global offsite (Cape Town ‘23) 
    • Unlimited paid time off (based on our company policy)
    • Personal hardware & professional training budget

Apply to our role and be part of our growing team!

Listed in: , , , , , , , , , , , , , , , , ,