Senior DevOps Engineer (Network Specialist)

Overview

About Us

BitMEX is one of the first crypto exchanges for derivatives
It exists to provide institutional and professional traders with a platform that caters to their needs

BitMEX created the Perpetual Swap, the most popular crypto trading product in history
It is the only major global exchange that continues to create new cryptocurrency derivative products, most recently, the ETH Staking Swap
 

On BitMEX, users can trade cryptocurrency derivatives on a professional trading platform that provides low latency, deep liquidity and constant availability
  Since 2014, no cryptocurrency has been lost through intrusion or hacking, allowing BitMEX users to trade safely in the knowledge that their funds are secure
Spot trading is also supported, as well as the purchase and conversion of cryptocurrency   BitMEX supports 45+ derivatives contracts, 11 pairs for spot trading and the ability for users to convert 30+ cryptocurrencies
 

We are looking for individuals who are determined, responsible and collaborative to join BitMEX as we continue to build a thriving cryptocurrency ecosystem
We value attention to detail, speed and simplicity
As a global business operating a 24/7 exchange, we seek out those who are adaptable and can work across markets and timezones
 

For more information on BitMEX, company initiatives and our products, please visit the or, and follow,, and

Role Overview

As a member of the Platform Engineering team, you will be responsible for managing and supporting the infrastructure which drives our platform
The reliability and scalability of our technology is key to our success and this position will work with our development and  security teams to help design highly available and fault tolerant systems

In particular you will be focussed on monitoring and optimizing our network performance to support the low-latency, high throughput operation of our trading exchange

Key Responsibilities

  • Continuously improve the resiliency, throughput and latency profiles of our trading systems, by working hand-in-hand with our trading technology teams
  • Manage and support our AWS cloud infrastructure, EC2 instances and physical
  • servers
  • Development and management of IaC to ensure consistency of our infrastructure
  • Ensuring security hardening of our OS builds and configurations
  • Manage and maintain config management tooling to ensure consistency
  • Integration of our stack with Kubernetes
  • Ensure SRE best practices for design and operation of the stack
  • Design, implement and test disaster recovery capabilities to ensure our business
  • can continue to operate in the event of a technology failure
  • Participate in an on-call rota for escalations

Qualifications

  • Theoretical and practical networking knowledge, incl
    but not limited to unicast and multicast routing protocols, Linux kernel’s TCP stack implementation, congestion avoidance/control ( BBR), traffic control, network simulation, AWS VPC / TGW & Kubernetes VPC CNI, etc
    DPDK experience being a plus
  • Professional experience with kernel troubleshooting: strace, bpftrace, perf profiling/tracing, navigating / reading / building the relevant kernel code
  • Professional experience with userland monitoring ( Thanos/Prometheus/AlertManaging), logging ( Splunk/Loki), alerting, troubleshooting, profiling/tracing, etc
  • Strong practical AWS knowledge, with min
    5 years of SRE / DevOps experience supporting and managing Linux based systems
    Computer science, or engineering, degree preferred – strong understanding of fundamental Computer Science principles is required
  • Familiarity with Kubernetes / Ansible / Chef, and with one or more programming language: Python, Golang, C, NodeJS

#LI-CH1

Listed in: , , , , , , , , , , , , , , , ,