Site Reliability Engineer

Employment Type

: Full-Time

Industry

: Miscellaneous




Who we re looking for
Site Reliability Engineers specialize in developing scalable methods for building, deploying, and supporting our client's cloud-based enterprise services and systems. This is a highly collaborative role in which you will work closely with their CTO and Software Developers to deploy and operate our solutions; automate and streamline our processes; build and maintain tools for deployment, monitor IT operations, and troubleshoot and resolve issues in our dev, test and production environments. You will interact, develop, engineer, and communicate at the highest technical levels of organizational decision-making.

Responsibilities

  • Design, deploy, maintain, and administer cloud infrastructure to power the company's stack
  • Optimize stack for performance and fault tolerance
  • Design and implement security policies and procedures
  • Build logging, monitoring, and alerting systems to identify bottlenecks and assist with debugging, analysis, and optimization
  • Experiment with and recommend new technologies that simplify or improve the company's stack
  • Participate in an off-hours on-call rotation, and perform periodic off-hours work during maintenance windows

Requirements

  • Fluent in Python and Shell Scripting, with experience implementing automation and monitoring using shell scripting and other related tools
  • Working with configuration management frameworks (ideally Ansible)
  • Knowledge of Linux (ideally Debian/Ubuntu) architecture, administration, performance monitoring/tuning, troubleshooting, and production operations
  • System monitoring experience (ideally Zabbix, CloudWatch and PagerDuty)
  • Securing networks, servers, and applications
  • Understanding of TCP/IP, DNS, network routing and subnet
  • Configuring and managing cloud infrastructure (ideally AWS)
  • Managing and tuning database clusters (ideally Elasticsearch, MySQL, and Aerospike)
  • Experience with an always-on and high-volume web server stack
  • Experience with containerization technologies (Docker, Kubernetes, Terraform)
  • Experience with building out a CI/CD pipeline
  • BS/MS in Computer Science or a related technical field, or equivalent relevant work experience


#LI-ME1

Benefits

  • Comprehensive benefits including medical, 401K and more.

Launch your career - Create your profile now!

Create your Profile

Loading some great jobs for you...