LiveOps SREWhat We DoThe Epic LiveOps team provides the best possible experience for our players. We dive deep into the data to understand player needs, minimize disruption, and manage Epic's incident response process.What You'll DoYou will be the voice of the customer in a wide variety of contexts across Epic’s business. You will dive deep into incidents to make sure we are providing our players the best possible experience, we hold a relentlessly high bar for Epic’s service quality, we focus the attention of our business and tech teams on the right priorities, and operate Epic’s incident management process. When other mechanisms fail, LiveOps is the backstop that ensures that Epic operates in the best interest of our players’ experience.In this role, you willRespond to alerts and manage issues in the production environmentOur Site Reliability Engineers manage the development and operation of our Incident Management Tooling, ensuring robust tooling support for the Incident processProduce specifications and determine the operational feasibility of our toolingDevelop quality standards, documentation and testing for our tools codebaseMaintain, improve, troubleshoot, debug and update our codebasesDevelop automated tooling features to drive incident management improvements and reduce the operational cost of the Incident processWork across the stack: Backend, frontend, infrastructure, operation to test, deploy, and iterate based on stakeholder feedbackWhat we're looking forYou thrive on ambiguity. You can understand a diverse set of product features and both identify how an issue impacts a single customer, and can quantify the business impact. You’re capable of identifying larger trends surrounding disparate issues and enable product teams to solve the real underlying issuesYou have a strong technical basis and know how to learn new things. Strong analysis and problem solving skills are essential to do this role successfully (we live in Grafana, Tableau, and similar tools)You are a problem solver with experience with AWS and other cloud infrastructure tools will make you comfortable in this role and the ability to script and automate actions in languages like Python, Ruby, or Go is a bonusYou have experience working cross-functionally or across a large number of teams in multiple organizationsYou have extensive experience working with and building reliable services on AWS or other major cloud infrastructure providersYou have a passion for the reliability engineering spaceEPIC JOB + EPIC BENEFITS = EPIC LIFEWe pay 100% for benefits except for PMI (for dependents). Our current benefits package includes pension, private medical insurance, health care cash plan, dental insurance, disability and life insurance, critical illness, cycle to work scheme, flu shots, health checks, and meals. We also offer a robust mental well-being program through Modern Health, which provides free therapy and coaching for employees & dependents.