LIVEOPSWhat We DoThe Epic LiveOps team provides the best possible experience for our players. We dive deep into the data to understand player needs, minimize disruption, and manage Epic's incident response process.What You'll DoYou will be the voice of the customer in a wide variety of contexts across Epic's business. You will dive deep into incidents to make sure we are providing our players the best possible experience, we hold a relentlessly high bar for Epic's service quality, we focus the attention of our business and tech teams on the right priorities, and operate Epic's incident management process. When other mechanisms fail, LiveOps is the backstop that ensures that Epic operates in the best interest of our players' experienceIn this role, you willRespond to alerts and manage issues in the production environmentOur Site Reliability Engineers manage the development and operation of our Incident Management Tooling, ensuring robust tooling support for the Incident processProduce specifications and determine the operational feasibility of our toolingDevelop quality standards, documentation and testing for our tools codebase Maintain, improve, troubleshoot, debug and update our codebasesDevelop automated tooling features to drive incident management improvements and reduce the operational cost of the Incident processWork across the stack: Backend, frontend, infrastructure, operation to test, deploy, and iterate based on stakeholder feedbackWhat we're looking forYou thrive on ambiguity. You can understand a diverse set of product features and both identify how an issue impacts a single customer, and can quantify the business impact. You're capable of identifying larger trends surrounding disparate issues and enable product teams to solve the real underlying issuesYou have a strong technical basis and know how to learn new things. Strong analysis and problem solving skills are essential to do this role successfully (we live in Grafana, Tableau, and similar tools)You are a problem solver with experience with AWS and other cloud infrastructure tools will make you comfortable in this role and the ability to script and automate actions in languages like Python, Ruby, or Go is a bonusYou have experience working cross-functionally or across a large number of teams in multiple organizationsYou have extensive experience working with and building reliable services on AWS or other major cloud infrastructure providersYou have a passion for the reliability engineering spaceEPIC JOB + EPIC BENEFITS = EPIC LIFEOur intent is to cover all things that are medically necessary and improve the quality of life. We pay 100% of the premiums for both you and your dependents. Our coverage includes Medical, Dental, a Vision HRA, Long Term Disability, Life Insurance & a 401k with competitive match. We also offer a robust mental well-being program through Modern Health, which provides free therapy and coaching for employees & dependents. Throughout the year we celebrate our employees with events and company-wide paid breaks. We offer unlimited PTO and sick time and recognize individuals for 7 years of employment with a paid sabbatical.