Stefanini Group?is looking for a Business Analyst for a globally recognized company!
For interested applicants, click the apply button or you may reach out to Alfher Hidalgo at (248) 728-2627/
for faster processing. Thank you! Responsibilities:
Drive continuous improvement in software quality and infrastructure reliability and resilience.Perform analytics on previous incidents to understand root causes and better predict and prevent future issues. Create dashboards and reports to communicate key metrics.Deploy technology to improve performance, scalability, and stability of systems. Track performance against SLOs in partnership with monitoring teams or other stakeholders, and ensure systems continue to meet SLOs over time.Remain current with site reliability engineering methods and trends such as observability-driven development and chaos engineering.May oversee, design, implement, and manage DevOps capabilities using continuous integration/continuous delivery toolsets and automation.Collaborate with development teams to promote the concept of reliability engineering during all phases of the software development lifecycle to detect and correct performance issues and meet availability goals.Deliver software to automate manual operational work (i.e., "toil"). Work with stakeholders such as product owners to define service level objectives (SLOs) for system operations such as mean time to detect (MTTD), mean time to triage (MTTT), mean time to mobile (MTTM), mean time to acknowledge (MTTA), and mean time to resolve (MTTR). Participate in operational support, including major incidents (MI), and on-call rotation shifts for supported systems and products.Conduct blameless postmortems to troubleshoot priority incidents.Use automation to reduce the probability and/or impact of problem recurrence.Identify, evaluate, and recommend monitoring tools and diagnostic techniques to improve system observability. Participate in system design consulting, platform management, capacity planning and launch reviews.Collaborate and share lessons learned regarding performance and reliability issues with all stakeholders including developers, other SMEs, operations teams, and project management teams. Participate in communities of practice to share knowledge and foster continuous improvement.Remain current with site reliability engineering methods and trends such as observability-driven development and chaos engineering.