This critical Service Reliability Engineering, Data Analytics Engineer - Platform role on the Pluto TV Team includes responsibilities focused on service performance reporting and data visualization, such as service monitoring, alerting, and tracking service performance metrics across our tech stacks. The Engineer will be required to dig into meaningful data that exposes insights into reliability metrics across multiple platforms and backend services. The Engineer will analyze data to identify patterns, uncover optimizations, and build an understanding of how services perform individually and in combination with other services. The Engineer will be required to leverage various measurement platforms to track performance, anomalies, and deviations due to new product feature releases, fluctuating traffic due to daily and seasonal traffic patterns (rps), and service and lambda performance and timing patterns. The Engineer will partner closely with other teams to identify service performance enhancements and optimizations, and work with other teams to refine data modeling across all backend services, testing platforms, data visualization, and data architecture. The Engineer in this role will also assist in leading the charge during critical incidents, enabling the Production Operations group to increase reliability across various tools and technologies. We’re looking for a star Engineer for our Service/Site Reliability group who has proficiency in backend systems and is confident with high volume and high-performance Node.js Servers, as well as vast knowledge of AWS to dive into Data Modeling to advise in short and long term fixes for code and infrastructure and implement proactive monitoring and alerting solutions.
This is a critical role with a wide range of responsibilities, including:
We believe the right individual will have the following skills and experience in order to be successful in the role: