The Customer Reliability Engineer will work with other Reliability Engineers (RE), Product Managers, and Developers practitioners to produce mission-critical infrastructure, tools, performance improvements, actionable and meaningful performance measurements, and communication to stakeholders. The CRE is expected to work with management, peers, and customers to define and implement the technical vision, improve monitoring tools, error detections, defects elimination while improving Mean Time to Detection/Resolution, and overall service availability and customer satisfaction. The CRE role at Dick’s Sporting Goods (DSG) provides an opportunity to blend system design and software engineering skills with passion for troubleshooting and defects elimination to address an ever-changing applications and environments with scalability and reliability challenges.
Troubleshoot high severity e-commerce, infrastructure and legacy business applications/websites performance and availability issues and manager the incident lifecycle to resolutions.
Drive root cause analysis/investigations through identifying, analyzing and remediating service(s) performance and availability issues to ensure maximum service uptime and availability. Conducting Blameless Post Incident Review is expected.
Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health. You're expected to be on- call and have strong written communication skills and be able to develop working relationships with coworkers.
Supervise ITSD, Operations and ESOC technicians in service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale.
Work across multiple project teams simultaneously to support rapid development efforts.
Solve complex, business critical issues that impact bottom line financial numbers and customer loyalty/experience.
Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
Contribute positively to open source projects developed by DSG and join existing communities. Navigate this broader ecosystem and structure projects with upstream/ downstream opportunities in mind.
Identify and integrate with third-party solutions where it makes the most sense.
Use data to understand the availability, reliability, and sustainability of our software.
Bring experience, pragmatism, empathy, and composure to interactions with teams outside of the RE organization.
Work frequently with Product teams on shared goals and cross-team projects.
Balance planned and reactive work using basic project planning techniques and technical roadmaps.
Work and collaborate across teams such Application services, Capacity Planning, Hardware, Network, and Datacenter Operations.
Participate in building advanced tooling for testing, monitoring, administration, and operations of multiple clusters across multiple environments.
Experience negotiating SLIs, SLOs, and SLAs with product owners.
Our teammates know that there is an athlete behind every in-store and eCommerce transaction. We go beyond the expected to build technology that makes the DICK’S Sporting Goods’ experience innovative and hassle-free.
COMMITTED TO INCLUSION & DIVERSITY.
We actively seek to create an inclusive and diverse workforce, reflecting the communities we serve. Doing so strengthens our ability to serve all our athletes and drive innovation and growth.
HAVE A PASSION FOR SPORTS.
We believe that sports make people better and we’re determined to be the best sports company in the world. Whether you’re an athlete or sports enthusiast, we bring our passion for the game into everything we do.
GET BETTER EVERY DAY.
The journey is never over. We know that to be the best, we must get a little better each day. We focus on delivering 1% more in everything we do.