Staff Engineer, Reliability Insights & Excellence

Stripe

United States

Remote

Published: Published today

Other

Who we are

About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies - from the world’s largest enterprises to the most ambitious startups - use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone's reach while doing the most important work of your career.

About the team

Stripe’s infrastructure team powers businesses all over the world. Our customers trust us with their businesses, and every request that stripe handles is critical. We process billions of dollars every year for millions of users, from the largest enterprises to a startup making their first sale. That is why both world-class reliability and seamless infrastructure scale are considered table stakes to support massive economic transactions for our customers.

We are the team leading Stripe’s reliability and scalability efforts, with focus on delivering world-class availability and certifying Stripe’s systems to handle unprecedented levels of traffic during big events like Black Friday and Cyber Monday as well as our merchants' key events. You can learn more about our contributions on https://stripe.com/newsroom/news/bfcm2023. We own the core preventative reliability platforms and tools used by infrastructure and product teams across the company to build resiliency in their systems and scale them to handle the projected peak load.

What you’ll do

We’re looking for an experienced distributed systems engineer with outstanding technical and leadership skills, strong collaboration skills and huge passion for customers to help deliver the foundation of our reliability infrastructure and work with various teams and across the entire stack to deliver world-class reliability solutions. In this role you’ll not only be in charge of designing, implementing and testing your various infrastructure components, but you’ll play an influential role in enabling engineering teams to make their services more reliable by identifying, creating, and deploying engineering practices, processes, and solutions.

You will:

Design, build, and maintain distributed cloud infrastructure and platform service
Debug production issues across services and various levels of the stack, work on scaling, automation, reliability and observability of infrastructure services
Mentor other engineers in the organization and review code and design documentation
Participate in roadmap planning and prioritization

Who you are

We're looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements:

10+ years of engineering experience or equivalent combined work experience reflecting domain expertise
Hands-on experience designing, building and operating large scale distributed systems, identifying shortcomings and optimization opportunities, and making data driven cost performance tradeoffs to influence design decisions
Demonstrated experience of leading initiatives spanning multiple teams and leveraging deep domain expertise to influence tech roadmap planning and execution
Demonstrated ability to effectively collaborate across multiple teams and stakeholders to drive business outcomes
Experience, mentoring, and investing in the development of engineers and peers

Preferred Qualifications:

Genuine interest and/or experience in debugging and troubleshooting complex distributed systems problems.
Experience in fault modeling and tolerance, chaos engineering and load testing.
Familiarity with the common patterns and practices for building reliable software.
Experience with C, C++, Go, Ruby or/and Java

This role is available either in an office or a remote location (typically, 35+ miles or 56+ km from a Stripe office).

Office-assigned Stripes spend at least 50% of the time in a given month in their local office or with users. This hits a balance between bringing people together for in-person collaboration and learning from each other, while supporting flexibility about how to do this in a way that makes sense for individuals and their teams.

A remote location, in most cases, is defined as being 35 miles (56 kilometers) or more from one of our offices. While you would be welcome to come into the office for team/business meetings, on-sites, meet-ups, and events, our expectation is you would regularly work from home rather than a Stripe office. Stripe does not cover the cost of relocating to a remote location. We encourage you to apply for roles that match the location where you currently or plan to live.

The annual US base salary range for this role is $209,800 - $314,800. For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. This salary range may be inclusive of several career levels at Stripe and will be narrowed during the interview process based on a number of factors, including the candidate’s experience, qualifications, and location. Applicants interested in this role and who are not located in the US may request the annual salary range for their location during the interview process.

Additional benefits for this role may include: equity, company bonus or sales commissions/bonuses; 401(k) plan; medical, dental, and vision benefits; and wellness stipends.

Office locations

Seattle

Remote locations

Remote in United States

Team

Infrastructure & Corporate Tech

Job type

Full time

Apply for this role