Site Reliability Engineer, Compute
Vercel
About Vercel:
Vercel’s Frontend Cloud provides the developer experience and infrastructure to build, scale, and secure a faster, more personalized web. Customers like Under Armour, eBay, The Washington Post, Johnson & Johnson, and Zapier use Vercel to build dynamic user experiences on the web.
At Vercel, our mission is to enable the world to ship the best products and that goes hand in hand with creating an environment where you can do the best work of your life.
About the Role:
We are looking for experienced SREs help grow our small team into a global footprint that can provide expert engagement across our core serving systems. As an early member of the SRE team you will report directly to the Director of Managed Infrastructure and play a foundational role in expanding our SRE practice, integrating reliability principles more deeply into Vercel’s engineering process as we expand.
Within the team your focus will be on enhancing our Compute infrastructure in close partnership with our EU-based developer team. You will design for reliability and performance while managing for risk as we introduce major innovations to our compute stack.
What You Will Do:
Ensure that our products are built for reliability and scale by engaging in the end-to-end design, development, and deployment of new software.
Drive continuous risk mitigation and reduction through direct involvement in incident management, blameless postmortems, and follow-ups.
Drive measurable improvements to the reliability, performance, and efficiency of our production systems through instrumentation, analysis, and implementation of engineering improvements.
Devise repeatable, low-toil operational practices through the development of automated systems for software delivery, system failover, and capacity management.
About You:
At least 3 years experience in an SRE role, or at least 5 years experience in an adjacent role (e.g. platform engineering), operating in a scaled environment.
Firm grasp of the SRE philosophy and mindset, with practical experience working on or directly with SRE teams that have proactively engaged in system design and improvement.
Strong sense of accountability and commitment to problem solving, backed by a curiosity to dig deep and identify root causes.
Willingness to proactively engage with development teams to influence the course of software design and operational practices.
Capability to manage risk, make decisions, and exhibit sound judgment
Demonstrated ability to plan and deliver long-term projects
Experience with distributed system design
Experience with Containers, Virtual Machines, and Linux
Bonus: Experience working with Terraform and/or Golang
Benefits:
Great compensation package and stock options.
Inclusive Healthcare Package.
Learn and Grow - we provide mentorship and send you to events that help you build your network and skills.
Flexible Time Off - Flexible vacation policy with a recommended 4-weeks per year, and paid holidays.
Remote Friendly - Work with teammates from different time zones across the globe.
We will provide you the gear you need to do your role, and a WFH budget for you to outfit your space as needed.
Vercel is committed to fostering and empowering an inclusive community within our organization. We do not discriminate on the basis of race, religion, color, gender expression or identity, sexual orientation, national origin, citizenship, age, marital status, veteran status, disability status, or any other characteristic protected by law. Vercel encourages everyone to apply for our available positions, even if they don't necessarily check every box on the job description.
#LI-PF1
Perks:
- Generous Gear Credit
- Flexible Time Off
- Stock Options
- Remote Friendly
Apply Now.
Tell us why you’d be a good fit for the Site Reliability Engineer, Compute role.