Senior Site Reliability Engineer
Xero
Software Engineering
San Mateo, CA, USA
Posted on Jan 14, 2025
Xero is a beautiful, easy-to-use platform that helps small businesses and their accounting and bookkeeping advisors grow and thrive.
At Xero, our purpose is to make life better for people in small business, their advisors, and communities around the world. This purpose sits at the centre of everything we do. We support our people to do the best work of their lives so that they can help small businesses succeed through better tools, information and connections. Because when they succeed they make a difference, and when millions of small businesses are making a difference, the world is a more beautiful place.
At Xero, we’re here to make running a business beautiful. By making small businesses more efficient every day, connecting them with big business technology and empowering a community behind them, their potential is limitless. When that happens, we’re not only helping small businesses, we’ll be building a stronger economy that can change the world.
What You'll Do:
- As a senior member of the Product SRE team, you will assume responsibility for the deliverables of the embedded product SRE for your assigned product. The team will consist of dedicated world class SRE engineers who are embedded into product teams to drive enduring reliability, world class Observability and high performing services. The lead senior engineer will provide technical proficiency to ensure the teams can drive reliability across the product landscape.
- This position requires a highly technical senior engineer with a strong Engineering background, deep experience in SRE and a passion for enabling high performing products. As a seasoned and relentless engineer, they will contribute to the Product SRE strategic objectives and contribute to the ongoing transformation of the Xero SRE culture. As an expert communicator, this senior engineer will manage change and ensure the value of robust systems is communicated clearly across the business.
How You'll Make Impact:
- Provides technical ownership to ensure completion of the day to day deliverables of a dedicated product SRE team of highly experienced Site Reliability Engineers, who will take ownership of reliability.
- Demonstrates high technical proficiency in all aspects of reliability, observability, operability and performance of the product you are assigned to through continued delivery of high quality solutions.
- Builds a long term relationship with product engineering teams to ensure they can deliver on system reliability that is continuously improving Demonstrates an ongoing dedication to “automation first” and ensures quality delivery at all times.
- Has experience of building and delivering an Error Budget culture associated with consistent breaches of SLA/ SLO.
- Ensures observability best practice is implemented across products to ensure fast detection of impactful events. Contributes to a culture of continuous improvement to ensure product reliability is continuously improving and impact of issues are reduced.
- Provides ongoing training as required across the business to ensure reliability requirements are well understood and incorporated into product designsContributes to the deliverables a team of engineers who continuously raise the bar of quality in deliverables and outcomes for the SRE organization.
- Contributes to, and actively monitors, quality standards for the SRE team and reports regularly on its adherence.
What You'll Bring With You:
- Demonstrable experience of delivery of Reliability systems and solutions in a highly technical team Extremely technical with strong engineering and hands-on SRE Bacground.
- Deep and proven experience in providing technical leadership and mentoring in world class embedded SRE teams in a fast growing companyProven track record in senior technical roles, with the ability to inspire and empower cross-functional teams to achieve operational excellence and drive continuous improvement.
- Has a strong product mindset and can understand and anticipate customer needs. Obsessed with delivering a high quality and highly stable customer experience. Drives a culture of customer-first thinking. 24/7 focus on incident response and remediation.
- Broad and deep technical understanding of modern cloud technologies (AWS, Azure, GCP) and their incident and problem management practices, particularly high-growth, high-availability SaaS-based transactional systems.
- Is highly data and analytically drivenProficiency in one or more object-oriented programming languages (C#, JavaScript, Java, Python etc) or experience with infrastructure-as-code (e.g. Terraform, Cloudformation).
- Experience using observability tooling to monitor the health of a highly distributed systems.
- Experience with agile software development methodology including continuous integration and delivery.
Preferred Skills
- Experience with designing, developing and operating distributed systems and large scale software systemsStrong experience delivering technical initiatives in an operational, site reliability or platform engineering capacity.
- The ability to solve engineering challenges outside of your own team, including using influence rather than authority to enact change.
- Demonstrated experience in reliability concepts like capacity management, autoscaling, deployment and release safety, software strategies for reliability, fault tolerance and graceful failure.
- Experienced in implementing customer focused Service Level Objectives (SLOs).
- Experience using software engineering to solve operational and reliability challenges.
- Understanding of human factors, safety science and resilience engineeringExperience working in environments with advanced security and networks.
Why Xero?
Diversity of people brings diversity of thought, and we like that. Our human-first culture of respect, fairness, and inclusion is what helps Xeros thrive and work and beyond. Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, employee resource groups, wellbeing programming and allowances, medical, dental, vision, and disability insurance, fertility and family forming financial support, 401k contribution matching, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices with snacks and break areas, flexible working, career development and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.
Research has shown that women and underrepresented groups are less likely to apply to jobs unless they meet every single competency or experience. If you are excited about this role, but your past experience doesn't align perfectly, we encourage you to apply anyway. You could be just the right person for this role and Xero. If you have any support or access requirements, we encourage you to advise us at time of application and throughout the interview process.