Lead Site Reliability Engineer (Security)

Jifflenow
Jifflenow

Software Engineering

India · Gurugram, Haryana, India

Posted on Jun 24, 2026
Overview:

Cvent is a leading meetings, events, and hospitality technology provider with more than 5,000+ employees and 24,000+ customers worldwide, including 60% of the Fortune 500. Founded in 1999, Cvent delivers a comprehensive event marketing and management platform for marketers and event professionals and offers software solutions to hotels, special event venues and destinations to help them grow their group/MICE and corporate travel business. Our technology brings millions of people together at events around the world. In short, we’re transforming the meetings and events industry through innovative technology that powers the human connection.
Cvent's strength lies in its people, fostering a culture where everyone is encouraged to think like entrepreneurs, taking risks and making decisions confidently. We value diverse perspectives and celebrate differences, working together with colleagues and clients to build strong connections.

AI at Cvent: Leading the Future:
Are you ready to shape the future of work at the intersection of human expertise and AI innovation?
At Cvent, we’re committed to continuous learning and adaptation—AI isn’t just a tool for us, it’s part of our DNA. We’re looking for candidates who are eager to evolve alongside technology. If you love to experiment boldly, share your discoveries, and help define best practices for AI-augmented work, you’ll thrive here. Our team values professionals who thoughtfully integrate AI into their daily work, delivering exceptional results while relying on the human judgment and creativity that drive real innovation.
Throughout our interview process, you’ll have the chance to demonstrate how you use AI to learn, iterate, and amplify your impact. If you’re excited to be part of a team that’s leading the way in AI-powered collaboration, we’d love to meet you.

Disclaimer: Beware of Recruitment Scams – Legitimate Cvent recruiting communications will always come from an official ‘[email protected]’ email. We never request any payments or ask for sensitive personal or financial information via chat or social media platforms. For more information, please visit: https://www.cvent.com/en/notice-recruitment-fraud

About the Role:
Site Reliability is about combining development and operations knowledge and skills to help make the organizationbetter. Whether you have a development background and are interested in learning more about operations andsecurity or have an operations or security background and are interested in developing internal tools andautomation – Cvent SRE can benefit from your skillsets. Ultimately, we are looking for passionate people who lovelearning, love technology and always want to make things better.
As a Lead SRE on the SRE Security team, you will be responsible for mentoring others and helping Cvent to bothenvision and achieve our DevSecOps goals. We are looking for someone with the drive, ownership and ability totake on challenging problems, both technical and process related, in a dynamic, collaborative and highlydistributed, multi-disciplinary team environment. You will use your background as a generalist to work closely withproduct development teams, Information Security, Cloud Infrastructure and other SRE teams to ensure the effectiveand efficient maintenance of our platforms' security. You must be able to see the big picture and workcollaboratively with teams to solve hard multi-disciplinary problems.
Technical expertise in topics such as cloud operations, the software development lifecycle, and securityvulnerability management will be of great help to you. However, excellent soft skills in mentorship, communicationand the ability to drive alignment are must haves. We use SRE principles such as blameless postmortems and afocus on automation to ensure we're constantly improving our knowledge and maintaining a good quality of life.
Overall, we're passionate about continuous improvement, learning and participating in dynamic day to day workwhere success is rewarded with recognition and upward mobility.


In This Role, You Will:

  • Enlighten, Enable and Empower a fast-growing set of multi-disciplinary teams, across multiple applications andlocations.
  • Tackle complex development, automation and business process problems. Champion Cvent standards and bestpractices.
  • Ensure the scalability, performance, and resilience of security related systems and processes.
  • Work with product development teams, Information Security, Cloud Automation and other SRE teams to ensurea holistic understanding of security concerns and their effective and efficient identification and resolution.
  • Identify recurring problems and anti-patterns in development, operational and security processes.
  • Develop build, test and deployment automation that seamlessly targets multiple on-premises and AWS regions.
  • Give back by working on and contributing to Open-Source projects.

Here's What You Need:

Must Have Skills:

  • 7–10 years of hands-on experience in Site Reliability Engineering
    — with a demonstrated track record ofowning reliability, security, and operational excellence at scale in production environments.
  • Excellent communication skills and a track record of driving alignment across multi-disciplinary teams.
  • A passion for and track record in making things better for your peers.
  • Hands-on experience with AWS WAF — including rule authoring, rate-based rules, bot control integration, WAFrule group management, and multi-product WAF sharing strategies (e.g., managing WAF rule limits acrossapplications sharing the same WebACL).
  • Experience designing and implementing DDoS protection using AWS Shield Advanced — including transitioningendpoints from count to block mode, building observability solutions (Lambda + CloudWatch alarms), and self-service enablement for product teams.
  • Experience with bot mitigation strategies — including AWS Bot Control, silent challenge / token-based trafficclassification (verified humans, verified bots, unknown traffic), JA4+ASN fingerprinting, and evaluation of third-party bot mitigation vendors (e.g., Datadome).
  • Experience managing AWS services and operational knowledge of running applications in AWS — ideally viaautomation and Infrastructure as Code (IaC) using CloudFormation or CDK.
  • Strong understanding of CI/CD pipelines — experience with Jenkins or equivalent, PR-based deploymentworkflows, build/test/deploy automation, and troubleshooting pipeline failures in distributed environments.
  • Incident management experience — able to act as IC, write clear incident summaries, drive RCA, andcoordinate resolution across teams under pressure.
  • Change management discipline — ability to communicate changes proactively to stakeholders, documentrollout strategies, and manage phased production deployments with rollback plans.
  • Fluent in at least one scripting language such as TypeScript, JavaScript, Python, Ruby, or Bash.
  • Experience with SDLC methodologies (preferably Agile).

AI & Automation Literacy (Must Have):
Practical understanding and hands-on exposure to AIfundamentals as applied to SRE and operational workflows:

  • Prompt Engineering — ability to design effective prompts for LLMs to assist with incident analysis, RCAgeneration, runbook creation, and on-call triage.
  • Retrieval-Augmented Generation (RAG) — basic understanding of RAG patterns; ability to leverage orcontribute to RAG-based internal tools that surface relevant runbooks, past incidents, and knowledgebase articles during operational events.
  • AI-assisted Workflow & Process Automation — experience using or building AI-powered automationsin operational contexts, such as automated incident summarization, alert enrichment, change riskassessment, or post-mortem drafting using LLM integrations (e.g., via MCP tools, Slack bots, or custompipelines).

Good to Have Skills:

  • Disaster recovery planning and execution — experience with multi-region failover, DR runbooks, and recoverytime / recovery point objective (RTO/RPO) management.
  • Experience managing CloudFront distributions, API Gateways, and ALBs as part of a layered security posture.
  • Experience with APM, monitoring and logging tools (Datadog, New Relic, Splunk).
  • Familiarity with security assessment tools and methodologies:
    • Cloud Security Posture Management (CSPM)
    • Infrastructure Vulnerability Scanning
    • Static Code Analysis
    • Software Composition Analysis (SCA)
    • Static, Interactive and Dynamic Application Security Testing (SAST, IAST and DAST)
    • Runtime Application Self Protection (RASP)
  • Good understanding of containerization concepts — Docker, ECS, EKS, Kubernetes.
  • Experience managing 3-tier application stacks.
  • Understanding of basic networking concepts.
  • Familiarity with risk assessment and management concepts and practices.
  • Experience with IaC tools such as CloudFormation, CDK (preferred), or Terraform.