Lead Data Scientist

Airkit

Airkit

Data Science
San Francisco, CA, USA
Posted on Dec 7, 2025

Description

We are seeking an experienced Salesforce Software Quality Lead Member of Technical Staff (LMTS) with a strong background in Artificial Intelligence and Machine Learning (AI/ML), with a focus on delivering solutions based on Large Language Models (LLMs). Prior experience as a member of a Data Science, ML Science, or ML Engineering team is a plus. This role requires a deep understanding of these areas, and the ability to eventually deliver effective solutions that are well integrated into the Salesforce technology stack and ecosystem.

As a Quality LMTS, you will play a key role in defining, implementing, and maintaining the quality strategy for Salesforce products, with a focus on integrating cutting-edge AI technologies. You will collaborate with cross-functional teams, including product development, data science, and AI teams, to ensure that Salesforce’s AI-driven solutions meet the highest quality standards for enterprise applications.

Key Responsibilities:

  • Drive Innovation & Adoption: Identify opportunities for integrating AI/ML solutions into existing workflows, propose new ones that are high-quality and efficient, and generally positively influence engineering decision-making by communicating and evaluating solution options, and facilitating agreement among key stakeholders. Ensure that we are continuously raising our standard of engineering excellence and producing high business impact.
  • Continued Excellence in AI/ML: Be tuned into developments in the field of AI/ML, by tracking publicly available models, frameworks, libraries and publications, while being focused on their practical application and relevance to Salesforce.
  • Technical Leadership on Proof of Concepts: Influence the direction of R&D as a whole through your technical, process, or product knowledge leadership. Industry-expert level in understanding of quality concerns and validation techniques. Able to drive behaviors and initiatives that both focus on quality and throughput.
  • Collaborate on AI Model Improvements: Work closely with data scientists and machine learning engineers to provide feedback on LLM performance and recommend improvements based on quality testing results.
  • Evaluate LLM Performance: Assess and optimize Large Language Models (LLMs) for quality, accuracy, relevance, and safety in various use cases.
  • Human Evaluation: Oversee human evaluation processes for subjective quality dimensions, such as response engagement, user safety, and contextual accuracy.
  • Define Quality Strategies: Develop and implement robust quality strategies for Salesforce’s SaaS products, focusing on AI/ML and LLM integrations.
  • Multiplier: Provide technical leadership for critical areas that significantly impact customer success. You have depth of expertise in key technologies and you are often consulted on the design and delivery of new solutions. You bring new best practices to R&D and actively ensure that they are being used. Continues to deepen and widen their understanding of the application and drives discussion of sometimes complex or controversial issues in open forums such as concept reviews, VAT discussions, and cross team discussions.
  • Cross-Platform Collaboration: Use your understanding of customers’ needs across industries and multiple technology landscapes (CRM, Modern Data Stack, Analytics & BI, CRM and AI) to develop solutions across Salesforce's technology stack. Work in a consultative fashion to improve communication, teamwork and alignment among teams inside and outside of the organization.
  • Mentor and Organization Builder: Be a cornerstone in the infrastructure of technical expertise represented by the organization's senior engineers. You challenge and engage with them to develop their expertise and leadership contributions. You are a key resource for engineers seeking to advance to the next level.

Required Skills and Qualifications:

  • AI & Machine Learning Expertise: Strong understanding of AI/ML principles, particularly in evaluating and optimizing model performance, response consistency, and safety. Although the focus is LLMs, the candidate is also expected to be familiar with non-LLM AI/ML techniques and practices such as model selection, picking of appropriate metrics, embeddings, areas such as fairness and explainable AI, traditional models such as Random Forest, Gradient Boosting, etc.
  • LLM Expertise: Hands-on experience evaluating Large Language Models (LLMs) for quality response, including familiarity with GPT-like models and AI evaluation frameworks.
  • Focus on Delivery: Ability to deliver practical solutions for an active user base, which they are expected to follow-up with enhancements and improvements, as needed.
  • Collaboration & Communication: Excellent communication skills, with the ability to work cross-functionally with product managers, AI/ML engineers, and data scientists.
  • Problem-Solving & Analytical Skills: Strong analytical and problem-solving abilities with a focus on identifying quality gaps and driving improvement.
  • Preferred Qualifications:
  • Prior experience in a Data Science or a Machine Learning Science or a Machine Learning Engineering team.
  • Experience in ethical AI practices and evaluating models for fairness, bias, and user safety.
  • Strong understanding of public cloud infrastructure - AWS/Azure/GCP; Certification in Salesforce.
  • Can Architect, design, implement, test and deliver test frameworks for highly scalable products.

For roles in San Francisco and Los Angeles: Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.