Description
The AgentForce Data Science team powers the core Large Language Models (LLMs) and reasoning engines behind Salesforce’s production-grade AI agents. Our work sits at the critical junction of generative AI research and massive-scale engineering, enabling trustworthy, high-performance AI systems across sales, service, marketing, and analytics.
We are looking for a visionary technical leader to architect the next generation of our AI platform—bridging the gap between cutting-edge model development and robust, scalable production infrastructure.
Role Overview
We are seeking an Architect, Applied Science to define the technical vision and system design for AgentForce’s AI capabilities. In this role, you will not just develop models; you will design the complex ecosystem in which they operate. You will determine how we orchestrate complex agentic workflows, optimize inference at scale, and architect feedback loops that enable continuous learning.
You will act as the technical glue between the Research, Applied Science, Product and Engineering teams, ensuring that our scientific breakthroughs are translated into viable, cost-effective, and low-latency architectural patterns.
Key Responsibilities
System Architecture & Technical Strategy
Define the end-to-end architecture for AgentForce’s model serving, inference orchestration, and agentic reasoning loops.
Make high-stakes technical decisions regarding "build vs. buy," model sizing, context window management, and retrieval-augmented generation (RAG) strategies.
Architect scalable pipelines for continuous learning (RLHF/RLAIF) that integrate seamlessly with production traffic without compromising latency or stability.
Design systems for multi-turn agent state management, memory persistence, and tool invocation (function calling).
Product & Application Architecture
Own the end-to-end architectural design of AgentForce AI capabilities from product requirements through model design, system implementation, and production rollout.
Translate product use cases (e.g., agent experiences, workflows, UI features) into concrete system architectures, including APIs, service contracts, and model interaction patterns.
Define reference architectures for AI-powered applications (web, backend services, agent runtimes) that standardize how products integrate with AgentForce models.
Partner with Product Engineering to ensure AI capabilities are designed for usability, reliability, and developer experience, not just model quality.
Applied Science Leadership
Translate abstract research concepts into concrete engineering specifications. (e.g., "How do we architect a system that supports Speculative Decoding or KV-Caching at our specific scale?")
Lead the design of evaluation frameworks that move beyond academic benchmarks to measure real-world system performance (latency, cost-per-token, reliability).
collaborate with scientists to optimize models for deployment (quantization, distillation, pruning) without sacrificing reasoning capabilities.
Cross-Functional Collaboration
Serve as the primary architectural liaison between Applied Science, Product Engineering, Infrastructure/AI Engineering, and Product Management, ensuring cohesive end-to-end solutions.
Act as a technical partner to product teams to shape roadmaps, feature designs, and architectural trade-offs involving AI capabilities.
Establish best practices for MLOps, model versioning, and safe rollout strategies (canary deployments, shadow testing) specific to GenAI.
Mentor Principal Scientists and Staff Engineers on system design principles and architectural patterns.
Required Qualifications
Education & Experience
PhD or Master’s in Computer Science, AI, Machine Learning, or Distributed Systems.
10+ years of technical experience, with a specific focus on deploying ML models at scale.
Proven experience acting as an Architect or Principal-level technical lead for large-scale AI or data platforms.
Technical Expertise
Experience designing and building production-grade AI-powered applications or platforms.
Experience defining public/internal APIs, SDKs, and service interfaces for ML/AI capabilities consumed by product teams.
Familiarity with frontend–backend–model interaction patterns for low-latency user-facing AI experiences.
Deep Learning & LLMs: Profound understanding of Transformer architectures, attention mechanisms, and the math behind LLMs (not just API usage).
Inference & Optimization: Experience with high-performance inference serving (e.g., vLLM, TensorRT-LLM, TGI, Triton) and optimization techniques (quantization, LoRA adapters, paged attention).
Distributed Systems: Strong background in designing distributed systems, microservices, and event-driven architectures (Kafka, gRPC, Kubernetes).
Coding: Advanced proficiency in Python and familiarity with C++ or CUDA is a strong plus.
Architectural Competencies
Ability to design for constraints: balancing model performance (accuracy) against system constraints (latency, throughput, COGS/compute costs).
Experience designing architectures for "Agentic" workflows (planning, reasoning, tool use, memory).
Familiarity with vector stores and search infrastructure (e.g., FAISS, Weaviate, Elasticsearch) for RAG implementations.
Preferred Qualifications
Experience architecting platforms for Reinforcement Learning (RL) in production environments.
Ability to map product requirements → system architecture → model design → infrastructure choices.
Strong intuition for user experience constraints (latency, streaming, partial results, fallbacks).
Experience balancing feature velocity vs. platform stability.
Active contributor to open-source LLM infrastructure projects (e.g., Ray, LangChain, Hugging Face).
Experience with safety guardrails and governance architectures for Enterprise AI.
Why Join AgentForce?
Architect the Future: Move beyond experimentation and design the structural foundation of the world’s leading Enterprise AI agent platform.
Scale & Impact: Your architectural decisions will directly impact millions of users and process billions of tokens.
Hybrid Innovation: Work at the cutting edge where deep research meets high-scale distributed systems.
For roles in San Francisco and Los Angeles: Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.
