Member of Technical Staff- Research Engineer (Harness Engineering and Agentic Orchestration)

Tessl

Tessl

IT

London, UK

Posted on May 28, 2026

Location

London Office

Employment Type

Full time

Location Type

Hybrid

Department

R&D

Tessl is a fast-growing Series A startup based in London, founded by Guy Podjarny. We’ve raised over $100M from world-class investors including Index Ventures, Accel, GV, and Boldstart, and in 2025 we were ranked #2 in Sifted EU’s B2B SaaS Rising 100 and #20 in Sifted's AI 100.

At Tessl, we are building the context layer for AI coding agents, and a platform for AI-native software development. As an early member of the team, you’ll help shape how we build, scale and support a company operating at the edge of AI and software development.

Overview of the role

We're hiring a Research Engineer to join our AI Research (AIR) team. You'll work on the components that make the outer loop real: how agent harnesses orchestrate model behaviour, how we evaluate what's actually working, how pipelines turn production traces into the next round of improvement, and how we diagnose the failure modes that matter to real users.

These aren't four separate workstreams — they're parts of one system, and we want people who see them that way.

We expect you to sit close to customers — joining calls, watching sessions, reading traces — and to let real workflows shape your research priorities. You'll have meaningful autonomy and the resources to run substantial experiments where the bar for success is shipped impact.

You'll report to our AI Research Lead, and collaborate closely with engineering, product, and design.

What we're looking for

We're explicitly building coverage across four skill areas. You don't need to be strong in all of them — but you should bring depth in at least one:

  • Agent harness and orchestration design — how tools, context, and control flow combine to make a useful agent.

  • Agentic eval methodology — task and repo-level evals, dataset curation, the craft of measuring what actually matters.

  • Outer-loop and pipeline thinking — feedback loops, training-data flywheels, bandit-style optimisation, anything that goes beyond a single agent session.

  • Failure-mode analysis — instrumenting agents, reading traces at volume, surfacing patterns engineering can act on.

Essential

  • 4+ years shipping AI/ML products in a startup or applied industry setting, with recent hands-on experience with LLMs and agentic systems.

  • Demonstrated depth in at least one of the four skill areas above.

  • Strong product and customer instincts: comfort joining customer calls, watching session recordings, and letting real workflows shape what you work on.

  • Sharp evaluation judgement: benchmarks where they exist, vibes and quick prototypes where they don't, and the taste to know which is appropriate.

  • Experience building datasets for evaluation or training, including the pipeline work that goes with it.

  • Deeply curious about agents and excited about reshaping how software is built.

Nice to have

  • A Masters or PhD in a relevant computational field.

  • Direct experience with coding agents or code-generation systems.

  • Background in RL, bandits, or other outer-loop optimisation frameworks applied to LLMs.

  • Experience building synthetic data, dataset infrastructure, or internal tooling that other engineers actually used.

  • A project you can show us (GitHub links welcome) and a thoughtful answer to "Why Tessl?"

What you'll do

No two weeks will look the same. A flavour:

  • Sit in on a customer session, understand how their agents are failing, design an eval that captures it, and drive a fix through to shipped improvement.

  • Close a piece of the outer loop end to end: production signal in, dataset out, eval scored, harness change shipped, metric moved.

  • Own a slice of our eval infrastructure: dataset curation, harness configuration, runner, analysis, and the comms back to engineering.

  • Prototype a new harness or context configuration and measure whether it actually moves the needle on real customer tasks.

  • Dig through pages of agent traces, build the tooling you need to make sense of them, and brief the team on what you found.

  • Partner with product and engineering on near-term shipping problems by bringing research rigour.

  • Pull a recent paper apart, work out what's actually transferable to our platform, and turn it into a concrete experiment.

You’ll be successful if…

In your first 3 months, you might have shipped a new eval suite for a real customer workflow, improved an agent harness based on trace analysis, or built a pipeline that turns production failures into reusable test cases.

Salary and benefits

Competitive salary commensurate with experience. Health insurance extending to partners and dependents, pension contributions, and the rest of what you'd expect.

Our office is a couple of minutes from King's Cross — pet friendly, with regular team lunches, drinks, and socials. We're hybrid, with Monday, Tuesday, and Thursday as the primary in-office days.

Application process

  • Intro call to understand "Why Tessl?" and to tell you a bit about us.

  • A call with our AI Research Lead to understand your ways of working and how you use agents.

  • A 4 hour technical take-home exercise extending our one-shot implementation.

  • A half-day on-site session including whiteboarding and hands-on activities.

  • Leadership chats with our Head of People, Head of Engineering and CEO.


We care deeply about the warm, inclusive environment we’re building at Tessl and we value diversity – we welcome applications from those typically underrepresented in tech. If you like the sound of this role but are not totally sure whether you’re the right person, do apply anyway!

Learn how we think and work