Systems Tooling & Infrastructure - Member of Technical Staff

Callosum

Callosum

Other Engineering, IT

London, UK

Posted on May 21, 2026

Location

London

Employment Type

Full time

Location Type

On-site

Department

Intelligent Systems Engineering

About Us

Artificial intelligence scaled on a bet - that bigger models, more identical chips, and more data would keep delivering. As problems grow more complex and the requirements of intelligence more diverse, that bet is breaking down. The next era belongs to heterogeneous intelligence: diverse models on diverse chips, each with distinct strengths, co-evolving into systems of capability unreachable by any single model or accelerator.

Callosum is the Intelligent Systems company. We built the infrastructure to make that possible. Our co-evolution engine optimises simultaneously across workflows, agents, and silicon. We launched in early 2026 showing orders of magnitude improvements in performance and a shift in the cost-performance frontier that no single chip or model provider can provide.

We believe intelligence comes from the system, not the model.

We are scientists and engineers solving what others consider impossible. If you thrive on hard problems, and are passionate and energised by the scale of the challenge, we'd love to hear from you.

About the Role

Callosum believes that orders of magnitude improvements in AI systems will come through application-aware orchestration across heterogeneous hardware. We are building that vision: infrastructure that treats the full landscape of compute as a unified, co-evolving system, evolved beyond GPUs.

This role owns the developer experience of Callosum's stack, turning complex, low-level systems into something observable, debuggable, and usable by the rest of the team. You'll build the profiling, tracing, and developer tooling that defines how engineers interact with heterogeneous systems, enabling fast experimentation with new accelerators and complex inference workflows. You will own the abstractions, CLIs, and instrumentation that the engineering organisation is built on - primitives that don't yet exist for the next generation of compute infrastructure. As multi-stage and multi-agent workflows grow in complexity, your work is what keeps execution paths visible and tractable, ensuring the organisation can scale without losing insight or control.

What You’ll Build

  • Extend profiling and tracing tooling for new accelerators, including collection, compression, and visualisation of performance data

  • Develop CLI tools and automation wrappers that simplify common workflows - spinning up inference stacks, launching benchmarks, managing configurations

  • Converting prototypes of internal tooling into high-performance, scalable, accessible commands

  • Build tooling to support multi-agent serving workflows: request tracing across agent boundaries, pipeline visualisation, and debugging tools for complex inference DAGs

  • Create internal libraries and abstractions that let other teams move faster without reinventing shared infrastructure

What You Bring

  • Strong software engineering fundamentals: clean APIs, good error handling, sensible defaults, and clear documentation

  • Experience with profiling and tracing systems (perf, Nsight, Tracy, or similar) and a good sense of how to make trace data actionable rather than overwhelming

  • Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry, or equivalent) in varied infrastructure environments

  • Comfortable across the stack - from low-level trace collection to dashboards and developer-facing CLI tools