That's the question we're building toward. And the engineering problem we're solving every day.
Doubleword is a London-based team of researchers and systems engineers obsessed with inference efficiency. We've built our own inference engines, published research, optimized kernels in CUDA and Triton, and deployed production infrastructure inside regulated enterprises. We're interested in what becomes possible when inference is 100x cheaper, and we spend all of our efforts in making that a reality.
We're looking for engineers who care deeply about performance and want to work on problems at the intersection of systems engineering and inference infrastructure.
See open roles