Doubleword
    About

    What becomes possible when inference is 100x cheaper?

    That's the question we're building toward. And the engineering problem we're solving every day.

    Who we are

    Doubleword is a London-based team of researchers and systems engineers obsessed with inference efficiency. We've built our own inference engines, published research, optimized kernels in CUDA and Triton, and deployed production infrastructure inside regulated enterprises. We're interested in what becomes possible when inference is 100x cheaper, and we spend all of our efforts in making that a reality.

    We're hiring

    We're looking for engineers who care deeply about performance and want to work on problems at the intersection of systems engineering and inference infrastructure.

    See open roles

    Ready to run inference at scale?