MPU bakes neural network weights directly into silicon — no memory, no bottleneck. A full multimodal LLM on a chip smaller than a SIM card.
Any model, any framework
NN-optimized datapath
On-chip SRAM, deterministic
Transformer-only silicon
Weights are the circuit. Zero memory.
Every inference chip today reads model weights from memory. That read is the bottleneck.
MPU skips it. We encode weights as physical wiring on the chip — no DRAM, no SRAM, no memory wall. The model doesn't run on the chip. It is the chip.
One chip, one model. When models are cheap and inference is expensive, dedicated silicon wins.
Models are stable enough for silicon. Agent frameworks reach human performance in targeted domains. Architectures aren't changing every quarter — it's time to commit to hardware.
60%+ of AI compute is inference. H100s cost $2–7/hr and are prohibitive for edge. The bottleneck isn't training — it's running models at scale.
Billions of edge devices need local AI. AR glasses, drones, robots, vehicles — all need real-time inference at under 5 watts. No existing chip can do this.
First-mover window is open. Etched ($5B), Groq (~$20B), and Taalas ($219M) prove the thesis — but all target cloud. No one has shipped a model-specific edge ASIC.
A complete multimodal LLM — text, speech, and vision — on a single chip smaller than a nano SIM card.
Text, speech, and vision share a single semantic core hardwired in metal. Frontends and task heads are lightweight and swappable. Adding a modality costs a small endpoint — not another chip.
Per-block power gating adapts to the task. A simple voice command uses a fraction of the chip. Full multimodal reasoning lights up everything — still under 5 watts.
120 FPS, <10ms latency, <5W, ~100mm². No existing GPU or NPU meets all four constraints simultaneously.
On-board AI currently cuts flight time by 80%. Ultra-low power inference changes the trade-off entirely.
Full-stack from model to silicon. Previously shipped production AI chips at top tech companies across autonomous driving, mobile, and datacenter.