June 28, 2026 7 min read

Loihi 2 vs GPU Energy: Real Joule Numbers for Spiking Inference

"Neuromorphic chips use less energy" is true and also frequently misrepresented. Here is the actual per-spike vs per-MAC energy model, the real published constants behind it, and why almost every public comparison - including ours - is modeled, not measured on silicon.

Direct answer

GPUs spend roughly 50 picojoules per multiply-accumulate (MAC) operation regardless of whether that connection's activation matters; Loihi 2's published figures are on the order of about 1 picojoule per synaptic operation, but that cost is only paid when a neuron actually spikes. The energy advantage of a spiking chip scales with sparsity: at 80% sparsity (80% of neurons silent), the same network can spend roughly 50-300x less modeled energy than its dense GPU equivalent - but this is a calculation from published constants and measured spike counts, not a power-meter reading on physical Loihi 2 hardware.

Two different ways of spending energy

A conventional ANN running on a GPU computes a dense matrix multiply: every input connects to every output through a weight, and the GPU pays the energy cost of that multiply-accumulate (MAC) operation whether the resulting activation turns out to matter or not. The energy cost per MAC depends on the process node and memory hierarchy, but a commonly cited published figure for 7nm-class CMOS is on the order of 50 picojoules per MAC.

A spiking neural network running on event-driven hardware like Loihi 2 works differently: energy is spent per synaptic operation, and a synaptic operation only happens when a neuron actually fires a spike. Intel's published figures put this at roughly 1 picojoule per synaptic operation (the exact number varies by Intel's specific test configuration and operation type, and should be read as an order-of-magnitude figure, not a single universal constant). If most neurons stay silent most of the time - which is the normal, expected behavior of a well-trained SNN - most of the network simply never spends energy at all.

A GPU pays for every connection that exists. A spiking chip pays only for the connections that actually fire. The gap between those two accounting methods is where the energy story comes from.

The actual math behind an energy comparison

For a network with N total synapses run over T timesteps:

Dense GPU/ANN energy ≈ N × T × E_MAC - every synapse pays the MAC cost every timestep, regardless of activity.
Spike-driven energy ≈ N × T × sparsity_rate × E_SPIKE - only the fraction of synapses that actually carried a spike pay the (much smaller) per-spike cost.

This is exactly the calculation that produces headline numbers like "99% less energy" - and it's also exactly why those numbers need a sparsity figure attached to be meaningful. A network with low sparsity (most neurons constantly firing) gets a much smaller energy advantage than one with high sparsity, because the per-spike cost gets paid far more often.

A worked example, with real measured sparsity

Here is the calculation applied to a real measured case: an MLP-MNIST spiking network with 268,800 synapses, T=16 timesteps, and 80.5% measured spike sparsity (meaning roughly 1 in 5 synaptic connections actually carried a spike at any given timestep):

Backend	Energy model used	Modeled energy per inference
Spike-driven (Loihi 2 constants)	`N × T × 0.195 × 0.9pJ`	~0.73 µJ
Dense GPU/ANN equivalent	`N × T × 50pJ`	~215.04 µJ

That's roughly a 296x reduction in modeled energy for this specific network, at this specific measured sparsity. The number is real arithmetic on real measured sparsity and real published constants - but it is not a measurement of actual joules consumed by physical silicon, and shouldn't be cited as one.

Why "modeled" and "measured" are not the same claim

This distinction matters enough that NeuroBench, the standard benchmark suite for neuromorphic computing (Yik et al., Nature Communications, 2025), builds it directly into its methodology. NeuroBench's System Track requires actual measured power and energy - on-chip instrumentation for hardware like Loihi 2 or Xylo, external multimeters for CPU - as the only acceptable evidence. Its Algorithm Track, which is hardware-independent, accepts modeled estimates like the calculation above explicitly as a lower-fidelity proxy, not as a substitute for measurement.

Almost every public energy comparison you'll find for spiking neural networks - including the worked example above - is Algorithm Track-grade evidence: a real calculation from published constants and measured spike counts, not a power-meter reading on a chip. That's not dishonest as long as it's labeled correctly. It becomes misleading exactly when "296x less energy, modeled from published constants" gets shortened to "296x less energy" with the methodology dropped.

How to label your own energy numbers correctly

If you multiplied a published per-spike or per-MAC constant by a measured spike count or operation count: say "modeled energy" and name the constants used.
If you put a multimeter or on-chip power sensor on actual hardware running the actual workload: say "measured energy" and name the instrumentation.
Never present the first as the second. A modeled number is real and useful for comparing architectures and sparsity levels - it is not a substitute for a measured number when the claim is about actual power draw.

Sources & further reading

Intel published Loihi 2 per-synaptic-operation energy figures
NeuroBench (Yik et al.), Nature Communications, 2025 - Algorithm Track vs. System Track methodology, DOI 10.1038/s41467-025-56739-4
NeuroCUDA ROS2 multi-backend energy benchmark node, github.com/Krishnav1/neurocuda

Frequently asked questions

How much energy does Loihi 2 use per spike?

Intel's published figures for Loihi 2 are on the order of roughly 1 picojoule per synaptic operation, depending on the specific operation and configuration. This is a per-spike, event-driven cost - energy is only spent when a neuron actually fires, unlike a GPU's dense matrix multiply which spends energy on every connection regardless of activity.

Are neuromorphic energy savings claims measured or estimated?

Many public energy comparisons, including most academic SNN papers, are modeled: they take a published per-spike or per-MAC energy constant and multiply it by measured spike counts or operation counts, rather than measuring actual power draw on physical hardware with a multimeter or on-chip instrumentation. NeuroBench's System Track explicitly requires measured energy; its Algorithm Track accepts modeled estimates as a lower-fidelity proxy.

Why do GPUs use more energy per inference than neuromorphic chips for sparse workloads?

A GPU's dense matrix multiplication computes every weight-activation product regardless of whether that activation is meaningfully active, spending the same energy whether 5% or 95% of neurons are firing. A spike-driven architecture like Loihi 2 only spends energy on synaptic operations triggered by an actual spike, so high sparsity (most neurons silent most of the time) translates directly into proportionally lower energy, in the modeled accounting.