The Raspberry Pi AI HAT+ 2 is a new PCIe add-on for the Raspberry Pi 5 that combines a Hailo-10H accelerator with 8 GB of onboard LPDDR4X. The point of adding memory on the add-on itself is straightforward: keep more of a model’s working set close to the accelerator, rather than leaning on the Pi’s system memory across the PCIe link.
Previous Raspberry Pi-branded Hailo add-ons for the Pi 5 were framed mainly as vision accelerators for camera-first pipelines such as object detection, segmentation and pose estimation. AI HAT+ 2 still fits that “bolt-on NPU” pattern, but the onboard 8 GB memory shifts the workload mix toward models that are memory-hungry even when compute is available.
The board is positioned as a PCIe AI/ML accelerator add-on for Raspberry Pi 5, aimed at vision plus selected LLM/VLM workloads.
Some details still look provisional. For example, Hailo’s public Hailo-10H documentation commonly calls out “4 | 8 GB LPDDR4/4X” without stating a speed grade; if LPDDR4X-4267 is accurate, it’s a sensible choice for memory-sensitive GenAI workloads. Likewise, “similar to the 26 TOPS AI HAT+” is hard to validate from headline specs alone because the Hailo-8 family and Hailo-10H have different design goals.
Raspberry Pi 5 exposes a single-lane PCIe connection on its external interface, while accelerator modules are often designed with wider links in mind. If a workload requires frequent host↔accelerator transfers, bandwidth becomes part of the performance story. The onboard 8 GB is there specifically to reduce that traffic, so the real impact will depend on how well the chosen runtime keeps activations and intermediate data local.
For background on the Pi’s PCIe add-on ecosystem, we previously looked at HAT+ as a PCIe-based standard in our HAT+ PCIe overview, and at Raspberry Pi’s earlier Hailo bundle approach in our AI Kit coverage. Raspberry Pi’s own accessory page for the earlier AI HAT+ is here: Raspberry Pi AI HAT+.
AI HAT+ 2 is being pitched for “selected” generative and multimodal workloads: small instruct models, compact VLMs and speech-to-text. Early targets include Llama-3.2-3B-Instruct and Qwen2.5-VL-3B-Instruct, plus Whisper-class speech workloads (overview: Whisper; example in Hailo’s explorer: Whisper-Base). Hailo’s positioning for the 10H is about generative AI at the edge with a direct DDR interface to scale working sets; see the Hailo-10H product page.
If your priority is classic vision throughput, the existing AI HAT+ boards remain the simpler path. AI HAT+ 2 is about expanding what a Pi 5 can credibly run by putting memory on the accelerator side, which is often what determines whether small LLM/VLM and Whisper-class workloads feel practical in the first place.