109 Views

“GPU Dominance in Edge AI Over NPUs”

LinkedIn Facebook X
May 30, 2025

Get a Price Quote

Artificial Intelligence (AI) isn’t just a technological breakthrough — it’s a permanent evolution in how software is written, understood, and executed. Traditional software development, built on deterministic logic and largely sequential processing, is giving way to a new paradigm: probabilistic models, trained behaviours, and data-driven computation. This isn’t a fleeting trend. AI represents a fundamental and irreversible shift in computer science — from rule-based programming to adaptive, learning-based systems that are increasingly integrated into a wider range of computing problems and capabilities.

This transformation demands a corresponding change in the hardware that powers it. The old model of building highly specialised chips for narrowly defined tasks no longer scales in a world where AI architectures and algorithms are in constant flux (as they are and forever will be). To meet the evolving needs of AI — especially at the edge — we need compute platforms that are as dynamic and adaptable as the workloads they run.

That’s why general-purpose parallel processors, GPUs, are emerging as the future of edge AI, displacing specialised processors like Neural Processing Units (NPUs). It’s not just a question of performance — it’s about flexibility, scalability, and alignment with the future of software itself.

The Makimoto Wave and the Return of Flexibility

To understand this shift, we need only look to Makimoto’s Wave, a concept proposed by Tsugio Makimoto of Hitachi. It describes the oscillation between standardisation and customisation in computing over time — driven by changes in market demand, technological innovation, and software complexity.

(Makimoto’s Wave shows a historical pattern of oscillating priorities in computing — from flexibility to specialisation and back again. AI’s current trajectory marks a swing back toward flexibility and general-purpose platforms.)

This model maps directly onto the evolution of AI hardware. In AI’s early years, when workloads were well-defined and stable, NPUs and other fixed-function accelerators made sense. They were highly optimised for specific tasks like image classification or object detection using CNNs.

But now, AI is evolving rapidly. We’ve moved beyond simple, static models into an era of hybrid networks, transformer based architectures, foundation models, and continual innovation. Hardware that was custom-built for last year’s AI simply can’t keep pace with this velocity.

We are, once again, at a Makimoto inflection point — moving from specialisation back toward general purpose compute as the scalable, adaptable solution.

AI Is a parallel compute problem — not a specialised one

AI is fundamentally a problem of parallelism. Deep learning relies heavily on concurrent operations — matrix math, tensor multiplications, vector operations — precisely the type of workload GPUs were built for in the first place. It’s no coincidence that the architecture developed for rendering millions of pixels simultaneously is now perfect for processing millions of neuron activations at once.

General-purpose GPUs today have evolved well beyond their graphics roots. With programmable pipelines, compute shaders, and increasingly AI centric design, GPUs can now accelerate both traditional and emerging workloads, providing a powerful, flexible engine for edge AI.

Specialised processors like NPUs, by contrast, struggle to remain relevant amid rapid change. They are optimised for specific operations, and when the AI world moves on — as it constantly does — those chips can quickly become obsolete.  It’s clear that as this new type for software continues to develop, it needs a flexible, general purpose parallel hardware platform to support it – a GPU.

Why general-purpose wins at the edge

Edge AI needs more than performance. It needs adaptability, reusability, and longevity. General-purpose parallel processors like modern GPUs deliver on all fronts:

  • Flexibility: Can be programmed to run new model types without changing hardware.
  • Scalability: Suitable for a wide range of edge devices, from IoT sensors to smart cameras and autonomous vehicles.
  • Software Ecosystem: Supported by mature, open development tools and standards (e.g., OpenCL, LiteRT and TVM).
  • Sustainability: Extend product life cycles and reduce the need for constant silicon redesigns.

In short, general-purpose parallel compute — GPUs — is naturally built to evolve with AI.

Looking to the future

Despite growing evidence, the market still tends to associate AI acceleration with NPUs and custom silicon. But just as the graphics industry learned that fixed-function pipelines couldn’t keep up with the pace of gaming innovation, the AI industry is discovering that fixed hardware is no match for fluid software.

It’s time to re-educate the ecosystem. The future of AI at the edge isn’t about narrowly optimised chips. It’s about programmable, adaptable, parallel compute platforms that can grow and scale with the needs of intelligent software.

Makimoto saw it decades ago. Today, we are living his insight — riding a wave back toward general-purpose flexibility. The GPU is not just catching up; it’s already in the lead.

Dennis Laudick is Vice President of Product Management, Imagination Technologies. Before joining Imagination Technologies, Dennis held various product and marketing leadership roles at Arm within its automotive, AI and GPU divisions across more than 13 years. Prior to that, he worked in senior positions at several leading semiconductor and OEM companies. Imagination launched its E-Series GPU architecture for AI this month, and the details are here.

Do you have a strong opinion about an electronics-related topic? If you have a strong take about a key topic in the industry, we want to hear from you. 

Recent Stories