New chip speeds up AI computational efficiency

Edge computing powered by AI is already ubiquitous in our lives. Devices like drones, smart wearables, and industrial IoT sensors are equipped with AI-enabled chips so that computing can happen at the “edge” of the internet, where the data originates. This enables real-time processing and guarantees data confidentiality.

The NeuRRAM chip is not only twice as energy efficient as leading chips, but it is also versatile and delivers results that are just as accurate as conventional digital chips. (Image credit: David Baillot/University of California, San Diego.)

However, AI functionality on these tiny edge devices is limited by power supplied by a battery. Therefore, improving energy efficiency is crucial. In today’s artificial intelligence chips, data processing and storage take place in separate places: a computing unit and a memory unit. Frequent data movement between these units consumes most of the power during AI processing, so reducing data movement is key to solving the power problem.

Stanford University engineers have proposed a potential solution: a new resistive random-access memory (RRAM) chip that performs AI processing in memory itself, eliminating the separation between computational units and memory. Their “compute-in-memory” (CIM) chip, called NeuRRAM, is about the size of a finger and does more work with a limited battery than current chips can do.

“Performing these calculations on the chip instead of sending information to and from the cloud could enable faster, safer, cheaper and more scalable AI in the future, and give more people access to the power of AI,” said H.-S Philip Wong, Professor Willard R. and Inez Kerr Bell at the School of Engineering.

“The problem of moving data is like spending eight hours commuting for a two-hour workday,” added Weier Wan, a recent Stanford graduate leading the project. “With our chip, we are showing a technology to meet this challenge.”

They featured NeuRRAM in a recent journal article Nature. Although in-memory computing has been around for decades, this chip is the first to demonstrate a wide range of AI applications in hardware, rather than through simulation alone.

Putting computing power on the device

To overcome the data movement bottleneck, the researchers implemented so-called in-memory computing (CIM), a new chip architecture that performs AI computation directly in memory rather only in separate computing units. The memory technology used by NeuRRAM is resistive random access memory (RRAM). This is a type of non-volatile memory – memory that retains data even after the power is turned off – that has made its way into commercial products. RRAM can store large AI models in a small area and consumes very little power, making them ideal for small, low-power edge devices.

By combining computing and memory in one place, the NeuRRAM chip could improve the efficiency and applications of a wide variety of AI-enabled devices, such as smart wearables, industrial sensors, and computers. drones. (Image credit: Nicolle Fuller/Sayo Studio)

Although the concept of CIM chips is well established and the idea of ​​implementing AI computing in RRAM is not new, “it is one of the first instances to integrate a lot of memory directly on the chip neural network and present all benchmark results through hardware measurements,” said Wong, who is a co-lead author of the Nature paper.

NeuRRAM’s architecture allows the chip to perform in-memory analog calculations at low power and in a compact footprint. It was designed in collaboration with the laboratory of Gert Cauwenberghs at the University of California, San Diego, a pioneer in the design of low-power neuromorphic hardware. The architecture also enables reconfigurability in data flow directions, supports various AI workload mapping strategies, and can work with different types of AI algorithms, all without sacrificing accuracy. AI calculations.

To show the accuracy of NeuRRAM’s AI capabilities, the team tested its operation on different tasks. They found that it is 99% accurate in letter recognition from the MNIST dataset, 85.7% in image classification from the CIFAR-10 dataset, 84 .7% in recognizing Google voice commands and showed a 70% reduction in image reconstruction error on a Bayesian. image recovery task.

“Efficiency, versatility and precision are all important aspects for wider adoption of the technology,” Wan said. “But achieving them all at once is not easy. Co-optimizing the full stack from hardware to software is key.

“Such comprehensive co-design is made possible by an international team of researchers with diverse expertise,” Wong added.

Powering the edge computing of the future

Currently, NeuRRAM is a physical proof of concept, but needs more development before it’s ready to be translated into real edge devices.

But this combined efficiency, accuracy and ability to perform different tasks showcases the chip’s potential. “Maybe today it is used to perform simple AI tasks such as keyword detection or human detection, but tomorrow it could enable a completely different user experience. Imagine video analytics real-time combined with voice recognition in a small device,” Wan said. “To achieve this, we must continue to improve the design and evolve RRAM to more advanced technology nodes.”

“This work opens several avenues for future research in RRAM device engineering, as well as programming models and neural network design for in-memory computing, to make this technology scalable and usable by software developers.” said Priyanka Raina, assistant professor of electrical engineering. and co-author of the article.

If successful, RRAM in-memory compute chips like NeuRRAM have nearly limitless potential. They could be integrated into crop fields to perform real-time AI calculations to adjust irrigation systems to current soil conditions. Or they could turn augmented reality goggles from clunky headsets with limited functionality into something closer to Tony Stark’s viewing screen in the Iron Man and avengers movies (without intergalactic or multiverse threats – we can hope).

If mass-produced, these chips would be cheap enough, adaptable enough and weak enough that they could be used to advance technologies already improving our lives, Wong said, such as in medical devices that enable health monitoring at residence.

They can also be used to solve global societal challenges: AI-enabled sensors would play a role in monitoring and combating climate change. “By having these kinds of smart electronics that can be placed almost anywhere, you can monitor the changing world and be part of the solution,” Wong said. “These chips could be used to solve all sorts of problems, from climate change to food security.”

Additional co-authors of this work include researchers from the University of California, San Diego (co-lead), Tsinghua University, University of Notre Dame, and University of Pittsburgh. Former Stanford graduate student Sukru Burc Eryilmaz is also a co-author. Wong is a Fellow of Stanford Bio-X and the Wu Tsai Neurosciences Institute, and affiliated with the Precourt Institute for Energy. He is also faculty director of the Stanford Nanofabrication Facility and co-founding faculty director of the Stanford SystemX Alliance – an industry affiliate program at Stanford focused on building systems.

This research was funded by National Science Foundation Expeditions in Computing, SRC JUMP ASCENT Center, Stanford SystemX Alliance, Stanford NMTRI, Beijing Innovation Center for Future Chips, National Natural Science Foundation of China, and Office of Naval Research.

Sherry J. Basler