DeepMind AI One-Ups Mathematicians to Crucial Computation for Computing

DeepMind has done it again.

After solving a fundamental challenge in biology – predicting protein structure – and unraveling the math of knot theory, he aims for a fundamental computing process embedded in thousands of daily applications. From image analysis to weather modeling or even exploring the inner workings of artificial neural networks, AI could theoretically speed up computations in a range of areas, increasing efficiency while reducing complexity. energy consumption and costs.

But the most impressive is How? ‘Or’ What they did it. The record-breaking algorithm, dubbed AlphaTensor, is a spin-off of AlphaZero, which beat human players at chess and go.

“Algorithms have been used in all civilizations around the world to perform fundamental operations for thousands of years,” wrote co-authors Drs. Matej Balog and Alhussein Fawzi at DeepMind. “However, discovering algorithms is very difficult.”

AlphaTensor ushers in a new world where AI designs programs that surpass anything the human engineer has devised, while simultaneously improving its own machine “brain”.

“This work pushes into uncharted territory using AI for an optimization problem that people have been working on for decades…the solutions it finds can be immediately scaled up to improve computational run times” , said Dr. Federico Levi, editor-in-chief at Naturewho published the study.

Enter matrix multiplication

The problem AlphaTensor faces is matrix multiplication. If you’re suddenly envisioning rows and columns of green numbers scrolling across your screen, you’re not alone. Basically, a matrix is ​​something like this – a grid of numbers that numerically represents the data of your choice. It could be pixels in an image, frequencies in an audio clip, or the appearance and actions of characters in video games.

Matrix multiplication takes two grids of numbers and multiplies one by the other. It is a calculation often taught in high school but is also essential for computer systems. Here, rows of numbers from one matrix are multiplied by columns from another. Results generate a result, for example, a command to zoom or tilt your view of a video game scene. While these calculations work under the hood, anyone using a phone or computer depends on their results every day.

You can see how the problem can get extremely difficult, extremely fast. The multiplication of large matrices requires a lot of energy and time. Each pair of numbers must be multiplied individually to build a new matrix. As the matrices grow, the problem quickly becomes untenable, even more so than predicting the best moves in Chess or Go. Some experts believe that there are more ways to solve matrix multiplication than the number of atoms in the universe.

In 1969, Volker Strassen, a German mathematician, showed that there were ways to cut corners, by reducing a round of two-by-two matrix multiplication from a total of eight to seven. That might not sound impressive, but Strassen’s method has shown that it is possible to beat long-standing operations standards, i.e. algorithms, for matrix multiplication. His approach, Strassen’s algorithm, has reigned as the most efficient approach for over 50 years.

What if there were even more effective methods? “Nobody knows the best algorithm to solve it,” said Dr Francois Le Gall of Nagoya University in Japan, who was not involved in the work. MIT Technology Review. “This is one of the biggest open problems in computer science.”

AI tracking algorithms

If human intuition is failing, why not tap into a mechanical mind?

In the new study, the DeepMind team turned matrix multiplication into a game. Similar to its predecessor AlphaZero, AlphaTensor uses deep reinforcement learning, a machine learning method inspired by the way biological brains learn. Here, an AI agent (often an artificial neural network) interacts with its environment to solve a multi-step problem. If he succeeds, he earns a “reward”, i.e. the network parameters of the AI ​​are updated so that he is more likely to succeed again in the future.

It’s like learning to flip a pancake. Many will initially fall to the ground, but your neural networks will eventually learn the arm and hand movements for the perfect flip.

AlphaTensor’s training ground is a kind of 3D board game. It’s basically a one-player puzzle roughly similar to Sudoku. The AI ​​must multiply grids of numbers in as few steps as possible, while choosing from a myriad of allowed moves – over a trillion of them.

These allowed moves were meticulously crafted in AlphaTensor. During a press briefing, co-author Dr. Hussain Fawzi explained, “Formulating the space of algorithmic discovery is very complex… even more difficult, how do we navigate this space.”

In other words, faced with a bewildering array of options, how can we narrow them down to improve our chances of finding the needle in the haystack? And how can we best devise a strategy to reach the needle without digging through the whole haystack?

One trick the team has incorporated into AlphaTensor is a method called tree search. Rather than, metaphorically speaking, randomly digging into the haystack, here the AI ​​probes for “roads” that could lead to a better outcome. Intermediate learnings then help the AI ​​plan its next move to increase the chance of success. The team also showed examples of successful game algorithms, such as teaching a child the first chess moves. Finally, once the AI ​​discovered useful moves, the team allowed it to rearrange those operations for more personalized learning in search of a better outcome.


AlphaTensor performed well. In a series of tests, the team challenged the AI ​​to find the most efficient solutions for matrices up to five by five, i.e. with five numbers each in a row or column. .

The algorithm quickly rediscovered Strassen’s original hack, but then surpassed all solutions previously devised by the human mind. By testing the AI ​​with different matrix sizes, AlphaTensor found more than 70 more efficient solutions. “In fact, AlphaTensor typically discovers thousands of algorithms for each matrix size,” the team said. “It’s mind-boggling.”

In one case, by multiplying a five-by-five matrix by a four-by-five matrix, the AI ​​reduced the previous record of 80 individual multiplications to just 76. It also shone on larger matrices, reducing the number of calculations needed for two eleven by eleven matrices from 919 to 896.

With the proof of concept in hand, the team turned to practical use. Computer chips are often designed to optimize different computations — GPUs for graphics, for example, or AI chips for machine learning — and matching an algorithm with the best-suited hardware increases efficiency.

Here, the team used AlphaTensor to find algorithms for two popular chips in machine learning: the NVIDIA V100 GPU and Google TPU. Altogether, the algorithms developed by the AI ​​increased the calculation speed by up to 20%.

It’s hard to say whether AI can also speed up smartphones, laptops, or other everyday devices. However, “this development would be very exciting if it could be used in practice,” said Dr Virginia Williams of MIT. “An increase in performance would improve many applications.”

The mind of an AI

Although AlphaTensor broke the last human record for matrix multiplication, the DeepMind team cannot yet explain why.

“He has this incredible intuition playing these games,” DeepMind scientist and co-author Dr. Pushmeet Kohli said during a press briefing.

Evolving algorithms also don’t have to be man versus machine.

Although AlphaTensor is a stepping stone to faster algorithms, even faster algorithms might exist. “Because it must restrict its search to algorithms of a specific form, it might miss other kinds of algorithms that might be more efficient,” Balog and Fawzi wrote.

Perhaps an even more intriguing path would combine human and machine intuition. “It would be nice to know if this new method actually subsumes all the previous ones, or if you can combine them and get something even better,” Williams said. Other experts agree. With a host of algorithms at their disposal, scientists can begin to dissect them for clues as to what made AlphaTensor’s solutions work, paving the way for the next breakthrough.

Image credit: DeepMind

Sherry J. Basler