Strange IndiaStrange India

A replica of a game between 'Go' player Lee Se-Dol and a Google-developed super-computer, in Seoul, Korea, 2016.

DeepMind used the technology behind AlphaZero — its artificial-intelligence system for playing board games — to generate sorting algorithms.Credit: Ed Jones/AFP via Getty

An artificial intelligence (AI) system based on Google DeepMind’s AlphaZero AI created algorithms that, when translated into the standard programming language C++, can sort data up to three times as fast as human-generated versions.

“We were a bit shocked,” said Daniel Mankowitz, a computer scientist at DeepMind who led the work. “We didn’t believe it at first.”

Computer scientists have, for decades, been optimizing how computers sort data to shave off crucial milliseconds in returning search results or alphabetizing contact lists. Now DeepMind, based in London, has vastly improved sorting speeds by applying the technology behind AlphaZero — its artificial-intelligence system for playing the board games chess, Go and shogi — to a game of building sorting algorithms. “This is an exciting result,” said Emma Brunskill, a computer scientist at Stanford University, California.

The system, AlphaDev, is described in a paper in Nature1, and has invented faster algorithms that are already part of two standard C++ coding libraries, so are being used trillions of times per day by programmers around the world.

Starting small

Table of Contents

The researchers first applied AlphaDev to the task of sorting numbers by size. They started small, with algorithms that sorted only 3, 4, or 5 numbers at a time, but these are important because they’re used by algorithms that sort longer lists. AlphaDev operated at the level of assembly instructions: code generated by automated compilers from code that programmers write in C++, before it is translated into the 1s and 0s of machine code.

AlphaDev works similarly to its predecessor, AlphaZero, which combines computer versions of deliberation and intuition to choose moves in a board game2. AlphaDev doesn’t choose moves; instead, it chooses instructions to add to a procedure (in what DeepMind engineers call AssemblyGame).

When using deliberation, at each decision point, AlphaZero considers its possible moves, its possible moves after each of those moves, and so on, in a branching fashion, calculating which moves are most likely to end with a win. But considering all possible branches might take longer than the age of the universe, so it uses something akin to intuition to narrow its search. At each step, the computer program feeds the game state into neural networks — complex, tunable mathematical functions — that highlight the most promising moves. During training, it continually updates the networks based on game outcomes. It also explores moves by not always picking the one that is currently highest-rated.

Rewards offered

AlphaDev can take one of four types of actions that involve comparing values, moving values between locations, or jumping to a different part of the program. After each step, it tries sorting a set of lists and receives a reward for how many items in the lists it sorted correctly. It plays until it sorts all the lists perfectly or reaches a program length limit, then starts a new program from scratch.

The neural networks evaluated and rewarded the programs not only on correctness, but also on speed. Mankowitz’s team trained the system to evaluate speed either on the basis of the number of total instructions or the processing time. Depending on the processor used and the number of values to be sorted, AlphaDev’s best algorithms took between 4% and 71% less time than did human algorithms. But when the algorithms were called multiple times to sort lists of one -quarter of a million values, the cumulative time saving was only 1–2%, because of other code it did not optimize.

The DeepMind team also applied AlphaDev to non-sorting algorithms. Its version of an algorithm used to convert data stored in a particular format into bytes took 67% less time than a standard version. And its hashing algorithm, used in data storage and retrieval, took 30% less time than a standard one.

To see where AlphaDev eked out its gains, the team took a closer look at its algorithms. For sorting, they found two new tactics, which they called the AlphaDev swap move and the AlphaDev copy move. Mankowitz compares them to ‘Move 37’, a surprising move that AlphaGo, a predecessor to AlphaZero, made against the human Go champion Lee Sedol in 2016, at an exhibition match in Seoul. “It’s something that, in hindsight, was actually fundamental to winning the game and influenced how we thought about strategies,” he says.

In terms of the science, “I don’t know that there’s anything particularly deep in there,” says Michael Littman, a computer scientist at Brown University in Providence, Rhode Island, who notes that AlphaZero has already been around for six years. “But the engineering is phenomenal.” He adds that the researchers behind DeepMind are good at fitting the method to new problems. Last year, DeepMind also modified AlphaZero to create AlphaTensor3, which invented faster ways to multiply grids of numbers.

In future, the DeepMind team would like to apply AlphaZero-style algorithms to more kinds of problems, even the design of hardware itself, Mankowitz says. “We really want to be tackling the whole stack.”

Source link


Leave a Reply

Your email address will not be published. Required fields are marked *