Can algorithmic progress in AI be summed up in the form of a “Moore’s Law”? OpenAI’s work fuels reflection.
Is there an equivalent to Moore’s Law in the field of AI? travaux OpenAI’s work fuels reflection.
The association, which last year became a “for-profit capped” company, has identified three main factors in the development of artificial intelligence: data, computational resources and algorithmic innovation.
She was interested in the latter, in particular: “algorithmic efficiency.”
Traditionally, this metric reflects the reduction in computing power needed to achieve a specific capacity.
To better fit machine learning, where the difficulty of tasks is more complex to evaluate, OpenAI has worked on constant computational performance. His approaches were limited to the training phase of the models.
The main purpose of the study is the classification of images, from the ImageNet database.
Finding: Between 2012 and 2020, the computing power required to achieve the same level of training was halved every 16 months.
This value is based on the differential measured between EfficientNet and AlexNet. The first requires 44 times less resources to reach the same level as the second at the time.
Also in image d’images classification, OpenAI observed a similar evolution with ResNet-50: required power halved every 17 months.
The results are comparable for the inference phase: doubling every 15 months between AlexNet and Shufflenet; every 13 months between ResNet and EfficientNet.
Games and translation
Based primarily on open source re-implementations (including PyTorch), OpenAI has expanded its analysis to other types of tasks. Among other things, translation.
Progress is much faster than for vision-related tasks.
Illustration with Transformer, which required 61 times less resources than Seq2Seq to translate an English text into French,based on WMT14.
Same trend in the field of games, with measurements however made at lower intervals.
OpenAI sees several explanations for this progress. These include standardization of batches, exploitation of residual connections and the ability to generalize from small data samples.
In the background, an appeal to AI development stakeholders (researchers, economists, regulators, etc.) to further integrate the notion of algorithmic progress, both in the short and long term, into their trade-offs.
There are limits to demonstration, OpenAI acknowledges. But both ways. On the one hand, the analysis did not take into account any gains related to the ability to use low-precision computing or to use optimized GPU nuclei.
On the other hand, AlexNet was originally driven over 90 cycles.
However, 62 are enough for him to reach 99.6% of his final performance.
Natalia Shepeleva’© main illustration – Shutterstock.com