Which is cleaner for the environment: learning models AI or five cars?

The area of artificial intelligence is often compared to the oil industry: after the extraction and processing of data, as well as oil can be a very lucrative commodity. Now, however, it becomes obvious that the metaphor is expanding. Like fossil fuels, the process of deep learning has a huge impact on the environment. In the new work, scientists from the University of Massachusetts at Amherst conducted a life cycle assessment study of a few common large models of artificial intelligence.

It turned out that the result of this process can be allocated more than 626 000 pounds (300,000 kg) of equivalent carbon dioxide, which is almost five times the emissions of a typical car over five years (including the production of the car).

As a study model AI

This is an amazing quantitative definition of what the researchers of artificial intelligence have long suspected.

“Although many of us think about it in abstract, blurred level, the figures demonstrate the scale of the problem,” says Carlos gómez-rodríguez, specialist in computer science at the University of a coruña in Spain, which was not involved in the study. “Neither I nor other researchers with whom I discussed them, I did not think that the environmental impact would be so significant.”

Coal trail natural language processing

In the work especially deals with the process of learning models for natural language processing (NLP), subfield of AI that deals with machine learning to work with human language. Over the past two years, the NLP community has reached several important milestones in the field of machine translation, the complete proposals and other standard assessment tasks. The infamous model OpenAI GPT-2, as an example, excelled in writingand a convincing fake news stories.

But such achievements required learning all of the larger models on the extended dataset of sentences taken from the Internet. This approach computationally is costly and very energy intensive.

The researchers considered four models in the field, responsible for some of the big jumps in performance: Transformer, ELMo, BERT and GPT-2. They trained each of them on one GPU during the day to measure power consumption.

Then they took the number of training hours specified in the original documents of the model, to calculate the total energy consumed in the whole process of learning. This amount was transferred to the equivalent pounds of carbon dioxide, which corresponded to the structure of energy consumption AWS from Amazon, the largest cloud provider.

It turned out that the computational and environmental costs of training grew in proportion to the size of the model, and then increased many times when I tuned the final accuracy of the model. Search neural architecture that tries to optimize the model by gradual changes in the structure of the neural network by trial and error, carries extremely high costs with little gain in performance. Without it the most expensive model BERT left a carbon footprint of 1,400 pounds (635 kg), which is close to the Transamerica flight to both ends.

Moreover, these figures should be considered only as baseline.

“Training one model is the minimal amount of work that you can do,” says Emma Strubell, lead author of the article. In practice, however, is much more likely that AI researchers will develop a new model from scratch or adapting an existing one, which will require many more cycles and settings.

In General, according to scientists, the process of creating and testing the final model, worthy of publication, required training 4789 models in six months. In terms of CO2 equivalent is about 35 000 kg.

The significance of these numbers is enormous, especially considering current research trends in AI. In General, research in the field of AI neglect efficiency, as large neural networks found useful for different tasks, and companies with unlimited computing resources will use them to gain competitive advantage.