brainsteam.co.uk/brainsteam/content/posts/legacy/2019-06-20-how-can-ai-pract...

11 KiB
Raw Blame History

title author type date url featured_image medium_post categories tags
How can AI practitioners reduce our carbon footprint? James post 2019-06-20T09:18:40+00:00 /2019/06/20/how-can-ai-practitioners-reduce-our-carbon-footprint/ /wp-content/uploads/2019/06/ash-blaze-burn-266487-825x510.jpg
O:11:"Medium_Post":11:{s:16:"author_image_url";N;s:10:"author_url";N;s:11:"byline_name";N;s:12:"byline_email";N;s:10:"cross_link";N;s:2:"id";N;s:21:"follower_notification";N;s:7:"license";N;s:14:"publication_id";N;s:6:"status";N;s:3:"url";N;}
Work
AI
climate catastrophe
climate change
machine learning
nlp

In recent weeks and months the impending global climate catastrophe has been at the forefront of many peoples minds. Thanks to movements like Extinction Rebellion and high profile environmentalists like Greta Thunberg and David Attenborough as well as damning reports from the IPCC, it finally feels like momentum is building behind significant reduction of carbon emissions. That said, knowing how we can help on an individual level beyond driving and flying less still feels very overwhelming.

The Energy Issue

A recent study by Strubel et al. (2019) gave insight into exactly how much energy certain neural architectures require to train. Their findings show that training some of the largest and most complex neural models and neural architecture search (in which multiple models are trained and measured against a fitness function to find the most performant model for a given task) consumes huge amounts of energy. Assuming that energy came from fossil-fuel power plants, a fair assumption since most researchers are using cloud providers like AWS and GCP which rely largely on carbon-generated electricity, the models are producing more CO2 pollution than a car produces in its lifetime.

Predictably, mainstream media misconstrued the findings and articles proposing abandonment of deep learning as a field started to surface (see Charles Radclyffes Forbes article: AIs Dirty Secret, if AI really does burn this much electricity, then maybe we should just pull the plug if were serious about climate change?”).

My biggest objection to this conclusion is that it is based upon the notion that all AI is this power hungry. As I said above, Strubels study is based on some of the biggest and most complex models in the field today. My intuition would be that most data scientists and AI researchers are not training models anywhere near this big and for many data problems it is not even necessary to use deep learning (as I discuss below).

My second objection to this notion that we should scrap AI is that we necessarily dismiss any and all potential benefits of continuing to develop models that reduce energy consumption by optimising data centres, logistics routes and even energy grids. In the future the mass adoption of self driving tech could save vast amounts of energy by removing erratic human drivers from the road with their fuel-hungry acceleration and braking behaviours. No more human drivers? Less need for traffic control measures which force millions of us to slow down and speed up every day burning large amounts of fuel that wouldnt be needed if we maintained a steady speed. None of this would be possible if we just stop trying to improve deep learning approaches overnight.

The BERT language model, one of Strubels worst offenders is, at the time of writing, the state of the art approach for a number of natural language processing tasks. What if BERT-based models powering chatbots and smart speakers could help consumers to make better purchasing decisions and prevent thousands of packages from being shipped and then returned on gas guzzling lorries, planes and cargo ships?

20 years ago most of us had power hungry CRT monitors and TVs that weve since replaced with more efficient LCD and LED displays. We were using incandescent lightbulbs that use 6x more electricity than a modern bulb and need replacing a order of magnitude more frequently. Our renewable generation technology has come on leaps and bounds with solar panels becoming significantly cheaper and more efficient over the last 20 years. My point here is that humans are pretty good at improving the energy efficiency of our inventions. Im sure most readers who frequently sit in their electrically lit living room at 10pm at night watching a flat screen TV or scrolling on an OLED touch screen on their smartphone are glad that we didnt give up on these technologies because CRT screens and incandescent bulbs are to energy hungry.

What can the AI community do?

There are a number of things that the AI community can do to help reduce their carbon footprint. Some are simpler and more straight forward, others are a little more involved.

KISS Keep it simple stupid!

When youre building ML models always start with a simple model first. It may be tempting to charge in with a deep learning model immediately but these models are slow to train, prone to overfitting due to their complexity and of course energy hungry. Aside from apeasing the marketing department, there is absolutely no advantage to using a deep learning model before youve even tried Logistic Regression or, whoa dont go too crazy now, a random decision forest!

Even if you train a few different simple models with different data folds and hyper parameters youll probably find it quicker and a less energy hungry starting point. Of course if simple models dont work, deep learning is a good option.

Pre-trained models and transfer learning

This could apply to both simple models (well kinda) and deep learning models.

It is well known by now that the best way to get near state of the art performance for classification tasks in NLP and computer vision is to take a pre-trained model like BERT or ResNet and “continue” training by updating the last few layers of the neural model with new weights.

Unless youre a multi-national or a top tier research institute with lots of money and data to throw at training then trying to train one of these systems from scratch may be a waste of time and energy anyway (I said may be, not always. If youre working on new state-of-the-art models then I salute you! We should always strive to better ourselves!).

You can also combine the KISS approach with pre-trained weights. You can achieve some really great text classification results by using pre-trained word embeddings like GloVe, word2vec or fastText with a linear classification model like SVM.

Scale down big data

If youre developing a model and working with a massive dataset, you might consider training on a small but representational subset of the data. Youll need to be very careful about this, especially if your dataset is not well balanced or has very rare features (in NLP this could be words that are important but only occur in a tiny proportion of documents). However, if you know that youre likely to need to change the model 10 more times before you calculate your final performance metrics, it might (but wont always) makes sense to train it on 10,000 samples instead of 100,000 samples.

If youre building models that use a gradient descent or evolutionary training approach then you could also limit the number of epochs during development of your model.

Give patronage to “green” hosting providers

Big companies are not always the most transparent so this suggestion could be trickier. That said, taking your money where the ethical hosting is could be a good way to reduce your models carbon footprint. Especially if you are one of the pioneers working on massive models that use a lot of electricity. Hardware is an important consideration too. GPUs have been a key tool in the evolution of deep learning over the last 10 years but it turns out that TPUs are better suited to deep learning and much less energy hungry with that.

Controversial Suggestion: Carbon Reporting in AI and ML Scientific Publications

This ones probably going to be a divisive suggestion but what if we could get all the big ML academic conferences to require some basic calculation of energy usage with all new model architecture submissions? The idea is to introduce a race to the bottom for AI model power consumption. A model that uses 100x less electricity and achieves near state-of-the-art performance would be much more interesting than one that improves state-of-the-art performance by 0.1%

Im well aware that this solution is far from perfect given cloud hosting transparency concerns (see above) and conference organisers would have to think carefully about how to set up peer reviews in a way that avoids always rewarding energy efficiency at the expense of model task performance.

I guess another approach could be an international conference for energy efficient machine learning systems. Id be interested in whether theres enough interest in such a conference from the academic community that Id seriously consider organising such an event. Also if one already exists Id be interested in participating.

If youd like to discuss the above Im on twitter @jamesravey

Conclusion

In closing, Im really glad that Strubel et al have brought this issue to the forefront of our minds and that the work has picked up so much attention. Rather than panicking and downing our tools, I think its important that we remain optimistic about AI and the huge advantages that it can bring and that we try to be as considerate as possible of environmental factors whenever we develop new approaches.