A team of engineers at AI inference technology company BitEnergy AI reports a method to reduce the energy needs of AI applications by 95%. The group has published a paper describing their new technique on the arXiv preprint server.
As AI applications have gone mainstream, their use has risen dramatically, leading to a notable rise in energy needs and costs. LLMs such as ChatGPT require a lot of computing power, which in turn means a lot of electricity is needed to run them.
As just one example, ChatGPT now requires roughly 564 MWh daily, or enough to power 18,000 American homes. As the science continues to advance and such apps become more popular, critics have suggested that AI applications might be using around 100 TWh annually in just a few years, on par with Bitcoin mining operations.
In this new effort, the team at BitEnergy AI claims that they have found a way to dramatically reduce the amount of computing required to run AI apps that does not result in reduced performance.
The new technique is basic—instead of using complex floating-point multiplication (FPM), the method uses integer addition. Apps use FPM to handle extremely large or small numbers, allowing applications to carry out calculations using them with extreme precision. It is also the most energy-intensive part of AI number crunching.
To read more, click here.