ChatGPT has triggered an onslaught of artificial intelligence hype. The arrival of OpenAI’s large-language-model-powered (LLM-powered) chatbot forced leading tech companies to follow suit with similar applications as quickly as possible. The race is continuing to develop a powerful AI model. Meta came out with an LLM called Llama at the beginning of 2023, and Google presented its Bard model (now called Gemini) last year as well. Other providers, such as Anthropic, have also delivered impressive AI applications.
The new LLMs are anything but perfect, however: A lot of time and computing power are needed to train them. And it is usually unclear how they arrive at their results. In fact, current AI models are like a black box. You enter something, and they deliver an output without any accompanying explanation. This makes it difficult to figure out whether a program is making something up (“hallucinating”) or providing a meaningful answer. Most companies focus on achieving reliable results by training the models with even more data or optimizing them for specific tasks, such as solving mathematical problems.
The basic principle of AI models generally remains untouched, however: the algorithms are usually based on neural networks, which are modeled on the visual cortex of our brain. But a team of experts led by physicist Ziming Liu of the Massachusetts Institute of Technology has now developed an approach that surpasses conventional neural networks in many respects. As the researchers reported in late April in a preprint paper that has not yet been peer-reviewed, so-called Kolmogorov-Arnold networks (KANs) can master a wide range of tasks much more efficiently and solve scientific problems better than previous approaches. And probably the biggest advantage is that their results can be reproduced. The experts hope to be able to integrate KANs into LLMs to enhance their performance.
To read more, click here.