Demystifying AI: The Probability Theory Behind LLMs Like OpenAI’s ChatGPT

Home/Demystifying AI: The Probabili...
Demystifying AI: The Probability Theory Behind LLMs Like OpenAI’s ChatGPT
Demystifying AI: The Probability Theory Behind LLMs Like OpenAI’s ChatGPT Admin CG January 23, 2024

When a paradigm shift occurs, it is not always obvious to those affected by it. 

But there is no “eye of the storm” equivalent when it comes to generative artificial intelligence (AI). 

The technology is here. There are already various commercial products available for deployment, and organizations that can effectively leverage it in support of their business goals are likely to outperform their peers that fail to adopt the innovation. 

Still, as with many innovations, uncertainty and institutional inertia reign supreme — which is why understanding how the large language models (LLMs) powering AI work is critical to not just piercing the black box of the technology’s supposed inscrutability, but also to applying AI tools correctly within an enterprise setting. 

The most important thing to understand about the foundational models powering today’s AI interfaces and giving them their ability to generate responses is the simple fact that LLMs, like Google’s Bard, Anthropic’s Claude, OpenAI’s ChatGPT and others, are just adding one word at a time. 

Underneath the layers of sophisticated algorithmic calculations, that’s all there is to it. 

That’s because at a fundamental level, generative AI models are built to generate reasonable continuations of text by drawing from a ranked list of words, each given different weighted probabilities based on the data set the model was trained on. 

How AI Works
While news of AI that can surpass human intelligence are helping fuel the hype of the technology, the reality is far more driven by math than it is by myth. 

“It is important for everyone to understand that AI learns from data … at the end of the day [AI] is merely probabilistics and statistics,” Akli Adjaoute, AI pioneer and founder and general partner at venture capital fund Exponion, told PYMNTS in November. 

But where do the probabilities that determine an AI systems’ output originate from? 

The answer lies within the  AI model’s training data. Peeking into the inner workings of an AI model reveals that it is not only the next reasonable word that is being identified, weighted, then generated, but that this process occurs on a letter by letter basis, as AI models break apart words into more manageable tokens. 

That is a big part of why prompt engineering for AI models is an emerging skillset. After all, different prompts produce different outputs based on the probabilities inherent to each reasonable continuation, meaning that to get the best output, you need to have a clear idea of where to point the provided input or query. 

It also means that the data informing the weight given to each probabilistic outcome must be relevant to the query. The more relevant, the better. 

See also: Tailoring AI Solutions by Industry Key to Scalability

Making AI Work for You
While PYMNTS Intelligence has found that more than eight in 10 business leaders (84%) believe generative AI will positively impact the workforce, generative AI systems are only as good as the data they’re trained on. That’s why the largest AI players are in an arms race to acquire the best training data sets.

“There’s a long way to go before there’s a futuristic version of AI where machines think and make decisions. … Humans will be around for quite a while,”  Tony Wimmer, head of data and analytics at J.P. Morgan Payments, told PYMNTS in March. “And the more that we can write software that has payments data at the heart of it to help humans, the better payments will get.”

That’s why, to train an AI model to perform to the necessary standard, many enterprises are relying on their own internal data to avoid compromising model outputs. By creating vertically specialized LLMs trained for industry use cases, organizations can deploy AI systems that are able to find the signal within the noise, as well as to be further fine-tuned to business-specific goals with real-time data. 

As Akli Adjaoute told PYMNTS back in November, “if you go into a field where the data is real, particularly in the payments industry, whether it’s credit risk, whether it’s delinquency, whether it’s AML [anti-money laundering], whether it’s fraud prevention, anything that touches payments … AI can bring a lot of benefit.”


PUBLISHING PARTNERS

Tags