Page: 1 of 1
A benefit of training a text prediction model on text from the internet, is that it not only learns language. It also learns a lot of other text that people write and talk about on the internet. Things like programming code, or text file formats. And so we can use the text predictor to generate traditional, structured data, which we can then use in our software that’s in charge to perform actions.
Today I want to take a little detour from our descent into the depths of LLMs and how software is built around it. I’d like to go up a little higher again and get a little philosophical, perhaps. As big tech and likely even non-tech Fortune 500 companies are currently trying to get LLMs added to whatever they are doing, there’s an interesing question to be asked. Is anyone building The Killer App for LLMs?
As I explained in my previous post, the software around a language model is really what makes it all work to become an AI “system”. To dig a little deeper into it, let’s look at how software leveraging AI is typically made using a pipeline. Note that some or most parts of such a pipeline are part of existing APIs or frameworks you can use, and probably should, but it’s important to have an idea of how these things are built.
In today’s (January 2024) generative AI landscape, which came on in a flash, there’s not much broader understanding about the architecture of the AI software we use. So I wanted to explain why The Great Text Predictor is just a cog in the AI machinery you use today and still is only completing sentences, and not reading your emails or searching the web. That’s the role of the software “controlling” the neural net.
Previously, we talked about LLMs basically being nifty text generators by predicting the next word given a bunch of previous text (“context”). We also found out there’s some by-product of clusters of knowledge hidden in these enormous neural nets of statistical word correlations that are very interesting but we may not really be able to count on. But how does fancy word prediction software get to the point of actually having a coherent back-and-forth conversation?
In my blog article The Great Text Predictor I talked about scaling up text prediction from next word based on the previous word, to next word based on previous words in the sentence, all the way to basing it on previous paragraphs and paragraphs. This is what is called “context”. Sometimes, statistically, the LLM has not seen enough data, or it has seen contradictory data, or somehow it has lost or never “figured out” specific correlations. Yet, its calculations spit out the next word - however unlikely that word may appear to us. As people like to say, it can be “confidently wrong”. Or, as the researchers have taught us to call it: hallucinations.
When texting became a more popular sport, we were all still using physical keys. In fact, the keys were for the numbers you would dial phone numbers with, and to type letters each number key was assigned about 3-4 letters. You’d press a key once, you’d get the first letter. If you’d press the key twice in quick succession you’d get the 2nd letter, three times for the third. As many of my age that used the ever popular Nokia 3210 (picture below) at the time can attest, there was a certain level of pride in how fast one could spell out words. Ah, those young ‘uns and their texting!
One of the things that’s great about Unity is that you can find pretty much whatever information you need online. Docs aren’t too bad, but the community has done forum posts, blogs, YouTubes and all sorts of content over the years. One thing I just haven’t been able to find though, is any information on using unit testing beyond just the basics of getting set up. So, basically, all that was left is trial & error. Here’s the questions I answered for myself.
Welcome to my new site. I’ve been wanting to blog more, but include topics unrelated to Microsoft Dynamics. I wanted a place to put some of the game development stuff I do. And as I’m considering to get into some regular streaming, I want a landing place for anyone checking me out. So, here we are. I started daxmusings.codecrib.com in 2010 on blogspot aka blogger. I attached the custom domain to it at a later time, keeping the daxmusings subdomain. I’ve had stuff on www.codecrib.com on and off, never very interesting. I’ve hosted it in several different ways over the years, most recently as a GitHub Pages site with custom domain attached.
Page: 1 of 1
Blog Post Collections
- The LLM Blogs
- Dynamics 365 (AX7) Dev Resources
- Dynamics AX 2012 Dev Resources
- Dynamics AX 2012 ALM/TFS