The construction of a large language model (LLM) depends on many things: banks of GPUs, vast reams of training data, massive amounts of power, and matrix manipulation libraries like Numpy. For models with lower requirements though, it’s possible to do away with all of that, including the software dependencies. As someone who’d already built a […]
Regardless of what you think of GPT and the associated AI hype, you have to admit that it is probably here to stay, at least in some form. But how, exactly, does it work? Well, MicroGPT will show you a very stripped-down model in your browser. But it isn’t just another chatbot, it exposes all […]
Few people know LLMs (Large Language Models) as thoroughly as [Andrej Karpathy], and luckily for us all he expresses that in useful open-source projects. His latest is nanochat, which he bills as a way to create “the best ChatGPT $100 can buy”. What is it, exactly? nanochat in a minimal and hackable software project — […]
As technology progresses, we generally expect processing capabilities to scale up. Every year, we get more processor power, faster speeds, greater memory, and lower cost. However, we can also use improvements in software to get things running on what might otherwise be considered inadequate hardware. Taking this to the extreme, while large language models (LLMs) […]