Nield reported on December 20 2023 that you may have already tested out generative AI engines such as ChatGPT and Google Bard. But while it's popular to accesses these tools in the cloud, you can also install them locally on your own computer. There are some real benefits to doing so: It's more private, of course, and you won't get hit by any warnings about the AI being over capacity or unavailable. Also, it's just kind of cool.
Credit: Joshua Woroniecki/Unsplash
To get started, you'll need a program to run the AI, and you'll need a Large Language Model (or LLM) to generate the responses. These LLMs underpin AI text generators. GPT-4 is the latest one powering ChatGPT, and Google has now pushed out Gemini as a new and improved LLM to run behind Google Bard.
If you've never heard the term LLM before, you clearly haven't read our ultimate AI glossary. To fully understand them requires a certain level of scientific and mathematical knowledge, but in basic terms, LLMs are trained on vast amounts of sample data, and they learn to recognize relationships between words and sentences (i.e. which words typically go after each other).
To put it as simply as possible, LLMs are supercharged autocorrect engines. They don't really "know" anything, but they recognize how words should fit together to sound natural and to make sense. At a high enough level, that starts to look like you're talking with a real human being. There's lots more to it than that, but you get the gist.
When it comes to running your own LLMs, you don't need to be a huge company or research organization to access them: There are several available to the public, including one released by Meta called LLaMa; others have been developed by researchers and volunteers. The general idea is that publicly available LLMs will help foster innovation and improve transparency.
For the purposes of this guide, we're going to use LM Studio to show you how to install an LLM locally. It's one of the best options out there for the job (though there are quite a few others). It's free to use, and you can set it up on Windows, macOS, and Linux systems.
How to install a local LLM
The first step is to download LM Studio from the official website, taking note of the minimum system requirements: LLM operation is pretty demanding, so you need a pretty powerful computer to do this. Windows or Linux PCs supporting AVX2 (typically on newer machines) and Apple Silicon Macs with macOS 13.6 or newer will work, and at least 16GB of RAM is recommended. On PCs, at least 6GB of VRAM is recommended too.
When you've got the software up and running, you need to find an LLM to download and use—you're not going to be able to do much without one. Part of the appeal of LM Studio is that it recommends "new and noteworthy" LLMs on the front screen of the application, so if you've got no idea what LLM you want, you can pick one from here.
You'll find LLMs vary by size, by complexity, by data sources, by purpose, and by speed: There's no right or wrong answer for which one to use, but there's plenty of information out there on sites such as Reddit and Hugging Face if you want to do some research. As you might expect, LLMs can run to several gigabytes in size, so you can do some background reading while you wait for one to download.
If you see an LLM you like on the front screen, just click Download. Otherwise, you can run a search or paste a URL in the box at the top. You'll be able to see the size of each LLM so you can estimate download times, as well as the date when it was last updated. It's also possible to filter the results to see the models that have been downloaded the most.
You can install as many LLMs as you like (as long as you have the space), but if there's at least one on your system, they'll show up in the My Models panel. (Click the folder icon on the left to get to it.) From here, you can see information about each model that you've installed, check for updates, and remove models.
To start doing some prompting, open up the AI Chat panel via the speech bubble icon on the left. Choose the model you want to use at the top, then type your prompt into the user message box at the bottom and hit Enter. The sort of output you get back will be familiar if you've used an LLM such as ChatGPT before.
On the right-hand side, you can control various settings related to the LLM, including how longer responses are handled, and how much of the processing work is offloaded to your system's GPU. There's also a box for a "pre-prompt:" You can tell the LLM to always respond in a particular tone or language style, for example.
Click the New chat button on the left if you want to start a fresh conversation, and your previous chats are logged underneath in case you need to get back to them. Whenever a particular answer has finished generating, you're given options to take a screenshot, copy the text, or regenerate a different answer from the same prompt.
That's it! You're up and running with local LLMs. There are all sorts of avenues you can explore in terms of LLM development and prompting if you want to dig deeper, but the basics aren't difficult to grasp, and LM Studio makes the setup process very straightforward even if you're a complete beginner.