Daniel Leykam's research blog: From large language models to local language models

Tuesday, September 24, 2024

From large language models to local language models

Last week Nature published a feature on local AI: Forget ChatGPT: why researchers now run small AIs on their laptops

This article discusses developments in large language models (LLMs) leading to the proliferation of language models that can be run locally on your own device without requiring top of the line hardware. There are four driving motivations behind this:

Privacy: Cloud-based LLMs such as ChatGPT do not offer any user privacy. This is a no-go when wanting to use them to analyze any kind of proprietary or confidential data. The only way to guarantee privacy is if you have a model that doesn't need to communicate with some cloud server to run.

Reliability: LLMs are constantly evolving. With commercial providers, there is a tug-of-war between the providers and the users, many of whom explore methods to "jailbreak" a model using finely crafted inputs to escape hard-coded restrictions on the possible outputs. Even when the underlying LLM might stay the same, preprocessing applied to a user's input before querying the LLM might change as the provider aims to improve the model performance or accuracy. This makes LLMs inherently unreliable - a prompt that works today might fail hopelessly the next day. With a local LLM the user is in control and will not be surprised by sudden changes to the model performance. Note that running a LLM locally does not completely solve this issue, since there is always some randomness to their output.

Reconfigurability: With the advent of efficient LLM fine-tuning methods such as low rank adaptation (LoRA), users can take an off-the-shelf open source LLM and augment it with their own specialized or proprietary data to solve problems of interest. For example, for the first year maths course I'm currently teaching the course convenor has augmented a LLM with the lecture notes and problem sets, creating a chatbot that is able to answer students' questions about the course and also refer them to the relevant parts of the lecture points. For the students, this combines the ease of use provided by a chatbot with the reliability of the source materials.

Cost: For heavy users cloud-based LLMs are not cheap. Moreover, academics need to make the decision between paying for access out of their own pocket, or wading through their institution's bureaucracy to find some funding source that will cover a subscription. Local LLMs avoid these hassles.

The feature article also lists popular platforms for installing and using local LLMs, both command line-based (for power users) and GUI-based (for ease of use). As a backend, many of these packages rely on fast execution of LLMs provided by llama.cpp, which I covered previously here and here.

It's been a while since I tinkered with these packages, but clearly there have been quite significant developments in their performance and usability since I last used them more than a year ago!

Daniel Leykam's research blog

Tuesday, September 24, 2024

From large language models to local language models

No comments:

Post a Comment