Friday, April 21, 2023

Large language models for everyone

ChatGPT's release late last year attracted a surge in interest -- and investment -- in anticipation of numerous monetization opportunities offered by the new and improved large language model. At the time there were no serious competitors - everyone had to use OpenAI's service, which is now pay to play.

As I wrote last month, competing models such as LLaMA have been released with downloadable weights, allowing end-users to run them locally (on high-end GPUs or even CPUs after discretization).

Researchers from Stanford University have released Alpaca, a fine-tuned version of LLaMA, showing how fine-tuning of language models for more specialized applications could be carried out relatively inexpensively provided one has access to a sufficiently powerful foundation model. 

However, LLaMA (and therefore its derivatives) were released under a restrictive license, in principle limiting them to non-commercial research purposes only. Nevertheless, students have been free to use illegal leaked copies of LLaMA to write their essays and do their homework.

This week, Stability AI released StableLM, a language model with a similar number of parameters to LLaMA, under a CreativeCommons license that allows free re-use even for commercial purposes.

Barriers towards widespread adoption of large language models are dropping fast!

No comments:

Post a Comment