Daniel Leykam's research blog: Large language models for everyone

Friday, April 21, 2023

Large language models for everyone

ChatGPT's release late last year attracted a surge in interest -- and investment -- in anticipation of numerous monetization opportunities offered by the new and improved large language model. At the time there were no serious competitors - everyone had to use OpenAI's service, which is now pay to play.

As I wrote last month, competing models such as LLaMA have been released with downloadable weights, allowing end-users to run them locally (on high-end GPUs or even CPUs after discretization).

Researchers from Stanford University have released Alpaca, a fine-tuned version of LLaMA, showing how fine-tuning of language models for more specialized applications could be carried out relatively inexpensively provided one has access to a sufficiently powerful foundation model.

However, LLaMA (and therefore its derivatives) were released under a restrictive license, in principle limiting them to non-commercial research purposes only. Nevertheless, students have been free to use illegal leaked copies of LLaMA to write their essays and do their homework.

This week, Stability AI released StableLM, a language model with a similar number of parameters to LLaMA, under a CreativeCommons license that allows free re-use even for commercial purposes.

Barriers towards widespread adoption of large language models are dropping fast!

Daniel Leykam's research blog

Friday, April 21, 2023

Large language models for everyone

No comments:

Post a Comment