Tuesday, September 24, 2024

From large language models to local language models

Last week Nature published a feature on local AI: Forget ChatGPT: why researchers now run small AIs on their laptops

This article discusses developments in large language models (LLMs) leading to the proliferation of language models that can be run locally on your own device without requiring top of the line hardware. There are four driving motivations behind this:

Privacy: Cloud-based LLMs such as ChatGPT do not offer any user privacy. This is a no-go when wanting to use them to analyze any kind of proprietary or confidential data. The only way to guarantee privacy is if you have a model that doesn't need to communicate with some cloud server to run.

Reliability: LLMs are constantly evolving. With commercial providers, there is a tug-of-war between the providers and the users, many of whom explore methods to "jailbreak" a model using finely crafted inputs to escape hard-coded restrictions on the possible outputs. Even when the underlying LLM might stay the same, preprocessing applied to a user's input before querying the LLM might change as the provider aims to improve the model performance or accuracy. This makes LLMs inherently unreliable - a prompt that works today might fail hopelessly the next day. With a local LLM the user is in control and will not be surprised by sudden changes to the model performance. Note that running a LLM locally does not completely solve this issue, since there is always some randomness to their output.

Reconfigurability: With the advent of efficient LLM fine-tuning methods such as low rank adaptation (LoRA), users can take an off-the-shelf open source LLM and augment it with their own specialized or proprietary data to solve problems of interest. For example, for the first year maths course I'm currently teaching the course convenor has augmented a LLM with the lecture notes and problem sets, creating a chatbot that is able to answer students' questions about the course and also refer them to the relevant parts of the lecture points. For the students, this combines the ease of use provided by a chatbot with the reliability of the source materials.

Cost: For heavy users cloud-based LLMs are not cheap. Moreover, academics need to make the decision between paying for access out of their own pocket, or wading through their institution's bureaucracy to find some funding source that will cover a subscription. Local LLMs avoid these hassles.

The feature article also lists popular platforms for installing and using local LLMs, both command line-based (for power users) and GUI-based (for ease of use). As a backend, many of these packages rely on fast execution of LLMs provided by llama.cpp, which I covered previously here and here.

It's been a while since I tinkered with these packages, but clearly there have been quite significant developments in their performance and usability since I last used them more than a year ago!

Monday, September 16, 2024

From classical to quantum HodgeRank

This is a rather late summary of a cool preprint I saw a few months ago: Quantum HodgeRank: Topology-Based Rank Aggregation on Quantum Computers 

This work is inspired by and builds on quantum subroutines developed for efficiently solving high-dimensional topological data analysis problems, offering superpolynomial speedups for ranking higher-order network data by developing a quantum version of the classical HodgeRank algorithm.

What is HodgeRank? It was originally proposed in 2011 as a better way of ranking incomplete or skewed datasets, for example based on user ratings or scores.

The basic idea is to apply an analogue of the Helmholtz decomposition (used routinely in electromagnetics) to graph data, enabling one to assign a ranking based on incomplete pairwise preferences. Importantly, HodgeRank outputs not just a raw ranking, but also an estimate of the quality of the ranking via the construction of local and global cycles present in the optimal ranking. To be specific, the returned optimal ranking is unique and fully consistent if the preference matrix can be written as the gradient of some scalar ranking function. If it cannot, then there are inevitable ambiguities present in the preference data due to the existence of global or local cycles. 

An example of a local ranking cycle is the following: B is preferred over A, C is preferred over B, and yet A is preferred over C. This leads to the ranking A < C < B < A, thus forming a cycle. It is better to identify cycles such as these and acknowledge that a traditional ranking does not make sense for these items. This is what HodgeRank does! User preference data is rarely consistent, so cycles such as these routinely occur in the wild, for example in user rankings of movies on online platforms such as Netflix. 

As a generalization of HodgeRank, Quantum HodgeRank promises the ability to perform ranking tasks on preference data forming higher-order networks, avoiding the exponential scaling with network dimension faced by classical algorithms. Moreover, the authors of the preprint argue that HodgeRank cannot be dequantized (i.e. implemented efficiently using a randomized classical algorithm) in the same manner as quantum TDA algorithms for the Betti number problem. Moreover, while applications of high-dimensional Betti numbers (and even their occurrence in real datasets) remain unclear, HodgeRank represents a ranking problem with more likely concrete applications. Thus, this looks like an exciting area to keep an eye on. 

It is also interesting to speculate on whether (classical) HodgeRank or HodgeRank-inspired methods can be useful for understanding the behaviour of interacting many-body quantum systems, where it is typically intractable to sample all of the pairwise interaction elements of Hamiltonians as the system size increases, but incomplete or skewed sampling is readily available. Watch this space!

Wednesday, September 11, 2024

Asian Network Mini-School on Quantum Materials 2024

Last week I visited the University of Indonesia to present two lectures on topological photonics at the Asian Network Mini-School on Quantum Materials 2024. This is one of a series of events held in South East Asian countries held by the ICTP Asian Network. The school attracted 95 participants from Indonesian universities, the majority being advanced undergraduates or graduate students. Meetings such as these provide valuable opportunities for early career scientists to learn about cutting-edge research areas and build collaborations with others in the region. I was impressed by the level of engagement from the audience - even though I ended my first lecture 20 minutes early, the remaining time was fully occupied by questions! Many thanks to the local organizers for putting together such an enjoyable meeting! Two more schools will be held this year, both in Thailand, on complex condensed matter systems and magnetism and spectroscopy, with more planned for next year.