Unlocking the Power of In-Context Learning

Marzo 19, 2024

In recent years, the field of artificial intelligence has witnessed groundbreaking advancements. In particular, large language models have celebrated immense success thanks to their remarkable capability for in-context learning, enabled by Transformer architectures. A Transformer model can understand and respond based on contextual cues derived from previous examples.

What exactly does this mean, what is the in-context learning capability of Transformers? To grasp this concept, let’s consider an example using ChatGPT, a leading AI language model. ChatGPT, pre-trained on vast amounts of text data, can understand the context of new questions and provide relevant responses, probably even without having seen this exact question before. For instance, when prompted with:

Hockey: Stick

Football: Ball

Volleyball: Net Tennis:

ChatGPT responds with:

Tennis: Racket

By recognizing the pattern of matching sports with their associated equipment, ChatGPT correctly deduces that the equipment typically associated with tennis is a racket.

Now, imagine harnessing this powerful concept to understand the intricacies of complex medical conditions, such as prediabetes. These conditions are often described with dynamical systems, where inputs at a given time lead to specific outputs over time. In traditional machine learning, unraveling these systems, seeking patterns and insights is an extremly challenging and time-consuming task.

However, a recent article by researchers from IDSIA USI-SUPSI, and funded in parts by PRAESIIDIUM, poses the following question: “Is it possible to understand the intricacies of a dynamical system not solely from its input/output pattern, but also by observing the behavior of other systems within the same class?” Their study introduces a novel approach to system identification, where a meta-model is trained on diverse synthetic data from related systems. This meta-model, equipped with in-context learning capabilities, can make predictions about the behavior of entirely new systems within the same class, eliminating the need for hand-designing specific models for individual systems. Applying Transformer architectures to vast amounts of simulated data generated from dynamical systems is still in its infancy. Departing from traditional approaches that require extensive manual modeling, AI models trained on diverse datasets of disease dynamics might unlock new avenues for understanding and predicting disease progression, like they did in natural language processing. For PRAESIIDIUM, we are excited to incorporate this research to deepen our understanding of prediabetes.

By Marco Forgione, Lea Multerer, Laura Azzimonti et al. (SUPSI, CH)