Speaker: Adrián Aranda Márquez.

Abstract: This presentation provides an in-depth analysis of the paper Titans: Learning to Memorize at Test Time, which proposes a novel neural architecture designed to enhance long-term contextual learning in sequence modeling. The authors introduce the Titans framework, combining attention mechanisms with a neural memory module that serves as long-term memory. This integration effectively overcomes the fixed-length context limitations inherent in Transformers, enabling the model to store and retrieve information across extended sequences. The study demonstrates that Titans achieve efficient, parallelizable training and maintain fast inference speeds while significantly outperforming Transformers and recent recurrent architectures in tasks such as language modeling, commonsense reasoning, genomics, and time-series prediction. The results highlight Titans’ scalability and robustness, achieving superior accuracy even with context windows exceeding two million tokens.