What does the structure of large language models imply for cognition?
A discussion/debate between two computational neuroscientists
My advisor and I collaborated with Ekkolapto to talk about what the structure, successes, and failings of large language models could mean for cognition more widely. Here is the link to the video. Towards the end we also get into a wider discussion on why this is an exciting time for the field of neuroscience in general. We have had these kinds of discussions many times over beers, and we really appreciate Ekkolapto putting together this video and discussion. (They are putting out a lot of other awesome stuff by the way, you should check out their Youtube channel). I’ve also put plain English definitions of some of the more confusing terms used in the video here, along with an outline with timestamps in case you want to jump to a specific portion.
Key Terms
Here are some terms we used that might be confusing to a newcomer and an explanation of what they mean:
Auto-regressive Generation
Wikipedia: Autoregressive Model
This is a type of statistical process, where each new piece builds upon what came before it. The system takes its own output and feeds it back as input for the next step, creating a continuous flow of generation. Modern AI systems use this approach to create coherent text one piece at a time, similar to how humans construct sentences word by word in natural conversation.
Large Language Models (LLMs)
Wikipedia: Large Language Models
These serve as the backbone of modern artificial intelligence applications. They process and generate human language by learning patterns from vast amounts of text data. ChatGPT represents one of the most well-known examples of these systems in action.
Token
Tokens function as the fundamental building blocks of text in AI systems. These discrete units might represent complete words, parts of words, or even punctuation marks. The sentence "I love AI!" breaks down into four distinct tokens: "I," "love," "AI," and "!" The AI processes these tokens sequentially to understand and generate text.
Hallucinations (in AI)
Wikipedia: Hallucination (artificial intelligence)
AI hallucinations occur when systems generate convincing but false information. These fabrications emerge when the AI fills gaps in its knowledge with plausible-sounding but incorrect data.
Transformer Models
Wikipedia: Transformer (machine learning model)
This is an AI architecture which excel at processing sequential data by understanding relationships between different elements. The system weighs the importance of various parts of the input simultaneously, rather than processing them in strict order. This architecture enables large language models to grasp complex language patterns and generate coherent responses.
Hyperdimensional Space
Wikipedia: High-dimensional space
While we live in a world with three physical dimensions (height, width, and depth), AI systems use hundreds or thousands of "dimensions" to organize information. Each dimension represents a different characteristic or feature. A children's library might organize books by just three features: reading level, subject, and age group. AI systems organize information using thousands of features simultaneously. When describing a dog, these features might include size, color, fluffiness, behavior, typical locations, related objects, and hundreds of other characteristics. This rich organizational system helps AI understand subtle differences and similarities between concepts. Two dogs might be similar in some ways (four legs, furry) but different in others (size, color), and the hyperdimensional space captures all these relationships.
Knowledge Graph
Knowledge graphs create structured networks of information by connecting related concepts, facts, and entities. These connections form a web of knowledge that mirrors human understanding of relationships between different pieces of information.
Retrieval-Augmented Generation (RAG)
This hybrid approach combines the creative abilities of AI with factual information retrieval. The system accesses external databases to verify and supplement its responses, leading to more accurate and reliable output. Modern chatbots use RAG to provide answers grounded in verified sources rather than relying solely on their training data.
Connectome
A connectome provides a comprehensive map of neural connections within a brain. This intricate diagram reveals how different brain regions communicate and work together. Scientists use connectomes to understand brain function.
Working Memory vs. Long-Term Memory
Wikipedia: Working Memory
Wikipedia: Long-term Memory
Working memory and long-term memory are both theoretical ideas that are fundamental to cognition research. Working memory acts as a temporary mental workspace for immediate tasks, while long-term memory stores information for future retrieval. Cognitive scientists theorize that these two systems work together seamlessly in human cognition. Working memory holds the ingredients while cooking a new recipe, while long-term memory stores cooking techniques learned over years of experience.
Important Timestamps
2:03 Dr. Barenholtz’ main thesis: Could all human thinking and reasoning be similar to how AI language models work - taking in information and generating the next step?
13:54 Clear explanation of what "autoregressive" means
27:42 Discussion about how humans think in images and video, similar to how AI image generation works step by step
41:15 Debate about whether the brain is one unified autoregressive system or an autoregressive and retrieval system working together (Daniel’s view)
52:54 Important differences between how human brains and AI store and access knowledge
1:08:11 The potential value of theoretical brain science, even if immediate medical benefits aren't clear
1:28:16 How theoretical brain research eventually leads to medical treatments through multiple stages of testing
1:34:02 Recent exciting developments in computational brain research that make this a promising time for neuroscience