Machine Learning Street Talk (MLST)-logo

Machine Learning Street Talk (MLST)

Technology Podcasts

Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).

Location:

United States

Description:

Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).

Language:

English


Episodes
Ask host to enable sharing for playback control

Karl Friston - Why Intelligence Can't Get Too Large (Goldilocks principle)

9/10/2025
In this episode, hosts Tim and Keith finally realize their long-held dream of sitting down with their hero, the brilliant neuroscientist Professor Karl Friston. The conversation is a fascinating and mind-bending journey into Professor Friston's life's work, the Free Energy Principle, and what it reveals about life, intelligence, and consciousness itself. **SPONSORS** Gemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal - https://github.com/google-gemini/gemini-cli --- Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community! --- cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++ Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlst Submit investment deck: https://cyber.fund/contact?utm_source=mlst *** They kick things off by looking back on the 20-year journey of the Free Energy Principle. Professor Friston explains it as a fundamental rule for survival: all living things, from a single cell to a human being, are constantly trying to make sense of the world and reduce unpredictability. It’s this drive to minimize surprise that allows things to exist and maintain their structure. This leads to a bigger question: What does it truly mean to be "intelligent"? The group debates whether intelligence is everywhere, even in a virus or a plant, or if it requires a certain level of complexity. Professor Friston introduces the idea of different "kinds" of things, suggesting that creatures like us, who can model themselves and think about the future, possess a unique and "strange" kind of agency that sets us apart. From intelligence, the discussion naturally flows to the even trickier concept of consciousness. Is it the same as intelligence? Professor Friston argues they are different. He explains that consciousness might emerge from deep, layered self-awareness—not just acting, but understanding that you are the one causing your actions and thinking about your place in the world. They also explore intelligence at different sizes. Is a corporation intelligent? What about the entire planet? Professor Friston suggests there might be a "Goldilocks zone" for intelligence. It doesn't seem to exist at the super-tiny atomic level or at the massive scale of planets and solar systems, but thrives in the complex middle-ground where we live. Finally, they tackle one of the most pressing topics of our time: Can we build a truly conscious AI? Professor Friston shares his doubts about whether our current computers are capable of a feat like that. He suggests that genuine consciousness might require a different kind of "mortal" computation, where the machine's physical body and its "mind" are inseparable, much like in biological creatures. TRANSCRIPT: https://app.rescript.info/public/share/FZkF8BO7HMt9aFfu2_q69WGT_ZbYZ1VVkC6RtU3eeOI TOC: 00:00:00: Introduction & Retrospective on the Free Energy Principle 00:09:34: Strange Particles, Agency, and Consciousness 00:37:45: The Scale of Intelligence: From Viruses to the Biosphere 01:01:35: Modelling, Boundaries, and Practical Application 01:21:12: Conclusion

Duration:01:21:39

Ask host to enable sharing for playback control

The Day AI Solves My Puzzles Is The Day I Worry (Prof. Cristopher Moore)

9/4/2025
We are joined by Cristopher Moore, a professor at the Santa Fe Institute with a diverse background in physics, computer science, and machine learning.The conversation begins with Cristopher, who calls himself a "frog" explaining that he prefers to dive deep into specific, concrete problems rather than taking a high-level "bird's-eye view". They explore why current AI models, like transformers, are so surprisingly effective. Cristopher argues it's because the real world isn't random; it's full of rich structures, patterns, and hierarchies that these models can learn to exploit, even if we don't fully understand how.**SPONSORS**Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!---Cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy.Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlstSubmit investment deck: https://cyber.fund/contact?utm_source=mlst*** Cristopher Moore: https://sites.santafe.edu/~moore/ TOC:00:00:00 - Introduction00:02:05 - Meet Christopher Moore: A Frog in the World of Science00:05:14 - The Limits of Transformers and Real-World Data00:11:19 - Intelligence as Creative Problem-Solving00:23:30 - Grounding, Meaning, and Shared Reality00:31:09 - The Nature of Creativity and Aesthetics00:44:31 - Computational Irreducibility and Universality00:53:06 - Turing Completeness, Recursion, and Intelligence01:11:26 - The Universe Through a Computational Lens01:26:45 - Algorithmic Justice and the Need for Transparency TRANSCRIPT: https://app.rescript.info/public/share/VRe2uQSvKZOm0oIBoDsrNwt46OMCqRnShVnUF3qyoFk Filmed at DISI (Diverse Intelligences Summer Institute) https://disi.org/ REFS:The Nature of computation [Chris Moore]https://nature-of-computation.org/ Birds and Frogs [Freeman Dyson]https://www.ams.org/notices/200902/rtx090200212p.pdf Replica Theory [Parisi et al]https://arxiv.org/pdf/1409.2722 Janossy pooling [Fabian Fuchs]https://fabianfuchsml.github.io/equilibriumaggregation/ Cracking the cryptic [YT channel]https://www.youtube.com/c/CrackingTheCrypticSudoko Bench [Sakana]https://sakana.ai/sudoku-bench/Fractured entangled representations “phylogenetic locking in comment” [Kumar/Stanley]https://arxiv.org/pdf/2505.11581 (see our shows on this)The War Against Cliché: [Martin Amis]https://www.amazon.com/War-Against-Cliche-Reviews-1971-2000/dp/0375727167Rule 110 (CA)https://mathworld.wolfram.com/Rule150.htmlUniversality in Elementary Cellular Automata [Matt Cooke]https://wpmedia.wolfram.com/sites/13/2018/02/15-1-1.pdf Small Semi-Weakly Universal Turing Machines [Damien Woods] https://tilde.ini.uzh.ch/users/tneary/public_html/WoodsNeary-FI09.pdf COMPUTING MACHINERY AND INTELLIGENCE [Turing, 1950]https://courses.cs.umbc.edu/471/papers/turing.pdf Comment on Space Time as a causal set [Moore, 88]https://sites.santafe.edu/~moore/comment.pdf Recursion Theory on the Reals and Continuous-time Computation [Moore, 96]

Duration:01:34:52

Ask host to enable sharing for playback control

Michael Timothy Bennett: Defining Intelligence and AGI Approaches

8/28/2025
Dr. Michael Timothy Bennett is a computer scientist who's deeply interested in understanding artificial intelligence, consciousness, and what it means to be alive. He's known for his provocative paper "What the F*** is Artificial Intelligence" which challenges conventional thinking about AI and intelligence.**SPONSOR MESSAGES***Prolific: Quality data. From real people. For faster breakthroughs.https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=mb***Michael takes us on a journey through some of the biggest questions in AI and consciousness. He starts by exploring what intelligence actually is - settling on the idea that it's about "adaptation with limited resources" (a definition from researcher Pei Wang that he particularly likes).The discussion ranges from technical AI concepts to philosophical questions about consciousness, with Michael offering fresh perspectives that challenge Silicon Valley's "just scale it up" approach to AI. He argues that true intelligence isn't just about having more parameters or data - it's about being able to adapt efficiently, like biological systems do.TOC:1. Introduction & Paper Overview [00:01:34]2. Definitions of Intelligence [00:02:54]3. Formal Models (AIXI, Active Inference) [00:07:06]4. Causality, Abstraction & Embodiment [00:10:45]5. Computational Dualism & Mortal Computation [00:25:51]6. Modern AI, AGI Progress & Benchmarks [00:31:30]7. Hybrid AI Approaches [00:35:00]8. Consciousness & The Hard Problem [00:39:35]9. The Diverse Intelligences Summer Institute (DISI) [00:53:20]10. Living Systems & Self-Organization [00:54:17]11. Closing Thoughts [01:04:24]Michaels socials:https://michaeltimothybennett.com/https://x.com/MiTiBennettTranscript:https://app.rescript.info/public/share/4jSKbcM77Sf6Zn-Ms4hda7C4krRrMcQt0qwYqiqPTPIReferences:Bennett, M.T. "What the F*** is Artificial Intelligence"https://arxiv.org/abs/2503.23923Bennett, M.T. "Are Biological Systems More Intelligent Than Artificial Intelligence?" https://arxiv.org/abs/2405.02325Bennett, M.T. PhD Thesis "How To Build Conscious Machines"https://osf.io/preprints/thesiscommons/wehmg_v1Legg, S. & Hutter, M. (2007). "Universal Intelligence: A Definition of Machine Intelligence"Wang, P. "Defining Artificial Intelligence" - on non-axiomatic reasoning systems (NARS)Chollet, F. (2019). "On the Measure of Intelligence" - introduces the ARC benchmark and developer-aware generalizationHutter, M. (2005). "Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability"Chalmers, D. "The Hard Problem of Consciousness"Descartes, R. - Cartesian dualism and the pineal gland theory (historical context)Friston, K. - Free Energy Principle and Active Inference frameworkLevin, M. - Work on collective intelligence, cancer as information isolation, and "mind blindness"Hinton, G. (2022). "The Forward-Forward Algorithm" - introduces mortal computation conceptAlexander Ororbia & Friston - Formal treatment of mortal computationSutton, R. "The Bitter Lesson" - on search and learning in AIPearl, J. "The Book of Why" - causal inference and reasoningAlternative AGI ApproachesWang, P. - NARS (Non-Axiomatic Reasoning System)Goertzel, B. - Hyperon system and modular AGI architecturesBenchmarks & EvaluationHendrycks, D. - Humanities Last Exam benchmark (mentioned re: saturation)Filmed at:Diverse Intelligences Summer Institute (DISI) https://disi.org/

Duration:01:05:44

Ask host to enable sharing for playback control

Superintelligence Strategy (Dan Hendrycks)

8/13/2025
Deep dive with Dan Hendrycks, a leading AI safety researcher and co-author of the "Superintelligence Strategy" paper with former Google CEO Eric Schmidt and Scale AI CEO Alexandr Wang. *** SPONSOR MESSAGES Gemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal - https://github.com/google-gemini/gemini-cli Prolific: Quality data. From real people. For faster breakthroughs. https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=script-gen *** Hendrycks argues that society is making a fundamental mistake in how it views artificial intelligence. We often compare AI to transformative but ultimately manageable technologies like electricity or the internet. He contends a far better and more realistic analogy is nuclear technology. Like nuclear power, AI has the potential for immense good, but it is also a dual-use technology that carries the risk of unprecedented catastrophe. The Problem with an AI "Manhattan Project": A popular idea is for the U.S. to launch a "Manhattan Project" for AI—a secret, all-out government race to build a superintelligence before rivals like China. Hendrycks argues this strategy is deeply flawed and dangerous for several reasons: - It wouldn’t be secret. You cannot hide a massive, heat-generating data center from satellite surveillance. - It would be destabilizing. A public race would alarm rivals, causing them to start their own desperate, corner-cutting projects, dramatically increasing global risk. - It’s vulnerable to sabotage. An AI project can be crippled in many ways, from cyberattacks that poison its training data to physical attacks on its power plants. This is what the paper refers to as a "maiming attack." This vulnerability leads to the paper's central concept: Mutual Assured AI Malfunction (MAIM). This is the AI-era version of the nuclear-era's Mutual Assured Destruction (MAD). In this dynamic, any nation that makes an aggressive, destabilizing bid for a world-dominating AI must expect its rivals to sabotage the project to ensure their own survival. This deterrence, Hendrycks argues, is already the default reality we live in. A Better Strategy: The Three Pillars Instead of a reckless race, the paper proposes a more stable, three-part strategy modeled on Cold War principles: - Deterrence: Acknowledge the reality of MAIM. The goal should not be to "win" the race to superintelligence, but to deter anyone from starting such a race in the first place through the credible threat of sabotage. - Nonproliferation: Just as we work to keep fissile materials for nuclear bombs out of the hands of terrorists and rogue states, we must control the key inputs for catastrophic AI. The most critical input is advanced AI chips (GPUs). Hendrycks makes the powerful claim that building cutting-edge GPUs is now more difficult than enriching uranium, making this strategy viable. - Competitiveness: The race between nations like the U.S. and China should not be about who builds superintelligence first. Instead, it should be about who can best use existing AI to build a stronger economy, a more effective military, and more resilient supply chains (for example, by manufacturing more chips domestically). Dan says the stakes are high if we fail to manage this transition: - Erosion of Control - Intelligence Recursion - Worthless Labor Hendrycks maintains that while the risks are existential, the future is not set. TOC: 1 Measuring the Beast [00:00:00] 2 Defining the Beast [00:11:34] 3 The Core Strategy [00:38:20] 4 Ideological Battlegrounds [00:53:12] 5 Mechanisms of Control [01:34:45] TRANSCRIPT: https://app.rescript.info/public/share/cOKcz4pWRPjh7BTIgybd7PUr_vChUaY6VQW64No8XMs

Duration:01:45:38

Ask host to enable sharing for playback control

DeepMind Genie 3 [World Exclusive] (Jack Parker Holder, Shlomi Fruchter)

8/5/2025
This episode features Shlomi Fuchter and Jack Parker Holder from Google DeepMind, who are unveiling a new AI called Genie 3. The host, Tim Scarfe, describes it as the most mind-blowing technology he has ever seen. We were invited to their offices to conduct the interview (not sponsored).Imagine you could create a video game world just by describing it. That's what Genie 3 does. It's an AI "world model" that learns how the real world works by watching massive amounts of video. Unlike a normal video game engine (like Unreal or the one for Doom) that needs to be programmed manually, Genie generates a realistic, interactive, 3D world from a simple text prompt.**SPONSOR MESSAGES***Prolific: Quality data. From real people. For faster breakthroughs.https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=script-gen***Here’s a breakdown of what makes it so revolutionary:From Text to a Virtual World: You can type "a drone flying by a beautiful lake" or "a ski slope," and Genie 3 creates that world for you in about three seconds. You can then navigate and interact with it in real-time.It's Consistent: The worlds it creates have a reliable memory. If you look away from an object and then look back, it will still be there, just as it was. The guests explain that this consistency isn't explicitly programmed in; it's a surprising, "emergent" capability of the powerful AI model.A Huge Leap Forward: The previous version, Genie 2, was a major step, but it wasn't fast enough for real-time interaction and was much lower resolution. Genie 3 is 720p, interactive, and photorealistic, running smoothly for several minutes at a time.The Killer App - Training Robots: Beyond entertainment, the team sees Genie 3 as a game-changer for training AI. Instead of training a self-driving car or a robot in the real world (which is slow and dangerous), you can create infinite simulations. You can even prompt rare events to happen, like a deer running across the road, to teach an AI how to handle unexpected situations safely.The Future of Entertainment: this could lead to a "YouTube version 2" or a new form of VR, where users can create and explore endless, interconnected worlds together, like the experience machine from philosophy.While the technology is still a research prototype and not yet available to the public, it represents a monumental step towards creating true artificial worlds from the ground up.Jack Parker Holder [Research Scientist at Google DeepMind in the Open-Endedness Team]https://jparkerholder.github.io/Shlomi Fruchter [Research Director, Google DeepMind]https://shlomifruchter.github.io/TOC:[00:00:00] - Introduction: "The Most Mind-Blowing Technology I've Ever Seen"[00:02:30] - The Evolution from Genie 1 to Genie 2[00:04:30] - Enter Genie 3: Photorealistic, Interactive Worlds from Text[00:07:00] - Promptable World Events & Training Self-Driving Cars[00:14:21] - Guest Introductions: Shlomi Fuchter & Jack Parker Holder[00:15:08] - Core Concepts: What is a "World Model"?[00:19:30] - The Challenge of Consistency in a Generated World[00:21:15] - Context: The Neural Network Doom Simulation[00:25:25] - How Do You Measure the Quality of a World Model?[00:28:09] - The Vision: Using Genie to Train Advanced Robots[00:32:21] - Open-Endedness: Human Skill and Prompting Creativity[00:38:15] - The Future: Is This the Next YouTube or VR?[00:42:18] - The Next Step: Multi-Agent Simulations[00:52:51] - Limitations: Thinking, Computation, and the Sim-to-Real Gap[00:58:07] - Conclusion & The Future of Game EnginesREFS:World Models [David Ha, Jürgen Schmidhuber]https://arxiv.org/abs/1803.10122POEThttps://arxiv.org/abs/1901.01753[Akarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanley]The Fractured Entangled Representation Hypothesishttps://arxiv.org/pdf/2505.11581TRANSCRIPT:https://app.rescript.info/public/share/Zk5tZXk6mb06yYOFh6nSja7Lg6_qZkgkuXQ-kl5AJqM

Duration:00:58:22

Ask host to enable sharing for playback control

Large Language Models and Emergence: A Complex Systems Perspective (Prof. David C. Krakauer)

7/31/2025
Prof. David Krakauer, President of the Santa Fe Institute argues that we are fundamentally confusing knowledge with intelligence, especially when it comes to AI. He defines true intelligence as the ability to do more with less—to solve novel problems with limited information. This is contrasted with current AI models, which he describes as doing less with more; they require astounding amounts of data to perform tasks that don't necessarily demonstrate true understanding or adaptation. He humorously calls this "really shit programming". David challenges the popular notion of "emergence" in Large Language Models (LLMs). He explains that the tech community's definition—seeing a sudden jump in a model's ability to perform a task like three-digit math—is superficial. True emergence, from a complex systems perspective, involves a fundamental change in the system's internal organization, allowing for a new, simpler, and more powerful level of description. He gives the example of moving from tracking individual water molecules to using the elegant laws of fluid dynamics. For LLMs to be truly emergent, we'd need to see them develop new, efficient internal representations, not just get better at memorizing patterns as they scale. Drawing on his background in evolutionary theory, David explains that systems like brains, and later, culture, evolved to process information that changes too quickly for genetic evolution to keep up. He calls culture "evolution at light speed" because it allows us to store our accumulated knowledge externally (in books, tools, etc.) and build upon it without corrupting the original. This leads to his concept of "exbodiment," where we outsource our cognitive load to the world through things like maps, abacuses, or even language itself. We create these external tools, internalize the skills they teach us, improve them, and create a feedback loop that enhances our collective intelligence. However, he ends with a warning. While technology has historically complemented our deficient abilities, modern AI presents a new danger. Because we have an evolutionary drive to conserve energy, we will inevitably outsource our thinking to AI if we can. He fears this is already leading to a "diminution and dilution" of human thought and creativity. Just as our muscles atrophy without use, he argues our brains will too, and we risk becoming mentally dependent on these systems. TOC: [00:00:00] Intelligence: Doing more with less [00:02:10] Why brains evolved: The limits of evolution [00:05:18] Culture as evolution at light speed [00:08:11] True meaning of emergence: "More is Different" [00:10:41] Why LLM capabilities are not true emergence [00:15:10] What real emergence would look like in AI [00:19:24] Symmetry breaking: Physics vs. Life [00:23:30] Two types of emergence: Knowledge In vs. Out [00:26:46] Causality, agency, and coarse-graining [00:32:24] "Exbodiment": Outsourcing thought to objects [00:35:05] Collective intelligence & the boundary of the mind [00:39:45] Mortal vs. Immortal forms of computation [00:42:13] The risk of AI: Atrophy of human thought David Krakauer President and William H. Miller Professor of Complex Systems https://www.santafe.edu/people/profile/david-krakauer REFS: Large Language Models and Emergence: A Complex Systems Perspective David C. Krakauer, John W. Krakauer, Melanie Mitchell https://arxiv.org/abs/2506.11135 Filmed at the Diverse Intelligences Summer Institute: https://disi.org/

Duration:00:49:48

Ask host to enable sharing for playback control

Pushing compute to the limits of physics

7/21/2025
Dr. Maxwell Ramstead grills Guillaume Verdon (AKA “Beff Jezos”) who's the founder of Thermodynamic computing startup Extropic. ***SPONSOR MESSAGE*** Google Gemini 2.5 Flash is a state-of-the-art language model in the Gemini app. Sign up at https://gemini.google.com *** Guillaume shares his unique path – from dreaming about space travel as a kid to becoming a physicist, then working on quantum computing at Google, to developing a radically new form of computing hardware for machine learning. He explains how he hit roadblocks with traditional physics and computing, leading him to start his company – building "thermodynamic computers." These are based on a new design for super-efficient chips that use the natural chaos of electrons (think noise and heat) to power AI tasks, which promises to speed up AND lower the costs of modern probabilistic techniques like sampling. He is driven by the pursuit of building computers that work more like your brain, which (by the way) runs on a banana and a glass of water! Guillaume talks about his alter ego, Beff Jezos, and the "Effective Accelerationism" (e/acc) movement that he initiated. Its objective is to speed up tech progress in order to “grow civilization” (as measured by energy use and innovation), rather than “slowing down out of fear”. Guillaume argues we need to embrace variance, exploration, and optimism to avoid getting stuck or outpaced by competitors like China. He and Maxwell discuss big ideas like merging humans with AI, decentralizing intelligence, and why boundless growth (with smart constraints) is “key to humanity's future”. REFS: 1. John Archibald Wheeler - "It From Bit" Concept 00:04:45 - Foundational work proposing that physical reality emerges from information at the quantum level Learn more: https://cqi.inf.usi.ch/qic/wheeler.pdf 2. AdS/CFT Correspondence (Holographic Principle) 00:05:15 - Theoretical physics duality connecting quantum gravity in Anti-de Sitter space with conformal field theory https://en.wikipedia.org/wiki/Holographic_principle 3. Renormalization Group Theory 00:06:15 - Mathematical framework for analyzing physical systems across different length scales https://www.damtp.cam.ac.uk/user/dbs26/AQFT/Wilsonchap.pdf 4. Maxwell's Demon and Information Theory 00:21:15 - Thought experiment linking information processing to thermodynamics and entropy https://plato.stanford.edu/entries/information-entropy/ 5. Landauer's Principle 00:29:45 - Fundamental limit establishing minimum energy required for information erasure https://en.wikipedia.org/wiki/Landauer%27s_principle 6. Free Energy Principle and Active Inference 01:03:00 - Mathematical framework for understanding self-organizing systems and perception-action loops https://www.nature.com/articles/nrn2787 7. Max Tegmark - Information Bottleneck Principle 01:07:00 - Connections between information theory and renormalization in machine learning https://arxiv.org/abs/1907.07331 8. Fisher's Fundamental Theorem of Natural Selection 01:11:45 - Mathematical relationship between genetic variance and evolutionary fitness https://en.wikipedia.org/wiki/Fisher%27s_fundamental_theorem_of_natural_selection 9. Tensor Networks in Quantum Systems 00:06:45 - Computational framework for simulating many-body quantum systems https://arxiv.org/abs/1912.10049 10. Quantum Neural Networks 00:09:30 - Hybrid quantum-classical models for machine learning applications https://en.wikipedia.org/wiki/Quantum_neural_network 11. Energy-Based Models (EBMs) 00:40:00 - Probabilistic framework for unsupervised learning based on energy functions https://www.researchgate.net/publication/200744586_A_tutorial_on_energy-based_learning 12. Markov Chain Monte Carlo (MCMC) 00:20:00 - Sampling algorithm fundamental to modern AI and statistical physics https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo 13. Metropolis-Hastings Algorithm 00:23:00 - Core sampling method for probability...

Duration:01:23:32

Ask host to enable sharing for playback control

The Fractured Entangled Representation Hypothesis (Kenneth Stanley, Akarsh Kumar)

7/5/2025
Are the AI models you use today imposters? Please watch the intro video we did before this: https://www.youtube.com/watch?v=o1q6Hhz0MAg In this episode, hosts Dr. Tim Scarfe and Dr. Duggar are joined by AI researcher Prof. Kenneth Stanley and MIT PhD student Akash Kumar to discuss their fascinating paper, "Questioning Representational Optimism in Deep Learning." Imagine you ask two people to draw a perfect skull. One is a brilliant artist who understands anatomy, the other is a machine that just traces the image. Both drawings look identical, but the artist understands what a skull is—they know where the mouth is, how the jaw works, and that it's symmetrical. The machine just has a tangled mess of lines that happens to form the right picture. An AI with an elegant representation, has the building blocks to generate truly new ideas. The Path Is the Goal: As Kenneth Stanley puts it, "it matters not just where you get, but how you got there". Two students can ace a math test, but the one who truly understands the concepts—instead of just memorizing formulas—is the one who will go on to make new discoveries. The show is a mixture of 3 separate recordings we have done, the original Patreon warmup with Tim/Kenneth, the Tim/Keith "Steakhouse" recorded after the main interview, then the main interview with Kenneth/Akarsh/Keith/Tim. Feel free to skip around. We had to edit this in a rush as we are travelling next week but it's reasonably cleaned up. TOC: 00:00:00 Intro: Garbage vs. Amazing Representations 00:05:42 How Good Representations Form 00:11:14 Challenging the "Bitter Lesson" 00:18:04 AI Creativity & Representation Types 00:22:13 Steakhouse: Critiques & Alternatives 00:28:30 Steakhouse: Key Concepts & Goldilocks Zone 00:39:42 Steakhouse: A Sober View on AI Risk 00:43:46 Steakhouse: The Paradox of Open-Ended Search 00:47:58 Main Interview: Paper Intro & Core Concepts 00:56:44 Main Interview: Deception and Evolvability 01:36:30 Main Interview: Reinterpreting Evolution 01:56:16 Main Interview: Impostor Intelligence 02:11:15 Main Interview: Recommendations for AI Research REFS: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis Akarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanley https://arxiv.org/pdf/2505.11581 Kenneth O. Stanley, Joel Lehman Why Greatness Cannot Be Planned: The Myth of the Objective https://amzn.to/44xLaXK Original show with Kenneth from 4 years ago: https://www.youtube.com/watch?v=lhYGXYeMq_E Kenneth Stanley is SVP Open Endedness at Lila Sciences https://x.com/kenneth0stanley Akarsh Kumar (MIT) https://akarshkumar.com/ AND... Kenneth is HIRING (this is an OPPORTUNITY OF A LIFETIME!) Research Engineer: https://job-boards.greenhouse.io/lila/jobs/7890007002 Research Scientist: https://job-boards.greenhouse.io/lila/jobs/8012245002 TRANSCRIPT: https://app.rescript.info/public/share/W_T7E1OC2Wj49ccqlIOOztg2MJWaaVbovTeyxcFEQdU

Duration:02:16:22

Ask host to enable sharing for playback control

The Fractured Entangled Representation Hypothesis (Intro)

7/5/2025
What if today's incredible AI is just a brilliant "impostor"? This episode features host Dr. Tim Scarfe in conversation with guests Prof. Kenneth Stanley (ex-OpenAI), Dr. Keith Duggar (MIT), and Arkash Kumar (MIT).While AI today produces amazing results on the surface, its internal understanding is a complete mess, described as "total spaghetti" [00:00:49]. This is because it's trained with a brute-force method (SGD) that’s like building a sandcastle: it looks right from a distance, but has no real structure holding it together [00:01:45].To explain the difference, Keith Duggar shares a great analogy about his high school physics classes [00:03:18]. One class was about memorizing lots of formulas for specific situations (like the "impostor" AI). The other used calculus to derive the answers from a deeper understanding, which was much easier and more powerful. This is the core difference: one method memorizes, the other truly understands.The episode then introduces a different, more powerful way to build AI, based on Kenneth Stanley's old experiment, "Picbreeder" [00:04:45]. This method creates AI with a shockingly clean and intuitive internal model of the world. For example, it might develop a model of a skull where it understands the "mouth" as a separate component it can open and close, without ever being explicitly trained on that action [00:06:15]. This deep understanding emerges bottom-up, without massive datasets.The secret is to abandon a fixed goal and embrace "deception" [00:08:42]—the idea that the stepping stones to a great discovery often don't look anything like the final result. Instead of optimizing for a target, the AI is built through an open-ended process of exploring what's "interesting" [00:09:15]. This creates a more flexible and adaptable foundation, a bit like how evolvability wins out in nature [00:10:30].The show concludes by arguing that this choice matters immensely. The "impostor" path may be hitting a wall, requiring insane amounts of money and energy for progress and failing to deliver true creativity or continual learning [00:13:00]. The ultimate message is a call to not put all our eggs in one basket [00:14:25]. We should explore these open-ended, creative paths to discover a more genuine form of intelligence, which may be found where we least expect it.REFS:Questioning Representational Optimism in Deep Learning:The Fractured Entangled Representation HypothesisAkarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanleyhttps://arxiv.org/pdf/2505.11581Kenneth O. Stanley, Joel LehmanWhy Greatness Cannot Be Planned: The Myth of the Objectivehttps://amzn.to/44xLaXKOriginal show with Kenneth from 4 years ago:https://www.youtube.com/watch?v=lhYGXYeMq_EKenneth Stanley is SVP Open Endedness at Lila Scienceshttps://x.com/kenneth0stanleyAkarsh Kumar (MIT)https://akarshkumar.com/AND... Kenneth is HIRING (this is an OPPORTUNITY OF A LIFETIME!)Research Engineer: https://job-boards.greenhouse.io/lila/jobs/7890007002Research Scientist: https://job-boards.greenhouse.io/lila/jobs/8012245002Tim's Code visualisation of FER based on Akarsh repo: https://github.com/ecsplendid/ferTRANSCRIPT: https://app.rescript.info/public/share/YKAZzZ6lwZkjTLRpVJreOOxGhLI8y4m3fAyU8NSavx0

Duration:00:15:45

Ask host to enable sharing for playback control

Three Red Lines We're About to Cross Toward AGI (Daniel Kokotajlo, Gary Marcus, Dan Hendrycks)

6/23/2025
What if the most powerful technology in human history is being built by people who openly admit they don't trust each other? In this explosive 2-hour debate, three AI experts pull back the curtain on the shocking psychology driving the race to Artificial General Intelligence—and why the people building it might be the biggest threat of all. Kokotajlo predicts AGI by 2028 based on compute scaling trends. Marcus argues we haven't solved basic cognitive problems from his 2001 research. The stakes? If Kokotajlo is right and Marcus is wrong about safety progress, humanity may have already lost control. Sponsor messages: ======== Google Gemini: Google Gemini features Veo3, a state-of-the-art AI video generation model in the Gemini app. Sign up at https://gemini.google.com Tufa AI Labs are hiring for ML Engineers and a Chief Scientist in Zurich/SF. They are top of the ARCv2 leaderboard! https://tufalabs.ai/ ======== Guest Powerhouse Gary Marcus - Cognitive scientist, author of "Taming Silicon Valley," and AI's most prominent skeptic who's been warning about the same fundamental problems for 25 years (https://garymarcus.substack.com/) Daniel Kokotajlo - Former OpenAI insider turned whistleblower who reveals the disturbing rationalizations of AI lab leaders in his viral "AI 2027" scenario (https://ai-2027.com/) Dan Hendrycks - Director of the Center for AI Safety who created the benchmarks used to measure AI progress and argues we have only years, not decades, to prevent catastrophe (https://danhendrycks.com/) Transcript: http://app.rescript.info/public/share/tEcx4UkToi-2jwS1cN51CW70A4Eh6QulBRxDILoXOno TOC: Introduction: The AI Arms Race 00:00:04 - The Danger of Automated AI R&D 00:00:43 - The Rationalization: "If we don't, someone else will" 00:01:56 - Sponsor Reads (Tufa AI Labs & Google Gemini) 00:02:55 - Guest Introductions The Philosophical Stakes 00:04:13 - What is the Positive Vision for AGI? 00:07:00 - The Abundance Scenario: Superintelligent Economy 00:09:06 - Differentiating AGI and Superintelligence (ASI) 00:11:41 - Sam Altman: "A Decade in a Month" 00:14:47 - Economic Inequality & The UBI Problem Policy and Red Lines 00:17:13 - The Pause Letter: Stopping vs. Delaying AI 00:20:03 - Defining Three Concrete Red Lines for AI Development 00:25:24 - Racing Towards Red Lines & The Myth of "Durable Advantage" 00:31:15 - Transparency and Public Perception 00:35:16 - The Rationalization Cascade: Why AI Labs Race to "Win" Forecasting AGI: Timelines and Methodologies 00:42:29 - The Case for Short Timelines (Median 2028) 00:47:00 - Scaling Limits: Compute, Data, and Money 00:49:36 - Forecasting Models: Bio-Anchors and Agentic Coding 00:53:15 - The 10^45 FLOP Thought Experiment The Great Debate: Cognitive Gaps vs. Scaling 00:58:41 - Gary Marcus's Counterpoint: The Unsolved Problems of Cognition 01:00:46 - Current AI Can't Play Chess Reliably 01:08:23 - Can Tools and Neurosymbolic AI Fill the Gaps? 01:16:13 - The Multi-Dimensional Nature of Intelligence 01:24:26 - The Benchmark Debate: Data Contamination and Reliability 01:31:15 - The Superhuman Coder Milestone Debate 01:37:45 - The Driverless Car Analogy The Alignment Problem 01:39:45 - Has Any Progress Been Made on Alignment? 01:42:43 - "Fairly Reasonably Scares the Sh*t Out of Me" 01:46:30 - Distinguishing Model vs. Process Alignment Scenarios and Conclusions 01:49:26 - Gary's Alternative Scenario: The Neurosymbolic Shift 01:53:35 - Will AI Become Jeff Dean? 01:58:41 - Takeoff Speeds and Exceeding Human Intelligence 02:03:19 - Final Disagreements and Closing Remarks REFS: Gary Marcus (2001) - The Algebraic Mind https://mitpress.mit.edu/9780262632683/the-algebraic-mind/ 00:59:00 Gary Marcus & Ernest Davis (2019) - Rebooting AI https://www.penguinrandomhouse.com/books/566677/rebooting-ai-by-gary-marcus-and-ernest-davis/ 01:31:59 Gary Marcus (2024) - Taming...

Duration:02:07:07

Ask host to enable sharing for playback control

How AI Learned to Talk and What It Means - Prof. Christopher Summerfield

6/16/2025
We interview Professor Christopher Summerfield from Oxford University about his new book "These Strange New Minds: How AI Learned to Talk and What It". AI learned to understand the world just by reading text - something scientists thought was impossible. You don't need to see a cat to know what one is; you can learn everything from words alone. This is "the most astonishing scientific discovery of the 21st century."People are split: some refuse to call what AI does "thinking" even when it outperforms humans, while others believe if it acts intelligent, it is intelligent. Summerfield takes the middle ground - AI does something genuinely like human reasoning, but that doesn't make it human.Sponsor messages:========Google Gemini: Google Gemini features Veo3, a state-of-the-art AI video generation model in the Gemini app. Sign up at https://gemini.google.comTufa AI Labs are hiring for ML Engineers and a Chief Scientist in Zurich/SF. They are top of the ARCv2 leaderboard! https://tufalabs.ai/========Prof. Christopher Summerfieldhttps://www.psy.ox.ac.uk/people/christopher-summerfieldThese Strange New Minds: How AI Learned to Talk and What It Meanshttps://amzn.to/4e26BVaTable of Contents:Introduction & Setup00:00:00 Superman 3 Metaphor - Humans Absorbed by Machines00:02:01 Book Introduction & AI Debate Context00:03:45 Sponsor Segments (Google Gemini, Tufa Labs)Philosophical Foundations00:04:48 The Fractured AI Discourse00:08:21 Ancient Roots: Aristotle vs Plato (Empiricism vs Rationalism)00:10:14 Historical AI: Symbolic Logic and Its LimitsThe Language Revolution00:12:11 ChatGPT as the Rubicon Moment00:14:00 The Astonishing Discovery: Learning Reality from Words Alone00:15:47 Equivalentists vs Exceptionalists DebateCognitive Science Perspectives00:19:12 Functionalism and the Duck Test00:21:48 Brain-AI Similarities and Computational Principles00:24:53 Reconciling Chomsky: Evolution vs Learning00:28:15 Lamarckian AI vs Darwinian Human LearningThe Reality of AI Capabilities00:30:29 Anthropomorphism and the Clever Hans Effect00:32:56 The Intentional Stance and Nature of Thinking00:37:56 Three Major AI Worries: Agency, Personalization, DynamicsSocietal Risks and Complex Systems00:37:56 AI Agents and Flash Crash Scenarios00:42:50 Removing Frictions: The Lawfare Example00:46:15 Gradual Disempowerment Theory00:49:18 The Faustian Pact of TechnologyHuman Agency and Control00:51:18 The Crisis of Authenticity00:56:22 Psychology of Control vs Reward01:00:21 Dopamine Hacking and Variable ReinforcementFuture Directions01:02:27 Evolution as Goal-less Optimization01:03:31 Open-Endedness and Creative Evolution01:06:46 Writing, Creativity, and AI-Generated Content01:08:18 Closing RemarksREFS:Academic References (Abbreviated)Essential Books"These Strange New Minds" - C. Summerfield [00:02:01] - Main discussion topic"The Mind is Flat" - N. Chater [00:33:45] - Summerfield's favorite on cognitive illusions"AI: A Guide for Thinking Humans" - M. Mitchell [00:04:58] - Host's previous favorite"Principia Mathematica" - Russell & Whitehead [00:11:00] - Logic Theorist reference"Syntactic Structures" - N. Chomsky (1957) [00:13:30] - Generative grammar foundation"Why Greatness Cannot Be Planned" - Stanley & Lehman [01:04:00] - Open-ended evolutionKey Papers & Studies"Gradual Disempowerment" - D. Duvenaud [00:46:45] - AI threat model"Counterfeit People" - D. Dennett (Atlantic) [00:52:45] - AI societal risks"Open-Endedness is Essential..." - DeepMind/Rocktäschel/Hughes [01:03:42]Heider & Simmel (1944) [00:30:45] - Agency attribution to shapesWhitehall Studies - M. Marmot [00:59:32] - Control and health outcomes"Clever Hans" - O. Pfungst (1911) [00:31:47] - Animal intelligence illusionHistorical References

Duration:01:08:28

Ask host to enable sharing for playback control

"Blurring Reality" - Chai's Social AI Platform (SPONSORED)

5/26/2025
"Blurring Reality" - Chai's Social AI Platform - sponsored This episode of MLST explores the groundbreaking work of Chai, a social AI platform that quietly built one of the world's largest AI companion ecosystems before ChatGPT's mainstream adoption. With over 10 million active users and just 13 engineers serving 2 trillion tokens per day, Chai discovered the massive appetite for AI companionship through serendipity while searching for product-market fit. CHAI sponsored this show *because they want to hire amazing engineers* -- CAREER OPPORTUNITIES AT CHAI Chai is actively hiring in Palo Alto with competitive compensation ($300K-$800K+ equity) for roles including AI Infrastructure Engineers, Software Engineers, Applied AI Researchers, and more. Fast-track qualification available for candidates with significant product launches, open source contributions, or entrepreneurial success. https://www.chai-research.com/jobs/ The conversation with founder William Beauchamp and engineers Tom Lu and Nischay Dhankhar covers Chai's innovative technical approaches including reinforcement learning from human feedback (RLHF), model blending techniques that combine smaller models to outperform larger ones, and their unique infrastructure challenges running exaflop-class compute. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers in Zurich and SF. Goto https://tufalabs.ai/ *** Key themes explored include: - The ethics of AI engagement optimization and attention hacking - Content moderation at scale with a lean engineering team - The shift from AI as utility tool to AI as social companion - How users form deep emotional bonds with artificial intelligence - The broader implications of AI becoming a social medium We also examine OpenAI's recent pivot toward companion AI with April's new GPT-4o, suggesting a fundamental shift in how we interact with artificial intelligence - from utility-focused tools to companion-like experiences that blur the lines between human and artificial intimacy. The episode also covers Chai's unconventional approach to hiring only top-tier engineers, their bootstrap funding strategy focused on user revenue over VC funding, and their rapid experimentation culture where one in five experiments succeed. TOC: 00:00:00 - Introduction: Steve Jobs' AI Vision & Chai's Scale 00:04:02 - Chapter 1: Simulators - The Birth of Social AI 00:13:34 - Chapter 2: Engineering at Chai - RLHF & Model Blending 00:21:49 - Chapter 3: Social Impact of GenAI - Ethics & Safety 00:33:55 - Chapter 4: The Lean Machine - 13 Engineers, Millions of Users 00:42:38 - Chapter 5: GPT-4o Becoming a Companion - OpenAI's Pivot 00:50:10 - Chapter 6: What Comes Next - The Future of AI Intimacy TRANSCRIPT: https://www.dropbox.com/scl/fi/yz2ewkzmwz9rbbturfbap/CHAI.pdf?rlkey=uuyk2nfhjzezucwdgntg5ubqb&dl=0

Duration:00:50:59

Ask host to enable sharing for playback control

Google AlphaEvolve - Discovering new science (exclusive interview)

5/14/2025
Today GoogleDeepMind released AlphaEvolve: a Gemini coding agent for algorithm discovery. It beat the famous Strassen algorithm for matrix multiplication set 56 years ago. Google has been killing it recently. We had early access to the paper and interviewed the researchers behind the work. AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/ Authors: Alexander Novikov*, Ngân Vũ*, Marvin Eisenberger*, Emilien Dupont*, Po-Sen Huang*, Adam Zsolt Wagner*, Sergey Shirobokov*, Borislav Kozlovskii*, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, Matej Balog* (* indicates equal contribution or special designation, if defined elsewhere) SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** AlphaEvolve works like a very smart, tireless programmer. It uses powerful AI language models (like Gemini) to generate ideas for computer code. Then, it uses an "evolutionary" process – like survival of the fittest for programs. It tries out many different program ideas, automatically tests how well they solve a problem, and then uses the best ones to inspire new, even better programs. Beyond this mathematical breakthrough, AlphaEvolve has already been used to improve real-world systems at Google, such as making their massive data centers run more efficiently and even speeding up the training of the AI models that power AlphaEvolve itself. The discussion also covers how humans work with AlphaEvolve, the challenges of making AI discover things, and the exciting future of AI helping scientists make new discoveries. In short, AlphaEvolve is a powerful new AI tool that can invent new algorithms and solve complex problems, showing how AI can be a creative partner in science and engineering. Guests: Matej Balog: https://x.com/matejbalog Alexander Novikov: https://x.com/SashaVNovikov REFS: MAP Elites [Jean-Baptiste Mouret, Jeff Clune] https://arxiv.org/abs/1504.04909 FunSearch [Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli & Alhussein Fawzi] https://www.nature.com/articles/s41586-023-06924-6 TOC: [00:00:00] Introduction: Alpha Evolve's Breakthroughs, DeepMind's Lineage, and Real-World Impact [00:12:06] Introducing AlphaEvolve: Concept, Evolutionary Algorithms, and Architecture [00:16:56] Search Challenges: The Halting Problem and Enabling Creative Leaps [00:23:20] Knowledge Augmentation: Self-Generated Data, Meta-Prompting, and Library Learning [00:29:08] Matrix Multiplication Breakthrough: From Strassen to AlphaEvolve's 48 Multiplications [00:39:11] Problem Representation: Direct Solutions, Constructors, and Search Algorithms [00:46:06] Developer Reflections: Surprising Outcomes and Superiority over Simple LLM Sampling [00:51:42] Algorithmic Improvement: Hill Climbing, Program Synthesis, and Intelligibility [01:00:24] Real-World Application: Complex Evaluations and Robotics [01:05:39] Role of LLMs & Future: Advanced Models, Recursive Self-Improvement, and Human-AI Collaboration [01:11:22] Resource Considerations: Compute Costs of AlphaEvolve This is a trial of posting videos on Spotify, thoughts? Email me or chat in our Discord

Duration:01:13:58

Ask host to enable sharing for playback control

Prof. Randall Balestriero - LLMs without pretraining and SSL

4/23/2025
Randall Balestriero joins the show to discuss some counterintuitive findings in AI. He shares research showing that huge language models, even when started from scratch (randomly initialized) without massive pre-training, can learn specific tasks like sentiment analysis surprisingly well, train stably, and avoid severe overfitting, sometimes matching the performance of costly pre-trained models. This raises questions about when giant pre-training efforts are truly worth it. He also talks about how self-supervised learning (where models learn from data structure itself) and traditional supervised learning (using labeled data) are fundamentally similar, allowing researchers to apply decades of supervised learning theory to improve newer self-supervised methods. Finally, Randall touches on fairness in AI models used for Earth data (like climate prediction), revealing that these models can be biased, performing poorly in specific locations like islands or coastlines even if they seem accurate overall, which has important implications for policy decisions based on this data. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT + SHOWNOTES: https://www.dropbox.com/scl/fi/n7yev71nsjso71jyjz1fy/RANDALLNEURIPS.pdf?rlkey=0dn4injp1sc4ts8njwf3wfmxv&dl=0 TOC: 1. Model Training Efficiency and Scale [00:00:00] 1.1 Training Stability of Large Models on Small Datasets [00:04:09] 1.2 Pre-training vs Random Initialization Performance Comparison [00:07:58] 1.3 Task-Specific Models vs General LLMs Efficiency 2. Learning Paradigms and Data Distribution [00:10:35] 2.1 Fair Language Model Paradox and Token Frequency Issues [00:12:02] 2.2 Pre-training vs Single-task Learning Spectrum [00:16:04] 2.3 Theoretical Equivalence of Supervised and Self-supervised Learning [00:19:40] 2.4 Self-Supervised Learning and Supervised Learning Relationships [00:21:25] 2.5 SSL Objectives and Heavy-tailed Data Distribution Challenges 3. Geographic Representation in ML Systems [00:25:20] 3.1 Geographic Bias in Earth Data Models and Neural Representations [00:28:10] 3.2 Mathematical Limitations and Model Improvements [00:30:24] 3.3 Data Quality and Geographic Bias in ML Datasets REFS: [00:01:40] Research on training large language models from scratch on small datasets, Randall Balestriero et al. https://openreview.net/forum?id=wYGBWOjq1Q [00:10:35] The Fair Language Model Paradox (2024), Andrea Pinto, Tomer Galanti, Randall Balestriero https://arxiv.org/abs/2410.11985 [00:12:20] Muppet: Massive Multi-task Representations with Pre-Finetuning (2021), Armen Aghajanyan et al. https://arxiv.org/abs/2101.11038 [00:14:30] Dissociating language and thought in large language models (2023), Kyle Mahowald et al. https://arxiv.org/abs/2301.06627 [00:16:05] The Birth of Self-Supervised Learning: A Supervised Theory, Randall Balestriero et al. https://openreview.net/forum?id=NhYAjAAdQT [00:21:25] VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning, Adrien Bardes, Jean Ponce, Yann LeCun https://arxiv.org/abs/2105.04906 [00:25:20] No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data (2025), Daniel Cai, Randall Balestriero, et al. https://arxiv.org/abs/2502.06831 [00:33:45] Mark Ibrahim et al.'s work on geographic bias in computer vision datasets, Mark Ibrahim https://arxiv.org/pdf/2304.12210

Duration:00:34:30

Ask host to enable sharing for playback control

How Machines Learn to Ignore the Noise (Kevin Ellis + Zenna Tavares)

4/8/2025
Prof. Kevin Ellis and Dr. Zenna Tavares talk about making AI smarter, like humans. They want AI to learn from just a little bit of information by actively trying things out, not just by looking at tons of data. They discuss two main ways AI can "think": one way is like following specific rules or steps (like a computer program), and the other is more intuitive, like guessing based on patterns (like modern AI often does). They found combining both methods works well for solving complex puzzles like ARC. A key idea is "compositionality" - building big ideas from small ones, like LEGOs. This is powerful but can also be overwhelming. Another important idea is "abstraction" - understanding things simply, without getting lost in details, and knowing there are different levels of understanding. Ultimately, they believe the best AI will need to explore, experiment, and build models of the world, much like humans do when learning something new. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT: https://www.dropbox.com/scl/fi/3ngggvhb3tnemw879er5y/BASIS.pdf?rlkey=lr2zbj3317mex1q5l0c2rsk0h&dl=0 Zenna Tavares: http://www.zenna.org/ Kevin Ellis: https://www.cs.cornell.edu/~ellisk/ TOC: 1. Compositionality and Learning Foundations [00:00:00] 1.1 Compositional Search and Learning Challenges [00:03:55] 1.2 Bayesian Learning and World Models [00:12:05] 1.3 Programming Languages and Compositionality Trade-offs [00:15:35] 1.4 Inductive vs Transductive Approaches in AI Systems 2. Neural-Symbolic Program Synthesis [00:27:20] 2.1 Integration of LLMs with Traditional Programming and Meta-Programming [00:30:43] 2.2 Wake-Sleep Learning and DreamCoder Architecture [00:38:26] 2.3 Program Synthesis from Interactions and Hidden State Inference [00:41:36] 2.4 Abstraction Mechanisms and Resource Rationality [00:48:38] 2.5 Inductive Biases and Causal Abstraction in AI Systems 3. Abstract Reasoning Systems [00:52:10] 3.1 Abstract Concepts and Grid-Based Transformations in ARC [00:56:08] 3.2 Induction vs Transduction Approaches in Abstract Reasoning [00:59:12] 3.3 ARC Limitations and Interactive Learning Extensions [01:06:30] 3.4 Wake-Sleep Program Learning and Hybrid Approaches [01:11:37] 3.5 Project MARA and Future Research Directions REFS: [00:00:25] DreamCoder, Kevin Ellis et al. https://arxiv.org/abs/2006.08381 [00:01:10] Mind Your Step, Ryan Liu et al. https://arxiv.org/abs/2410.21333 [00:06:05] Bayesian inference, Griffiths, T. L., Kemp, C., & Tenenbaum, J. B. https://psycnet.apa.org/record/2008-06911-003 [00:13:00] Induction and Transduction, Wen-Ding Li, Zenna Tavares, Yewen Pu, Kevin Ellis https://arxiv.org/abs/2411.02272 [00:23:15] Neurosymbolic AI, Garcez, Artur d'Avila et al. https://arxiv.org/abs/2012.05876 [00:33:50] Induction and Transduction (II), Wen-Ding Li, Kevin Ellis et al. https://arxiv.org/abs/2411.02272 [00:38:35] ARC, François Chollet https://arxiv.org/abs/1911.01547 [00:39:20] Causal Reactive Programs, Ria Das, Joshua B. Tenenbaum, Armando Solar-Lezama, Zenna Tavares http://www.zenna.org/publications/autumn2022.pdf [00:42:50] MuZero, Julian Schrittwieser et al. http://arxiv.org/pdf/1911.08265 [00:43:20] VisualPredicator, Yichao Liang https://arxiv.org/abs/2410.23156 [00:48:55] Bayesian models of cognition, Joshua B. Tenenbaum https://mitpress.mit.edu/9780262049412/bayesian-models-of-cognition/ [00:49:30] The Bitter Lesson, Rich Sutton http://www.incompleteideas.net/IncIdeas/BitterLesson.html [01:06:35] Program induction, Kevin Ellis, Wen-Ding Li https://arxiv.org/pdf/2411.02272 [01:06:50] DreamCoder (II), Kevin Ellis et al. https://arxiv.org/abs/2006.08381 [01:11:55] Project MARA, Zenna Tavares, Kevin Ellis https://www.basis.ai/blog/mara/

Duration:01:16:55

Ask host to enable sharing for playback control

Eiso Kant (CTO poolside) - Superhuman Coding Is Coming!

4/2/2025
Eiso Kant, CTO of poolside AI, discusses the company's approach to building frontier AI foundation models, particularly focused on software development. Their unique strategy is reinforcement learning from code execution feedback which is an important axis for scaling AI capabilities beyond just increasing model size or data volume. Kant predicts human-level AI in knowledge work could be achieved within 18-36 months, outlining poolside's vision to dramatically increase software development productivity and accessibility. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** Eiso Kant: https://x.com/eisokant https://poolside.ai/ TRANSCRIPT: https://www.dropbox.com/scl/fi/szepl6taqziyqie9wgmk9/poolside.pdf?rlkey=iqar7dcwshyrpeoz0xa76k422&dl=0 TOC: 1. Foundation Models and AI Strategy [00:00:00] 1.1 Foundation Models and Timeline Predictions for AI Development [00:02:55] 1.2 Poolside AI's Corporate History and Strategic Vision [00:06:48] 1.3 Foundation Models vs Enterprise Customization Trade-offs 2. Reinforcement Learning and Model Economics [00:15:42] 2.1 Reinforcement Learning and Code Execution Feedback Approaches [00:22:06] 2.2 Model Economics and Experimental Optimization 3. Enterprise AI Implementation [00:25:20] 3.1 Poolside's Enterprise Deployment Strategy and Infrastructure [00:26:00] 3.2 Enterprise-First Business Model and Market Focus [00:27:05] 3.3 Foundation Models and AGI Development Approach [00:29:24] 3.4 DeepSeek Case Study and Infrastructure Requirements 4. LLM Architecture and Performance [00:30:15] 4.1 Distributed Training and Hardware Architecture Optimization [00:33:01] 4.2 Model Scaling Strategies and Chinchilla Optimality Trade-offs [00:36:04] 4.3 Emergent Reasoning and Model Architecture Comparisons [00:43:26] 4.4 Balancing Creativity and Determinism in AI Models [00:50:01] 4.5 AI-Assisted Software Development Evolution 5. AI Systems Engineering and Scalability [00:58:31] 5.1 Enterprise AI Productivity and Implementation Challenges [00:58:40] 5.2 Low-Code Solutions and Enterprise Hiring Trends [01:01:25] 5.3 Distributed Systems and Engineering Complexity [01:01:50] 5.4 GenAI Architecture and Scalability Patterns [01:01:55] 5.5 Scaling Limitations and Architectural Patterns in AI Code Generation 6. AI Safety and Future Capabilities [01:06:23] 6.1 Semantic Understanding and Language Model Reasoning Approaches [01:12:42] 6.2 Model Interpretability and Safety Considerations in AI Systems [01:16:27] 6.3 AI vs Human Capabilities in Software Development [01:33:45] 6.4 Enterprise Deployment and Security Architecture CORE REFS (see shownotes for URLs/more refs): [00:15:45] Research demonstrating how training on model-generated content leads to distribution collapse in AI models, Ilia Shumailov et al. (Key finding on synthetic data risk) [00:20:05] Foundational paper introducing Word2Vec for computing word vector representations, Tomas Mikolov et al. (Seminal NLP technique) [00:22:15] OpenAI O3 model's breakthrough performance on ARC Prize Challenge, OpenAI (Significant AI reasoning benchmark achievement) [00:22:40] Seminal paper proposing a formal definition of intelligence as skill-acquisition efficiency, François Chollet (Influential AI definition/philosophy) [00:30:30] Technical documentation of DeepSeek's V3 model architecture and capabilities, DeepSeek AI (Details on a major new model) [00:34:30] Foundational paper establishing optimal scaling laws for LLM training, Jordan Hoffmann et al. (Key paper on LLM scaling) [00:45:45] Seminal essay arguing that scaling computation consistently trumps human-engineered solutions in AI, Richard S. Sutton (Influential "Bitter Lesson" perspective)

Duration:01:36:28

Ask host to enable sharing for playback control

The Compendium - Connor Leahy and Gabriel Alfour

3/30/2025
Connor Leahy and Gabriel Alfour, AI researchers from Conjecture and authors of "The Compendium," joinus for a critical discussion centered on Artificial Superintelligence (ASI) safety and governance. Drawing from their comprehensive analysis in "The Compendium," they articulate a stark warning about the existential risks inherent in uncontrolled AI development, framing it through the lens of "intelligence domination"—where a sufficiently advanced AI could subordinate humanity, much like humans dominate less intelligent species. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT + REFS + NOTES: https://www.dropbox.com/scl/fi/p86l75y4o2ii40df5t7no/Compendium.pdf?rlkey=tukczgf3flw133sr9rgss0pnj&dl=0 https://www.thecompendium.ai/ https://en.wikipedia.org/wiki/Connor_Leahy https://www.conjecture.dev/about https://substack.com/@gabecc​ TOC: 1. AI Intelligence and Safety Fundamentals [00:00:00] 1.1 Understanding Intelligence and AI Capabilities [00:06:20] 1.2 Emergence of Intelligence and Regulatory Challenges [00:10:18] 1.3 Human vs Animal Intelligence Debate [00:18:00] 1.4 AI Regulation and Risk Assessment Approaches [00:26:14] 1.5 Competing AI Development Ideologies 2. Economic and Social Impact [00:29:10] 2.1 Labor Market Disruption and Post-Scarcity Scenarios [00:32:40] 2.2 Institutional Frameworks and Tech Power Dynamics [00:37:40] 2.3 Ethical Frameworks and AI Governance Debates [00:40:52] 2.4 AI Alignment Evolution and Technical Challenges 3. Technical Governance Framework [00:55:07] 3.1 Three Levels of AI Safety: Alignment, Corrigibility, and Boundedness [00:55:30] 3.2 Challenges of AI System Corrigibility and Constitutional Models [00:57:35] 3.3 Limitations of Current Boundedness Approaches [00:59:11] 3.4 Abstract Governance Concepts and Policy Solutions 4. Democratic Implementation and Coordination [00:59:20] 4.1 Governance Design and Measurement Challenges [01:00:10] 4.2 Democratic Institutions and Experimental Governance [01:14:10] 4.3 Political Engagement and AI Safety Advocacy [01:25:30] 4.4 Practical AI Safety Measures and International Coordination CORE REFS: [00:01:45] The Compendium (2023), Leahy et al. https://pdf.thecompendium.ai/the_compendium.pdf [00:06:50] Geoffrey Hinton Leaves Google, BBC News https://www.bbc.com/news/world-us-canada-65452940 [00:10:00] ARC-AGI, Chollet https://arcprize.org/arc-agi [00:13:25] A Brief History of Intelligence, Bennett https://www.amazon.com/Brief-History-Intelligence-Humans-Breakthroughs/dp/0063286343 [00:25:35] Statement on AI Risk, Center for AI Safety https://www.safe.ai/work/statement-on-ai-risk [00:26:15] Machines of Love and Grace, Amodei https://darioamodei.com/machines-of-loving-grace [00:26:35] The Techno-Optimist Manifesto, Andreessen https://a16z.com/the-techno-optimist-manifesto/ [00:31:55] Techno-Feudalism, Varoufakis https://www.amazon.co.uk/Technofeudalism-Killed-Capitalism-Yanis-Varoufakis/dp/1847927270 [00:42:40] Introducing Superalignment, OpenAI https://openai.com/index/introducing-superalignment/ [00:47:20] Three Laws of Robotics, Asimov https://www.britannica.com/topic/Three-Laws-of-Robotics [00:50:00] Symbolic AI (GOFAI), Haugeland https://en.wikipedia.org/wiki/Symbolic_artificial_intelligence [00:52:30] Intent Alignment, Christiano https://www.alignmentforum.org/posts/HEZgGBZTpT4Bov7nH/mapping-the-conceptual-territory-in-ai-existential-safety [00:55:10] Large Language Model Alignment: A Survey, Jiang et al. http://arxiv.org/pdf/2309.15025 [00:55:40] Constitutional Checks and Balances, Bok https://plato.stanford.edu/entries/montesquieu/

Duration:01:37:10

Ask host to enable sharing for playback control

ARC Prize v2 Launch! (Francois Chollet and Mike Knoop)

3/24/2025
We are joined by Francois Chollet and Mike Knoop, to launch the new version of the ARC prize! In version 2, the challenges have been calibrated with humans such that at least 2 humans could solve each task in a reasonable task, but also adversarially selected so that frontier reasoning models can't solve them. The best LLMs today get negligible performance on this challenge. https://arcprize.org/ SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT: https://www.dropbox.com/scl/fi/0v9o8xcpppdwnkntj59oi/ARCv2.pdf?rlkey=luqb6f141976vra6zdtptv5uj&dl=0 TOC: 1. ARC v2 Core Design & Objectives [00:00:00] 1.1 ARC v2 Launch and Benchmark Architecture [00:03:16] 1.2 Test-Time Optimization and AGI Assessment [00:06:24] 1.3 Human-AI Capability Analysis [00:13:02] 1.4 OpenAI o3 Initial Performance Results 2. ARC Technical Evolution [00:17:20] 2.1 ARC-v1 to ARC-v2 Design Improvements [00:21:12] 2.2 Human Validation Methodology [00:26:05] 2.3 Task Design and Gaming Prevention [00:29:11] 2.4 Intelligence Measurement Framework 3. O3 Performance & Future Challenges [00:38:50] 3.1 O3 Comprehensive Performance Analysis [00:43:40] 3.2 System Limitations and Failure Modes [00:49:30] 3.3 Program Synthesis Applications [00:53:00] 3.4 Future Development Roadmap REFS: [00:00:15] On the Measure of Intelligence, François Chollet https://arxiv.org/abs/1911.01547 [00:06:45] ARC Prize Foundation, François Chollet, Mike Knoop https://arcprize.org/ [00:12:50] OpenAI o3 model performance on ARC v1, ARC Prize Team https://arcprize.org/blog/oai-o3-pub-breakthrough [00:18:30] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Jason Wei et al. https://arxiv.org/abs/2201.11903 [00:21:45] ARC-v2 benchmark tasks, Mike Knoop https://arcprize.org/blog/introducing-arc-agi-public-leaderboard [00:26:05] ARC Prize 2024: Technical Report, Francois Chollet et al. https://arxiv.org/html/2412.04604v2 [00:32:45] ARC Prize 2024 Technical Report, Francois Chollet, Mike Knoop, Gregory Kamradt https://arxiv.org/abs/2412.04604 [00:48:55] The Bitter Lesson, Rich Sutton http://www.incompleteideas.net/IncIdeas/BitterLesson.html [00:53:30] Decoding strategies in neural text generation, Sina Zarrieß https://www.mdpi.com/2078-2489/12/9/355/pdf

Duration:00:54:15

Ask host to enable sharing for playback control

Test-Time Adaptation: the key to reasoning with DL (Mohamed Osman)

3/22/2025
Mohamed Osman joins to discuss MindsAI's highest scoring entry to the ARC challenge 2024 and the paradigm of test-time fine-tuning. They explore how the team, now part of Tufa Labs in Zurich, achieved state-of-the-art results using a combination of pre-training techniques, a unique meta-learning strategy, and an ensemble voting mechanism. Mohamed emphasizes the importance of raw data input and flexibility of the network. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT + REFS: https://www.dropbox.com/scl/fi/jeavyqidsjzjgjgd7ns7h/MoFInal.pdf?rlkey=cjjmo7rgtenxrr3b46nk6yq2e&dl=0 Mohamed Osman (Tufa Labs) https://x.com/MohamedOsmanML Jack Cole (Tufa Labs) https://x.com/MindsAI_Jack How and why deep learning for ARC paper: https://github.com/MohamedOsman1998/deep-learning-for-arc/blob/main/deep_learning_for_arc.pdf TOC: 1. Abstract Reasoning Foundations [00:00:00] 1.1 Test-Time Fine-Tuning and ARC Challenge Overview [00:10:20] 1.2 Neural Networks vs Programmatic Approaches to Reasoning [00:13:23] 1.3 Code-Based Learning and Meta-Model Architecture [00:20:26] 1.4 Technical Implementation with Long T5 Model 2. ARC Solution Architectures [00:24:10] 2.1 Test-Time Tuning and Voting Methods for ARC Solutions [00:27:54] 2.2 Model Generalization and Function Generation Challenges [00:32:53] 2.3 Input Representation and VLM Limitations [00:36:21] 2.4 Architecture Innovation and Cross-Modal Integration [00:40:05] 2.5 Future of ARC Challenge and Program Synthesis Approaches 3. Advanced Systems Integration [00:43:00] 3.1 DreamCoder Evolution and LLM Integration [00:50:07] 3.2 MindsAI Team Progress and Acquisition by Tufa Labs [00:54:15] 3.3 ARC v2 Development and Performance Scaling [00:58:22] 3.4 Intelligence Benchmarks and Transformer Limitations [01:01:50] 3.5 Neural Architecture Optimization and Processing Distribution REFS: [00:01:32] Original ARC challenge paper, François Chollet https://arxiv.org/abs/1911.01547 [00:06:55] DreamCoder, Kevin Ellis et al. https://arxiv.org/abs/2006.08381 [00:12:50] Deep Learning with Python, François Chollet https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438 [00:13:35] Deep Learning with Python, François Chollet https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438 [00:13:35] Influence of pretraining data for reasoning, Laura Ruis https://arxiv.org/abs/2411.12580 [00:17:50] Latent Program Networks, Clement Bonnet https://arxiv.org/html/2411.08706v1 [00:20:50] T5, Colin Raffel et al. https://arxiv.org/abs/1910.10683 [00:30:30] Combining Induction and Transduction for Abstract Reasoning, Wen-Ding Li, Kevin Ellis et al. https://arxiv.org/abs/2411.02272 [00:34:15] Six finger problem, Chen et al. https://openaccess.thecvf.com/content/CVPR2024/papers/Chen_SpatialVLM_Endowing_Vision-Language_Models_with_Spatial_Reasoning_Capabilities_CVPR_2024_paper.pdf [00:38:15] DeepSeek-R1-Distill-Llama, DeepSeek AI https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B [00:40:10] ARC Prize 2024 Technical Report, François Chollet et al. https://arxiv.org/html/2412.04604v2 [00:45:20] LLM-Guided Compositional Program Synthesis, Wen-Ding Li and Kevin Ellis https://arxiv.org/html/2503.15540 [00:54:25] Abstraction and Reasoning Corpus, François Chollet https://github.com/fchollet/ARC-AGI [00:57:10] O3 breakthrough on ARC-AGI, OpenAI https://arcprize.org/ [00:59:35] ConceptARC Benchmark, Arseny Moskvichev, Melanie Mitchell https://arxiv.org/abs/2305.07141 [01:02:05] Mixtape: Breaking the Softmax Bottleneck Efficiently, Yang, Zhilin and Dai, Zihang and Salakhutdinov, Ruslan and Cohen, William W. http://papers.neurips.cc/paper/9723-mixtape-breaking-the-softmax-bottleneck-efficiently.pdf

Duration:01:03:36

Ask host to enable sharing for playback control

GSMSymbolic paper - Iman Mirzadeh (Apple)

3/19/2025
Iman Mirzadeh from Apple, who recently published the GSM-Symbolic paper discusses the crucial distinction between intelligence and achievement in AI systems. He critiques current AI research methodologies, highlighting the limitations of Large Language Models (LLMs) in reasoning and knowledge representation. SPONSOR MESSAGES: *** Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/ *** TRANSCRIPT + RESEARCH: https://www.dropbox.com/scl/fi/mlcjl9cd5p1kem4l0vqd3/IMAN.pdf?rlkey=dqfqb74zr81a5gqr8r6c8isg3&dl=0 TOC: 1. Intelligence vs Achievement in AI Systems [00:00:00] 1.1 Intelligence vs Achievement Metrics in AI Systems [00:03:27] 1.2 AlphaZero and Abstract Understanding in Chess [00:10:10] 1.3 Language Models and Distribution Learning Limitations [00:14:47] 1.4 Research Methodology and Theoretical Frameworks 2. Intelligence Measurement and Learning [00:24:24] 2.1 LLM Capabilities: Interpolation vs True Reasoning [00:29:00] 2.2 Intelligence Definition and Measurement Approaches [00:34:35] 2.3 Learning Capabilities and Agency in AI Systems [00:39:26] 2.4 Abstract Reasoning and Symbol Understanding 3. LLM Performance and Evaluation [00:47:15] 3.1 Scaling Laws and Fundamental Limitations [00:54:33] 3.2 Connectionism vs Symbolism Debate in Neural Networks [00:58:09] 3.3 GSM-Symbolic: Testing Mathematical Reasoning in LLMs [01:08:38] 3.4 Benchmark Evaluation and Model Performance Assessment REFS: [00:01:00] AlphaZero chess AI system, Silver et al. https://arxiv.org/abs/1712.01815 [00:07:10] Game Changer: AlphaZero's Groundbreaking Chess Strategies, Sadler & Regan https://www.amazon.com/Game-Changer-AlphaZeros-Groundbreaking-Strategies/dp/9056918184 [00:11:35] Cross-entropy loss in language modeling, Voita http://lena-voita.github.io/nlp_course/language_modeling.html [00:17:20] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in LLMs, Mirzadeh et al. https://arxiv.org/abs/2410.05229 [00:21:25] Connectionism and Cognitive Architecture: A Critical Analysis, Fodor & Pylyshyn https://www.sciencedirect.com/science/article/pii/001002779090014B [00:28:55] Brain-to-body mass ratio scaling laws, Sutskever https://www.theverge.com/2024/12/13/24320811/what-ilya-sutskever-sees-openai-model-data-training [00:29:40] On the Measure of Intelligence, Chollet https://arxiv.org/abs/1911.01547 [00:33:30] On definition of intelligence, Gignac et al. https://www.sciencedirect.com/science/article/pii/S0160289624000266 [00:35:30] Defining intelligence, Wang https://cis.temple.edu/~wangp/papers.html [00:37:40] How We Learn: Why Brains Learn Better Than Any Machine... for Now, Dehaene https://www.amazon.com/How-We-Learn-Brains-Machine/dp/0525559884 [00:39:35] Surfaces and Essences: Analogy as the Fuel and Fire of Thinking, Hofstadter and Sander https://www.amazon.com/Surfaces-Essences-Analogy-Fuel-Thinking/dp/0465018475 [00:43:15] Chain-of-thought prompting, Wei et al. https://arxiv.org/abs/2201.11903 [00:47:20] Test-time scaling laws in machine learning, Brown https://podcasts.apple.com/mv/podcast/openais-noam-brown-ilge-akkaya-and-hunter-lightman-on/id1750736528?i=1000671532058 [00:47:50] Scaling Laws for Neural Language Models, Kaplan et al. https://arxiv.org/abs/2001.08361 [00:55:15] Tensor product variable binding, Smolensky https://www.sciencedirect.com/science/article/abs/pii/000437029090007M [01:08:45] GSM-8K dataset, OpenAI https://huggingface.co/datasets/openai/gsm8k

Duration:01:11:23