Powered by RND
PodcastyTechnologiaMachine Learning Street Talk (MLST)
Słuchaj Machine Learning Street Talk (MLST) w aplikacji
Słuchaj Machine Learning Street Talk (MLST) w aplikacji
(4 676)(250 137)
Zapisz stacje
Budzik
Sleep timer

Machine Learning Street Talk (MLST)

Podcast Machine Learning Street Talk (MLST)
Machine Learning Street Talk (MLST)
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuro...

Dostępne odcinki

5 z 195
  • Jurgen Schmidhuber on Humans co-existing with AIs
    Jürgen Schmidhuber, the father of generative AI, challenges current AI narratives, revealing that early deep learning work is in his opinion misattributed, where it actually originated in Ukraine and Japan. He discusses his early work on linear transformers and artificial curiosity which preceded modern developments, shares his expansive vision of AI colonising space, and explains his groundbreaking 1991 consciousness model. Schmidhuber dismisses fears of human-AI conflict, arguing that superintelligent AI scientists will be fascinated by their own origins and motivated to protect life rather than harm it, while being more interested in other superintelligent AI and in cosmic expansion than earthly matters. He offers unique insights into how humans and AI might coexist. This was the long-awaited second, unreleased part of our interview we filmed last time. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? Goto https://tufalabs.ai/ *** Interviewer: Tim Scarfe TOC [00:00:00] The Nature and Motivations of AI [00:02:08] Influential Inventions: 20th vs. 21st Century [00:05:28] Transformer and GPT: A Reflection The revolutionary impact of modern language models, the 1991 linear transformer, linear vs. quadratic scaling, the fast weight controller, and fast weight matrix memory. [00:11:03] Pioneering Contributions to AI and Deep Learning The invention of the transformer, pre-trained networks, the first GANs, the role of predictive coding, and the emergence of artificial curiosity. [00:13:58] AI's Evolution and Achievements The role of compute, breakthroughs in handwriting recognition and computer vision, the rise of GPU-based CNNs, achieving superhuman results, and Japanese contributions to CNN development. [00:15:40] The Hardware Lottery and GPUs GPUs as a serendipitous advantage for AI, the gaming-AI parallel, and Nvidia's strategic shift towards AI. [00:19:58] AI Applications and Societal Impact AI-powered translation breaking communication barriers, AI in medicine for imaging and disease prediction, and AI's potential for human enhancement and sustainable development. [00:23:26] The Path to AGI and Current Limitations Distinguishing large language models from AGI, challenges in replacing physical world workers, and AI's difficulty in real-world versus board games. [00:25:56] AI and Consciousness Simulating consciousness through unsupervised learning, chunking and automatizing neural networks, data compression, and self-symbols in predictive world models. [00:30:50] The Future of AI and Humanity Transition from AGIs as tools to AGIs with their own goals, the role of humans in an AGI-dominated world, and the concept of Homo Ludens. [00:38:05] The AI Race: Europe, China, and the US Europe's historical contributions, current dominance of the US and East Asia, and the role of venture capital and industrial policy. [00:50:32] Addressing AI Existential Risk The obsession with AI existential risk, commercial pressure for friendly AIs, AI vs. hydrogen bombs, and the long-term future of AI. [00:58:00] The Fermi Paradox and Extraterrestrial Intelligence Expanding AI bubbles as an explanation for the Fermi paradox, dark matter and encrypted civilizations, and Earth as the first to spawn an AI bubble. [01:02:08] The Diversity of AI and AI Ecologies The unrealism of a monolithic super intelligence, diverse AIs with varying goals, and intense competition and collaboration in AI ecologies. [01:12:21] Final Thoughts and Closing Remarks REFERENCES: See pinned comment on YT: https://youtu.be/fZYUqICYCAk
    --------  
    1:12:50
  • Yoshua Bengio - Designing out Agency for Safe AI
    Professor Yoshua Bengio is a pioneer in deep learning and Turing Award winner. Bengio talks about AI safety, why goal-seeking “agentic” AIs might be dangerous, and his vision for building powerful AI tools without giving them agency. Topics include reward tampering risks, instrumental convergence, global AI governance, and how non-agent AIs could revolutionize science and medicine while reducing existential threats. Perfect for anyone curious about advanced AI risks and how to manage them responsibly. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? They are hosting an event in Zurich on January 9th with the ARChitects, join if you can. Goto https://tufalabs.ai/ *** Interviewer: Tim Scarfe Yoshua Bengio: https://x.com/Yoshua_Bengio https://scholar.google.com/citations?user=kukA0LcAAAAJ&hl=en https://yoshuabengio.org/ https://en.wikipedia.org/wiki/Yoshua_Bengio TOC: 1. AI Safety Fundamentals [00:00:00] 1.1 AI Safety Risks and International Cooperation [00:03:20] 1.2 Fundamental Principles vs Scaling in AI Development [00:11:25] 1.3 System 1/2 Thinking and AI Reasoning Capabilities [00:15:15] 1.4 Reward Tampering and AI Agency Risks [00:25:17] 1.5 Alignment Challenges and Instrumental Convergence 2. AI Architecture and Safety Design [00:33:10] 2.1 Instrumental Goals and AI Safety Fundamentals [00:35:02] 2.2 Separating Intelligence from Goals in AI Systems [00:40:40] 2.3 Non-Agent AI as Scientific Tools [00:44:25] 2.4 Oracle AI Systems and Mathematical Safety Frameworks 3. Global Governance and Security [00:49:50] 3.1 International AI Competition and Hardware Governance [00:51:58] 3.2 Military and Security Implications of AI Development [00:56:07] 3.3 Personal Evolution of AI Safety Perspectives [01:00:25] 3.4 AI Development Scaling and Global Governance Challenges [01:12:10] 3.5 AI Regulation and Corporate Oversight 4. Technical Innovations [01:23:00] 4.1 Evolution of Neural Architectures: From RNNs to Transformers [01:26:02] 4.2 GFlowNets and Symbolic Computation [01:30:47] 4.3 Neural Dynamics and Consciousness [01:34:38] 4.4 AI Creativity and Scientific Discovery SHOWNOTES (Transcript, references, best clips etc): https://www.dropbox.com/scl/fi/ajucigli8n90fbxv9h94x/BENGIO_SHOW.pdf?rlkey=38hi2m19sylnr8orb76b85wkw&dl=0 CORE REFS (full list in shownotes and pinned comment): [00:00:15] Bengio et al.: "AI Risk" Statement https://www.safe.ai/work/statement-on-ai-risk [00:23:10] Bengio on reward tampering & AI safety (Harvard Data Science Review) https://hdsr.mitpress.mit.edu/pub/w974bwb0 [00:40:45] Munk Debate on AI existential risk, featuring Bengio https://munkdebates.com/debates/artificial-intelligence [00:44:30] "Can a Bayesian Oracle Prevent Harm from an Agent?" (Bengio et al.) on oracle-to-agent safety https://arxiv.org/abs/2408.05284 [00:51:20] Bengio (2024) memo on hardware-based AI governance verification https://yoshuabengio.org/wp-content/uploads/2024/08/FlexHEG-Memo_August-2024.pdf [01:12:55] Bengio’s involvement in EU AI Act code of practice https://digital-strategy.ec.europa.eu/en/news/meet-chairs-leading-development-first-general-purpose-ai-code-practice [01:27:05] Complexity-based compositionality theory (Elmoznino, Jiralerspong, Bengio, Lajoie) https://arxiv.org/abs/2410.14817 [01:29:00] GFlowNet Foundations (Bengio et al.) for probabilistic inference https://arxiv.org/pdf/2111.09266 [01:32:10] Discrete attractor states in neural systems (Nam, Elmoznino, Bengio, Lajoie) https://arxiv.org/pdf/2302.06403
    --------  
    1:41:53
  • Francois Chollet - ARC reflections - NeurIPS 2024
    François Chollet discusses the outcomes of the ARC-AGI (Abstraction and Reasoning Corpus) Prize competition in 2024, where accuracy rose from 33% to 55.5% on a private evaluation set. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? They are hosting an event in Zurich on January 9th with the ARChitects, join if you can. Goto https://tufalabs.ai/ *** Read about the recent result on o3 with ARC here (Chollet knew about it at the time of the interview but wasn't allowed to say): https://arcprize.org/blog/oai-o3-pub-breakthrough TOC: 1. Introduction and Opening [00:00:00] 1.1 Deep Learning vs. Symbolic Reasoning: François’s Long-Standing Hybrid View [00:00:48] 1.2 “Why Do They Call You a Symbolist?” – Addressing Misconceptions [00:01:31] 1.3 Defining Reasoning 3. ARC Competition 2024 Results and Evolution [00:07:26] 3.1 ARC Prize 2024: Reflecting on the Narrative Shift Toward System 2 [00:10:29] 3.2 Comparing Private Leaderboard vs. Public Leaderboard Solutions [00:13:17] 3.3 Two Winning Approaches: Deep Learning–Guided Program Synthesis and Test-Time Training 4. Transduction vs. Induction in ARC [00:16:04] 4.1 Test-Time Training, Overfitting Concerns, and Developer-Aware Generalization [00:19:35] 4.2 Gradient Descent Adaptation vs. Discrete Program Search 5. ARC-2 Development and Future Directions [00:23:51] 5.1 Ensemble Methods, Benchmark Flaws, and the Need for ARC-2 [00:25:35] 5.2 Human-Level Performance Metrics and Private Test Sets [00:29:44] 5.3 Task Diversity, Redundancy Issues, and Expanded Evaluation Methodology 6. Program Synthesis Approaches [00:30:18] 6.1 Induction vs. Transduction [00:32:11] 6.2 Challenges of Writing Algorithms for Perceptual vs. Algorithmic Tasks [00:34:23] 6.3 Combining Induction and Transduction [00:37:05] 6.4 Multi-View Insight and Overfitting Regulation 7. Latent Space and Graph-Based Synthesis [00:38:17] 7.1 Clément Bonnet’s Latent Program Search Approach [00:40:10] 7.2 Decoding to Symbolic Form and Local Discrete Search [00:41:15] 7.3 Graph of Operators vs. Token-by-Token Code Generation [00:45:50] 7.4 Iterative Program Graph Modifications and Reusable Functions 8. Compute Efficiency and Lifelong Learning [00:48:05] 8.1 Symbolic Process for Architecture Generation [00:50:33] 8.2 Logarithmic Relationship of Compute and Accuracy [00:52:20] 8.3 Learning New Building Blocks for Future Tasks 9. AI Reasoning and Future Development [00:53:15] 9.1 Consciousness as a Self-Consistency Mechanism in Iterative Reasoning [00:56:30] 9.2 Reconciling Symbolic and Connectionist Views [01:00:13] 9.3 System 2 Reasoning - Awareness and Consistency [01:03:05] 9.4 Novel Problem Solving, Abstraction, and Reusability 10. Program Synthesis and Research Lab [01:05:53] 10.1 François Leaving Google to Focus on Program Synthesis [01:09:55] 10.2 Democratizing Programming and Natural Language Instruction 11. Frontier Models and O1 Architecture [01:14:38] 11.1 Search-Based Chain of Thought vs. Standard Forward Pass [01:16:55] 11.2 o1’s Natural Language Program Generation and Test-Time Compute Scaling [01:19:35] 11.3 Logarithmic Gains with Deeper Search 12. ARC Evaluation and Human Intelligence [01:22:55] 12.1 LLMs as Guessing Machines and Agent Reliability Issues [01:25:02] 12.2 ARC-2 Human Testing and Correlation with g-Factor [01:26:16] 12.3 Closing Remarks and Future Directions SHOWNOTES PDF: https://www.dropbox.com/scl/fi/ujaai0ewpdnsosc5mc30k/CholletNeurips.pdf?rlkey=s68dp432vefpj2z0dp5wmzqz6&st=hazphyx5&dl=0
    --------  
    1:26:46
  • Jeff Clune - Agent AI Needs Darwin
    AI professor Jeff Clune ruminates on open-ended evolutionary algorithms—systems designed to generate novel and interesting outcomes forever. Drawing inspiration from nature’s boundless creativity, Clune and his collaborators aim to build “Darwin Complete” search spaces, where any computable environment can be simulated. By harnessing the power of large language models and reinforcement learning, these AI agents continuously develop new skills, explore uncharted domains, and even cooperate with one another in complex tasks. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? They are hosting an event in Zurich on January 9th with the ARChitects, join if you can. Goto https://tufalabs.ai/ *** A central theme throughout Clune’s work is “interestingness”: an elusive quality that nudges AI agents toward genuinely original discoveries. Rather than rely on narrowly defined metrics—which often fail due to Goodhart’s Law—Clune employs language models to serve as proxies for human judgment. In doing so, he ensures that “interesting” always reflects authentic novelty, opening the door to unending innovation. Yet with these extraordinary possibilities come equally significant risks. Clune says we need AI safety measures—particularly as the technology matures into powerful, open-ended forms. Potential pitfalls include agents inadvertently causing harm or malicious actors subverting AI’s capabilities for destructive ends. To mitigate this, Clune advocates for prudent governance involving democratic coalitions, regulation of cutting-edge models, and global alignment protocols. Jeff Clune: https://x.com/jeffclune http://jeffclune.com/ (Interviewer: Tim Scarfe) TOC: 1. Introduction [00:00:00] 1.1 Overview and Opening Thoughts 2. Sponsorship [00:03:00] 2.1 TufaAI Labs and CentML 3. Evolutionary AI Foundations [00:04:12] 3.1 Open-Ended Algorithm Development and Abstraction Approaches [00:07:56] 3.2 Novel Intelligence Forms and Serendipitous Discovery [00:11:46] 3.3 Frontier Models and the 'Interestingness' Problem [00:30:36] 3.4 Darwin Complete Systems and Evolutionary Search Spaces 4. System Architecture and Learning [00:37:35] 4.1 Code Generation vs Neural Networks Comparison [00:41:04] 4.2 Thought Cloning and Behavioral Learning Systems [00:47:00] 4.3 Language Emergence in AI Systems [00:50:23] 4.4 AI Interpretability and Safety Monitoring Techniques 5. AI Safety and Governance [00:53:56] 5.1 Language Model Consistency and Belief Systems [00:57:00] 5.2 AI Safety Challenges and Alignment Limitations [01:02:07] 5.3 Open Source AI Development and Value Alignment [01:08:19] 5.4 Global AI Governance and Development Control 6. Advanced AI Systems and Evolution [01:16:55] 6.1 Agent Systems and Performance Evaluation [01:22:45] 6.2 Continuous Learning Challenges and In-Context Solutions [01:26:46] 6.3 Evolution Algorithms and Environment Generation [01:35:36] 6.4 Evolutionary Biology Insights and Experiments [01:48:08] 6.5 Personal Journey from Philosophy to AI Research Shownotes: We craft detailed show notes for each episode with high quality transcript and references and best parts bolded. https://www.dropbox.com/scl/fi/fz43pdoc5wq5jh7vsnujl/JEFFCLUNE.pdf?rlkey=uu0e70ix9zo6g5xn6amykffpm&st=k2scxteu&dl=0
    --------  
    2:00:13
  • Neel Nanda - Mechanistic Interpretability (Sparse Autoencoders)
    Neel Nanda, a senior research scientist at Google DeepMind, leads their mechanistic interpretability team. In this extensive interview, he discusses his work trying to understand how neural networks function internally. At just 25 years old, Nanda has quickly become a prominent voice in AI research after completing his pure mathematics degree at Cambridge in 2020. Nanda reckons that machine learning is unique because we create neural networks that can perform impressive tasks (like complex reasoning and software engineering) without understanding how they work internally. He compares this to having computer programs that can do things no human programmer knows how to write. His work focuses on "mechanistic interpretability" - attempting to uncover and understand the internal structures and algorithms that emerge within these networks. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on ARC and AGI, they just acquired MindsAI - the current winners of the ARC challenge. Are you interested in working on ARC, or getting involved in their events? Goto https://tufalabs.ai/ *** SHOWNOTES, TRANSCRIPT, ALL REFERENCES (DONT MISS!): https://www.dropbox.com/scl/fi/36dvtfl3v3p56hbi30im7/NeelShow.pdf?rlkey=pq8t7lyv2z60knlifyy17jdtx&st=kiutudhc&dl=0 We riff on: * How neural networks develop meaningful internal representations beyond simple pattern matching * The effectiveness of chain-of-thought prompting and why it improves model performance * The importance of hands-on coding over extensive paper reading for new researchers * His journey from Cambridge to working with Chris Olah at Anthropic and eventually Google DeepMind * The role of mechanistic interpretability in AI safety NEEL NANDA: https://www.neelnanda.io/ https://scholar.google.com/citations?user=GLnX3MkAAAAJ&hl=en https://x.com/NeelNanda5 Interviewer - Tim Scarfe TOC: 1. Part 1: Introduction [00:00:00] 1.1 Introduction and Core Concepts Overview 2. Part 2: Outside Interview [00:06:45] 2.1 Mechanistic Interpretability Foundations 3. Part 3: Main Interview [00:32:52] 3.1 Mechanistic Interpretability 4. Neural Architecture and Circuits [01:00:31] 4.1 Biological Evolution Parallels [01:04:03] 4.2 Universal Circuit Patterns and Induction Heads [01:11:07] 4.3 Entity Detection and Knowledge Boundaries [01:14:26] 4.4 Mechanistic Interpretability and Activation Patching 5. Model Behavior Analysis [01:30:00] 5.1 Golden Gate Claude Experiment and Feature Amplification [01:33:27] 5.2 Model Personas and RLHF Behavior Modification [01:36:28] 5.3 Steering Vectors and Linear Representations [01:40:00] 5.4 Hallucinations and Model Uncertainty 6. Sparse Autoencoder Architecture [01:44:54] 6.1 Architecture and Mathematical Foundations [02:22:03] 6.2 Core Challenges and Solutions [02:32:04] 6.3 Advanced Activation Functions and Top-k Implementations [02:34:41] 6.4 Research Applications in Transformer Circuit Analysis 7. Feature Learning and Scaling [02:48:02] 7.1 Autoencoder Feature Learning and Width Parameters [03:02:46] 7.2 Scaling Laws and Training Stability [03:11:00] 7.3 Feature Identification and Bias Correction [03:19:52] 7.4 Training Dynamics Analysis Methods 8. Engineering Implementation [03:23:48] 8.1 Scale and Infrastructure Requirements [03:25:20] 8.2 Computational Requirements and Storage [03:35:22] 8.3 Chain-of-Thought Reasoning Implementation [03:37:15] 8.4 Latent Structure Inference in Language Models
    --------  
    3:42:36

Więcej Technologia podcastów

O Machine Learning Street Talk (MLST)

Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).
Strona internetowa podcastu

Słuchaj Machine Learning Street Talk (MLST), Techstorie - rozmowy o technologiach i wielu innych podcastów z całego świata dzięki aplikacji radio.pl

Uzyskaj bezpłatną aplikację radio.pl

  • Stacje i podcasty do zakładek
  • Strumieniuj przez Wi-Fi lub Bluetooth
  • Obsługuje Carplay & Android Auto
  • Jeszcze więcej funkcjonalności
Media spoecznościowe
v7.2.0 | © 2007-2025 radio.de GmbH
Generated: 1/19/2025 - 9:12:39 AM