AI正从感知走向认知与具身,多模态世界模型和安全成为实现AGI的关键。
第 10 周
发展中...
第 11 周
发展中...
第 12 周
**多模态世界模型与具身智能**
第 13 周
**多模态世界模型与具身智能** **解读:** 本周的AI前沿内容呈现出几个相互关联的关键趋势,它们共同指向了未来AI发展的一个主要方向:构建能够理解和与复杂世界
第 14 周
具身智能与多模态融合
未来方向
💫 这条线展示了 AI 发展的演进脉络。每个节点代表一周的核心发展方向,通过观察这条线的演变,你可以看到 AI 如何从过去的方向逐步演进到现在,以及未来可能的发展方向。

2026/4/9
Lars Brownworth is a historian, teacher, podcaster, and author specializing in Viking history, medieval Europe, and the Byzantine Empire. https://lexfridman.com/sponsors/ep495-sc Transcript: https://lexfridman.com/lars-brownworth-transcript CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: https://larsbrownworth.com/ https://www.amazon.com/Sea-Wolves-History-Vikings/dp/1909979120 https://amzn.to/4sHY0xw https://12byzantinerulers.com/ https://apple.co/4sgSxNi SPONSORS: Larridin: Measure AI adoption in your business. https://larridin.com BetterHelp: Online therapy and counseling. https://betterhelp.com/lex LMNT: Zero-sugar electrolyte drink mix. https://drinkLMNT.com/lex Fin: AI agent for customer service. https://fin.ai/lex Shopify: Sell stuff online. https://shopify.com/lex Perplexity: AI-powered answer engine. https://perplexity.ai/ OUTLINE: PODCAST LINKS: https://lexfridman.com/podcast https://apple.co/2lwqZIr https://spoti.fi/2nEwCF8 https://lexfridman.com/feed/podcast/ https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 https://www.youtube.com/lexclips

2026/3/23
Jensen Huang is the co-founder and CEO of NVIDIA, the world’s most valuable company and the engine powering the AI computing revolution. https://lexfridman.com/sponsors/ep494-sc Transcript: https://lexfridman.com/jensen-huang-transcript CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: https://nvidia.com https://x.com/nvidia https://x.com/NVIDIAAI https://youtube.com/@nvidia https://www.instagram.com/nvidia/ https://www.linkedin.com/company/nvidia/ https://www.facebook.com/NVIDIA/ https://github.com/NVIDIA https://developer.nvidia.com/nemotron SPONSORS: Perplexity: AI-powered answer engine. https://perplexity.ai/ Shopify: Sell stuff online. https://shopify.com/lex LMNT: Zero-sugar electrolyte drink mix. https://drinkLMNT.com/lex Fin: AI agent for customer service. https://fin.ai/lex Quo: Phone system (calls, texts, contacts) for businesses. https://quo.com/lex OUTLINE: PODCAST LINKS: https://lexfridman.com/podcast https://apple.co/2lwqZIr https://spoti.fi/2nEwCF8 https://lexfridman.com/feed/podcast/ https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 https://www.youtube.com/lexclips

2026/3/11
Jeff Kaplan is a legendary Blizzard game designer of World of Warcraft and Overwatch, now preparing to launch a new game, The Legend of California, from his new studio Kintsugiyama – available to wishlist on Steam today, with alpha later in March. https://lexfridman.com/sponsors/ep493-sc CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: https://store.steampowered.com/app/2550530/The_Legend_of_California https://www.kintsugiyama.com/ SPONSORS: Fin: AI agent for customer service. https://fin.ai/lex Blitzy: AI agent for large enterprise codebases. https://blitzy.com/lex BetterHelp: Online therapy and counseling. https://betterhelp.com/lex Shopify: Sell stuff online. https://shopify.com/lex CodeRabbit: AI-powered code reviews. https://coderabbit.ai/lex Perplexity: AI-powered answer engine. https://perplexity.ai/ OUTLINE: PODCAST LINKS: https://lexfridman.com/podcast https://apple.co/2lwqZIr https://spoti.fi/2nEwCF8 https://lexfridman.com/feed/podcast/ https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 https://www.youtube.com/lexclips

2026/3/1
Rick Beato is a music educator, interviewer, producer, songwriter, and a true multi-instrument musician, playing guitar, bass, cello & piano. His incredible YouTube channel celebrates great musicians & musical ideas, and helps millions of people fall in love with great music all over again. https://lexfridman.com/sponsors/ep492-sc Transcript: https://lexfridman.com/rick-beato-transcript CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: https://youtube.com/RickBeato https://x.com/rickbeato https://instagram.com/rickbeato1 https://rickbeato.com https://beatoeartraining.com https://beatobook.com SPONSORS: UPLIFT Desk: Standing desks and office ergonomics. https://upliftdesk.com/lex BetterHelp: Online therapy and counseling. https://betterhelp.com/lex LMNT: Zero-sugar electrolyte drink mix. https://drinkLMNT.com/lex Fin: AI agent for customer service. https://fin.ai/lex Shopify: Sell stuff online. https://shopify.com/lex Perplexity: AI-powered answer engine. https://perplexity.ai/ OUTLINE: PODCAST LINKS: https://lexfridman.com/podcast https://apple.co/2lwqZIr https://spoti.fi/2nEwCF8 https://lexfridman.com/feed/podcast/ https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 https://www.youtube.com/lexclips

2026/2/12
Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that’s the fastest-growing project in GitHub history. https://lexfridman.com/sponsors/ep491-sc Transcript: https://lexfridman.com/peter-steinberger-transcript CONTACT LEX: Feedback – give feedback to Lex: https://lexfridman.com/survey AMA – submit questions, videos or call-in: https://lexfridman.com/ama Hiring – join our team: https://lexfridman.com/hiring Other – other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: https://x.com/steipete https://github.com/steipete https://steipete.com https://www.linkedin.com/in/steipete https://openclaw.ai https://github.com/openclaw/openclaw https://discord.gg/openclaw SPONSORS: Perplexity: AI-powered answer engine. https://perplexity.ai/ Quo: Phone system (calls, texts, contacts) for businesses. https://quo.com/lex CodeRabbit: AI-powered code reviews. https://coderabbit.ai/lex Fin: AI agent for customer service. https://fin.ai/lex Blitzy: AI agent for large enterprise codebases. https://blitzy.com/lex Shopify: Sell stuff online. https://shopify.com/lex LMNT: Zero-sugar electrolyte drink mix. https://drinkLMNT.com/lex OUTLINE: (00:00) – Introduction (03:51) – Sponsors, Comments, and Reflections (15:29) – OpenClaw origin story (18:48) – Mind-blowing moment (28:15) – Why OpenClaw went viral (32:12) – Self-modifying AI agent (36:57) – Name-change drama (54:07) – Moltbook saga (1:02:26) – OpenClaw security concerns (1:11:07) – How to code with AI agents (1:42:02) – Programming setup (1:48:45) – GPT Codex 5.3 vs Claude Opus 4.6 (1:57:52) – Best AI agent for programming (2:19:52) – Life story and career advice (2:23:49) – Money and happiness (2:27:41) – Acquisition offers from OpenAI and Meta (2:44:51) – How OpenClaw works (2:56:09) – AI slop (3:02:13) – AI agents will replace 80% of apps (3:10:50) – Will AI replace programmers? (3:22:50) – Future of OpenClaw community
📖 检索增强生成:让 AI 能访问外部知识库
Mirage, the maker of video editing app Captions, has raised $75 million in growth financing from General Catalyst's Customer Value Fund (CVF)....
📰 行业动态:最新的 AI 发展趋势
Agile Robots will incorporate Google DeepMind's robotics foundation models into its bots while collecting data for the AI research lab....
📰 行业动态:最新的 AI 发展趋势
London's Air Street Capital has raised a large Fund III with eyes locked on backing early-stage European and North American AI companies....
本周AI领域呈现出多模态融合加速、AI安全日益受重视以及智能体生态蓬勃发展的显著趋势。多模态技术是本周的焦点,Mirage公司凭借其AI视频编辑应用Captions获得7500万美元融资。OpenAI的Sora模型强调了其在安全方面的努力。学术界也紧随其后,多篇论文探讨了多模态的深度融合。随着AI能力的增强,安全问题日益突出。OpenAI通过链式思维监控来检测和缓解内部编码智能体中的对齐风险。智能体概念在本周被广泛讨论并实现技术落地。GitHub热门项目如MoneyPrinterV2、TradingAgents和MiroFish都围绕智能体展开。
来源:TechCrunch AI
Mirage, the maker of video editing app Captions, has raised $75 million in growth financing from General Catalyst's Customer Value Fund (CVF)....
来源:TechCrunch AI
Agile Robots will incorporate Google DeepMind's robotics foundation models into its bots while collecting data for the AI research lab....
来源:TechCrunch AI
London's Air Street Capital has raised a large Fund III with eyes locked on backing early-stage European and North American AI companies....
来源:OpenAI Blog
To address the novel safety challenges posed by a state-of-the-art video model as well as a new social creation platform, we've built Sora 2 and the Sora app with safety at the foundation. Our approac...
来源:OpenAI Blog
How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards....
来源:OpenAI Blog
Accelerates Codex growth to power the next generation of Python developer tools...
TradingAgents: Multi-Agents LLM Financial Trading Framework
A Simple and Universal Swarm Intelligence Engine, Predicting Anything. 简洁通用的群体智能引擎,预测万物
Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.
An open-source SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of tasks that could take minutes to hours.
作者:Umair Nawaz, Ahmed Heakl, Ufaq Khan, Abdelrahman Shaker, Salman Khan, Fahad Shahbaz Khan
Diffusion Transformers (DiTs) power high-fidelity video world models but remain computationally expensive due to sequential denoising and costly spatio-temporal attention. Training-free feature cachin...
作者:Shivam Duggal, Xingjian Bai, Zongze Wu, Richard Zhang, Eli Shechtman, Antonio Torralba, Phillip Isola, William T. Freeman
Latent diffusion models (LDMs) enable high-fidelity synthesis by operating in learned latent spaces. However, training state-of-the-art LDMs requires complex staging: a tokenizer must be trained first...
作者:Ziyi Wang, Xinshun Wang, Shuang Chen, Yang Cong, Mengyuan Liu
We present UniMotion, to our knowledge the first unified framework for simultaneous understanding and generation of human motion, natural language, and RGB images within a single architecture. Existin...
作者:Haichao Zhang, Yijiang Li, Shwai He, Tushar Nagarajan, Mingfei Chen, Jianglin Lu, Ang Li, Yun Fu
Recent progress in latent world models (e.g., V-JEPA2) has shown promising capability in forecasting future world states from video observations. Nevertheless, dense prediction from a short observatio...
作者:Haoyu Zhen, Xiaolong Li, Yilin Zhao, Han Zhang, Sifei Liu, Kaichun Mo, Chuang Gan, Subhashree Radhakrishnan
Large Language Models (LLMs) and Vision Language Models (VLMs) have shown impressive reasoning abilities, yet they struggle with spatial understanding and layout consistency when performing fine-grain...
LeCun emphasizes the importance of world models and self-supervised learning for achieving AGI, moving beyond pure scaling of language models.
Karpathy discusses optimization techniques for LLM training, focusing on efficiency and scalability improvements.
Growing consensus in the AI community about the importance of safety and alignment as core research areas.
为已站在最前沿的研究者提供发散性思考,包括研究交叉点、反向思考、技术瓶颈和社区趋势。
WorldCache等项目将视频理解与世界模型结合,为具身智能体提供实时环境预测能力
OpenAI的chain-of-thought monitoring将安全监控与多模态推理融合,实现更细粒度的对齐检测
Google DeepMind与Agile Robots的合作展示了如何通过数据收集和模型整合推动具身智能体进化
当前业界聚焦大模型能力,但对于特定领域(如机器人控制、实时推理)小模型的高效性可能是关键
OpenAI将安全机制内置于Sora设计中,而非事后补救,这种约束驱动设计可能催生新的架构范式
Google DeepMind与机器人公司的合作强调数据收集的价值,这可能比模型创新更成为竞争壁垒
当前视频世界模型仍难以实现低延迟实时推理,限制了其在机器人控制中的应用
链式思维监控虽然有效,但其决策过程仍然黑盒,难以被人类理解和干预
当前机器人模型在特定环境训练效果好,但跨环境泛化能力仍然有限
社区讨论逐渐从'模型有多聪明'转向'智能体能做什么',反映了实用性关注的提升
Yann LeCun、Andrej Karpathy等大佬在X上频繁讨论AGI安全和对齐问题,表明这已成为行业共识
Deer-Flow、MoneyPrinterV2等开源项目获得大量关注,表明开发者社区对智能体框架的需求旺盛