
This episode dives into OpenAI's promising new model, Strawberry, which could revolutionize interactions in ChatGPT. We explore the financial envy Nvidia employees inspire in their Google and Meta counterparts due to lucrative stock options. Google’s new Pipe SQL syntax aims to simplify data querying, while concerns about research accessibility are raised. Finally, we discuss BaichuanSEED and Dolphin models, which highlight advancements in extensible data collection and energy-efficient processing, paving the way for enhanced AI capabilities.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:40 OpenAI Races to Launch Strawberry
03:07 Google, Meta workers envy Nvidia staffers’ fat paychecks: ‘Bought a 100K car … all cash’
05:01 Google's New Pipe SQL Syntax
06:12 Fake sponsor
07:47 BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline
09:20 Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
11:09 Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
12:50 Outro
Aug 29, 2024
14 min

OpenAI's 'Strawberry' AI tackles complex math and programming with enhanced reasoning, while Cerebras claims to have launched the fastest AI inference, enabling real-time applications at competitive prices. The GenCA model revolutionizes avatar creation with photo-realistic, controllable 3D avatars, and the "Build-A-Scene" paper introduces interactive 3D layout control for text-to-image generation, enhancing creative fields with dynamic object manipulation.
Contact: [email protected]
Timestamps:
00:34 Introduction
02:02 OpenAI Shows ‘Strawberry’ AI to the Feds and Uses It to Develop ‘Orion’
03:23 Cerebras Launches the World’s Fastest AI Inference
05:07 Diffusion Models Are Real-Time Game Engines
06:15 Fake sponsor
08:06 The Mamba in the Llama: Distilling and Accelerating Hybrid Models
09:42 GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars
11:16 Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation
13:04 Outro
Aug 28, 2024
14 min

Grok-2's advancements in speed and accuracy position it as a leading AI model, particularly in math and coding. OpenAI's backing of California's AI bill highlights the critical need for transparency in synthetic content, especially during an election year. The episode features groundbreaking research on the SwiftBrush diffusion model and K-Sort Arena for generative model evaluation. Additionally, the LlamaDuo pipeline offers a practical solution for migrating from cloud-based LLMs to local models, tackling privacy and operational challenges.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:55 grok-2 is Faster and Better
03:32 OpenAI supports California AI bill requiring 'watermarking' of synthetic content
04:53 Fake sponsor
06:45 SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher
08:10 SWE-bench-java: A GitHub Issue Resolving Benchmark for Java
09:40 K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
11:24 LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
13:26 Outro
Aug 27, 2024
14 min

This episode dives into Salesforce's innovative AI sales agents that automate tasks but risk losing human touch, NVIDIA's compact yet powerful language model that promises efficiency, groundbreaking research showing how optimized computation can enhance model performance, and insights into compound inference systems revealing the delicate balance in maximizing language model effectiveness.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:49 Salesforce's New Sales AI Agents
03:09 Lightweight Champ: NVIDIA Releases Small Language Model With State-of-the-Art Accuracy
04:52 avante.nvim
05:56 Fake sponsor
07:45 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
09:22 Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
11:15 Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
13:10 Outro
Aug 26, 2024
14 min

This episode dives deep into the future of coding, challenging the belief that AI will render developers obsolete. It highlights Meta's stock surge, attributing it to Zuckerberg's compelling AI narrative that captivates investors. The discussion also covers groundbreaking research like Transfusion, which merges text and image processing, and the innovative approach of automated design for intelligent agents. Lastly, it emphasizes the xGen-MM framework's commitment to safety in AI, showcasing the critical need to mitigate harmful behaviors in advanced models.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:28 Amazon cloud chief: Devs may stop coding when AI takes over
02:53 Meta Shares Are Flying High as Zuckerberg Sells His AI Vision
04:34 I've Built My First Successful Side Project, and I Hate It
05:41 Fake sponsor
07:35 Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
09:16 Automated Design of Agentic Systems
10:56 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
12:44 Outro
Aug 23, 2024
13 min

OpenAI's SearchGPT is launching with limited access for only 10,000 users, raising questions about trust and the potential risks of generative search products. A comprehensive analysis challenges the belief that Vision Transformers are inefficient, suggesting they can handle higher resolutions effectively. The introduction of Automated Design of Agentic Systems (ADAS) could revolutionize how intelligent agents are created, outperforming traditional hand-designed models. The xGen-MM framework aims to enhance multimodal AI capabilities while prioritizing safety measures to mitigate harmful behaviors.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:43 OpenAI is fresh out of SearchGPT
02:50 From ChatGPT to Gemini: how AI is rewriting the internet
04:32 On the speed of ViTs and CNNs
05:49 Fake sponsor
07:49 JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
09:34 Automated Design of Agentic Systems
11:12 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
13:01 Outro
Aug 19, 2024
14 min

This episode dives into the Grok-2 Beta Release, highlighting its advanced reasoning capabilities and competitive edge. We explore Apple’s ambitious plans for a $1,000 tabletop robotic home device, set to transform smart home technology. The introduction of ChemVLM marks a breakthrough in chemistry research, effectively integrating chemical images and text. Lastly, InfinityMATH presents a scalable dataset that enhances language models' mathematical reasoning, showcasing impressive performance improvements.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:37 Grok-2 Beta Release
02:58 Apple Aiming to Launch Tabletop Robotic Home Device as Soon as 2026 With Pricing Around $1,000
04:29 Gemlite: Towards Building Custom Low-Bit Fused CUDA Kernels
05:34 Fake sponsor
07:16 Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM
08:55 Generative Photomontage
10:26 InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning
12:22 Outro
Aug 15, 2024
13 min

This episode dives into Gemini Live's interactive AI capabilities, OpenAI's improved coding benchmark for reliable evaluations, LongWriter's breakthrough in generating ultra-long outputs, and SlotLifter's advancements in 3D object-centric learning. Each topic highlights significant innovations and their implications in the AI landscape.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:48 Gemini makes your mobile device a powerful AI assistant
03:08 New OpenAI Coding Benchmark
04:52 Things I learned from teaching
05:59 Fake sponsor
07:38 Imagen 3
09:05 LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
10:46 SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields
12:22 Outro
Aug 14, 2024
13 min

Google Meet's new AI note-taking feature could change meeting dynamics, while Trump’s claims about Kamala Harris reveal the political implications of AI. The exploration of AI's role in scientific research raises ethical concerns, and cutting-edge papers on ControlNeXt, rStar, and FruitNeRF showcase advancements in image generation, reasoning capabilities, and fruit counting accuracy.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:43 Google Meet call will soon be able to take notes for you
02:56 Trump falsely claims Kamala Harris ‘AI’d’ her rally crowd size
04:23 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
05:35 Fake sponsor
07:15 ControlNeXt: Powerful and Efficient Control for Image and Video Generation
08:47 Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
10:41 FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework
12:41 Outro
Aug 13, 2024
13 min

OpenAI's mysterious "Strawberry" AI model is causing a buzz in the tech world, with rumors of advanced reasoning capabilities.
Meta is trying to improve their AI assistants by enlisting the help of celebrities like Awkwafina to give them a more relatable and entertaining vibe.
Google DeepMind's research on building a robot capable of playing table tennis at a human level is a remarkable exploration of robotics and sports.
UC Berkeley and Google DeepMind's paper on optimizing LLMs and Harbin Institute of Technology's research on building a general-purpose AI agent capable of completing long-horizon tasks are both groundbreaking developments in the field of AI.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:35 Sam Altman teases project Strawberry
03:06 Meta courts celebs like Awkwafina to voice AI assistants ahead of Meta Connect
04:58 Achieving Human Level Competitive Robot Table Tennis
06:11 Fake sponsor
08:15 Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
09:55 Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
11:41 UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling
13:30 Outro
Aug 12, 2024
15 min
Load more
