Podcasty TechnologiaLatent Space: The AI Engineer Podcast

Słuchaj tego podcastu za darmo w aplikacji:

radio.pl

Sleep timer

Budzik

Zapisz stacje

Pobierz za darmo z App Store

Latent Space: The AI Engineer Podcast

swyx + Alessio

Technologia Biznes Przedsiębiorczość

Najnowszy odcinek

Dostępne odcinki

5 z 126

⚡️GPT 4.1: The New OpenAI Workhorse
We’ll keep this brief because we’re on a tight turnaround: GPT 4.1, previously known as the Quasar and Optimus models, is now live as the natural update for 4o/4o-mini (and the research preview of GPT 4.5). Though it is a general purpose model family, the headline features are: Coding abilities (o1-level SWEBench and SWELancer, but ok Aider) Instruction Following (with a very notable prompting guide) Long Context up to 1m tokens (with new MRCR and Graphwalk benchmarks) Vision (simply o1 level) Cheaper Pricing (cheaper than 4o, greatly improved prompt caching savings) We caught up with returning guest Michelle Pokrass and Josh McGrath to get more detail on each! Chapters 00:00:00 Introduction and Guest Welcome 00:00:57 GPC 4.1 Launch Overview 00:01:54 Developer Feedback and Model Names 00:02:53 Model Naming and Starry Themes 00:03:49 Confusion Over GPC 4.1 vs 4.5 00:04:47 Distillation and Model Improvements 00:05:45 Omnimodel Architecture and Future Plans 00:06:43 Core Capabilities of GPC 4.1 00:07:40 Training Techniques and Long Context 00:08:37 Challenges in Long Context Reasoning 00:09:34 Context Utilization in Models 00:10:31 Graph Walks and Model Evaluation 00:11:31 Real Life Applications of Graph Tasks 00:12:30 Multi-Hop Reasoning Benchmarks 00:13:30 Agentic Workflows and Backtracking 00:14:28 Graph Traversals for Agent Planning 00:15:24 Context Usage in API and Memory Systems 00:16:21 Model Performance in Long Context Tasks 00:17:17 Instruction Following and Real World Data 00:18:12 Challenges in Grading Instructions 00:19:09 Instruction Following Techniques 00:20:09 Prompting Techniques and Model Responses 00:21:05 Agentic Workflows and Model Persistence 00:22:01 Balancing Persistence and User Control 00:22:56 Evaluations on Model Edits and Persistence 00:23:55 XML vs JSON in Prompting 00:24:50 Instruction Placement in Context 00:25:49 Optimizing for Prompt Caching 00:26:49 Chain of Thought and Reasoning Models 00:27:46 Choosing the Right Model for Your Task 00:28:46 Coding Capabilities of GPC 4.1 00:29:41 Model Performance in Coding Tasks 00:30:39 Understanding Coding Model Differences 00:31:36 Using Smaller Models for Coding 00:32:33 Future of Coding in OpenAI 00:33:28 Internal Use and Success Stories 00:34:26 Vision and Multi-Modal Capabilities 00:35:25 Screen vs Embodied Vision 00:36:22 Vision Benchmarks and Model Improvements 00:37:19 Model Deprecation and GPU Usage 00:38:13 Fine-Tuning and Preference Steering 00:39:12 Upcoming Reasoning Models 00:40:10 Creative Writing and Model Humor 00:41:07 Feedback and Developer Community 00:42:03 Pricing and Blended Model Costs 00:44:02 Conclusion and Wrap-Up
--------
41:52
SF Compute: Commoditizing Compute
Evan Conrad, co-founder of SF Compute, joined us to talk about how they started as an AI lab that avoided bankruptcy by selling GPU clusters, why CoreWeave financials look like a real estate business, and how GPUs are turning into a commodities market. Chapters: 00:00:05 - Introductions 00:00:12 - Introduction of guest Evan Conrad from SF Compute 00:00:12 - CoreWeave Business Model Discussion 00:05:37 - CoreWeave as a Real Estate Business 00:08:59 - Interest Rate Risk and GPU Market Strategy Framework 00:16:33 - Why Together and DigitalOcean will lose money on their clusters 00:20:37 - SF Compute's AI Lab Origins 00:25:49 - Utilization Rates and Benefits of SF Compute Market Model 00:30:00 - H100 GPU Glut, Supply Chain Issues, and Future Demand Forecast 00:34:00 - P2P GPU networks 00:36:50 - Customer stories 00:38:23 - VC-Provided GPU Clusters and Credit Risk Arbitrage 00:41:58 - Market Pricing Dynamics and Preemptible GPU Pricing Model 00:48:00 - Future Plans for Financialization? 00:52:59 - Cluster auditing and quality control 00:58:00 - Futures Contracts for GPUs 01:01:20 - Branding and Aesthetic Choices Behind SF Compute 01:06:30 - Lessons from Previous Startups 01:09:07 - Hiring at SF Compute Chapters 00:00:00 Introduction and Background 00:00:58 Analysis of GPU Business Models 00:01:53 Challenges with GPU Pricing 00:02:48 Revenue and Scaling with GPUs 00:03:46 Customer Sensitivity to GPU Pricing 00:04:44 Core Weave's Business Strategy 00:05:41 Core Weave's Market Perception 00:06:40 Hyperscalers and GPU Market Dynamics 00:07:37 Financial Strategies for GPU Sales 00:08:35 Interest Rates and GPU Market Risks 00:09:30 Optimal GPU Contract Strategies 00:10:27 Risks in GPU Market Contracts 00:11:25 Price Sensitivity and Market Competition 00:12:21 Market Dynamics and GPU Contracts 00:13:18 Hyperscalers and GPU Market Strategies 00:14:15 Nvidia and Market Competition 00:15:12 Microsoft's Role in GPU Market 00:16:10 Challenges in GPU Market Dynamics 00:17:07 Economic Realities of the GPU Market 00:18:03 Real Estate Model for GPU Clouds 00:18:59 Price Sensitivity and Chip Design 00:19:55 SF Compute's Beginnings and Challenges 00:20:54 Navigating the GPU Market 00:21:54 Pivoting to a GPU Cloud Provider 00:22:53 Building a GPU Market 00:23:52 SF Compute as a GPU Marketplace 00:24:49 Market Liquidity and GPU Pricing 00:25:47 Utilization Rates in GPU Markets 00:26:44 Brokerage and Market Flexibility 00:27:42 H100 Glut and Market Cycles 00:28:40 Supply Chain Challenges and GPU Glut 00:29:35 Future Predictions for the GPU Market 00:30:33 Speculations on Test Time Inference 00:31:29 Market Demand and Test Time Inference 00:32:26 Open Source vs. Closed AI Demand 00:33:24 Future of Inference Demand 00:34:24 Peer-to-Peer GPU Markets 00:35:17 Decentralized GPU Market Skepticism 00:36:15 Redesigning Architectures for New Markets 00:37:14 Supporting Grad Students and Startups 00:38:11 Successful Startups Using SF Compute 00:39:11 VCs and GPU Infrastructure 00:40:09 VCs as GPU Credit Transformators 00:41:06 Market Timing and GPU Infrastructure 00:42:02 Understanding GPU Pricing Dynamics 00:43:01 Market Pricing and Preemptible Compute 00:43:55 Price Volatility and Market Optimization 00:44:52 Customizing Compute Contracts 00:45:50 Creating Flexible Compute Guarantees 00:46:45 Financialization of GPU Markets 00:47:44 Building a Spot Market for GPUs 00:48:40 Auditing and Standardizing Clusters 00:49:40 Ensuring Cluster Reliability 00:50:36 Active Monitoring and Refunds 00:51:33 Automating Customer Refunds 00:52:33 Challenges in Cluster Maintenance 00:53:29 Remote Cluster Management 00:54:29 Standardizing Compute Contracts 00:55:28 Unified Infrastructure for Clusters 00:56:24 Creating a Commodity Market for GPUs 00:57:22 Futures Market and Risk Management 00:58:18 Reducing Risk with GPU Futures 00:59:14 Stabilizing the GPU Market 01:00:10 SF Compute's Anti-Hype Approach 01:01:07 Calm Branding and Expectations 01:02:07 Promoting San Francisco's Beauty 01:03:03 Design Philosophy at SF Compute 01:04:02 Artistic Influence on Branding 01:05:00 Past Projects and Burnout 01:05:59 Challenges in Building an Email Client 01:06:57 Persistence and Iteration in Startups 01:07:57 Email Market Challenges 01:08:53 SF Compute Job Opportunities 01:09:53 Hiring for Systems Engineering 01:10:50 Financial Systems Engineering Role 01:11:50 Conclusion and Farewell
--------
1:12:01
The Creators of Model Context Protocol
Today’s guests, David Soria Parra and Justin Spahr-Summers, are the creators of Anthropic’s Model Context Protocol (MCP). When we first wrote Why MCP Won, we had no idea how quickly it was about to win. In the past 4 weeks, OpenAI and now Google have now announced the MCP support, effectively confirming our prediction that MCP was the presumptive winner of the agent standard wars. MCP has now overtaken OpenAPI, the incumbent option and most direct alternative, in GitHub stars (3 months ahead of conservative trendline): For protocol and history nerds, we also asked David and Justin to tell the origin story of MCP, which we leave to the reader to enjoy (you can also skim the transcripts, or, the changelogs of a certain favored IDE). It’s incredible the impact that individual engineers solving their own problems can have on an entire industry. Timestamps 00:00 Introduction and Guest Welcome 00:37 What is MCP? 02:00 The Origin Story of MCP 05:18 Development Challenges and Solutions 08:06 Technical Details and Inspirations 29:45 MCP vs Open API 32:48 Building MCP Servers 40:39 Exploring Model Independence in LLMs 41:36 Building Richer Systems with MCP 43:13 Understanding Agents in MCP 45:45 Nesting and Tool Confusion in MCP 49:11 Client Control and Tool Invocation 52:08 Authorization and Trust in MCP Servers 01:01:34 Future Roadmap and Stateless Servers 01:10:07 Open Source Governance and Community Involvement 01:18:12 Wishlist and Closing Remarks
--------
1:19:56
Unsupervised Learning x Latent Space Crossover Special
Unsupervised Learning is a podcast that interviews the sharpest minds in AI about what’s real today, what will be real in the future and what it means for businesses and the world - helping builders, researchers and founders deconstruct and understand the biggest breakthroughs. Top guests: Noam Shazeer, Bob McGrew, Noam Brown, Dylan Patel, Percy Liang, David Luan https://www.latent.space/p/unsupervised-learning Timestamps 00:00 Introduction and Excitement for Collaboration 00:27 Reflecting on Surprises in AI Over the Past Year 01:44 Open Source Models and Their Adoption 06:01 The Rise of GPT Wrappers 06:55 AI Builders and Low-Code Platforms 09:35 Overhyped and Underhyped AI Trends 22:17 Product Market Fit in AI 28:23 Google's Current Momentum 28:33 Customer Support and AI 29:54 AI's Impact on Cost and Growth 31:05 Voice AI and Scheduling 32:59 Emerging AI Applications 34:12 Education and AI 36:34 Defensibility in AI Applications 40:10 Infrastructure and AI 47:08 Challenges and Future of AI 52:15 Quick Fire Round and Closing Remarks Chapters 00:00:00 Introduction and Collab Excitement 00:00:58 Open Source and Model Adoption 00:01:58 Enterprise Use of Open Source Models 00:02:57 The Competitive Edge of Closed Source Models 00:03:56 DeepSea and Open Source Model Releases 00:04:54 Market Narrative and DeepSea Impact 00:05:53 AI Engineering and GPT Wrappers 00:06:53 AI Builders and Low-Code Platforms 00:07:50 Innovating Beyond Existing Paradigms 00:08:50 Apple and AI Product Development 00:09:48 Overhyped and Underhyped AI Trends 00:10:46 Frameworks and Protocols in AI Development 00:11:45 Emerging Opportunities in AI 00:12:44 Stateful AI and Memory Innovation 00:13:44 Challenges with Memory in AI Agents 00:14:44 The Future of Model Training Companies 00:15:44 Specialized Use Cases for AI Models 00:16:44 Vertical Models vs General Purpose Models 00:17:42 General Purpose vs Domain-Specific Models 00:18:42 Reflections on Model Companies 00:19:39 Model Companies Entering Product Space 00:20:38 Competition in AI Model and Product Sectors 00:21:35 Coding Agents and Market Dynamics 00:22:35 Defensibility in AI Applications 00:23:35 Investing in Underappreciated AI Ventures 00:24:32 Analyzing Market Fit in AI 00:25:31 AI Applications with Product Market Fit 00:26:31 OpenAI's Impact on the Market 00:27:31 Google and OpenAI Competition 00:28:31 Exploring Google's Advancements 00:29:29 Customer Support and AI Applications 00:30:27 The Future of AI in Customer Support 00:31:26 Cost-Cutting vs Growth in AI 00:32:23 Voice AI and Real-World Applications 00:33:23 Scaling AI Applications for Demand 00:34:22 Summarization and Conversational AI 00:35:20 Future AI Use Cases and Market Fit 00:36:20 AI Education and Model Capabilities 00:37:17 Reforming Education with AI 00:38:15 Defensibility in AI Apps 00:39:13 Network Effects and AI 00:40:12 AI Brand and Market Positioning 00:41:11 AI Application Defensibility 00:42:09 LLM OS and AI Infrastructure 00:43:06 Security and AI Application 00:44:06 OpenAI's Role in AI Infrastructure 00:45:02 The Balance of AI Applications and Infrastructure 00:46:02 Capital Efficiency in AI Infrastructure 00:47:01 Challenges in AI DevOps and Infrastructure 00:47:59 AI SRE and Monitoring 00:48:59 Scaling AI and Hardware Challenges 00:49:58 Reliability and Compute in AI 00:50:57 Nvidia's Dominance and AI Hardware 00:51:57 Emerging Competition in AI Silicon 00:52:54 Agent Authentication Challenges 00:53:53 Dream Podcast Guests 00:54:51 Favorite News Sources and Startups 00:55:50 The Value of In-Person Conversations 00:56:50 Private vs Public AI Discourse 00:57:48 Latent Space and Podcasting 00:58:46 Conclusion and Final Thoughts
--------
The Agent Network — Dharmesh Shah
If you’re in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you’re not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh’s approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What are you coding with? What's your stack? Yep.Dharmesh [00:50:06]: Uh, so Python's my language. Uh, I'm glad that it won in terms of the AI, uh, languages, lingua franca.swyx [00:50:12]: It's the second best language for everything.Dharmesh [00:50:13]: And by the way, there, I think exactly end of one of things that I disagree with Brett Taylor on, uh, when, when he was on, and just generally, I'm a massive Brett Taylor fan, uh, smart. One of my favorite people in tech, like it was like a segment in there. He was talking about like, oh, we need a, a different language than Python or whatever. That is like built for, uh, built for AI and built. It's like, no, Brett, I don't think we do actually, it's just fine. Um, it deals with just fine, just expressive enough. And it's nice to have a language that we can use as a common denominator across both humans and AI it's, it doesn't slow the AI down. Enough, but it does make it awfully useful for us to also be able to participate in that kind of future world, uh, that we can still be somewhat useful.swyx [00:50:53]: I mean, but yeah, so it's, uh, Python, uh, cursor as my, uh, kind of code gen thing. Yeah. I would also mention that I really like your code generation thing. I have another thesis I haven't written up yet about how generative UI has kind of not fulfilled its full potential. We've seen the bolts and lovables and those are great. And then Vercel has a version of generative UI that is basically function calling pre-made components. And there's some. Thing in between where you should be able to generate the UI that you want and pin it and stick to it. And that becomes your form or yeah. And so the way I put it is, um, you know, I think that the two form factors of agents that I've seen a lot of product market fit recently has been deep research and the AI builders, like the bolt lovables. I think there's some version of this where you generate the UI, but you sort of generate the Mad Libs fill in the blanks forms, and then you, you, you keep that stable. And the deep research is. Just fills that in. Yeah. Yep. And that's it. I like that.Dharmesh [00:51:49]: Yeah. Um, so I, I, I love those, uh, kind of simple, uh, simple limitations and kind of abstractions, but then if you look at the kind of, I'll say almost like the polar opposite of that. So, so right now, most of the UIs that you and I think about or conceive, or even examples are based on the primitives and the vocabulary that we have for UI right now. It's like, oh, we have text boxes. We have check boxes. We have radio buttons. We have pulldowns. We have nav. We have clicks, touches, swipes, now voice, whatever it is, the set of primitives that exist right now, we will combine them in, uh, in interesting ways, but where I think AI is going to be headed on, I think on the UI front is the same place is headed on the science front that originally it's like, oh, well, based on the things that we know right now, it'll sort of combine them, but we're like right at the cusp of it being able to actual novel research. So maybe a future version of AI comes up with a new set of primitives that actually work better for human computer interaction than things that we've done in the past, right? It's like, I don't. I don't think it's, it ended with the, uh, the checkbox, radio button and dropdown list. Right. I think there's life beyond that.Alessio [00:52:44]: Uh, yeah, I know we're going to move to business models after, but when you talked about ivory teams, one way we talk to folks about it is like you had offshoring yet on shoring, which is like, you know, move to cheaper place in the country than offshoring. You know, it's like AI shoring. Yep. You're kind of moving some roles. That's the thing people say. Yeah. Shoring. Yeah.Dharmesh [00:53:01]: That's the first time I've ever heard of that. Yeah. Yeah.Alessio [00:53:04]: I don't know, man. But I think to me, the most interesting thing about the professional networks is like with people, you have limited availability to evaluate a person. Yeah. So you have to use previous signal as kind of like a evaluation thing. With agents, theoretically, you can have kind of like proof of work. Yeah. You know, you can run simulations and like evaluate them in that way. Yep. How do you think about that when running, building agent.ai even? It's like, you know, instead of just choosing one, I could like literally just run across all of them and figure out which one is going to work best.Dharmesh [00:53:32]: I'm a big believer. So under the covers, when you build, because the primitives are so simple, you have some sort of inputs. We know that what the variables are. Every agent that's on agent.ai automatically has a REST API. That's callable in exactly the way you would expect. Automatically shows up in the MCP server, so you're able to invoke it in whatever form you decide to. And so my expectation is that in this future state, whether it's a human hiring an agent to do a particular task or evaluating a set of five agents to do a particular task and picking the best one for their particular use case, we should be able to do that. It's like, I just want to try it, and there should be a policy that the publisher or builder of the agent has that says, okay, well, I'm going to let you call me 50 times, 100 times before you have to pay or something like that. We should have effectively like an audit trail, like, okay, this agent has been called this many times. We also have kind of human ratings and reviews right now, and we have tens of thousands of reviews of the existing agents on agent.ai. Average is like 4.1 out of five stars. And all those things are nice signals to be able to have. But the kind of callable... Verifiable kind of thing, I think, is super useful. Like, if I can just call... Give me an API that says here are five agents and it solves this particular problem for me. If I have like a simple eval, I think that'd be so powerful. I wish I had that for humans, honestly. That'd be so cool.Alessio [00:54:47]: Yeah, because, I mean, when I was running engineering teams, people would try and come up with these rubrics, you know, when hiring. And it's like, they're not really helpful, but you just kind of need some ground truth. But I feel like now, say you want to hire, yeah, an AI software engineer. Yep. You can literally generate like 15. 20 examples of like your actual issues in your organization, both from a people perspective of like collaboration and like actual code generation. Yep. And just pay for it to run it. Yeah. Like today we do take home projects and we pay people. Sure. Like this should be kind of the same thing. Yeah. It's like, I'll just run you. But I feel like people are not investing in their own evals as much internally.Dharmesh [00:55:22]: I mean, that's the present company included, right? Everyone talks about evals. Everyone accepts the fact that we should be doing more with evals. I won't say nobody, but almost nobody actually does. That's the... And yeah, it's a topic for a whole other day. I'm not...swyx [00:55:36]: It's funny, I mean, because obviously HubSpot is famous for launching graders of things. Yes. You'd be perfect for it. Yeah. Somehow. agree on evals, by the way. I mean, I just force myself to be the human in the loop or, you know, someone I work with and that's okay. But obviously the scalable thing needs to be done. Just a fun fact on, or question on the agent AI, you famously, you've already talked about the chat.com acquisition and all that. Yeah. And that was around the time of custom GPTs and the GPT store launching. Yes. And I definitely feel agent AI is kind of the GPT score, but not taken seriously. Yeah. Do you feel open AI if like they woke up one day and they were like, agent AI is the thing, like we should just reinvest in GPT store instead of fear?Dharmesh [00:56:20]: I think that won't be agent.ai driven. It's an inevitability that open AI, I don't have any insider information, I'm an investor, but no inside information is because it makes too much money. It makes too much sense for them not to like, and they, they've taken multiple passes at it, right? They did the plugins back in the day, then the custom GPTs and the GPT store because, you know, being the platform that they are, I think it's inevitable that they will ultimately come up with, and they already have custom, it's going to happen. I'm not on the list of things I promised myself I would never do is compete with Solomon Altman ever, not intentionally anyway. But here you are. But yeah, here I am.swyx [00:56:58]: But I'm not really, right? Not really. It's free, so like, whatever. But, you know, at some point, if it's actually valuable.Dharmesh [00:57:06]: They're solving a much, much bigger problem. I'm like a small, tiny rounding error in the universe. But the reason that compelled me to actually create in the first place, because I knew custom GPTs existed, I did have this rule in my head that don't compete with Sam. He's literally like at the top of my list of people not to compete with. He's so good. But the thing that I needed in terms of for my own personal use, which is how agent.ai got started, because I was building a bunch of what I call solo software. Things for my own personal productivity gain. And I found myself doing more and more kind of LM driven stuff because it was better that way. You know, I sort of showed up in those solo projects a bunch. And so the thing I needed was an underlying framework to kind of build these things. And high on the list was I want to be able to straddle models because certain steps in the thing is like, oh, for this particular thing involves writing. So maybe I want to use Claude for this particular thing. Maybe I want to do this even around image generation, different types of whether. It has texture, doesn't have texture, whatever. And I want to be able to mix and match. And my sense is that whether it's OpenAI or Anthropic or whatever, they're likely going to have an affinity for their own models, right? Which makes sense for them. But I can sort of be, for my own purposes and for our user base, a little bit of the Switzerland. It's like we don't think there's like one model to rule them all based on your use case. You're going to want to mix and match and maybe even change them out. Maybe even test them back to the kind of eval idea. It's like I have this agentic workflow. And here's the thing that we've been playing with recently. Because we have. We have enough users now where they, like the LM, and I look at the bills and it's like, oh, I'm spending real money now. And this is just human nature, right? It's not just normies, but it's like, so you have this drop down of all the models that you can say, which model do you want to use in your agent.ai agent? And as it turns out, people pick the largest number. So they will pick 4.5 or whatever, whatever it is, right? It's like it's.swyx [00:58:55]: Oh my God, you're doing 4.5? Yes.Dharmesh [00:58:57]: Ouch. Yes. Yeah. But the thing I've promised myself is we will support all of them, regardless of what it costs. And like, once again, I see this as a just a research thing, you know, benefit to humanity and inference costs are going down. At least I so I tell myself late at night so I can sleep. So they pick the highest numbered one. And so we have an option in there right now that says, which is the first option. It's like, let the system pick for me. Auto-optimist. Yeah. As it turns out, people don't do that. They just pick the, because they don't trust it yet, which is fine. They shouldn't trust it completely. But one thing we discovered is that if we back channel it, and this is the thing we're testing with, is that, oh, if I can just run the exact same agent that gets run a thousand times, we'll do it on our own internal agents first. And if the ratings and reviews, because we're getting human evals all the time on these agents, we can get a dramatic multiple orders of magnitude reduction by going to a lower model with literally like no change in the quality of the output. Right. Which makes sense. Because so many of the things we're doing doesn't require the most powerful model. And it's actually bad because there is higher latency. It's not just a cost thing. But so anyway, like in that kind of future state, I think we're going to have model routing and a whole body of people working on that problem, too. It's like, help me pick the best model at runtime. Would you buy or build model routing? I buy everything that I can buy. I don't want to build anything if I don't have to.swyx [01:00:26]: One of the most impressive examples of this. I think was our Chai AI conversation, which I think about a lot. He views himself explicitly as a marketplace. You are kind of a marketplace, but he has a third angle, which is the model providers, and he lets them compete. And I think that sort of Chai three-way marketplace maybe makes a lot of sense. Like, I don't know why every AI company isn't built that way. It's a good point, actually.Dharmesh [01:00:48]: Yeah, it makes sense. I have a list of things I'm super passionate about. I'm very passionate about efficient markets or extremely irritated by inefficient markets. And so efficient markets, for the normies listening, are markets that exist where every possible efficient markets are the ones that every transaction that should occur actually does. That's an efficient market that should happen. And so then why do inefficient markets exist? Well, maybe the buyer and seller don't know about each other. Maybe there's not enough of a trust mechanism. There's no way to actually price that or come up with fair market value for fair pricing. And as you kind of knock those dominoes down, the market becomes more and more. And lots of latent value exists as a result of inefficiency. And whoever removes those inefficiencies. Yeah. And then the market recedes for high value markets makes a lot of money. That's been proven time and time again. This is one of those examples of there's an inefficiency right now because we are either over using over models or whatever. Let's just reduce that to an efficient market. The right model should be matched up with the right use case for the right price. And then we'll... Very interesting. You ever looked into DSPy? I have looked at it. Not deeply enough, though.swyx [01:01:48]: It's supposed to be, as far as I think, the only evals first framework. Yep. And if evals are so important. And by the way, the relationship between this and all that is DSPy would also help you optimize your models. Yep. Because you did the evals first. Yep. I wonder why it's not as popular, you know. But I mean, it is growing in traction, I would say. We're keeping an eye on it.Alessio [01:02:09]: Let's talk about business models. Obviously, you have kind of two, work as a service and results as a service. Yep. I'm curious how you divide the two. Yeah.Dharmesh [01:02:19]: So work as a service is... So we know about software as a service, right? So I'm licensing software that's delivered to me as a service. That's been around for decades now. So we understand that. But the consumer of that service is generally a human that's doing the actual work, whichever software you're buying. Work as a service is the software is actually doing the work, whatever that work happens to be. And so that's work as a service. So I'll come up with kind of discrete use cases, whether it's kind of classification or legal contract review or whatever the software is actually doing the thing. Results as a service is you're actually charging for the outcome, not actually the work, right? That says, okay, instead of saying, I'm going to pay you X amount of dollars to review a legal contract or this amount of time or number of uses or something like that, I'm going to actually pay you for the actual result, which is... So my take on this in the industry or the parts of the industry are super excited about this kind of results as a service or outcomes-based pricing. And I think the reason for that, I think we're over-indexing on it. And the reason we're over-indexing on it is the most popular use case on the kind of agent side right now is like customer support. Well-documented. A lot of the providers that have agents for customer support do it on a number of tickets resolved times X dollars per ticket. And the reason that that makes a lot of sense is that the customer support departments and teams sort of already have a sense for what a ticket costs to resolve through their kind of current way. And so you can come up with an approximation for A, what the kind of economic value is. There's also at least a semi-objective measure for what an acceptable value is. And that's what an acceptable resolution or outcome is, right? Like you can say, oh, well, we measured the net promoter score or CSAT for tickets or whatever. As long as the customers, 90% of the tickets were handled in a way the customer was happy. That's whatever your kind of line is. As long as the AI is able to kind of replicate that same SLA, it's like, okay, well, it's the same. They're fungible, one versus the other. I think the reason we're over-indexed, though, is that there are not that many use cases that have those two dimensions to them that are objectively measurable. And that there's a known economic value that's constant. Like, customer support tickets, because they're handled by humans, make sense. And humans have a discrete cost. And especially in retail, which is where this originally got started in B2C companies that have a high volume of customer support tickets that they're distributing across, a ticket is roughly worth the same because it takes the same amount of time for most humans to do that kind of level one, tier one support. But in other things, the value per outcome can vary dramatically, literally by orders of magnitude, in terms of what the thing is actually worth. That's kind of thing number one. Thing number two is, how do you objectively evaluate that? How do you measure? So let's say you're going to do a logo creator as a service based on results, right? And that's a completely opposite subjective thing or whatever. And so, okay, well, it may take me 100 iterations. It may take me five iterations. The quality of the output is actually not completely under my control. It's not up to the software. It could be you have weird taste or you didn't describe what you're looking for enough or whatever. It's like it was just not a solvable problem. Design kind of qualitative, subjective disciplines deal with this all the time. How do you make for a happy customer? There's a reason why they have, oh, we'll go through five iterations. But our output is we're going to charge you $5,000 or $500 or whatever it is for this logo. But that's hard, right, to kind of do at scale.swyx [01:05:29]: Just a relatable anecdote. Our podcast, actually, we just got a new logo. And we did 99 designs for it. And there are so many designers who are working really hard. But I just didn't know what I wanted. So I was just too bad. You seem great, but you know.Dharmesh [01:05:48]: that's another example of a market made efficient, right? Yeah. It's like I've been a 99designs user and customer for a dozen plus years now.swyx [01:05:55]: It's fantastic. Yeah. So many designers, like this doesn't cost that much for them to do. It's worth a lot to us. We can't design for s**t. Totally. Yeah. Yep.Dharmesh [01:06:04]: By the way, pro tip on 99designs is that on the margin, you're better off kind of committing to paying the designer that you're going to pick a winner. Whether you like it or not doesn't really matter. And that gets higher participation. And you're still going to get a bunch of crap that happens. You get a bunch of noise in it. But the kind of quality outcome is often a function of the number of iterations. And logo design is one of those examples. If you had to choose between 200 logos versus 20 logos, chances are closer that you're going to find something you like. Yeah.swyx [01:06:33]: For those interested, I have a blog post on my reflections on the 99designs thing. And that's one of those. They give an estimate of how many designs you get. Yep. And I think that the modifier for like, we will pay you, we'll pay somebody and maybe it's you, is like 30 to 60. But actually it's 200. Yep. So it's underpriced. Yep.Alessio [01:06:51]: Yep. Do you think some markets are just fundamentally going to move to more results-driven business models? Probably.Dharmesh [01:06:59]: And I don't understand enough markets well enough to know. But if we had to kind of sort or rank them, there's likely some dimension along which we could sort that. It's like, oh, these kinds of businesses, is there an objective measure of kind of truth or the outcome? Is there a way to kind of price it in terms of the low variance or variability on the value? If those things are true, whatever industries that is true in, customer support is an example, but there's likely lots of other examples where those two things are true. But then the thing I wonder, though, is that from the customer's perspective, would they rather actually pay for work as a service versus an actual, it's like maybe the way they think about it is that's sort of my arbitrage opportunity. Like I can get work done for X, but the value is actually Y. Why would I want that delta to be squozed out by the kind of provider of the software if I have a choice? I don't know. Oh, I mean, okay.swyx [01:07:51]: Attribution. There's 18 things that go into them. You're one of them. So it's hard to tell. Yes, it is. By the way, have you seen, obviously you're in this industry, not exactly HubSpot's exact part of the market, but what have you seen in attribution that is interesting? Because that directly ties into work as a service versus results. Yeah.Dharmesh [01:08:12]: Not enough because we are so, as a world, as an industry, just pick your thing. So behind. Yeah. This is why I think Web3 in the way that it was meant to be done is going to make a comeback because fundamental principles of that makes sense. I think what happened in that world was kind of a bunch of crypto bros and grifters and NFT stuff or whatever that was loosely related. There was no actual, but the idea of a blockchain, of a trackable thing, of being able to fractionalize digital assets, attribution, having an audit log, a published thing that's verifiable. All those primitives make sense, right? And maybe there's a limited, but it's not zero, set of use cases where the kind of what we would now call like the inference cost or the overhead, the tax for storing data on the blockchain. And there's certainly a tax to it. It doesn't make sense for all things, but it makes sense for some things for sure. But we just don't have attribution in any meaningful way, I don't think. Isn't it sad that it's so important and no answer? I know. It partly comes down to incentives. Yeah. So people that actually have the data or parts of the data from which attribution could be calculated or derived don't really have the incentives to make that data available. So even something as simple like on the PPC side, right, on the Google search thing, that's sort of my world or has been. We have less data now than we did back in the day in terms of like click-throughs and things like that before Google would actually send you. Here are the keywords people typed. And years ago, they even took that away. So it's hard to kind of really connect the dots back on things. And we're seeing that across. It's not just PPC, but just all sorts of things. They took that away from the Search Console. What's that? The Search Console has that. Yes. They took that away. Search Console has that. But your website, if you go to Google Analytics, you can connect it back to the Google Search Console. I see. Yeah. Yes. Okay.swyx [01:10:00]: All right. Yeah. Well, it's a known thing. You don't have to make it a rant about Google.Alessio [01:10:06]: What about software engineering? Do you think it will stay as like a work as a service? Or do you think? I think most companies hire a lot of engineers, but they don't really know what to do with them or like they don't really use them productively. Yeah. And I think now they're kind of hitting this like, you know, crisis where it's like, okay, I don't know what I will price an agent because I don't really know what my people are doing anyway. Yeah. Like, how do you think that changes?Dharmesh [01:10:27]: I think, so I'm actually bullish on engineers in terms of their kind of long-term economic value. Not despite all the movements in Cogen and all the things that we're already seeing, but because of it. Because what's going to happen as a result of AI, and people have talked about this in even other disciplines, we're going to be able to solve many more problems. The semi-math guy in me is like, okay, so we always say, oh, well, now agents are going to be doing code or whatever. And so there's going to be a million software engineers, you know, virtual digital software engineers out there. And so the value per engineer is going to go down because I'm just in that same mix that I as an engineer. What they don't recognize is that it's not just about the denominator, there's a numerator as well, which is what's the total economic value that's possible. And I would argue that's growing faster than the kind of denominator is, that the actual economic value that's possible as a result of software and what engineers can produce, you know, with the tools that they will have at hand. So I think the value of an engineer actually goes up. They're going to have the power tools, they're going to be able to solve a larger base of problems that are going to need to be solved. Yeah.Alessio [01:11:29]: It feels to me like he'll stay as like work as a service. You're paying for work. I don't think there's like a way to do that.Dharmesh [01:11:34]: And there will be a set of engineers that, and we see this all the time, you know, they're like in the media industry, you have people that are kind of writers, but then you have freelancers that, you know, write articles or write however they manifest their kind of creative talent. And both make sense, right? There's like the work for hire. There's also the kind of outcome based or like I produce this thing. And maybe they, some of those engineers actually produce agents. So they put it in a marketplace like agent did AI someday, and that's how they make their millions. Yeah.Alessio [01:11:58]: Any other thoughts just on agents? We got a lot of like misc things that we want to talk to you about. Miscellaneous.Dharmesh [01:12:03]: I think we cover a lot of territory. So I'm excited about agents. My kind of message to the world. Yeah. Would be, don't be scared. I know it's scary. Easy for me to say as a tech techno optimist, but learn it. Even if you're a normie, even if you're not an engineer, if you're not an AI person, you'll think of yourself as an AI person. Use the tools. I don't care what role you have right now, where you are in the workforce. It will be useful to you and start to get to know agents, use them, build them.swyx [01:12:29]: And I think my message for engineers is always like, there's more to go. Like we're still in the early days of figuring out what an agent's stack looks like. Yeah. And I want to push people towards agents with memory. Yeah. Agents with planning.Dharmesh [01:12:43]: Oh, we have to talk about memory. We got to talk about memory. Let's go. Let's do it. Because I think that's the next, in my mind, the next frontier is actual long-term memory, both for agents and then for agentic networks and a trustable, verifiable, I won't say privacy first, but privacy oriented way. I have an issue with the term privacy first, because a lot of times we say privacy first, when we don't really mean that. Privacy first means I value that above all things. It doesn't matter what we're talking about. And that's just not true, not for any human. Anything that wants to be used. So memory is an interesting thing, right? So the thing I'm working on right now, lots of things in play in agent.ai is around implementation of memory. And there are great projects out there, mem0 being one of them. But the thing that's interesting for me, right, is, and so we see this in ChatGPT and other things right now, where it does have the notion of a longer term memory. You can pull things back into context as needed. The thing I'm fascinated by is cross-agent memory. So if I'm an agent builder right now, it's like, okay, here are the things that I sort of know or I learned from the user in terms of pulling out the, I'll call them knowledge nuggets, for lack of a better term. And that's great. But then when the next agent builder comes out and it's the same user, shouldn't all the things that agent one learned about me, if it's going to be useful for agent two, as long as I opt into it, it's like, yeah, I don't care those things. In fact, I would find it awfully annoying to tell agent two and agent n and agent n plus one, all the same things I've already told it, because it should know, like the system should know. And this is part of the reason why I'm like a believer in these kind of networks of agents and shared state is that that user utility gets created as a result of having shared memory. Not just we should solve the memory problem for an independent agent, but then we should also be able to share that context, share that memory across the system. And that's part of the value prop for agent.ai is like, okay, when you're building, it's like, so we've got, you know, whatever million users and we're going to have growing memory about all of them. So instead of you going off on your own thing and building an agent out as this kind of disconnected node in the universe or whatever, here's the value for building on the network or on the platform, ours or someone else's, because there's more user value that gets created. It's more utility.Alessio [01:14:59]: How do you think about auth for that? Because part of memory is like selective memory. So it takes like scheduling. Yep. I want you to have access. If I have another scheduling agent, you should be able to access the events you're a part of. Yep. And like what times I have available, but it shouldn't tell you about other events on my calendar. Like what's that like?Dharmesh [01:15:15]: I have so many thoughts on this. This is like the opportunity out there, like solving these kind of fundamental, like this is going to need to exist, right? So right now the closest approximation we have is auth, auth 2.0, right? And everyone has, it's like, okay, approve. And it's a very, very coarse set of scopes, right? Like based on the provider of the auth server, be it Google, whoever it is, HubSpot, it doesn't matter. It's like, oh, I pick a set of scopes and they could have defined the scopes to be super granular. Fine. But it's sort of up to them. But that is going to move so slowly, right? So for instance, the use case I have right now, like I use email for everything. I use it as a, like an event and data bus for my life, right? And why I mean that, like literally, it's like, I'm like anything that I do, if there's a way to kind of get that into email, because I know it's an open protocol, right? It's like, okay, I will be able to get to that data in useful ways. And this is before. So I have 3 million that I've built a vector store off of that has solved my own personal use cases. So I'll give you the example, but obviously I'm not going to build all my own software for everything. But if a startup comes along and says, Dharmesh, can you make your email inbox available in exchange for these things? I'm like, hell no. Like that's the, literally my kind of like everything, like my life is in here, right? So you need to share subsets. Yes. And so I think there's a, and maybe this is not the actual implementation, but imagine if someone said, okay, I have a trusted intermediary for that first trust, however defined that says, okay, I'm going to OAuth into this thing. And it gets to control that. I can say in natural language, I only want to pass email to this provider where the label is one of X or that's within the last thing and no more than 50 emails in a day or whatever. So I don't have them dumping the entire 3 million backlog, whatever controls I want to put on it. It's unlikely that the, all the OAuth server side right now, the Googles, even the big ones, small ones doesn't really matter. Are going to do that. But this is an opportunity for someone and they're going to need to get to some scale, build some level of trust that says, okay, I'm going to hand over the keys to this intermediary. Yeah. But then it opens up a bunch of utility because it gives me control, more fine, fine grainswyx [01:17:15]: control. Yeah. I'd say Langchain has, has an interesting one. There are a bunch of people who has tried to track crack AI email. Every single one of them who has tried has pivoted away. Yep. And I'm waiting for Superhuman to do it. Yep. I don't know why they haven't, but you know, at some point.Alessio [01:17:29]: They have some cool AI stuff. Yeah. Yeah. I think the pace needs to increase, but I think this goes back to like open graph. Yeah. Right. Which is like, I think Google is not incentivized to build better scopes. Nope. And like, they're just not going to do it. Nope. So.Dharmesh [01:17:42]: We can't even get like, we haven't been able to get semantic search out of Google for like, still. Not totally. Yeah. Just now they made the announcement this week. What do you mean? Semantic search? In Gmail. Oh, I see. Yeah. So, okay. So they have all the, they have my 3 million emails. Why don't they have a vector store where I can just like basic. Yeah. Yeah.Dharmesh [01:18:01]: In real time.swyx [01:18:03]: Like, I don't think my email is that big a deal, but. Yeah. My standard thing on memory is, it sounds like you are using mem0. I am. There's also memgpt, now Letta, which give a workshop at my conference. There's Zep, which uses a graph database, just kind of open source, kind of interesting. Yep. And LangMem from LangGraph, which I would highlight. Also, like it's really interesting, this developing philosophy that people seem to be agreeing on, on a hierarchy of memories. Mm-hmm. Mm-hmm. So, from memory to episodic memory to, I think it's just overall sort of background processing. Like, we have independently reinvented that AI should sleep. Yep. To do the deep REM processing of memories. Yep. It's kind of interesting. Yep.Dharmesh [01:18:43]: Yeah, that is. It's the other, I mean, just on the notion of memory and hierarchies. So, you know, I talked about the memory we're working on right now is at the user level and it's cross agent, right? Yeah. But the other kind of one step up would be, so once again, going back to this kind of hybrid digital teams. Yeah. Is that you can imagine to say, oh, well, my team has this kind of shared team. I don't want to share with the world or this set of agents across this group of people. I want to have shared state like we would have in a Slack channel or something like that. That should sort of exist as an option, right? Yeah. And the platforms should provide that.swyx [01:19:15]: And the B folks I should also mention have mentioned that they're working on that as well. Okay. So, imagine being able to share, you know, selective conversations with people. Like, that's nice. Yeah. Yeah. VerbalLess has, I guess, voice-based shielding. I don't think they have the action. I'm an investor in that too.Dharmesh [01:19:32]: Oh, really? Okay. Trying to think about all the things I've said, Invest in OpenAI, Perplexity, Langraph, Kru.ai, Limitless, a bunch of them. So, if I've said anything, by the way, I have no insider knowledge. I have no... I'm not trying to plug or pitch or anything like that. No, no, no.swyx [01:19:48]: I think it's understood. We're often... Like, you know, if you have skin in the game, you've probably invested or me or me not... I'm not an investor in B, but I'm just a friend. And I think you should be able to speak freely of your opinions regardless. Okay, we have some miscellaneous questions that may be zooming out from Agent AI. First of all, you mentioned this and I have to ask, you have so many AI projects you'll never get to. What's one or two that you want other people to work on?Dharmesh [01:20:15]: Oh, wow.swyx [01:20:16]: Drop some from your list.Dharmesh [01:20:18]: Other people to work on. Because you'll never get to it. Yeah, what I need to do is I've had this thought before. So I have this is like maybe like pick one a week or something like that and give the domain away. Like I have people submit their one pager or something like that. It's like, if you can convince me that you have at least enough of an idea, enough like willingness to kind of commit to actually doing something. It's the ones that you keep mentioning, but you haven't gotten to it for whatever reason. Yep, yep. Traffic, like some of them, I don't have the underlying business model. We're going to have to come back to this, maybe do a follow-up episode. I don't, like they're just not jumping to mine. You don't need the business model, just... Yeah, so I own Scout.ai. I think that's an interesting... By the way, pretty much all of them, there was an idea at the time. It's like it was one of those late night, it's like, oh, I could do this. Is the domain available? And I'll go grab it. I'm trying to think what else I have on the AI space. I have a lot of like non-profit domain names as well for like non-profit like OpenGraph. I'm not sure why things are not jumping to my head. Yeah, I have agent.com, which obviously is tied to agent.ai.swyx [01:21:24]: Oh, that's going to be big. That's going to be big. Oh my God. That's going to be like a 30, $50 million.Dharmesh [01:21:29]: It's going to be big. It's going to be, I think, end up being bigger than chat.com, which was 15.swyx [01:21:38]: Yeah, it's more work oriented. Yep. That's interesting.Alessio [01:21:41]: Yeah, do you want to talk about the chat.com thing? I would love just the backstories. Like, did you just call up Sam one day and be like, I got the domain? Yeah. Did they? Can I get back to you?Dharmesh [01:21:52]: No, I'll give you, it's a good story. Back in the original ChatGPT days, the first thought I had in my head, which lots of people had in their head, is that OpenAI is going to build a platform and ChatGPT is actually just a demo app to show off the thing. And there's been precedence for tech companies that have had, you know, demo apps to kind of help normies understand the underlying technology. And even after the kind of boost or whatever. So my original thought was, well, someone should actually create like an actual real world. And so I'm like, and that product should be called chat.com because GPT is not a consumer friendly thing at all. Like that's an acronym, not pretty, it doesn't roll off the tongue. And so like, I'll build ChatGPT because that was just a demo app back then. So I, you know, got chat.com. And then as it turns out, ChatGPT is like a real product. And I was at an event here in San Francisco that Sam spoke at where he launched plugins. I think it was the announcement at that time. Yeah. And that's the thing is like, I had sort of suspected, it's like, okay, things sort of be like, there's no way. There's no way that OpenAI is going to launch plugins for ChatGPT if they were not thinking of it as an actual platform. So it's not just about the GPT APIs. This is like a real thing. I'm like, crap. Like this violates the first rule of Dharmesh, which is don't compete with Sam. I knew when I bought the domain that there was competition for the domain. There were other companies looking to buy it. I don't know who they were. I had suspicions. So I bought it and then I'm like, okay, well, I'll reach out to Sam. I was like, hey, Sam, I happen to have got, I don't know. I don't know if he was or wasn't kind of in the running or trying to acquire it or not, but I have chat.com. I don't, not looking to make a profit or whatever. If you want it, you will obviously do something much better, bigger with it. I don't want to be in the compete with Sam game effectively is what I said. And so they did want it.swyx [01:23:38]: And yeah, we struck a deal. Looks like it's been a very good deal if the valuations are, you know, to be, to be real. Yeah. Who knows? Who knows?Alessio [01:23:48]: It's one of those weird things. Like, yeah. Yeah. The agent that AI domain evaluator said that late in that space is for between five and 15 K. Okay.swyx [01:23:55]: So does that feel right? Well, it's missed the, it's missing this one.Dharmesh [01:24:00]: Does not incorporate the transactional data. I have not published that one yet. Uh, that's because it's also operationally very intensive, uh, that other one. But anyway, we, we actually had it donated by a listener, so I don't know what the real cost is, but it's missing that it's linked to an influencer by way of AI, which I've offered. I'm an investor in, in, yes, I bought that. Uh, and I've told him that like, whenever you're ready, you let me know, I'll sell it to you at cost. Uh, yeah.swyx [01:24:25]: So, yeah, I mean, that, that is some value add since you may buy a lot of domains.Dharmesh [01:24:29]: What, what are your favorite, uh, domain buying tips apart from have a really good domain broker, which I assume you have, uh, no, I actually don't, uh, I do, I do my own deals. Um, I have a, like a very cards face up approach to life. Um, so there's, so, you know, some people would tell you, it's like, oh, well, if someone, they know that it's, you're behind the transaction. Yeah. So, you know, the price is going to go up, sure, but it's still like willing seller or willing buyer or whatever. It doesn't mean I'm going to have to necessarily pay that price. Uh, it's like, okay. But the upside to it, uh, cause I always, you know, reach out as myself when I'm, when there's a domain out there. Um, and they can look you up. They can look me up. But then I also come off as like legit, like, okay, well, there's very few people are not going to return my email. When I say I'm interested in a domain that they may have for sale, um, or had not considered selling, but you know, would you consider selling? Uh, so yeah. And some of the, like, uh. So I own some of my favorites. I still own prompt.com, by the way, that, that could be a big one. Um, but I owned, and this is one, uh, I don't regret it. I went into a good, I owned a playground.com. And so the original idea behind playground.com was at the time, uh, open AI had their, uh, playground where you can play around with the models and things like that. Right. It's like, okay, well, there should be a platform neutral thing. There should be a playground across all the LLMs. Then you can, and there are obviously products and, uh, startups that, that do that now. And so that was my original thing. It's like, oh, there should be playground.com and you can go test out all the models and play around with them just like you can with, uh, with open AI's, uh, GPT stuff. And then, uh, so sale was out there with, uh, with, with playground, uh, the company, uh, and I think he reached out, it might've reached out to me over, over Twitter or something like that. So we knew of, of each other. I'd never, I've still never, never met him. And he asked me whether I would consider, and that was a tough one because I'm like, I actually have the business idea already in my head. I think it's a great idea. I think it's a great domain name. Uh, and it's like a really simple English word that has like relevance and a whole new context now. But once again, uh, I took, uh, took equity. So it's like, uh, look on the bright side. That's like, I, so domains that get me into deals that I would never been able to likely get into two other ways. So, yeah.Alessio [01:26:35]: Yeah. We should securitize your GoDaddy account and just make it a fund. It's a fund.Dharmesh [01:26:41]: It's basically a fund. Yeah. Um, and by the way, and so back to the kind of, uh, I hope you don't use GoDaddy by the way. Vested, uh, I don't know if it's public yet. Um, but in a company that's going to treat domains as a fractionalizable, uh, tradable asset, because that's the kind of the original NFT in a way, right? It's like, okay, well, and then if you can make both fractionalizing, but also just to transfer, like right now, it's so painful when you buy a domain, you go through an escrow service and there's just all of this. It's like, I just want like instantaneous, like charge me in Bitcoin or credit card, whatever it is. And then I should show up and I should be able to reroute the DNS. Like that should be minutes, not weeks or days. Um, anyway, so.Alessio [01:27:19]: Yeah, that's what ENS on Ethereum is basically the same, but it should bring that for normies. Yeah, exactly. They should bring it. Yeah. The ICANN and all of that is, uh, as its own, its own thing.swyx [01:27:30]: I have a question on, on just, uh, you know, you keep bringing up your Sam Altman rule. One of my favorite, favorite, favorite, my first millions of all time was actually without you there, but talking about you. Okay. Cause, uh, Sean was describing you as a fierce nerd, which I'm sure you, you, you were there. Uh, um, and, uh, I think Sam also is a fierce nerd and, and he is, uh, uh, I was, I was listening to this Jessica Livingston podcast where what she had him on and described him as a formidable person. I think you're also very formidable and I just wonder what makes you formidable. What makes you a fierce nerd? What, what keeps you this driven? Yeah.Dharmesh [01:28:09]: Sam's fiercer and nerdier just for the record. Um, but I think part of it is just like the strength of my conviction, I guess. Like I'm, I'm willing to. Work harder and grind it out, uh, more than people that are smarter than me. And I'm only slightly stupider than people that are willing to work harder than me. Right. Like I'm just the right mix of, uh, the kind of grinded it, kind of work at it, stick to it for extended periods of time. If I think I'm right, I will latch out, latch on and not let go until I can either like prove to myself that it's not. Um, so even like the natural language thing, it's like, you know, it took 20 years, but eventually I got to a point where, uh, the world caught up and it became possible. Uh, but yeah, I think. And part of it is, uh, I think this is partly, I think what makes me like, I'm a nice guy. Uh, sometimes they're the most dangerous kind, right? It's like, okay, well, I, I don't make enemies or whatever, but so my advice would be my, this is my take on competition. I don't think of it as like war. I think of it as, uh, their opponents. All right. And this is, it's not worried up. It's like, it's, it's a game, right? And you can use whatever analogy I happen to play a fair amount of chess. I'm a student of the game. That's partly, I think what, uh, makes me. Effective, uh, I'm solving for the long-term, uh, so I'm kind of hard to deter. So for those of you out there looking to kind of compete with HubSpot, uh, no, uh, I'm going to be here 18 years. I'm going to be here for another 18 years. So, but not that you shouldn't do it. It's a big market.swyx [01:29:34]: Uh, I'm not trying to sway anyone, but yeah, I think like something I struggled with, with this conviction, you said you pursue things to conviction, but like you start out not knowing anything. Yeah. And so how do you develop a conviction when there's. You, you find it along the way, or you, you stumble along the way, then you lose conviction and then you stop working on it, you know, like, how do you keep going?Dharmesh [01:29:57]: The way I've sort of approached it is that, um, so I don't generally tend to have conviction around a solution or a product. I have conviction around a problem, uh, that says this is an actual real problem that needs to be solved. And I may have an idea for how to be solved, uh, you know, right now, and that I may get dissuaded. It's like, ah, I'm not smart enough. Technology's not good enough. Whatever the constraints are, but it's the problem I have conviction around. It's like, oh, that problem still hasn't gone away. Uh, so like I sort of filed away in the back of my brain and I'll revisit it's like, okay, well, you know, the kind of board changes, uh, and then it changes really fast now with AI, like things that weren't possible before are now possible. So you kind of go back to your roster of things that you believe or believed and say, maybe now, uh, now is the time maybe then it wasn't the time, uh, but I'm a big believer in kind of attaching yourself. Passionately, uh, with conviction to problems that matter, um, that, and there are some that are just too highfalutin for me that I'm not going to ever be able to kind of take on. I have the humility to recognize that. Yeah.swyx [01:30:59]: I feel like I need a, um, updated founder's version of a serenity prayer. Like give me the confidence to like do what I think I I'm capable of, but like not to overestimate myself, you know? Uh, you know, anyway, uh, when you say board changes, how do you keep up on AI? A lot of YouTube, as it turns out. Yeah, a lot. Um, okay. Fireship. I don't know what fireship is. It's a current meme right now. Whenever OpenAI drops something, you know, they love this, like live streams of, of stuff from on the OpenAI channel. The top comment is always, I will wait for the fireship video because fireship just summarizes their thing in five minutes.Dharmesh [01:31:35]: No, I, so my kind of MO, so I, by the way, I keep very weird hours. Uh, so my average go to bedtime, uh, is roughly 2 AM. Oh boy. But I do get average seven, seven and a half hours in. Uh, I don't, I don't use alarm clocks cause I don't, I don't, uh, have meetings, uh, uh, in the morning at all, uh, or try not to at least, uh, so my late night thing is, uh, is I'll watch probably like a couple of hours of YouTube videos off in the background while I'm coding. Um,swyx [01:32:04]: that's how you've seen our talks.Dharmesh [01:32:06]: I have. Yeah, I've seen. Yeah. Okay.swyx [01:32:08]: Yep.Dharmesh [01:32:09]: , um, and so I, and there's so much good material out there and the, and the thing I love about kind of YouTube and this, by the way, in terms of like use cases and things that agents that should exist that, uh, don't yet, I would love to, uh, technology exists now to build this is to be able to take a YouTube video of like a talk about, let's say on Latent Space or not, uh, but on the, um, AI engineer event and say, just pull the slides out for me, uh, cause I want to put it into a deck for use or whatever, some form of, uh, kind of distillation or translation into a different, uh, different format. Oh, I see. Cool slides. Got it. Pull the slides out of a video. Um, so I think that's interesting. I have, yeah. So by the way, on the kind of agent.ai thing, like one of the commonly used, uh, actions, uh, primitives that we have is the ability to kind of get a transcript from a video. And that seems like such a trivial thing or whatever, but it's like, like, if you don't know how to do it programmatically or whatever, if you're just a normie, it's like, okay, well I know it's there, but I can copy it and paste it. But like, how do I actually like get to the, the transcript for you and then, uh, getting to the transcript and then being able to encode it and say, I can. Actually. Uh, give you timestamps. So if you have a use case that says, oh, I want to know exactly when this was, I want to create an aggregate video clip. This was the actual original, um, agent that I built for my wife that she wanted to pull multiple clips together without using video editing softwares. Cause she wanted to have this, uh, aggregate thing. Uh, she's on the nonprofit side to like send to a friend.swyx [01:33:27]: Uh, anyway, there are video understanding models that have come out from meta, but the easiest one by far is going to be Gemini. They just launched YouTube support. Yep.Dharmesh [01:33:36]: So, um, they're doing good work over there. By the way, in terms of. The coolest thing AI wise recently, I'll say last week to 10 days has been the new, um, image model, Gemini flash, experimental, whatever they call it, uh, because it lets you effectively do editing, um, and just, and so, you know, my son is doing a eighth grade research project on AI image generation, right? So he's kind of gone deep on, uh, stable diffusion in the algorithms and things like that. I don't know much about it, but one thing I do know, I know enough about stable diffusion to know why editing is like near impossible that you can't recreate. Because it's like, you can't go back that way. It's going to be a different thing because it's sort of spinning the roulette wheel another time. The next time you try to, you know, a similar prompt. And so the fact that they were able to pull it off, it's still, it's still a very much a V one because you know, if you, I, you know, one of the test case, like, Oh, take the HubSpot logo and replace the, Oh, which is like this kind of sprocket with a donut and it will do it, but it won't size it to the degree that will actually fit into the actual thing. It's like, okay. Um, but yeah, but that's where it's headed.swyx [01:34:36]: Do you know the backstory behind that one? No. Uh, mostly. Most of Mustafa, who was part of, so they had image generation in Lama three, uh, lawyers didn't approve it. Mustafa quit meta and joined Gemini and didn't shift it. Uh, and it is rumored. And that's all I can say is that they got rid of diffusion. They, they, they did auto-aggressive image generation. And I think it's been interesting, these two worlds colliding because diffusion was really about the images and auto-aggressive was really about languages and people were kind of seeing like, how are they going to merge? And. And on the mid-journey side, David Holtz was very much betting on text diffusion being, uh, being their path forward. Uh, but it seems like the auto-aggressive paradigm is one like next token isDharmesh [01:35:17]: So Hill and playground are doing like exceptional work on that kind of domain of, uh, I don't know if it's auto-aggressive, but around kind of image editing and not just the kind of text to image and actually building like a UI for like a Photoshop kind of thing for actual generation of images versus, uh, just doing text. It's fascinating.swyx [01:35:32]: I just thought diffusion was kind of dead. Like there wasn't that much, it was just like bigger models. You know, higher detail and now auto-aggressive come along and now like the whole field is open. Yeah. Um, and I think like, if there was any real threat to like Photoshop or Canva, it's this thing. Yeah.Alessio [01:35:47]: Just to wrap up the conversation, you have a great post called, sorry, you must pass, which if I did the math right, you first wrote in 2007, the first version, and then you re-updated it post COVID, you mentioned you made a lot of changes to your schedule and your life based on the pandemic. How do you make decisions today? You know, in the, as anything changed, like since you, because you updated this in 2022 and I think now we're kind of like, you know, five years removed from COVID and all of that. I'm curious if you made any changes. Yeah.Dharmesh [01:36:17]: So the, so that post, sorry, must pass was the issue that happened, um, is my schedule just, and life just got overwhelmed. Right. It's like, it's just, I just, uh, too many kind of dots and connections and I love interacting with new people online. I love ideas. I love startups. There's. But as it turns out, uh, every time you say yes to anything, uh, you are by definition saying no to something else. Um, this, uh, you know, despite my best app, you know, attempts to change the laws of the universe, uh, I have not been able to do that. So that post was a reaction to that because what would happen for me, uh, would be when I did say no, I would feel this guilt because it's like, okay, well, whatever happened to me, it's like, oh, can you spend 15 minutes and just review this startup idea or whatever? It's like, uh, and sometimes it would like be someone that was second degree removed, like intro through a friend or something like that. Yeah. And I felt, uh, you know, real guilt. And so this was a very kind of honest, vulnerable, here's what's going on in my life. So, so this is not a judgment on you at all, whatever your project or whatever your thing you're working on, but I have sort of come to this realization that I just can't do it. So I'm sorry, but I, so my default thing right now, and lots of people will disagree with this kind of default position is that I have to pass because unless, and Derek Sivers said this really well, it's like either a hell yes or it's a no, right? So, and I'm going to, there's going to be a limited number of the, the hell yeses, um, that I'm going to be able to kind of inject into this. Um, so yeah, that, and that's of all the blog posts I've ever written, that has been the most useful for me. So I, um, and so, and I send it and I still send it out personally, right? I don't have a, I don't automate my email responses at all yet. Um, don't do automated social media posts. Um, but yeah, that one's been very, and I, so I encourage everyone wherever your line happens to be. I think this, um, lots of people have this guilt issue and that's one of the most unproductive emotions, uh, in, in human psychology. It's like no good comes from guilt. Not really. And unless you're like a sociopath or something like that, um, maybe you need, um, anyway, you don't need more guilt.swyx [01:38:14]: I would also say, so I, um, I would just encourage people to blog more because a lot of times people want like to pick your brain and then they ask you the same five questions that everyone else has asked. So if you blogged it, then you can just hear.Dharmesh [01:38:26]: So one of the things I'm working on, uh, and there are startups that are working on this as well. Uh, but I started before then is like a Dharmesh.ai, right? That's just captures. Yeah. And it's interesting. So that's one of the agents, um, on, on agent.ai, uh, on the underlying platform. Oh, there, there's a Dharmesh.ai? It's out there. It's Dharmesh.ai. Yeah. Nice. It's pure text space. No video, no audio right now. Um, but, uh, the, the thing that's like, I found it useful in terms of just the, how, how do I give it knowledge? So I have a kind of a private email address because a lot of the interactions that I will have, or if I do answer questions, because I, the other thing I, by the way, I don't do any phone calls like at all. Even like. No Zooms. Like at all. I mean, I'll get on Zooms with teams, but no one-on-one meetings, no one-on-one, uh, it just doesn't scale. So I've moved as much as possible to an async world. It's like, I will, as long as I can control the schedule, like I will take 20 minutes and write a thoughtful response, but I reserve the right, uh, anonymously with no attribution to kind of share that, uh, either with my model or with the world, um, you know, through a blog post or something. But it's been like useful because, uh, now that I have that kind of email backlog, I can go back and say, okay, I'm going to try to answer this question. Go through the vector store. Uh, and it's shockingly good. Uh, and I'm still irritated that Gmail doesn't do that out of the box. It's like they're in Google. Um, I think it's, it's gotta be coming now. It's there. I think they're finally, uh, the giant has been woken up. I think they're, uh, they're kind of, it's gotten faster now.swyx [01:39:45]: You know, it's one of the biggest giants in the world ever. Yeah. So, yeah. When I first told Alessio, you know, you were one of our dream guests. I never, I never expected, actually expected to book you because of, sorry, my spouse. So we were just like, ah, let's send an email. And then like, he'll say no and we'll move on with all day. Uh, so I just have to say like, uh, yeah, we're very honored.Dharmesh [01:40:05]: Oh, I'm just thrilled to be here. A huge fan of first time, first time guest, but, uh, yeah. Thank you for all that you do for the, for the community. I, I, I speak for a lot of them. You guys taught me a lot of, uh, what I think I know. So, uh, yeah.swyx [01:40:20]: Appreciate it. Yeah. I mean, uh, I am explicitly inspired by, by, um, by HubSpot. Oh, thank you. Inbound marketing. Uh, I think it's a stroke of genius and like the. The AI engineering is explicitly modeled after that. So like you created your own industry, you know, subsection of an industry that became a huge thing because you got the trend, right. And that's what AI engineering is supposed to be if we get it right. Um, how do we screw this up? How do we square what up? How, how do I screw this up? How do we screw AI engineering up?Dharmesh [01:40:47]: Oh, um, you know, yeah, the common failure modes, right. Is, um, so the original thing that makes inbound marketing work, the kind of kernel of the idea was to kind of, uh, to solve for the customer, solve for the audience, solve for the other side, uh, because the thing that was broken about marketing was marketing was a very self-centered, I have this budget. I'm going to blast you and interrupt your life and interrupt your day. And because I want you to buy this thing from me, right. And inbound marketing was the exact opposite. It's like use whatever limited budget you have and put something useful in the world that your target customer, uh, whoever it happens to be, will find valuable. Um, anyway, so the, the common failure mode is, um, is that you lose that, uh, I don't think you will, but it's very, very common, right? It's like, ah, like now I'm just going to like turn the crank and squeeze it just a little bit more like it's, uh, but you, you, the right reason, I think, uh, folks like me, uh, you know, appreciate that community so much is used you to have that genuine want to act. And there's nothing wrong with making money. There's nothing wrong with having spot, none of that, but at the, at the core of it, it's like, we want to lift the overall level of awareness for this group of people and create value and create goodness in the world. Um, I think if you hold onto that over the fullness of time, uh, the market becomes more efficient rewards. Yeah. Uh, that generosity, uh, that's my kind of fundamental life belief. So I think you guys are doing well. Thank you for your help and support. Yeah. My pleasure. Yeah.Alessio [01:42:06]: And just to wrap in very Dharmesh fashion, you have a URL for the Sorry Must Pass blog, which is sorrymustpass.org. So yeah, I thought that was a good, good nugget. Um, yeah, thanks so much for coming on. Oh, thanks. Thanks for having me. Get full access to Latent.Space at www.latent.space/subscribe
--------
1:38:24

Więcej Technologia podcastów

O Latent Space: The AI Engineer Podcast

The podcast by and for AI Engineers! In 2024, over 2 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space

Strona internetowa podcastu

Technologia Biznes Przedsiębiorczość