Powered by RND
PodcastyTechnologiaHow AI Is Built
Słuchaj How AI Is Built w aplikacji
Słuchaj How AI Is Built w aplikacji
(4 676)(250 137)
Zapisz stacje
Budzik
Sleep timer

How AI Is Built

Podcast How AI Is Built
Nicolay Gerold
Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out th...

Dostępne odcinki

5 z 49
  • RAG is two things. Prompt Engineering and Search. Keep it Separate | S2 E28
    John Berryman moved from aerospace engineering to search, then to ML and LLMs. His path: Eventbrite search → GitHub code search → data science → GitHub Copilot. He was drawn to more math and ML throughout his career.RAG Explained"RAG is not a thing. RAG is two things." It breaks into:Search - finding relevant informationPrompt engineering - presenting that information to the modelThese should be treated as separate problems to optimize.The Little Red Riding Hood PrincipleWhen prompting LLMs, stay on the path of what models have seen in training. Use formats, structures, and patterns they recognize from their training data:For code, use docstrings and proper formattingFor financial data, use SEC report structuresUse Markdown for better formattingModels respond better to familiar structures.Testing PromptsTesting strategies:Start with "vibe testing" - human evaluation of outputsDevelop systematic tests based on observed failure patternsUse token probabilities to measure model confidenceFor few-shot prompts, watch for diminishing returns as examples increaseManaging Token LimitsWhen designing prompts, divide content into:Static elements (boilerplate, instructions)Dynamic elements (user inputs, context)Prioritize content by:Must-have informationNice-to-have informationOptional if space allowsEven with larger context windows, efficiency remains important for cost and latency.Completion vs. Chat ModelsChat models are winning despite initial concerns about their constraints:Completion models allow more flexibility in document formatChat models are more reliable and aligned with common use casesMost applications now use chat models, even for completion-like tasksApplications: Workflows vs. AssistantsTwo main LLM application patterns:Assistants: Human-in-the-loop interactions where users guide and correctWorkflows: Decomposed tasks where LLMs handle well-defined steps with safeguardsBreaking Down Complex ProblemsTwo approaches:Horizontal: Split into sequential steps with clear inputs/outputsVertical: Divide by case type, with specialized handling for each scenarioExample: For SOX compliance, break horizontally (understand control, find evidence, extract data, compile report) and vertically (different audit types).On AgentsAgents exist on a spectrum from assistants to workflows, characterized by:Having some autonomy to make decisionsUsing tools to interact with the environmentUsually requiring human oversightBest PracticesFor building with LLMs:Start simple: API key + Jupyter notebookBuild prototypes and iterate quicklyAdd evaluation as you scaleKeep users in the loop until models prove reliabilityJohn Berryman:LinkedInX (Twitter)Arcturus LabsPrompt Engineering for LLMs (Book)Nicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to RAG: Retrieval and Generation00:19 Optimizing Retrieval Systems01:11 Introducing John Berryman02:31 John's Journey from Search to Prompt Engineering04:05 Understanding RAG: Search and Prompt Engineering05:39 The Little Red Riding Hood Principle in Prompt Engineering14:14 Balancing Static and Dynamic Elements in Prompts25:52 Assistants vs. Workflows: Choosing the Right Approach30:15 Defining Agency in AI30:35 Spectrum of Assistance and Workflows34:35 Breaking Down Problems Horizontally and Vertically37:57 SOX Compliance Case Study40:56 Integrating LLMs into Existing Applications44:37 Favorite Tools and Missing Features46:37 Exploring Niche Technologies in AI52:52 Key Takeaways and Future Directions
    --------  
    1:02:44
  • Graphs aren't just for specialists anymore. They are one import away | S2 E27
    Kuzu is an embedded graph database that implements Cypher as a library.It can be easily integrated into various environments—from scripts and Android apps to serverless platforms.Its design supports both ephemeral, in-memory graphs (ideal for temporary computations) and large-scale persistent graphs where traditional systems struggle with performance and scalability.Key Architectural Decisions:Columnar Storage:Kuzu stores node and relationship properties in separate, contiguous columns. This design reduces I/O by allowing queries to scan only the needed columns, unlike row-based systems (e.g., Neo4j) that read full records even when only a subset of properties is required.Efficient Join Indexing with CSR:The join index is maintained using a Compressed Sparse Row (CSR) format. By sorting and compressing relationship data, Kuzu ensures that adjacent node relationships are stored contiguously, minimizing random I/O and speeding up traversals.Vectorized Query Processing:Instead of processing one tuple at a time, Kuzu processes blocks (vectors) of tuples. This block-based (or vectorized) approach reduces function-call overhead and improves cache locality, boosting performance for analytic queries.Factorization and ASP Join:For many-to-many queries that can generate enormous intermediate results, Kuzu uses factorization to represent data compactly. Its ASP join algorithm integrates factorization, sequential scanning, and sideways information passing to avoid unnecessary full scans and materializations.Kuzu is optimized for read-heavy, analytic workloads. While batched writes are efficient, the system is less tuned for high-frequency, small transactions. Upcoming features include:A WebAssembly (Wasm) version for running in browsers.Enhanced vector and full-text search indices.Built-in graph data science algorithms for tasks like PageRank and centrality analysis.Kuzu can be a powerful backend for AI applications in several ways:Knowledge Graphs:Store and query complex relationships between entities to support natural language understanding, semantic search, and reasoning tasks.Graph Data Science:Run built-in graph algorithms (like PageRank, centrality, or community detection) that help uncover patterns and insights, which can improve recommendation systems, fraud detection, and other AI-driven analyses.Retrieval-Augmented Generation (RAG):Integrate with large language models by efficiently retrieving relevant, structured graph data. Kuzu’s vector search capabilities and fast query processing make it ideal for augmenting AI responses with contextual information.Graph Embeddings & ML Pipelines:Serve as the foundation for generating graph embeddings, which are used in downstream machine learning tasks—such as clustering, classification, or link prediction—to enhance model performance.Semih Salihoğlu:LinkedInKuzu GitHubKuzu DocsNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to Graph Databases00:18 Introducing Kuzu: A Modern Graph Database01:48 Use Cases and Applications of Kuzu03:03 Kuzu's Research Origins and Scalability06:18 Columnar Storage vs. Row-Oriented Storage10:27 Query Processing Techniques in Kuzu22:22 Compressed Sparse Row (CSR) Storage27:25 Vectorization in Graph Databases31:24 Optimizing Query Processors with Vectorization33:25 Common Wisdom in Graph Databases35:13 Introducing ASP Join in Kuzu35:55 Factorization and Efficient Query Processing39:49 Challenges and Solutions in Graph Databases45:26 Write Path Optimization in Kuzu54:10 Future Developments in Kuzu57:51 Key Takeaways and Final Thoughts
    --------  
    1:03:35
  • Knowledge Graphs Won't Fix Bad Data | S2 E26
    Metadata is the foundation of any enterprise knowledge graph.By organizing both technical and business metadata, organizations create a “brain” that supports advanced applications like AI-driven data assistants.The goal is to achieve economies of scale—making data reusable, traceable, and ultimately more valuable.Juan Sequeda is a leading expert in enterprise knowledge graphs and metadata management. He has spent years solving the challenges of integrating diverse data sources into coherent, accessible knowledge graphs. As Principal Scientist at data.world, Juan provides concrete strategies for improving data quality, streamlining feature extraction, and enhancing model explainability. If you want to build AI systems on a solid data foundation—one that cuts through the noise and delivers reliable, high-performance insights—you need to listen to Juan’s proven methods and real-world examples.Terms like ontologies, taxonomies, and knowledge graphs aren’t new inventions. Ontologies and taxonomies have been studied for decades—even since ancient Greece. Google popularized “knowledge graphs” in 2012 by building on decades of semantic web research. Despite current buzz, these concepts build on established work.Traditionally, data lives in siloed applications—each with its own relational databases, ETL processes, and dashboards. When cross-application queries and consistent definitions become painful, organizations face metadata management challenges. The first step is to integrate technical metadata (table names, columns, code lineage) into a unified knowledge graph. Then, add business metadata by mapping business glossaries and definitions to that technical layer.A modern data catalog should:Integrate Multiple Sources: Automatically ingest metadata from databases, ETL tools (e.g., dbt, Fivetran), and BI tools.Bridge Technical and Business Views: Link technical definitions (e.g., table “CUST_123”) with business concepts (e.g., “Customer”).Enable Reuse and Governance: Support data discovery, impact analysis, and proper governance while facilitating reuse across teams.Practical Approaches & Use Cases:Start with a Clear Problem: Whether it’s reducing churn, improving operational efficiency, or meeting compliance needs, begin by solving a specific pain point.Iron Thread Method: Follow one query end-to-end—from identifying a business need to tracing it back to source systems—to gradually build and refine the graph.Automation vs. Manual Oversight: Technical metadata extraction is largely automated. For business definitions or text-based entity extraction (e.g., via LLMs), human oversight is key to ensuring accuracy and consistency.Technical Considerations:Entity vs. Property: If you need to attach additional details or reuse an element across contexts, model it as an entity (with a unique identifier). Otherwise, keep it as a simple property.Storage Options: The market offers various graph databases—Neo4j, Amazon Neptune, Cosmos DB, TigerGraph, Apache Jena (for RDF), etc. Future trends point toward multimodel systems that allow querying in SQL, Cypher, or SPARQL over the same underlying data.Juan Sequeda:LinkedIndata.worldSemantic Web for the Working OntologistDesigning and Building Enterprise Knowledge Graphs (before you buy, send Juan a message, he is happy to send you a copy)Catalog & Cocktails (Juan’s podcast)Nicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to Knowledge Graphs 00:45 The Role of Metadata in AI 01:06 Building Knowledge Graphs: First Steps 01:42 Interview with Juan Sequira 02:04 Understanding Buzzwords: Ontologies, Taxonomies, and More 05:05 Challenges and Solutions in Data Management 08:04 Practical Applications of Knowledge Graphs 15:38 Governance and Data Engineering 34:42 Setting the Stage for Data-Driven Problem Solving 34:58 Understanding Consumer Needs and Data Challenges 35:33 Foundations and Advanced Capabilities in Data Management 36:01 The Role of AI and Metadata in Data Maturity 37:56 The Iron Thread Approach to Problem Solving 40:12 Constructing and Utilizing Knowledge Graphs 54:38 Trends and Future Directions in Knowledge Graphs 59:17 Practical Advice for Building Knowledge Graphs
    --------  
    1:10:59
  • Temporal RAG: Embracing Time for Smarter, Reliable Knowledge Graphs | S2 E25
    Daniel Davis is an expert on knowledge graphs. He has a background in risk assessment and complex systems—from aerospace to cybersecurity. Now he is working on “Temporal RAG” in TrustGraph.Time is a critical—but often ignored—dimension in data. Whether it’s threat intelligence, legal contracts, or API documentation, every data point has a temporal context that affects its reliability and usefulness. To manage this, systems must track when data is created, updated, or deleted, and ideally, preserve versions over time.Three Types of Data:Observations:Definition: Measurable, verifiable recordings (e.g., “the hat reads ‘Sunday Running Club’”).Characteristics: Require supporting evidence and may be updated as new data becomes available.Assertions:Definition: Subjective interpretations (e.g., “the hat is greenish”).Characteristics: Involve human judgment and come with confidence levels; they may change over time.Facts:Definition: Immutable, verified information that remains constant.Characteristics: Rare in dynamic environments because most data evolves; serve as the “bedrock” of trust.By clearly categorizing data into these buckets, systems can monitor freshness, detect staleness, and better manage dependencies between components (like code and its documentation).Integrating Temporal Data into Knowledge Graphs:Challenge:Traditional knowledge graphs and schemas (e.g., schema.org) rarely integrate time beyond basic metadata. Long documents may only provide a single timestamp, leaving the context of internal details untracked.Solution:Attach detailed temporal metadata (such as creation, update, and deletion timestamps) during data ingestion. Use versioning to maintain historical context. This allows systems to:Assess whether data is current or stale.Detect conflicts when updates occur.Employ Bayesian methods to adjust trust metrics as more information accumulates.Key Takeaways:Focus on Specialization:Build tools that do one thing well. For example, design a simple yet extensible knowledge graph rather than relying on overly complex ontologies.Integrate Temporal Metadata:Always timestamp data operations and version records. This is key to understanding data freshness and evolution.Adopt Robust Infrastructure:Use scalable, proven technologies to connect specialized modules via APIs. This reduces maintenance overhead compared to systems overloaded with connectors and extra features.Leverage Bayesian Updates:Start with initial trust metrics based on observed data and refine them as new evidence arrives.Mind the Big Picture:Avoid working in isolated silos. Emphasize a holistic system design that maintains in situ context and promotes collaboration across teams.Daniel DavisCognitive CoreTrustGraphYouTubeLinkedInDiscordNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to Temporal Dimensions in Data 00:53 Timestamping and Versioning Data 01:35 Introducing Daniel Davis and Temporal RAG 01:58 Three Buckets of Data: Observations, Assertions, and Facts 03:22 Dynamic Data and Data Freshness 05:14 Challenges in Integrating Time in Knowledge Graphs 09:41 Defining Observations, Assertions, and Facts 12:57 The Role of Time in Data Trustworthiness 46:58 Chasing White Whales in AI 47:58 The Problem with Feature Overload 48:43 Connector Maintenance Challenges 50:02 The Swiss Army Knife Analogy 51:16 API Meshes and Glue Code 54:14 The Importance of Software Infrastructure 01:00:10 The Need for Specialized Tools 01:13:25 Outro and Future Plans
    --------  
    1:33:44
  • Context is King: How Knowledge Graphs Help LLMs Reason
    Robert Caulk runs Emergent Methods, a research lab building news knowledge graphs. With a Ph.D. in computational mechanics, he spent 12 years creating open-source tools for machine learning and data analysis. His work on projects like Flowdapt (model serving) and FreqAI (adaptive modeling) has earned over 1,000 academic citations.His team built AskNews, which he calls "the largest news knowledge graph in production." It's a system that doesn't just collect news - it understands how events, people, and places connect.Current AI systems struggle to connect information across sources and domains. Simple vector search misses crucial relationships. But building knowledge graphs at scale brings major technical hurdles around entity extraction, relationship mapping, and query performance.Emergent Methods built a hybrid system combining vector search and knowledge graphs:Vector DB (Quadrant) handles initial broad retrievalCustom knowledge graph processes relationshipsTranslation pipeline normalizes multi-language contentEntity extraction model identifies key elementsContext engineering pipeline structures data for LLMsImplementation Details:Data Pipeline:All content normalized to English for consistent embeddingsEntity names preserved in original language when untranslatableCustom Gleiner News model handles entity extractionRetrained every 6 months on fresh dataHuman review validates entity accuracyEntity Management:Base extraction uses BERT-based Gleiner architectureTrained on diverse data across topics/regionsDisambiguation system merges duplicate entitiesManual override options for analystsMetadata tracking preserves relationship contextKnowledge Graph:Selective graph construction from vector resultsOn-demand relationship processingGraph queries via standard CypherBuilt for specific use cases vs general coverageIntegration with S3 and other data storesSystem Validation:Custom "Context is King" benchmark suiteRAGAS metrics track retrieval accuracyTime-split validation prevents data leakageManual review of entity extractionProduction monitoring of query patternsEngineering Insights:Key Technical Decisions:English normalization enables consistent embeddingsHybrid vector + graph approach balances speed/depthSelective graph construction keeps costs downHuman-in-loop validation maintains qualityDead Ends Hit:Full multi-language entity system too complexReal-time graph updates not feasible at scalePure vector or pure graph approaches insufficientTop Quotes:"At its core, context engineering is about how we feed information to AI. We want clear, focused inputs for better outputs. Think of it like talking to a smart friend - you'd give them the key facts in a way they can use, not dump raw data on them." - Robert"Strong metadata paints a high-fidelity picture. If we're trying to understand what's happening in Ukraine, we need to know not just what was said, but who said it, when they said it, and what voice they used to say it. Each piece adds color to the picture." - Robert"Clean data beats clever models. You can throw noise at an LLM and get something that looks good, but if you want real accuracy, you need to strip away the clutter first. Every piece of noise pulls the model in a different direction." - Robert"Think about how the answer looks in the real world. If you're comparing apartments, you'd want a table. If you're tracking events, you'd want a timeline. Match your data structure to how humans naturally process that kind of information." - Nico"Building knowledge graphs isn't about collecting everything - it's about finding the relationships that matter. Most applications don't need a massive graph. They need the right connections for their specific problem." - Robert"The quality of your context sets the ceiling for what your AI can do. You can have the best model in the world, but if you feed it noisy, unclear data, you'll get noisy, unclear answers. Garbage in, garbage out still applies." - Robert"When handling multiple languages, it's better to normalize everything to one language than to try juggling many. Yes, you lose some nuance, but you gain consistency. And consistency is what makes these systems reliable." - Robert"The hard part isn't storing the data - it's making it useful. Anyone can build a database. The trick is structuring information so an AI can actually reason with it. That's where context engineering makes the difference." - Robert"Start simple, then add complexity only when you need it. Most teams jump straight to sophisticated solutions when they could get better results by just cleaning their data and thinking carefully about how they structure it." - Nico"Every token in your context window is precious. Don't waste them on HTML tags or formatting noise. Save that space for the actual signal - the facts, relationships, and context that help the AI understand what you're asking." - NicoRobert Caulk:LinkedInEmergent MethodsAsknewsNicolay Gerold:⁠LinkedIn⁠⁠X (Twitter)00:00 Introduction to Context Engineering 00:24 Curating Input Signals 01:01 Structuring Raw Data 03:05 Refinement and Iteration 04:08 Balancing Breadth and Precision 06:10 Interview Start 08:02 Challenges in Context Engineering 20:25 Optimizing Context for LLMs 45:44 Advanced Cypher Queries and Graphs 46:43 Enrichment Pipeline Flexibility 47:16 Combining Graph and Semantic Search 49:23 Handling Multilingual Entities 52:57 Disambiguation and Deduplication Challenges 55:37 Training Models for Diverse Domains 01:04:43 Dealing with AI-Generated Content 01:17:32 Future Developments and Final Thoughts
    --------  
    1:33:35

Więcej Technologia podcastów

O How AI Is Built

Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.
Strona internetowa podcastu

Słuchaj How AI Is Built, MacGadka 🎙 – podcast MyApple i wielu innych podcastów z całego świata dzięki aplikacji radio.pl

Uzyskaj bezpłatną aplikację radio.pl

  • Stacje i podcasty do zakładek
  • Strumieniuj przez Wi-Fi lub Bluetooth
  • Obsługuje Carplay & Android Auto
  • Jeszcze więcej funkcjonalności
Media spoecznościowe
v7.10.0 | © 2007-2025 radio.de GmbH
Generated: 3/12/2025 - 8:07:46 PM