AI News HubLIVE
Public articles 132Collected articles 180Trust 86Refresh 60 min
Health HealthySource type OfficialFull-text rights Official full textLast ingested 2026-06-26ID databricks-blogStatus Enabled

Official data and AI platform feed; confirm reuse terms before full body display.

Latest public articles

How Databricks is turning video into searchable, actionable intelligence

Databricks introduces a novel approach that treats video as a data engineering problem, leveraging Vision Language Models, serverless GPUs, and Lakeflow pipelines to automatically detect, truncate, and summarize key video moments. The model-agnostic architecture scales horizontally for real-time analysis in public safety, infrastructure, and urban operations.

  • Databricks treats video analysis as a data engineering problem, using VLMs, serverless GPUs, and Lakeflow pipelines.
  • The pipeline automatically detects, truncates, and summarizes key video moments, enabling natural language queries.
In-site article

How the English Office for Students leverages Databricks to enhance higher education standards and drive better student outcomes

The Office for Students modernized its analytics environment on Databricks to manage millions of student records and support data-informed higher education regulation. By unifying structured, qualitative, and near-live data on a governed platform with Unity Catalog and AI capabilities, teams accelerated analysis and improved collaboration. The organization dramatically reduced processing times, streamlined complex regulatory workflows and enabled faster, more trusted decision support to improve student outcomes across England.

  • Processing time for a 300-million-record data job reduced from 8 hours to minutes after moving to Databricks.
  • Student segmentation analysis completed in half a day, down from two weeks with two analysts.
In-site article

What To Look For in a Serverless Database for AI Applications

This buyer's guide covers key criteria for evaluating serverless databases for AI workloads, including compute-storage separation, open standards, scale-to-zero, connection models, and AI-native capabilities.

  • Separation of compute and storage is critical for true serverless architecture.
  • Open standards like PostgreSQL ensure portability and avoid vendor lock-in.
In-site article

What Is Serverless PostgreSQL?

Serverless PostgreSQL is a fully managed cloud database model that decouples compute and storage for independent scaling. It eliminates manual provisioning and charges only for active usage, making it suitable for bursty workloads but less ideal for always-on, latency-sensitive applications. The article also introduces lakebase architecture, which builds on serverless Postgres to unify transactional and analytical workloads, reducing data duplication and simplifying access for AI and real-time applications.

  • Serverless PostgreSQL decouples compute and storage, enabling automatic scaling and usage-based billing.
  • Compared to traditional Postgres, it reduces operational overhead but introduces cold start latency and connection management challenges.
In-site article

How Daikin Applied Americas builds consistent data pipelines at scale with Genie Code

Daikin Applied Americas redesigned its data engineering operating model using Databricks Genie Code, implementing a MECE skill framework and medallion architecture to enforce consistency. This AI-assisted approach accelerates pipeline development while maintaining governance and alignment with business concepts.

  • Standardized pipeline development using MECE skills and medallion architecture.
  • Genie Code enables faster iteration and reduces boilerplate.
In-site article

What if the answer was already in your data?

Kythera Labs is building an AI-native healthcare strategy platform on Databricks that gives any health system access to expert intelligence through AI agents that answer strategic questions in plain language. A Louisiana health system went live in 10 days, achieving 150% more visibility into patient encounters, 22% less leakage, and $3.8M in estimated annualized value.

  • Kythera Labs packages healthcare data expertise into AI agents on Databricks, enabling leaders to ask strategic questions in natural language.
  • The platform processes 339 billion claims to reconstruct patient journeys and deliver trustworthy answers.
In-site article

Databricks positioned highest in execution and furthest in vision for the second consecutive year in Gartner Magic Quadrant

Databricks has been recognized as a Leader in the 2026 Gartner Magic Quadrant for AI Platforms for Data Science and Machine Learning, positioned highest in Ability to Execute and furthest in Completeness of Vision. This reflects the market shift from model building to deploying agentic applications at scale, emphasizing unified data, AI, and governance.

  • Databricks placed highest in execution and furthest in vision for the second year in a row.
  • Enterprises are rapidly deploying agentic apps, requiring a unified data, AI, and governance platform.
In-site article

Genesis Workbench: A blueprint for industry AI in life sciences, powered by Databricks and NVIDIA

Genesis Workbench is an open Databricks blueprint integrating NVIDIA's accelerated computing tools for end-to-end drug discovery, providing a no-code interface for bench scientists while maintaining IP security via Unity Catalog.

  • Modular blueprint combining Databricks governance with NVIDIA's BioNeMo and Parabricks for drug discovery.
  • No-code point-and-click UI enables bench scientists to perform genomics and molecular design tasks.
In-site article

Guide to Agentic Systems and AI Agents

Agentic AI systems are autonomous platforms that perceive, reason, act, and learn to achieve complex goals with minimal human intervention. This guide explains how they differ from generative AI, their core components, orchestration patterns, and enterprise governance considerations.

  • Agentic AI autonomously plans and executes multi-step workflows, unlike single-response generative AI.
  • Core loop: perceive-reason-act-learn, with LLMs as reasoning engines and external tools for execution.
In-site article

Top 10 AI Business Solutions Driving Company Growth

Companies seeing the highest returns from AI are making deliberate investments tied to specific business outcomes, grounded in clean and governed data. This article outlines 10 proven AI business solutions and the conditions necessary for success.

  • AI value creation comes in three forms: productivity, automation, and business reimagination.
  • Data quality accounts for 75% of AI solution success.
In-site article

End-to-End RAG Workflow: How Retrieval Augmented Generation Works

Retrieval Augmented Generation (RAG) connects large language models to external knowledge bases through a five-stage pipeline — ingestion, embedding, retrieval, augmentation, and generation — enabling accurate, domain-specific answers without retraining the model. A production RAG workflow requires selecting the right embedding model, configuring vector database indexing and chunking strategies, and implementing hybrid search that combines semantic vector search with keyword fallback to maximize retrieval quality. RAG evaluation must measure retrieval precision and generation faithfulness independently, because strong LLM performance cannot compensate for a weak information retrieval component, and continuous data updates are essential to prevent stale knowledge from degrading response accuracy.

  • RAG uses a five-stage pipeline to connect LLMs with external knowledge, avoiding model retraining.
  • Hybrid search (semantic + keyword) and proper chunking are critical for retrieval quality.
In-site article

What is Vector Search?

Vector search is a search technique that finds results based on meaning and context rather than exact keyword matches. It uses embeddings to identify similar text, images, audio, and other content, solving limitations of keyword-only search. It is used in RAG, enterprise search, recommendations, and anomaly detection. Production systems often combine vector and keyword search for better results, and managed services like Databricks AI Search add reranking, metadata filtering, and governance.

  • Vector search compares embeddings to find meaning-based matches, not just keywords. It excels at synonyms, cross-language, and cross-format search.
  • It works by creating embeddings, building an index, and matching queries using approximate nearest neighbor (ANN) search for scalability.
In-site article

Data Lake vs. Cloud Data Warehouse: A Practical Guide for Data Scientists

This article compares data lakes and cloud data warehouses, highlighting that data lakes store raw, multi-format data for machine learning and advanced analytics, while cloud data warehouses optimize high-concurrency SQL performance for structured reporting. Data lakehouses, built on open table formats like Delta Lake, unify both advantages and are projected to become the dominant architecture for enterprise analytics.

  • Data lakes store raw data at low cost, support all data types, and are ideal for machine learning.
  • Cloud data warehouses provide fast SQL queries and high concurrency but only for structured data.
In-site article

Data scientists: Powering the future of AI and analytics

Data scientists sit at the intersection of analytics, machine learning and AI, turning raw data into predictive models, experiments and recommendations that guide business decisions. This article explores the evolution of the role, core skills needed, challenges faced, and how modern platforms accelerate the path from exploration to deployment.

  • Data scientists transform raw data into predictive models, experiments, and recommendations that drive business outcomes.
  • The role has expanded to include large language models, generative AI, and production deployment.
In-site article

How Stagwell Built Privacy-Safe ID Matching on Databricks

Stagwell developed a privacy-safe identity matching solution using Databricks Clean Rooms and Marketplace Apps. Brands can install the app in their own environment, run identity matching against Stagwell's Identity Spine without exposing raw data, and activate audiences through its Agentic Targeting System. The approach reduces deployment from months to minutes and ensures compliance.

  • Brands face challenges in matching first-party data with identity graphs securely.
  • Databricks Marketplace Apps enable plug-and-play deployment in the brand's own workspace.
In-site article

What is artificial intelligence (AI)?

Artificial intelligence (AI) is a branch of computer science that enables machines to perform tasks requiring human intelligence. This article covers how AI works, main types, real-world examples, limitations, and history.

  • AI learns patterns from data to make predictions or decisions, rather than being explicitly programmed.
  • Modern AI falls mainly into the 'limited memory' category, powering chatbots and recommendation engines.
In-site article

Data Engineering for AI: A Practical Guide for Data Professionals

Data engineering is the foundational backbone of artificial intelligence systems. This guide for data professionals covers the complete lifecycle of data engineering for AI, from ingestion and architecture to feature engineering, generative AI integration, compliance, and career development.

  • Data engineering for AI shifts focus from traditional BI to managing large-scale, unstructured, and real-time data pipelines for ML and generative AI.
  • Automation, observability, and unified data architecture are core competencies for production-grade AI.
In-site article

Data Warehouse Types: A Complete Guide to Architectures and Use Cases

A data warehouse is a centralized repository that stores structured historical data from multiple sources, optimized for complex queries and business intelligence. This guide covers the three primary types: Enterprise Data Warehouses (EDW), Data Marts, and Operational Data Stores (ODS), along with modern cloud, hybrid, and lakehouse architectures, helping you choose the right data warehouse for your needs.

  • The three primary data warehouse types are EDW, Data Mart, and ODS.
  • EDW serves as a single source of truth across the enterprise, but traditional EDWs face scalability challenges.
In-site article

Payment Fraud Detection: How Banks and Businesses Stop Fraudulent Transactions

Payment fraud detection combines rule-based systems, machine learning, and real-time monitoring to block unauthorized transactions. Learn about major fraud types, detection techniques, and prevention strategies.

  • Major fraud types include credit card fraud, account takeover, card testing, friendly fraud, and authorized push payment scams.
  • Detection uses behavioral analytics, device fingerprinting, and real-time risk scoring.
In-site article

What is an AI agent harness?

An AI agent harness is the software infrastructure that wraps around a large language model (LLM) and enables it to act on tasks, not just respond to prompts. This article explains the core components—tools, memory, sandboxes, and guardrails—and how they enable reliable action through a reason-act-observe loop. It covers eight building blocks, common failure modes, and why harness design is critical for enterprise AI strategy.

  • An AI agent harness turns model reasoning into reliable action, providing tools, memory, execution environments, and guardrails.
  • Harness design directly impacts agent performance; strong context management, orchestration, and verification matter as much as the underlying model.
In-site article

Databricks and NVIDIA: Building for the Agentic Era

Databricks and NVIDIA are expanding their collaboration to deliver an end-to-end AI platform that accelerates model training, inference, and agentic AI development on governed enterprise data. New capabilities include Multinode training in AI Runtime, GPU support in Databricks Free Edition, Model Serving Enhancements, and support for NVIDIA technologies such as NVIDIA Agent Toolkit. Customers can leverage NVIDIA’s industry-specific AI frameworks directly within Databricks to accelerate use cases across healthcare, life sciences, supply chain, robotics, digital twins, and document intelligence.

  • Databricks and NVIDIA expand partnership for end-to-end AI platform covering training, inference, and agentic development.
  • New features: multinode training in AI Runtime, Free Edition GPU support, enhanced Model Serving, and NVIDIA Agent Toolkit integration.
In-site article

The Partner Well-Architected Framework: What's New and What's Next

The Partner Well-Architected Framework (PWAF) provides AI-ready architecture guidance, technical standards, and best practices for partners building on Databricks. Since its February launch, new additions include the AI Partner Dev Kit, expanded architecture patterns, and the open-source Firefly reference application, helping partners accelerate development and adopt proven design patterns. PWAF continues to evolve with the Databricks platform and AI market, enabling partners to build differentiated products, measure adoption impact, and unlock growth opportunities.

  • PWAF offers AI-ready guidance for three partner architectures: Built-On, Connected, and Data Collaboration
  • New AI Partner Dev Kit includes 15+ tested skills for coding agents to use
In-site article

What’s coming next to Free Edition

Databricks is expanding its Free Edition with five new products—Genie Code, serverless GPUs, Lakebase, Agent Bricks, and Lakeflow Designer—to provide a complete toolkit for building data and AI projects. Over 500,000 users have already used Free Edition, and the new additions enable end-to-end development from data engineering to AI agents.

  • Over 500,000 users have used Databricks Free Edition since its launch 12 months ago.
  • Five new products are added: Genie Code, serverless GPUs, Lakebase, Agent Bricks, and Lakeflow Designer.
In-site article

Becoming the most comprehensive data & AI ecosystem on earth

Databricks expands its partner ecosystem with new Marketplace, Apps, OpenSharing, and Genie Agent capabilities that help partners build, distribute, share, and monetize solutions on the platform.

  • Marketplace Commit Drawdown allows partners to access customers' pre-committed spend to accelerate deals.
  • Partners can now list Databricks Apps on Marketplace, reaching over 20,000 customers.
In-site article

Design Beautiful Dashboards in AI/BI

Learn how to design professional, on-brand dashboards in Databricks AI/BI using themes, layout strategies, font selection, color theory, and accessible visualization palettes.

  • Customize fonts, colors, and visualization palettes with dashboard themes for brand consistency.
  • Structure layouts using grid and scanning patterns (F or Z) based on audience needs.
In-site article

What’s new in Genie Code at Data + AI Summit 2026

Genie Code is Databricks' specialized agent for data and ML work. Over the past year, Genie products have grown over 10x and are used by 90% of customers. New updates include a full-page command center for managing complex multi-threaded work, agentic ML workflows with native integration into MLflow, Model Serving, and compute awareness, scheduled tasks for autonomous work, and Genie ZeroOps for production operations.

  • New full-page command center for managing complex multi-threaded work with thread status, review points, and quick access to instructions, skills, and connectors.
  • Genie Code expands to ML workflows with agentic development, using Genie Ontology to learn team patterns, and integrates natively with MLflow, Model Serving, and compute awareness.
In-site article

What’s new in Databricks Platform security and compliance at Data + AI Summit 2026

At Data + AI Summit 2026, Databricks announced new security and compliance capabilities including Automatic Identity Management (AIM) for Entra ID and Okta, Context-Based Ingress, Private Network Gateway, expanded Private Link for Lakebase, and new compliance certifications such as HITRUST, ISMAP, and upcoming FedRAMP High on Azure Commercial.

  • AIM for Entra ID is now GA on AWS and GCP, AIM for Okta in Public Preview.
  • Context-Based Ingress enables zero-trust access policies for AI experiences.
In-site article

Building an open ecosystem for AI governance with Unity AI Gateway

Databricks announces the Unity AI Gateway partner ecosystem at Data + AI Summit 2026, integrating security, identity, and governance vendors to help enterprises monitor, secure, and govern AI interactions at runtime.

  • Unity AI Gateway extends Unity Catalog governance to runtime interactions between models, agents, MCP servers, and tools.
  • New partners include Alice, CrowdStrike, Cyera, and others for real-time AI security and guardrails.
In-site article

What’s New in the AI Platform: Agents for ML Engineering, Our Deep Learning Platform, and New Capabilities for Real-Time ML

Databricks announces new AI platform capabilities at Data+AI Summit 2026: Genie Code for ML (coding agent integrated with ML stack), AI Runtime (public preview of serverless GPU training), and enhanced real-time ML support (low-latency, high-QPS Feature Store and Model Serving). These features accelerate the path from experimentation to production.

  • Genie Code for ML: coding agent that integrates with Databricks ML components for faster feature engineering, training, deployment, and monitoring.
  • AI Runtime (Public Preview): serverless GPU training environment for research-grade deep learning and fine-tuning without infrastructure management.
In-site article

Introducing the Agentic CDP: A New Species of CDP for a New Era of Agents

The article introduces the Agentic CDP, a new type of Customer Data Platform built for the era of AI agents. It argues that traditional CDPs are obsolete because buyers now use agents that make decisions in milliseconds, demanding speed, hyper-personalization, and richer context (Golden Context). The Agentic CDP is embedded in the data foundation, powers Infinity Campaigns (always-on, AI-driven personalization), and is designed from the ground up for agents and humans to work together. Databricks' CustomerLake is presented as an implementation.

  • Traditional CDPs fail to meet the speed, personalization, and context demands of AI agent-driven buying.
  • Agentic CDP introduces Golden Context (live customer, business, and decision signals) and Infinity Campaigns (autonomous, real-time personalization).
In-site article

All sources