What's new in Databricks - May 2025
A DEDICATED NEWSLETTER FOR THE DATA AND AI SUMMIT IS GOING TO BE AVAILABLE SOON.
May 2025 Release Highlights
Databricks Apps is GA
Metrics views are in Public preview
Data Engineering
File events on external location
You can now enable file events on external locations that are defined in Unity Catalog. This makes file arrival triggers in jobs and file notifications in Auto Loader more scalable and efficient. Learn more
Views in ETL pipelines
The CREATE VIEW
SQL command is now available in ETL pipelines. You can create a dynamic view of your data. Learn more
Pipelines and workflows enhancements
Tables created by DLT in Unity Catalog ETL pipelines can now be moved from one pipeline to another.
In the past repaired tasks were unblocked once their direct dependencies completed. Now, repaired tasks wait for all transitive dependencies. For example, in a graph A → B → C, repairing A and C will block C until A finishes.
Databricks Asset bundles in the workspace
Collaborating on Databricks Asset Bundles with other users in your organization is now easier with bundles in the workspace, which allows workspace users to edit, commit, test, and deploy bundle updates through the UI. Learn more
Other updates
Type widening support added for streaming for Delta tables.
You can use the IDENTIFIER clause when performing : Create Catalog, Drop Catalog, Comment on Catalog, Alter Catalog
Governance and data sharing
System tables update
You can now use the system.access.workspaces_latest
table to monitor the latest state of all active workspaces in your account.
The system.lakeflow
schema, which contains system tables related to jobs, is now enabled by default in all Unity Catalog workspaces.
The system.lakeflow.pipelines
table is a slowly changing dimension table (SCD2) that tracks all pipelines created in your Databricks account.
The lineage system tables (system.access.column_lineage
and system.access.table_lineage
) have been updated to better log entity information.
The
entity_metadata
column replacesentity_type
,entity_run_id
, andentity_id
, which have been deprecated.The
record_id
column is a new primary key for the lineage record.The
event_id
column logs an identifier for lineage event, which can be shared by multiple rows if they were generated by the same event.The
statement_id
column logs the query statement ID of the query that generated the lineage event. It is a foreign key that can be joined with thesystem.query.history
table.
Cleanroom enhancements
Databricks Clean Rooms now supports:
Up to 10 collaborators for more complex, multi-party data projects.
New notebook approval workflow that enhances security and compliance, allowing designated runners, requiring explicit approval before execution
Auto-approval options for trusted partners.
Difference views for easy review and auditing.
Platform
Predictive Optimization is now enabled for all Databricks accounts
Query Snippets are now available everywhere
Query snippets are segments of queries that you can share and trigger using autocomplete. You can now create query snippets through the View menu in the new SQL editor, and also in the notebook and file editors. Learn more
Jobs and pipelines share a single unified view
Automatically provision user with JIT
You can now enable just-in-time (JIT) provisioning to automatically creates new user accounts during first-time authentication. When a user logs in to Databricks for the first time using single sign-on (SSO), Databricks checks if the user already has an account. If not, Databricks instantly provisions a new user account using details from the identity provider. Learn more
Support of workflow scope in Databricks App to support Authoring Github actions
Databricks made a change where you may get an email request for the Read and write access to Workflows scope for the Databricks GitHub app. This change makes the scope of the Databricks GitHub app consistent with the required scope of other supported authentication methods and allows users to commit GitHub Actions from Databricks Git folders using the Databricks GitHub app for authorization.
If you are the owner of a Databricks account where the Databricks GitHub app is installed and configured to support OAuth, you may receive the following notification in an email titled "Databricks is requesting updated permissions" from GitHub. Accept the new permissions in order to enable committing GitHub Actions from Databricks Git folders with the Databricks GitHub app.
SQL Authoring improvements
Filters applied to result tables now also affect visualizations, enabling interactive exploration without modifying the underlying query or dataset.Learn more
In a SQL notebook, you can now create a new query from a filtered results table or visualization.Learn more
You can hover over
*
in aSELECT *
query to expand the columns in the queried table.Custom SQL formatting settings are now available in the new SQL editor and the notebook editor. Click View > Developer Settings > SQL Format. Learn more
Improved UI for managing notebook dashboards
Databricks MFA is GA
Account admins can now enable multi-factor authentication (MFA) directly in Databricks, allowing them to either recommend or require MFA for users at login. Learn more
Dashboards, alerts and queries are supported as workspace files
Dashboards, alerts, and queries are now supported as workspace files, which means you can programmatically interact with these Databricks objects like any other file, from anywhere the workspace filesystem is available.
Databricks Apps is now GA
GenAI & ML
Llama 4 Maverick is available in EU regions
Llama 4 Maverick can now be used in EU regions where Foundation Model API pay-per-token is supported.
Learn more
Why it matters: This expands access to cutting-edge language models for European customers, supporting data residency and compliance needs while enabling advanced generative AI applications.
Llama 4 Maverick supports provisioned throughput workloads (Public Preview)
Llama 4 Maverick is now supported on Foundation Model APIs provisioned throughput workloads, available in Public Preview.
See Provisioned throughput Foundation Model APIs
See Preview limitations
Why it matters: Provisioned throughput enables consistent, high-performance access to Llama 4 Maverick for production workloads, removing latency variability typical in on-demand usage.
Mosaic AI Model Serving expands to new regions
Mosaic AI Model Serving is now available in ap-northeast-2
, eu-west-2
, and sa-east-1
.
See Model serving feature availability
Why it matters: More regional availability helps global organizations reduce inference latency and comply with local data regulations.
Provisioned throughput workloads support compliance security standards
Provisioned throughput Foundation Model APIs now support HIPAA, PCI-DSS, and UK Cyber Essentials Plus compliance standards.
See compliance availability
Why it matters: Teams working with sensitive or regulated data can now use high-performance AI workloads while meeting critical compliance requirements.
Model Serving CPU and GPU workloads support compliance standards
Model Serving workloads (CPU and GPU) now support HIPAA, PCI-DSS, and UK Cyber Essentials Plus compliance via the compliance security profile.
See compliance availability
Why it matters: Ensures secure and compliant deployment of AI services across regulated industries, enabling safer AI adoption in healthcare, finance, and government sectors.
Databricks Asset Bundles now available in the workspace (Public Preview)
Collaborate more easily on Databricks Asset Bundles through the UI by editing, committing, testing, and deploying bundle updates directly from the workspace.
See Collaborate on bundles in the workspace
Why it matters: Makes it easier for cross-functional teams to manage code and configuration changes in production workflows without relying on local CLI setups.
Claude Sonnet 4 and Claude Opus now on Mosaic AI Model Serving
The Anthropic Claude Sonnet 4 and Claude Opus 4 models are now available as Databricks-hosted foundation models via Mosaic AI Model Serving in US regions (pay-per-token).
See Foundation Model APIs
Why it matters: These models offer strong reasoning and summarization capabilities, giving users more choices for deploying conversational AI and enterprise LLM solutions.
Agent Bricks: Knowledge Assistant now in Beta
Agent Bricks offers a no-code tool for creating domain-specific Q&A chatbots. The Knowledge Assistant supports fine-tuning your assistant based on feedback from subject matter experts.
See how to use Knowledge Assistant
Why it matters: Lowers the barrier to building specialized AI assistants, enabling business teams to deploy effective AI solutions without needing deep ML expertise.
Here are polished entries for each video, with a title, what it covers, and why you should watch it:
📺 Automate Cost Reporting with Genie + MCP in Databricks
What it covers:
A hands-on demo of using Genie and the Anthropic’s (MCP) protocol to automate the generation of Databricks cost reports, complete with executive summaries and usage breakdowns via a conversational interface.
📺 Building a Multi-Genie AI Agent Architecture on Databricks
What it covers:
This video follows Aarni's journey as he builds a multi-genie AI architecture on Databricks. It features a “Genie of Genies” system that routes user queries to domain-specific LLM agents using orchestration, permissions, logging, and session memory—all deployed using Databricks Apps and service principals. Includes a live demo and open-source code walkthrough.
📺 KPMG + Databricks: GenAI for Scalable Financial Audits
What it covers:
KPMG and Databricks showcase how they use GenAI and Unity Catalog to automate audit workflows, including intelligent document parsing, entity extraction, and secure access controls across financial datasets.
AIBI
AI/BI dashboards
Unity Catalog metric views are now in Public Preview: Metric views provide a centralized way to define and manage consistent, reusable, and governed core business metrics. They abstract complex business logic into a centralized definition, enabling organizations to define key performance indicators once and use them consistently across reporting tools like dashboards, Genie spaces, and alerts.
Global filters panel: Dashboard authors can now add filters to a global filters panel, which applies across all dashboard pages.
Relative date and time preview improvement: Previews for relative date and time controls are now hidden to avoid confusion caused by time zone differences.
Dashboard PDF snapshots improvement: Dashboard PDF snapshots now wrap the content of the dashboard and no longer include white space.
Genie in dashboards query update: Genie in dashboards will now only query from dashboard datasets, instead of the underlying tables within dashboard datasets.
Custom calculation support: Added support for custom calculations using
isnull
,isnotnull
functions, and thenull
operator.AI/BI dashboard schedule support: AI/BI dashboard schedules now support dashboards without embedded credentials.
Number formatting support: Number formatting in table and pivot table cells from notebooks, SQL editor, and legacy dashboards can now be carried over to AI/BI dashboards.
Select and add multiple tables and views: Users can now select and add multiple tables and views at once from the Add data dialog.
AI-assisted point maps in dashboards: You can now use natural-language prompts to generate point map charts in dashboards.
Deploy dashboard tasks using Databricks Asset Bundles: You can now use dashboard tasks within a
job
resource when deploying bundles.
AI/BI Genie
Popular query suggestions in Genie: Genie will now suggest popular queries as example SQL for newly added tables to the space. Only popular queries that the author has access to are suggested. See Create a Genie space.
Increased accuracy with example values: Genie now collects example values when you add tables to a space, helping it better understand the data in your tables and generate more accurate responses. See Edit column metadata.
Edit column descriptions in Genie: You can now edit table and column descriptions at the Genie-space level. This allows you to fine-tune metadata descriptions within a specific the Genie space. See Edit column metadata.
Column synonyms support: You can now add column synonyms to help Genie interpret your data, making column descriptions easier to understand. See Edit column metadata.
Hide unnecessary columns: You can now hide unnecessary columns from Genie to simplify the data model, without needing to create new UC assets.
SQL function de-emphasis: The option to add SQL functions has been de-emphasized, making adding example SQL the default action.
Increased value dictionary support: Each value dictionary in Genie now supports 1024 values per column, up from 255. Click Refresh value dictionary on your existing value dictionaries to increase cardinality support in those columns. See Use value dictionaries to improve Genie's accuracy.
Point maps support in Genie: Point maps are now supported in Genie.
Data Warehousing
IDENTIFIER
support now available in Databricks SQL for catalog operations.
You can now use the IDENTIFIER
clause when performing the following catalog operations:
CREATE CATALOG
DROP CATALOG
COMMENT ON CATALOG
ALTER CATALOG
This new syntax allows you to dynamically specify catalog names using parameters defined for these operations, enabling more flexible and reusable SQL workflows. As an example of the syntax, consider CREATE CATALOG IDENTIFIER(:param)
where param
is a parameter provided to specify a catalog name.
New alert version in Beta
A new version of alerts is now in Beta. This version simplifies creating and managing alerts by consolidating query setup, conditions, schedules, and notification destinations into a single interface. You can still use legacy alerts alongside the new version.
Metric Views are in Public Preview
Unity Catalog metric views provide a centralized way to define and manage consistent, reusable, and governed core business metrics. They abstract complex business logic into a centralized definition, enabling organizations to define key performance indicators once and use them consistently across reporting tools like dashboards, Genie spaces, and alerts. Use a SQL warehouse running on the Preview channel (2025.16) or other compute resource running Databricks Runtime 16.4 or above to work with metric views.
New queries in Drafts folder: New queries are now created by default in the Drafts folder. When saved or renamed, they automatically move out of Drafts.
Query snippets support: You can now create and reuse query snippets—predefined segments of SQL such as
JOIN
orCASE
expressions, with support for autocomplete and dynamic insertion points. Create snippets by choosing View > Query Snippets.Audit log events: Audit log events are now emitted for actions taken in the new SQL editor.
Filters impact on visualizations: Filters applied to result tables now also affect visualizations, enabling interactive exploration without modifying the SQL query.