What's new in Databricks - February 2025

Mar 03, 2025

February 2025 Release Highlights

Automatic Liquid clustering is now public preview
Serverless Compute can now use instance profiles for data access

Data Engineering

You can now write from pipelines to external services with Delta Live tables sinks

The Delta Live Tables sink API is in Public Preview. With Delta Live Tables sinks, you can write data transformed by your pipeline to targets like event streaming services such as Apache Kafka or Azure Event Hubs, and external tables managed by Unity Catalog or the Hive metastore. Documentation

Standard access mode compute now support more Scala Streaming functions

Standard access mode compute now supports the Scala streaming function DataStreamWriter.foreach on Databricks Runtime 16.1 and above. On Databricks Runtime 16.2 and above, the functions DataStreamWriter.foreachBatch and KeyValueGroupedDataset.flatMapGroupsWithState are supported.

Automatic liquid clustering is now public preview

You can use enable automatic liquid clustering on Unity Catalog managed tables. Automatic liquid clustering intelligently selects clustering keys to optimize data layout for your queries. Documentation

New functions allowed in Delta Lake generated columns (DBR16.2+)

you can use the timestampdiff and timestampadd functions in Delta Lake generated column expressions. Documentation

Support for SQL Pipeline syntax

A SQL pipeline structures a standard query

Trailing blank insensitive collations

Support for trailing blank insensitive collations, adding to the collation support added in Databricks Runtime 16.1. For example, these collations treat 'Youssef' and 'Youssef ' as equal

Governance

UC Governed access to external cloud services using service credentials is now GA

Service credentials enable simple and secure authentication with your cloud tenant’s services from Databricks. Service credentials support Scala and Python SDKs. Documentation

Delta Sharing change behavior

Shares created using the SQL command ALTER SHARE <share> ADD TABLE <table> now have history sharing (WITH HISTORY) enabled by default.

Preview files in volumes

Volumes now display previews for common file formats in Catalog Explorer, including images, text files, JSON, yaml, and CSV

Platform

You can now download as excel in notebooks connected to SQL Warehouses

Serverless compute can now use instance profiles for data access

Notebooks are supported as workspace files

You can now programmatically interact with notebooks from anywhere the workspace filesystem is available, including writing, reading, and deleting notebooks like any other file. Documentation.

OAuth secrets for service principals now have a configurable lifetime

Newly created OAuth secrets default to a maximum lifetime of two years, whereas previously, they did not expire.

GenAI & ML

Connect AI agent tools to external services (public preview)

You can connect AI agent tools to external applications like Slack, Google Calendar, or any service with an API using HTTP requests. Set up authentication to the external service using either a bearer token, OAuth 2.0 Machine-to-Machine, or OAuth 2.0 User-to-Machine Shared. In terms of requirements, your workspace must be Unity Catalog enabled, you must have network connectivity from a Databricks compute resource to the external service, and you must use a compute with dedicated access mode (formerly single user access mode) on Databricks Runtime 15.4 and above. Finally, you must have a pro or serverless SQL warehouse.

Learn more here.

Updates in the model serving billing records

In order to improve the cost observability, billing records are now logged every five minutes rather than one hour interval.

MLFlow Tracing is GA

You can track inputs, outputs, and other metadata associated with each step of a model or agent request.

Tracing lets you pinpoint the source of bugs and unexpected behavior, compare performance across models or agents, and build new datasets to improve the quality.

In-line tracing captures detailed information for each step in a gen AI app

AIBI

AI/BI dashboards

Quickly navigate to the most popular dashboards: Dashboard thumbnails are now shown for all dashboards published with embedded credentials. The dashboards listing page attempts to show thumbnails for the four most popular dashboards you can access. Dashboards you don't have access to do not appear on the listing page.

Pivot tables support more cells: Pivot tables now accommodate up to 1,000 rows and 1,000 columns, up from the previous limit of 100 rows and 100 columns.
Edit box plot display names: You can now edit the Y-axis display names in box plots, enabling a more customized presentation.
Multiple Y fields for generated charts: Visualizations generated using the Databricks Assistant now support multiple Y fields.
ColorBy performance optimization: Rendering is now optimized for charts with a very large number of groupings. This optimization prevents performance issues and crashes.
Customize sort order and label angles: Control the sorting order of data on the axis and adjust the angle of labels in visualizations. See Format axis settings.
Custom column widths for tables: All column types in table visualizations now support custom widths. Drag the handle at the top of a column to adjust its size.
Enhanced value display in stacked bar and pie charts: Stacked bar charts and pie charts now display raw values and percentages together.
Clone dashboard pages: You can now duplicate dashboard pages. See Clone a page.
Updated timezone handling: Visualizations now use the timezone from the dataset or compute resource instead of the browser settings. If a widget includes two columns with different time zones, the second is formatted to match the first

AI/BI Genie

Edit parameters in a response: You can now edit the parameter values used to generate a response to a trusted question. See Review a response.

A GIF showing the user typing a new parameter into a trusted response and then rerunning the query.

View data sources: Genie now displays the tables used as source data for each response.
Avoid unnecessary wait times: You can now cancel a SQL query execution during the Waiting for warehouse state to avoid unnecessary wait times.
Improved reasoning about generated SQL: Genie’s model for translating text into SQL now uses Chain-of-Thought reasoning to break down questions into manageable steps: first, identifying useful columns; next, planning the SQL generation; and finally, combining the parts into a single SQL query. This upgrade results in more robust and accurate SQL translations. You should see improvements in Genie’s ability to pick precise filter conditions and improved reasoning on nuanced questions.
Sharing a Genie space now sends an email notification to the recipient. See Share a Genie space.