What's new in Databricks - November 2024

Dec 18, 2024

Novembre 2024 Release Highlights

Predictive Optimization is now enabled for all the new Databricks accounts
Fine grained access control on Single user compute is now GA

Unity Catalog

Service Credentials

Service credentials enable a secure authentication with your cloud tenant’s services using IAM roles and UC. It lets you govern access from Databricks to external cloud services like Glue or AWS Secret Manager. Learn more

Predictive Optimization is now enabled for all the new Databricks accounts

Previously you had to enable it through the admin account. Predictive Optimization can run Optimize, Vacuum and Analyze automatically

New table available in System tables

System.compute.warehouse table record SQL warehouse configuration

Cross View Sharing

Databricks recipients can now query shared views using any Databricks compute resource. Previously, if a recipient’s Databricks account was different from the provider’s account, recipients could only query a shared view using a serverless SQL warehouse

Data Engineering

Fine grained access control on Single user compute is now GA

In Databricks Runtime 16.0 and above, fine-grained access control on single user compute is generally available. In workspaces enabled for serverless compute, if a query is run on supported compute such as single user compute and the query accesses any of the following objects, the compute resource passes the query to the serverless compute to run data filtering:

Views defined over tables on which the user does not have the SELECT privilege.
Dynamic views.
Tables with row filters or column masks applied.
Materialized views and Streaming tables.

Create liquid clustered tables during streaming writes

You can now use ClusterBy to enable liquid clustering when creating new tables with Structured Streaming writes. Learn more

Optimize Full

You can optimize all records in a table that uses liquid clustering including data that has been previously clustered. Learn more

Avro updates

You can now use the recursiveFieldMaxDepth option with the from_avro and the avro data source. This option sets the maximum depth for schema recursion on the avro data source. Learn more

Databricks now supports Avro Schema reference with Confluent Schema registery. Learn more

Query history and query profile support for Delta Live tables

You have now improved tools for monitoring query performance in DLT. You can use the Compute filter on the query history page to show only queries processed using Delta Live Tables compute.

Platform

Enhanced experience in the notebooks

When you hover over the name of a Unity Catalog table in a notebook cell, the tooltip now includes the owner, last updated, size, description and a link to view the table in the catalog explorer

Expanded support for Python debugger in notebooks

The built in Python debugger in Databricks notebooks now is supported on Serverless compute.

Improved web terminal experience

Support for configuration files: You can now set persistent configurations for your Databricks web terminal with .bashrc configuration files.
Command cycle history: You can now cycle through commands executed in previous sessions with the “Up” and “Down” arrow keys. This makes it easy to recall and reuse previous commands without retyping them.
Shared Cluster Support: Our improved web terminal is available on shared clusters in Databricks Runtime 15.1 and above.

AI/BI Dashboards

Dashboard and Genie integration

Start a chat with Genie from published dashboards in the same interface, which automatically creates a AI/BI Genie space based on data in dashboards

General Updates

Dashboard parameters now support multiple selections & date ranges.

Dashboard dataset editor now supports multi-statement queries

Dashboard viewers can now open datasets in a new table from the canvas of a published dataset, making it much easier to view the data behind the visuals

Embedding

Dashboard embedding is now generally available, so you can now embed an AI/BI dashboard in an external website or application

AI/BI Genie

General Updates

Genie space authors can now add markdown descriptions to provide users with guidance on using their space.

Query results can now be downloaded up to 1GB in a csv file.

Genie is now turned on by default in the workspace toggle

More tables can now be added via the Add Data button in the Data tab

Databricks SQL

Streaming Tables and Materialized Views

Streaming tables now support time travel to allow you to query previous table versions based on timestamp or table versions

Schedules for streaming tables and materialized Views created via Databricks SQL now use human-readable syntax instead of CRON scheduling.

General Updates

In DBR 14.3 and above, running a SQL query in notebook generates an implicit DataFrame (_sqldf) that can be reused downstream, which is an addition to the _sqldf variable in Python cells.

GenAI & ML

Mosaic AI Model Training has been rebranded to Foundation Model Fine-tuning and AutoML. → Documentation
Foundation Model APIs pay-per-token workloads are now supported in all regions where Mosaic AI Model Serving is available. If your workspace is in a Model Serving region but not in a U.S. or EU region, your workspace must be enabled for cross-Geo data processing. → Documentation
Breaking change: Hosted RStudio is end-of-life in Databricks Runtime 16.0 and above. → Documentation
How to: Serving Vision Language Models on Databricks | Mlflow Extensions → Demo
How to: Using GenAI and Traditional ML for Anomaly & Outlier Detection → Demo
How to: Optimise RAG applications with semantic caching on Databricks → Demo

What's new in Databricks - November 2024

Unity Catalog

Data Engineering

Platform

AI/BI Dashboards

AI/BI Genie

Databricks SQL

GenAI & ML

Discussion about this post