What's new in Databricks for February 2024
Databricks Learning Festival
Databricks Learning Festival (Virtual) is back. The event starts on 29 February in celebration of Digital Learning Day and to help you leap into learning! 💡🧠 📚🎓
Access free, self-paced, role-based training
Earn a 50%-off Databricks certification voucher
Available from 29 February 2024 to 13 March 2024
Below are the learning plans offered:
Data Engineer Learning Plan - Data Engineering with Databricks
Data Engineer Learning Plan - Advanced Data Engineering with Databricks
Data Analyst Learning Plan - Data Analysis with Databricks SQL
Machine Learning Practitioner Learning Plan - Scalable Machine Learning with Apache Spark
Machine Learning Practitioner Learning Plan - Machine Learning in Production
Generative AI Engineering Pathway - Generative AI Engineering with Databricks
Data Engineering on Databricks
File triggers in Databricks Workflows is GA
File arrival triggers is now generally available in all cloud providers. With this release, you can trigger a Databricks job when new files arrive in a Unity Catalog Volume, in addition to the existing support for Unity Catalog external location. Learn more
Ingest XML from the UI
The “add data” UI now supports XML file upload and ingestion from cloud object storage.
Governance & Unity Catalog
Volumes are now GA and are getting better!
You can use the UI to create Unity Catalog managed tables from data stored in UC volumes. Individual files can easily be deleted from the UI. Learn more
Full page, AI Powered, semantic Search
Databricks search has been enhanced with DatabricksIQ offering a smarter, AI-driven search encouter. The upgraded full-experience search provides enriched metadata for objects and expanded filters for refining search outcomes.
AI, LLM & Data Science
LLM over SQL
It’s now super easy to invoke an existing Databricks Model Serving endpoint, parse it and return its response. Because these endpoint can be backed by any model (LLM or traditional ML), this makes tasks such as forecast, data generation, extraction, sentiment analysis super simple. Learn more
New builtin LLM AI functions
New AI functions invoking a state-of-the-art Generative AI from Databricks Foundation Model APIs to perform various tasks :
SQL & Warehouse update
Lakeview Dashboards
Lakeview Dashboards offer a new dashboarding experience, optimized for ease of use, broad distribution, governance and security.
Watch this exclusive discussion with Miranda Luna Product manager of the Lakeview dashboarding experience and Matthieu Lamairesse Databricks SQL Expert to have a better understanding of the roadmap.
Run SQL Notebook jobs on a SQL Warehouse
You can schedule and execute notebook jobs using a sql warehouse as the compute resource. Learn more
Workspace file support for dbt and Sql file tasks is GA
Using workspace files, you can run dbt core projects as a task in Databricks Workflows. Learn more
Delta Sharing
Delta Sharing supports tables that use deletion vectors
Delta sharing now supports the sharing of tables that use deletion vectors. Deletion vector allow faster update/delete operation, and will soon be enabled by default on all your tables Learn more
Notebooks for monitoring and managing Delta Sharing egress costs are now available
In Databricks Marketplace, the listing Delta Sharing Egress Pipeline includes two notebooks that you can clone and use to monitor egress usage patterns and costs associated with Delta Sharing.
Both of these notebooks create and execute a Delta Live Tables pipeline:
IP Ranges Mapping Pipeline notebook
Egress Cost Analysis Pipeline notebook
When you run these notebooks as a Delta Live Tables template, they will automatically generate a detailed cost report. Logs are joined with cloud provider IP range tables and Delta Sharing system tables to generate egress bytes transferred, attributed by share and recipient.
Support for Cloudflare R2 storage to avoid cross-region egress fees
You can now use Cloudflare R2 as cloud storage for data registered in Unity Catalog. Cloudflare R2 is intended primarily for Delta Sharing use cases in which you want to avoid the data egress fees charged by cloud providers when data crosses regions. Learn more
Notebook & Devx
Notebook 2.0
Databricks Notebooks are getting better every day! Jason Messer Product Manager at Databricks is showing us the new notebook experience.
In a nutshell…
In AWS you can now configure firewalls on your resources to allow access from Databricks serverless compute plane. Learn more
You can now use all functions in the
spark.catalog
API in both Python and Scala on compute configured with shared access mode.Databricks Runtime 14.3 LTS is GA
Databricks Connect is GA for Scala
Databricks Git server proxy no longer requires CAN_ATTACH_TO permissions
You can now request access when opening a link to a Lakeview dashboard you do not have permission on.
Lakeview dashboard filters now have explicit All and None options. Authors can choose to hide the All option in single select filters.
You can now set minimum and maximum values for axes on Lakeview dashboard charts.