DAIS recap - Data Warehousing - Part 4
Data & AI Summit 2024 is a wrap! With 16k attendees and 40k virtual, vibrant parties, it was the most exciting edition so far.
Did you miss the keynotes & all the announcements? Don’t worry, we’re giving you a full recap.
Thanks for reading NextGenLakehouse Newsletter!
Subscribe for free to receive new posts and support our work.
In a nutshell
AI/BI Dashboards allow end users to create dashboards with low-code experience via a drag-and-drop canvas or through natural language via AI-powered text to viz capability.
AI/BI Genie is a conversational experience for business users to interrogate their data through natural language and follow up with visualisations. You can tune and improve responses to produce accurate and reproducible answers. Learn more
AI functions allow you to invoke Gen AI models to perform specific tasks such as sentiment analysis, classification, summarization, translation, etc. on columns in structured tables using SQL functions. You can wrap any custom model serving endpoint in a SQL function.
Databricks AI/BI
Databricks AI/BI is the business intelligence product. Built on compound AI systems that combine interacting components such as understanding data and comments, creating complex SQL queries as well as visual generation, AI/BI empowers business users to carry out self service analytics. There are 2 complementary products:
AI/BI Dashboards (formerly Lakeview dashboards) is GA. AIBI Dashboards allow end users to create dashboards with low-code experience via a drag-and-drop canvas or through natural language via AI-powered text-to-viz capability. Those Dashboards offer a wide set of visualization capabilities that includes cross-filtering, as well as sharing and exporting options.
AI/BI Genie is now in Public Preview. It’s a conversational experience for business users to interrogate their data through natural language and follow up with visualizations with a single click. Genie uses a set of capabilities to allow teams to tune and improve responses so that it produces trustworthy & accurate answers that are reproducible:
Setting up the space with pre-set questions to get business users started
Asking for clarification with follow ups if it does not have sufficient information and uses the response to guide prompts.
Unity Catalog table and column comments to enrich Genie’s knowledge base so that it understands an organization’s terminology and business jargon
Instructions that are saved as text to guide the LLM’s behavior that would be reflected in the prompt
Sample SQL statements that users can save to teach the model to answer specific questions
Certified answers that can be saved to answer specific questions so that results are always reproducible with an indicator to show end users that the answers are trustworthy
Logging end user’s questions and answers for quality monitoring
Learn more about the AIBI experience with Miranda Luna, Sr Manager Product Management and Chao Cai, Sr Director of Engineering.
Data warehousing performance improvements
Data warehousing performance improvements: watch the Data+AI summit talk with a demo
Automatic Liquid Clustering allows you to optimise data layouts for performance by specifying cluster keys, which can evolve over time without rewriting existing data. Auto Liquid Clustering uses AI to learn from the query patterns on your workloads and automatically selects and apply the cluster keys that will bring optimal performance
Auto Statistics uses a model to return columns that have higher fidelity data to be collected automatically on writes for query optimiser to generate an optimal plan to improve Spark execution. This removes the need for explicit user action with running ANALYZE to generate query optimizer stats.
Predictive I/O 2.0 is an improvement to Predictive I/O , which uses deep learning to improve query access patterns by estimating the next matching row, optimizing the data scanned. 2.0 now has larger AI models powered by Mosaic that work with a larger set of feature vectors, which can now accelerate much more workloads.
AI Functions
AI Functions allow you to invoke Gen AI models and apply AI to perform specific tasks such as sentiment analysis, classification, summarization, extraction, translation on columns in structured tables using SQL functions. AI Query extends this capability beyond pre-canned functions by allowing you to wrap any custom model serving endpoint in a SQL function.
SQL analysts can now query Mosaic Vector Search in SQL
Follow us on Linkedin: Quentin & Youssef & Lara & Maria & Beatrice