Simplifying Change Data Capture with Databricks Delta

Posted Leave a commentPosted in Apache Spark, CDC, Change Data Capture, Company Blog, Databricks Delta, Education, Engineering Blog, Product

A common use case that we run into at Databricks is that customers looking to perform change data capture (CDC) from one or many sources into a set of Databricks Delta tables. These sources may be on-premises or in the cloud, operational transactional stores, or data warehouses. The common glue that binds them all is […]

Building a Real-Time Attribution Pipeline with Databricks Delta

Posted Leave a commentPosted in Adhoc Analysis, Advertising Analytics, Apache Spark, bi, Company Blog, Databricks Delta, Ecosystem, Education, Engineering Blog, Kinesis, Machine Learning, Platform, Product, Spark Streaming, Streaming, Structured Streaming, Tableau

Try this notebook in Databricks In digital advertising, one of the most important things to be able to deliver to clients is information about how their advertising spend drove results.  The more quickly we can provide this, the better. To tie conversions or engagements to the impressions served in an advertising campaign, companies must perform […]

Processing Petabytes of Data in Seconds with Databricks Delta

Posted Leave a commentPosted in Apache Spark, Databricks Delta, Engineering Blog, Machine Leanring, Spark SQL, Streaming, Structured Streaming, Unified Analytics Platform

Introduction Databricks Delta is a unified data management system that brings data reliability and fast analytics to cloud data lakes. In this blog post, we take a peek under the hood to examine what makes Databricks Delta capable of sifting through petabytes of data within seconds. In particular, we discuss Data Skipping and ZORDER Clustering. […]

Simplify Streaming Stock Data Analysis Using Databricks Delta

Posted Leave a commentPosted in Apache Spark, Data Lakes, Data Warehousing, Databricks Delta, Ecosystem, Education, financial, Machine Learning, Platform, Product, Stock Prices, Streaming

Traditionally, real-time analysis of stock data was a complicated endeavor due to the complexities of maintaining a streaming system and ensuring transactional consistency of legacy and streaming data concurrently.  Databricks Delta helps solve many of the pain points of building a streaming system to analyze stock data in real-time. In the following diagram, we provide […]