Apparate: Managing Libraries in Databricks with CI/CD

Posted Leave a commentPosted in Apache Spark, apparate, CI/CD, continuous delivery, continuous integration, Continuous Processing, Customers, Education, Partners, Product

This is a guest blog from Hanna Torrence, Data Scientist at ShopRunner. Introduction As leveraging data becomes a more vital component of organizations’ tech stacks, it becomes increasingly important for data teams to make use of software engineering best-practices. The Databricks platform provides excellent tools for exploratory Apache Spark workflows in notebooks as well as […]

Open Sourcing Databricks Integration Tools at Edmunds

Posted Leave a commentPosted in Apache Spark, Company Blog, Customers, Engineering Blog, Platform

This is a guest post from Shaun Elliott, Data Engineering Tech Lead and Sam Shuster, Staff Engineer at Edmunds. What is Databricks and How is it Useful for Edmunds? Databricks is a cloud-based, fully managed, big data and analytics processing platform that leverages Apache SparkTM and the JVM. The big selling point of the Databricks Unified […]

Introducing Flint: A time-series library for Apache Spark

Posted Leave a commentPosted in Apache Spark, Company Blog, Customers, Education, Engineering Blog, Flint, python, Spark SQL, Time Series

This is a joint guest community blog by Li Jin at Two Sigma and Kevin Rasmussen at Databricks; they share how to use Flint with Apache Spark. Introduction The volume of data that data scientists face these days increases relentlessly, and we now find that a traditional, single-machine solution is no longer adequate to the demands […]

Announcing Databricks Runtime 4.2! – The Databricks Blog

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Customers, Databricks, Delta, Engineering Blog, Platform, Product, Runtime, Streaming

We’re excited to announce Databricks Runtime 4.2, powered by Apache Spark™.  Version 4.2 includes updated Spark internals, new features, and major performance upgrades to Databricks Delta, as well as general quality improvements to the platform.  We are moving quickly toward the Databricks Delta general availability (GA) release and we recommend you upgrade to Databricks Runtime […]

Viacom’s Journey to Improving Viewer Experiences with Real-time Analytics at Scale

Posted Leave a commentPosted in Company Blog, Customers

With over 4 billion subscribers, Viacom is focused on delivering amazing viewing experiences to their global audiences. Core to this strategy is ensuring petabytes of streaming content is delivered flawlessly through web, mobile and streaming applications. This is critically important during popular live events like the MTV Video Music Awards. Streaming this much video can […]