Efficient Databricks Deployment Automation with Terraform

Posted Leave a commentPosted in CI/CD, cloud automation, Company Blog, Customers, Ecosystem, Education, Engineering Blog, Platform

Managing cloud infrastructure and provisioning resources can be a headache that DevOps engineers are all too familiar with. Even the most capable cloud admins can get bogged down with managing a bewildering number of interconnected cloud resources – including data streams, storage, compute power, and analytics tools. Take, for example, the following scenario: a customer […]

Tangible Impacts of AI on the Business

Posted Leave a commentPosted in Announcements, Company Blog, Customers

With 2019 in full swing, the excitement for data and AI driven innovation continues.  Over the past few years, we’ve seen leading innovators – like Riot Games, Regeneron and Shell – become early adopters of the latest machine learning and AI technologies, building and deploying AI applications into production. But with great promise comes great challenges, as […]

The Importance Of Creating A Single View Of The Customer With Data

Posted Leave a commentPosted in Analytics, Big Data, customer data, Customers, data marketing, Data Storage, IT, it data, Marketing, SmartData Collective Exclusive

In order to thrive in the current economic environment, businesses need to provide exceptional customer service. To do so, they must be able to rapidly understand and react to customer shopping behaviors. To properly interpret and react to customer behaviors, businesses need a complete, single view of their customers. What does that mean? A Single […]

Apparate: Managing Libraries in Databricks with CI/CD

Posted Leave a commentPosted in Apache Spark, apparate, CI/CD, continuous delivery, continuous integration, Continuous Processing, Customers, Education, Partners, Product

This is a guest blog from Hanna Torrence, Data Scientist at ShopRunner. Introduction As leveraging data becomes a more vital component of organizations’ tech stacks, it becomes increasingly important for data teams to make use of software engineering best-practices. The Databricks platform provides excellent tools for exploratory Apache Spark workflows in notebooks as well as […]

Open Sourcing Databricks Integration Tools at Edmunds

Posted Leave a commentPosted in Apache Spark, Company Blog, Customers, Engineering Blog, Platform

This is a guest post from Shaun Elliott, Data Engineering Tech Lead and Sam Shuster, Staff Engineer at Edmunds. What is Databricks and How is it Useful for Edmunds? Databricks is a cloud-based, fully managed, big data and analytics processing platform that leverages Apache SparkTM and the JVM. The big selling point of the Databricks Unified […]

Introducing Flint: A time-series library for Apache Spark

Posted Leave a commentPosted in Apache Spark, Company Blog, Customers, Education, Engineering Blog, Flint, python, Spark SQL, Time Series

This is a joint guest community blog by Li Jin at Two Sigma and Kevin Rasmussen at Databricks; they share how to use Flint with Apache Spark. Introduction The volume of data that data scientists face these days increases relentlessly, and we now find that a traditional, single-machine solution is no longer adequate to the demands […]

Announcing Databricks Runtime 4.2! – The Databricks Blog

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Customers, Databricks, Delta, Engineering Blog, Platform, Product, Runtime, Streaming

We’re excited to announce Databricks Runtime 4.2, powered by Apache Spark™.  Version 4.2 includes updated Spark internals, new features, and major performance upgrades to Databricks Delta, as well as general quality improvements to the platform.  We are moving quickly toward the Databricks Delta general availability (GA) release and we recommend you upgrade to Databricks Runtime […]