Data Engineering Light: Expanding Choice for Production Workloads

Posted Leave a commentPosted in Announcements, Company Blog, Data Engineering, Data Engineering Light, Unified Analytics Platform

Adding a Lower Tier, Lower Price Option Since its inception a guiding principle for Databricks has been the unification of data science and engineering helping to bring together silos and raising the level of collaboration and productivity among the analytics professionals. And our customers have been reaping the benefits of this to solve some of […]

New Databricks Delta Features Simplify Data Pipelines

Posted Leave a commentPosted in Announcements, Company Blog, Data Engineering, Data Lakes, Databricks Delta, Product, Time Travel, Unified Analytics Engine

Continued Innovation and Expanded Availability for the Next-gen Unified Analytics Engine Databricks Delta the next generation unified analytics engine, built on top of Apache Spark, and aimed at helping data engineers build robust production data pipelines at scale is continuing to make strides. Already a powerful approach to building data pipelines, new capabilities and performance […]

Analyze Games from European Soccer Leagues with Apache Spark and Databricks

Posted Leave a commentPosted in ad hoc analysis, Apache Spark, Data Engineering, Data Visualization, Education, Engineering Blog, ETL, Machine Learning, Platform, Product, Unified Analytics Platform

Try this notebook series in Databricks Introduction The global sports market is huge, comprised of players, teams, leagues, fan clubs, sponsors, etc., and all of these entities interact in myriad ways generating an enormous amount of data. Some of that data is used internally to help make better decisions, and there are a number of […]

Introducing Databricks Optimized Auto-Scaling – The Databricks Blog

Posted Leave a commentPosted in Announcements, Auto-scaling, autoscaling, Company Blog, Data Engineering, Databricks, Engineering Blog, Product

Databricks is thrilled to announce our new optimized auto-scaling feature. The new Apache Spark™-aware resource manager leverages Spark shuffle and executor statistics to resize a cluster intelligently, improving resource utilization. When we tested long-running big data workloads, we observed cloud cost savings of up to 30%. What’s the problem with current state-of-the-art auto-scaling approaches? Today, […]