Announcing Databricks Runtime 5.4 – The Databricks Blog

Posted Leave a commentPosted in Announcements, Company Blog, Databricks Connect, Library Utilities, Product, Runtime, Runtime 5.4

Databricks is pleased to announce the release of Databricks Runtime 5.4.  This release includes Apache Spark 2.4.3 along with several important improvements and bug fixes .   We recommend all users upgrade to take advantage of this new runtime release.  This blog post gives a brief overview of some of the new high value features that […]

Simplifying Streaming Stock Analysis using Delta Lake and Apache Spark: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in ACID Transactions, Apache Spark, Company Blog, Delta Lake, Education, Engineering Blog, Financial Services, Product, Streaming, Structured Streaming, Time Travel, Unified Batch and Streaming Sync

On June 13th, we hosted a live webinar — Simplifying Streaming Stock Analysis using Delta Lake and Apache Spark — with Junta Nakai, Industry Leader – Financial Services at Databricks, John O’Dwyer, Solution Architect at Databricks, and Denny Lee, Technical Product Marketing Manager at Databricks. This is the first webinar in a series of financial […]

Databricks Connect: Bringing the capabilities of hosted Apache Spark™ to applications and microservices

Posted Leave a commentPosted in Announcements, CoLab, Company Blog, Connect, Databricks Connect, Eclipse, Intellij, jupyter, Platform, Product, PyCharm, RStudio, Zeppelin

In this blog post we introduce Databricks Connect, a new library that allows you to leverage native Apache Spark APIs from any Notebook, IDE, or custom application. Overview Over the last several years, many custom application connectors have been written for Apache Spark. This includes tools like spark-submit, REST job servers, notebook gateways, and so […]

Announcing the MLflow 1.0 Release

Posted Leave a commentPosted in Announcements, Company Blog, Data Science, Ecosystem, Engineering Blog, Lifecycle, Machine Learning, MLflow, Model Management, Product

MLflow is an open source platform to help manage the complete machine learning lifecycle. With MLflow, data scientists can track and share experiments locally (on a laptop) or remotely (in the cloud), package and share models across frameworks, and deploy models virtually anywhere. Today we are excited to announce the release of MLflow 1.0. Since […]

Enhanced Hyperparameter Tuning and Optimized AWS Storage with Databricks Runtime 5.4 ML

Posted Leave a commentPosted in Announcements, AutoML, Company Blog, Data Science, Databricks Runtime 5.4 ML, Deep Learning, Ecosystem, Engineering Blog, Hyperopt, Hyperparameter Tuning, Machine Learning, MLflow, MLlib, Platform, Product

We are excited to announce the release of Databricks Runtime 5.4 ML (Azure | AWS). This release includes two Public Preview features to improve data science productivity, optimized storage in AWS for developing distributed applications, and a number of Python library upgrades. To get started, you simply select the Databricks Runtime 5.4 ML from the […]

Introducing Databricks Runtime 5.4 with Conda (Beta)

Posted Leave a commentPosted in Announcements, Company Blog, Data Science, Databricks Runtime, Deep Learning, Ecosystem, Engineering Blog, Machine Learning, Product

We are excited to introduce a new runtime: Databricks Runtime 5.4 with Conda (Beta). This runtime uses Conda to manage Python libraries and environments. Many of our Python users prefer to manage their Python environments and libraries with Conda, which quickly is emerging as a standard. Conda takes a holistic approach to package management by […]

Spark + AI Summit 2019 Product Announcements and Recap. Watch the keynote recordings today!

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Delta Lake, Events, Koalas, MLflow, Product, Spark + AI Summit

Spark + AI Summit 2019, the world’s largest data and machine learning conference for the Apache Spark™ Community, brought nearly 5000 data scientists, engineers, and business leaders to San Francisco’s Moscone Center to find out what’s coming next. Watch the keynote recordings today and learn more about the latest product announcements for Apache Spark, MLflow, […]

Announcing General Availability of Managed MLflow on Databricks

Posted Leave a commentPosted in Announcements, Company Blog, Ecosystem, Engineering Blog, Machine Learning, Managed MLflow, MLflow, Platform, Product

Try this tutorial in Databricks MLflow is an open source platform to help manage the complete machine learning lifecycle. With MLflow, data scientists can track and share experiments locally or in the cloud, package and share models across frameworks, and deploy models virtually anywhere. Today at the Spark + AI Summit, we announced the General […]

A Guide to MLflow Talks at Spark + AI Summit 2019

Posted Leave a commentPosted in Company Blog, Events, Machine Learning, MLflow, Open Source, Product, Spark + AI Summit

In less than a year, MLflow has reached almost 500K monthly downloads, and gathered over 80 code contributors and 40 contributing organizations, confirming the need for an open source approach to help standardize the machine learning lifecycle across tools, teams, and processes. We are thrilled to host some of our key contributors and customers next […]

Managing the Complete Machine Learning Lifecycle: On-Demand Webinar now available!

Posted Leave a commentPosted in Company Blog, Data Science, Ecosystem, Education, Machine Learning, Managed MLflow, MLflow, Model Management, Open Source, Product, Webinar

On March 7th, our team hosted a live webinar—Managing the Complete Machine Learning Lifecycle—with Andy Konwinski, Co-Founder and VP of Product at Databricks. In this webinar, we walked you through how MLflow, an open source framework for the complete Machine Learning lifecycle, helps solve for challenges around experiment tracking, reproducible projects and model deployment. Specifically, […]