MLflow v0.8.1 Features Faster Experiment UI and Enhanced Python Model

Posted Leave a commentPosted in Apache Spark, Data Science, Ecosystem, Engineering Blog, Machine Learning, Machine Learning Life Cycle, MLflow, Model Management, Platform, Spark UDF

Try this notebook in Databricks MLflow v0.8.1 was released this week. It introduces several UI enhancements, including faster load times for thousands of runs and improved responsiveness when navigating runs with many metrics and parameters. Additionally, it expands support for evaluating Python models as Apache Spark UDFs and automatically captures model dependencies as Conda environments. […]

Introducing Databricks Runtime 5.0 for Machine Learning

Posted Leave a commentPosted in Announcements, Company Blog, Databricks Runtime 5.0 ML, Deep Learning, Ecosystem, Engineering Blog, Machine Learning, Platform

Six months ago we introduced the Databricks Runtime for Machine Learning with the goal of making machine learning performant and easy on the Databricks Unified Analytics Platform. The Databricks Runtime for ML comes pre-packaged with many ML frameworks and enables distributed training and inference. Today we are excited to release the second iteration including Conda […]

Applying your Convolutional Neural Network: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Deep Learning, Ecosystem, Engineering Blog, Keras, Machine Learning, Neural Networks, Platform, TensorFlow

Try this notebook in Databricks On October 25th, we hosted a live webinar—Applying your Convolutional Neural Network—with Denny Lee, Technical Product Marketing Manager at Databricks. This is the third webinar of a free deep learning fundamental series from Databricks. In this webinar, we dived deeper into Convolutional Neural Networks (CNNs), a particular type of neural […]

Open Sourcing Databricks Integration Tools at Edmunds

Posted Leave a commentPosted in Apache Spark, Company Blog, Customers, Engineering Blog, Platform

This is a guest post from Shaun Elliott, Data Engineering Tech Lead and Sam Shuster, Staff Engineer at Edmunds. What is Databricks and How is it Useful for Edmunds? Databricks is a cloud-based, fully managed, big data and analytics processing platform that leverages Apache SparkTM and the JVM. The big selling point of the Databricks Unified […]

Introducing Apache Spark 2.4 – The Databricks Blog

Posted Leave a commentPosted in Apache Spark, Apache Spark 2.4, Databricks Runtime 5.0, Ecosystem, Engineering Blog, Machine Learning, Pandas UDF, Platform, SparkSQL, Streaming, Structured Streaming, Unified Analytics Platform

We are excited to announce the availability of Apache Spark 2.4 on Databricks as part of the Databricks Runtime 5.0. We want to thank the Apache Spark community for all their valuable contributions to the Spark 2.4 release. Continuing with the objectives to make Spark faster, easier, and smarter, Spark 2.4 extends its scheduler to […]

Democratizing Cloud Infrastructure with Terraform and Jenkins

Posted Leave a commentPosted in Ecosystem, Engineering Blog, Infrastructure, Monitoring, Platform, Provisioning, Unified Analytics Platform

This blog post is part of our series of internal engineering blogs on the Databricks platform, infrastructure management, integration, tooling, monitoring, and provisioning. This summer at Databricks I designed and implemented a service for coordinating and deploying cloud provider infrastructure resources that significantly improved the velocity of operations on our self-managed cloud platform. The service […]

Training your Neural Network: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Deep Learning, Ecosystem, Engineering Blog, Keras, Machine Learning, Neural Networks, Platform, TensorFlow

Try this notebook in Databricks On October 9th, we hosted a live webinar—Training your Neural Network—on Data Science Central with Denny Lee, Technical Product Marketing Manager at Databricks. This is the second webinar of a free deep learning fundamental series from Databricks. In this webinar, we covered the principles for training your neural network including […]

Introduction to Neural Networks: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Deep Learning, Ecosystem, Keras, Machine Learning, Neural Networks, Platform, TensorFlow

Try this notebook in Databricks On September 27th, we hosted a live webinar—Introduction to Neural Networks—with Denny Lee, Technical Product Marketing Manager at Databricks. This is the first webinar of a free deep learning fundamental series from Databricks. In this webinar, we covered the fundamentals of deep learning to better understand what gives neural networks […]

How to Use MLflow To Reproduce Results and Retrain Saved Keras ML Models

Posted Leave a commentPosted in Apache Spark, Engineering Blog, Keras, Machine Learning, MLflow, Model Management, Platform, TensorFlow, Unified Analytics Platform

In part 2 of our series on MLflow blogs, we demonstrated how to use MLflow to track experiment results for a Keras network model using binary classification. We classified reviews from an IMDB dataset as positive or negative. And we created one baseline model and two experiments. For each model, we tracked its respective training […]

New Features in MLflow v0.6.0

Posted Leave a commentPosted in Data Science, Engineering Blog, Machine Learning, MLflow, Model Management, Platform, Spark ML

Today, we’re excited to announce MLflow v0.6.0, released early in the week with new features. Now available on PyPI and Maven, the docs are updated. You can install the recent release with pip install mlflow as described in the MLflow quickstart guide. MLflow v0.6.0 introduces a number of major features: A Java client API, available […]