Applying your Convolutional Neural Network: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Deep Learning, Ecosystem, Engineering Blog, Keras, Machine Learning, Neural Networks, Platform, TensorFlow

Try this notebook in Databricks On October 25th, we hosted a live webinar—Applying your Convolutional Neural Network—with Denny Lee, Technical Product Marketing Manager at Databricks. This is the third webinar of a free deep learning fundamental series from Databricks. In this webinar, we dived deeper into Convolutional Neural Networks (CNNs), a particular type of neural […]

Introducing Apache Spark 2.4 – The Databricks Blog

Posted Leave a commentPosted in Apache Spark, Apache Spark 2.4, Databricks Runtime 5.0, Ecosystem, Engineering Blog, Machine Learning, Pandas UDF, Platform, SparkSQL, Streaming, Structured Streaming, Unified Analytics Platform

We are excited to announce the availability of Apache Spark 2.4 on Databricks as part of the Databricks Runtime 5.0. We want to thank the Apache Spark community for all their valuable contributions to the Spark 2.4 release. Continuing with the objectives to make Spark faster, easier, and smarter, Spark 2.4 extends its scheduler to […]

Democratizing Cloud Infrastructure with Terraform and Jenkins

Posted Leave a commentPosted in Ecosystem, Engineering Blog, Infrastructure, Monitoring, Platform, Provisioning, Unified Analytics Platform

This blog post is part of our series of internal engineering blogs on the Databricks platform, infrastructure management, integration, tooling, monitoring, and provisioning. This summer at Databricks I designed and implemented a service for coordinating and deploying cloud provider infrastructure resources that significantly improved the velocity of operations on our self-managed cloud platform. The service […]

Training your Neural Network: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Deep Learning, Ecosystem, Engineering Blog, Keras, Machine Learning, Neural Networks, Platform, TensorFlow

Try this notebook in Databricks On October 9th, we hosted a live webinar—Training your Neural Network—on Data Science Central with Denny Lee, Technical Product Marketing Manager at Databricks. This is the second webinar of a free deep learning fundamental series from Databricks. In this webinar, we covered the principles for training your neural network including […]

MLflow v0.7.0 Features New R API by RStudio

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Deep Learning, Ecosystem, Education, Engineering Blog, GPyOpt, Hyperopt, Java, Keras, Machine Learning, MLflow, multistep workflow, Partners, python, R, RStudio

Today, we’re excited to announce MLflow v0.7.0, released with new features, including a new MLflow R client API contributed by RStudio. A testament to MLflow’s design goal of an open platform with adoption in the community, RStudio’s contribution extends the MLflow platform to a larger R community of data scientists who use RStudio and R […]

Introduction to Neural Networks: On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Deep Learning, Ecosystem, Keras, Machine Learning, Neural Networks, Platform, TensorFlow

Try this notebook in Databricks On September 27th, we hosted a live webinar—Introduction to Neural Networks—with Denny Lee, Technical Product Marketing Manager at Databricks. This is the first webinar of a free deep learning fundamental series from Databricks. In this webinar, we covered the fundamentals of deep learning to better understand what gives neural networks […]

What’s New for Apache Spark on Kubernetes in the Upcoming Apache Spark 2.4 Release

Posted Leave a commentPosted in Apache Spark, Ecosystem, Engineering Blog, Kubernetes

This is a community blog from Yinan Li, a software engineer at Google, working in the Kubernetes Engine team. He is part of the group of companies that have contributed to Kubernetes support in the upcoming Apache Spark 2.4. Since the Kubernetes cluster scheduler backend was initially introduced in Apache Spark 2.3, the community has […]

MLflow On-Demand Webinar and FAQ Now Available!

Posted Leave a commentPosted in Data Science, Deep Learning, Ecosystem, Engineering Blog, Machine Learning, MLflow, Model Management, Platform, Product, Unified Analytics Platform

On August 30th, our team hosted a live webinar—Introducing MLflow: Infrastructure for a complete Machine Learning lifecycle—with Matei Zaharia, Co-Founder and Chief Technologist at Databricks. In this webinar, we walked you through MLflow, a new open source project from Databricks that aims to design an open ML platform where organizations can use any ML library […]

Building a Real-Time Attribution Pipeline with Databricks Delta

Posted Leave a commentPosted in Adhoc Analysis, Advertising Analytics, Apache Spark, bi, Company Blog, Databricks Delta, Ecosystem, Education, Engineering Blog, Kinesis, Machine Learning, Platform, Product, Spark Streaming, Streaming, Structured Streaming, Tableau

Try this notebook in Databricks In digital advertising, one of the most important things to be able to deliver to clients is information about how their advertising spend drove results.  The more quickly we can provide this, the better. To tie conversions or engagements to the impressions served in an advertising campaign, companies must perform […]

Loan Risk Analysis with XGBoost and Databricks Runtime for Machine Learning

Posted Leave a commentPosted in Apache Spark, Company Blog, data pipeline, Data Visualization, Ecosystem, Education, Engineering Blog, financial, Machine Learning, MLlib, Platform, Product, XGBoost

Try this notebook series in Databricks For companies that make money off of interest on loans held by their customer, it’s always about increasing the bottom line. Being able to assess the risk of loan applications can save a lender the cost of holding too many risky assets. It is the data scientist’s job to […]