Featured MLflow Talks at Spark + AI Summit 2019 Europe

Posted Leave a commentPosted in Announcements, Company Blog, Machine Learning, MLflow, SparkAISummit

We are thrilled to see how well MLflow has been welcomed by the community since we launched it last summer. With now over 800K monthly downloads, 130 code contributors and dozens of contributing organizations including RStudio and Microsoft, it is one of the fastest growing open source projects in the field of Machine Learning, confirming […]

Apache Spark Tutorials at 2019 Spark + AI Summit

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Delta Lake, Events, MLflow, Spark SQL, Structured Streaming

You might have heard the famous saying, “Why software is eating the world.” But if software is eating the world, you may ask, where does software come from? Naturally, Developers! Some software developers advocate that the “Developers are eating the world.” A research report by Stripe indicates that “developers have the ability to raise global […]

2019 Spark + AI Summit Europe Keynote Agenda

Posted Leave a commentPosted in Announcements, Artificial Intelligence, Company Blog, Events, Keynotes, Spark + AI Summit, Spark Summit, Summit, Thought Leadership

Spark + AI Summit is the premier global event for the data and machine learning community to discuss the latest advances in open-source technologies such as Apache Spark™, Delta Lake, MLflow, Koalas and TensorFlow as well as best practices for deploying AI in the real world. In addition to over 100 exciting breakout sessions, this […]

AutoML on Databricks: Augmenting Data Science from Data Prep to Operationalization

Posted Leave a commentPosted in Announcements, AutoML, Company Blog, Data Science, Data Science and Machine Learning, Databricks Labs, Engineering Blog, Hyperopt, Hyperparameter Tuning, Machine Learning, MLflow, Model Search, Product

Thousands of data science jobs are going unfilled today as global demand for the talent greatly outstrips supply. Every day, businesses pay the price of the data scientist shortage in missed opportunities and slow innovation. For organizations to realize the full potential of machine learning, data teams have to build hundreds of predictive models a […]

New cost savings option for Azure Databricks with DBU pre-purchase

Posted Leave a commentPosted in Announcements, Company Blog

The rapid adoption of Azure Databricks through our strategic partnership with Microsoft has been remarkable, and it’s proven to be a compelling service for our customers’ big data, analytics and machine learning initiatives. To further help our customers save costs and improve budgeting for Azure Databricks, we are pleased to share a new pricing option […]

Announcing Databricks Runtime 5.5 and Runtime 5.5 for Machine Learning

Posted Leave a commentPosted in Announcements, Apache Spark, Machine Learning

Databricks is pleased to announce the release of Databricks Runtime 5.5.  This release includes Apache Spark 2.4.3 along with several important improvements and bug fixes as noted in the latest release notes [Azure|AWS].  We recommend all users upgrade to take advantage of this new runtime release.  This blog post gives a brief overview of some […]

Getting Data Ready for Data Science: On-Demand Webinar and Q&A Now Available

Posted Leave a commentPosted in Announcements, Company Blog

On June 25th, our team hosted a live webinar — Getting Data Ready for Data Science — with Prakash Chockalingam, Product Manager at Databricks. Successful data science relies on solid data engineering to furnish reliable data. Data lakes are a key element of modern data architectures. Although data lakes afford significant flexibility, they also face […]

Scaling Genomic Workflows with Spark SQL BGEN and VCF Readers

Posted Leave a commentPosted in Announcements, Apache Spark, BGEN, Ecosystem, Engineering Blog, Genomics, HLS, Spark SQL, VCF

In the past decade, the amount of available genomic data has exploded as the price of genome sequencing has dropped. Researchers are now able to scan for associations between genetic variation and diseases across cohorts of hundreds of thousands of individuals from projects such as the UK Biobank. These analyses will lead to a deeper […]

Announcing Databricks Runtime 5.4 – The Databricks Blog

Posted Leave a commentPosted in Announcements, Company Blog, Databricks Connect, Library Utilities, Product, Runtime, Runtime 5.4

Databricks is pleased to announce the release of Databricks Runtime 5.4.  This release includes Apache Spark 2.4.3 along with several important improvements and bug fixes .   We recommend all users upgrade to take advantage of this new runtime release.  This blog post gives a brief overview of some of the new high value features that […]

Databricks Connect: Bringing the capabilities of hosted Apache Spark™ to applications and microservices

Posted Leave a commentPosted in Announcements, CoLab, Company Blog, Connect, Databricks Connect, Eclipse, Intellij, jupyter, Platform, Product, PyCharm, RStudio, Zeppelin

In this blog post we introduce Databricks Connect, a new library that allows you to leverage native Apache Spark APIs from any Notebook, IDE, or custom application. Overview Over the last several years, many custom application connectors have been written for Apache Spark. This includes tools like spark-submit, REST job servers, notebook gateways, and so […]