A Guide to Training Sessions at Spark + AI Summit, Europe

Posted Leave a commentPosted in Apache Spark, Company Blog, Data and ML Industry Use Case, Data and ML Research, Data Engineering, Data Science, Data Science and Machine Learning, data-science, Delta Lake, Education, Events, Keras, MLflow, Productionizing Machine Learning, PyTorch, Spark + AI Summit, Spark SQL, Structured Streaming, TensorFlow, training

Education and the pursuit of knowledge are lifelong journeys: they never complete; there is always something new to learn; a new professional certification to add to your credit; a knowledge gap to fill. Training at Spark + AI Summit, Europe is not only about becoming an Apache Spark expert. Nor is it only about being […]

Working Abroad with Databricks – My Amsterdam Move

Posted Leave a commentPosted in Company Blog, Culture, dataengineers, datascientists, jobsabroad, technologyjobs, workingabroaddatabricks

While we are proud of our Berkeley roots, Databricks now calls many cities around the world our home. In addition to offices in London, Singapore, New York and our headquarters in San Francisco, we have one of our major engineering hubs in the fast-growing European Development Center, Amsterdam. Databricks offers the exciting opportunity to relocate […]

Featured MLflow Talks at Spark + AI Summit 2019 Europe

Posted Leave a commentPosted in Announcements, Company Blog, Machine Learning, MLflow, SparkAISummit

We are thrilled to see how well MLflow has been welcomed by the community since we launched it last summer. With now over 800K monthly downloads, 130 code contributors and dozens of contributing organizations including RStudio and Microsoft, it is one of the fastest growing open source projects in the field of Machine Learning, confirming […]

Diving Into Delta Lake: Schema Enforcement & Evolution

Posted Leave a commentPosted in Apache Spark, Company Blog, Data Engineering, Delta Lake, Developer, Ecosystem, Education, Engineering Blog, Schema Enforcement, Schema Evolution

Try this notebook series in Databricks Think back to when you were in high school – so fresh and full of ideas. Since then, undoubtedly, both the world and the way you see things have changed in many ways, as you have gained new experiences. Data, like our experiences, is always evolving and accumulating. To […]

Engineering population scale Genome-Wide Association Studies with Apache Spark, Delta Lake, and MLflow

Posted Leave a commentPosted in AI, Apache Spark, Company Blog, Customers, Data and ML Industry Use Case, Data Engineering, Data Science and Machine Learning, Delta Lake, Education, Engineering Blog, genome sequencing, GWAS, Managed MLflow, MLflow

Try this notebook series in Databricks The advent of genome-wide association studies (GWAS) in the late 2000s enabled scientists to begin to understand the causes of complex diseases such as diabetes and Crohn’s disease at their most fundamental level. However, academic bioinformatics tools to perform GWAS have not kept pace with the growth of genomic […]

Productionizing Machine Learning: From Deployment to Drift Detection

Posted Leave a commentPosted in AI, Company Blog, Data and ML Industry Use Case, Data Science and Machine Learning, Machine Learning, Machine Learning Life Cycle, MLflow, Model Drift, Product, Tutorials

Try this notebook to reproduce the steps outlined below and watch our on-demand webinar to learn more. In many literature and blogs, a machine learning workflow starts with data prep and ends with deploying a model to production. But in reality, that’s just the beginning of the lifecycle of a machine learning model. As they say, […]

A Guide to AI, Data Science, Machine Learning and Deep LearningTalks at Spark+AI Summit Europe 2019

Posted Leave a commentPosted in AI, Artificial Intelligence, Company Blog, Data and ML Industry Use Case, Data Science, Data Science and Machine Learning, Deep Learning, Events, Machine Learning, MLflow, Spark + AI Summit Europe

The Spark + AI Summit Europe is just around the corner, and it’s a great opportunity for data scientists and Machine Learning (ML) practitioners to get up to speed on the latest tools and innovations in the field! Below is a selection of talks on ML best practices for productionizing ML at scale, real-life use […]

Monitoring patient medical device data with ML + Delta Lake, Keras, and MLflow

Posted Leave a commentPosted in Company Blog, Data and ML Industry Use Case

On August 20th, our team hosted a live webinar—Automated Monitoring of Medical Device Data with Data Science—with Frank Austin Nothaft, PhD, Technical Director of Healthcare and Life Sciences, and Michael Ortega, Senior Industry and Solutions Marketing Manager. By applying machine learning to medical device data, healthcare organizations can automate patient monitoring, reduce repair costs with […]

Using AutoML Toolkit to Automate Loan Default Predictions

Posted Leave a commentPosted in AI, AutoML, Company Blog, Data Science and Machine Learning, Developer, Education, Engineering Blog, Machine Leanring, Machine Learning, MLflow, XGBoost

Download the following notebooks and try the AutoML Toolkit today: Evaluating Risk for Loan Approvals using XGBoost (0.90) | Using AutoML Toolkit to Simplify Loan Risk Analysis XGBoost Model Optimization In a previous blog and notebook, Loan Risk Analysis with XGBoost, we explored the different stages of how to build a Machine Learning model to improve […]

Apache Spark Tutorials at 2019 Spark + AI Summit

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Delta Lake, Events, MLflow, Spark SQL, Structured Streaming

You might have heard the famous saying, “Why software is eating the world.” But if software is eating the world, you may ask, where does software come from? Naturally, Developers! Some software developers advocate that the “Developers are eating the world.” A research report by Stripe indicates that “developers have the ability to raise global […]