5 ways data science can help you work smarter, not harder

Posted Leave a commentPosted in data, Data Science, sentiment analysis, trends

Today, the internet bleeds into almost every facet of everyday life — it empowers our productivity, enhances our entertainment, and enables our communication. As it does these things, of course, it generates a vast quantity of data: rich, complex, wide-reaching data on everything from the money we spend to the websites we visit. Data science […]

Hyperparameter Tuning with MLflow, Apache Spark MLlib and Hyperopt

Posted Leave a commentPosted in Apache Spark, AutoML, Data Science, Databricks Runtime 5.4 ML, Deep Learning, Ecosystem, Engineering Blog, Hyperopt, Hyperparameter Tuning, Machine Learning, MLflow, MLlib

Hyperparameter tuning is a common technique to optimize machine learning models based on hyperparameters, or configurations that are not learned during model training.  Tuning these configurations can dramatically improve model performance. However, hyperparameter tuning can be computationally expensive, slow, and unintuitive even for experts. Databricks Runtime 5.4 and 5.4 ML (Azure | AWS) introduce new […]

Announcing the MLflow 1.0 Release

Posted Leave a commentPosted in Announcements, Company Blog, Data Science, Ecosystem, Engineering Blog, Lifecycle, Machine Learning, MLflow, Model Management, Product

MLflow is an open source platform to help manage the complete machine learning lifecycle. With MLflow, data scientists can track and share experiments locally (on a laptop) or remotely (in the cloud), package and share models across frameworks, and deploy models virtually anywhere. Today we are excited to announce the release of MLflow 1.0. Since […]

Enhanced Hyperparameter Tuning and Optimized AWS Storage with Databricks Runtime 5.4 ML

Posted Leave a commentPosted in Announcements, AutoML, Company Blog, Data Science, Databricks Runtime 5.4 ML, Deep Learning, Ecosystem, Engineering Blog, Hyperopt, Hyperparameter Tuning, Machine Learning, MLflow, MLlib, Platform, Product

We are excited to announce the release of Databricks Runtime 5.4 ML (Azure | AWS). This release includes two Public Preview features to improve data science productivity, optimized storage in AWS for developing distributed applications, and a number of Python library upgrades. To get started, you simply select the Databricks Runtime 5.4 ML from the […]

Introducing Databricks Runtime 5.4 with Conda (Beta)

Posted Leave a commentPosted in Announcements, Company Blog, Data Science, Databricks Runtime, Deep Learning, Ecosystem, Engineering Blog, Machine Learning, Product

We are excited to introduce a new runtime: Databricks Runtime 5.4 with Conda (Beta). This runtime uses Conda to manage Python libraries and environments. Many of our Python users prefer to manage their Python environments and libraries with Conda, which quickly is emerging as a standard. Conda takes a holistic approach to package management by […]

6 Data And Analytics Trends To Prepare For In 2020

Posted Leave a commentPosted in #GDPR, Analytics, analytics trends, Big Data, Business Intelligence, Cloud Computing, data analytics trends, Data Science, data trends, Machine Learning, machine learning skills, Predictive Analytics, SmartData Collective Exclusive

We’re well past the point of realization that big data and advanced analytics solutions are valuable — just about everyone knows this by now. In fact, there’s no escaping the increasing reliance on such technologies. Big data alone has become a modern staple of nearly every industry from retail to manufacturing, and for good reason. […]

Koalas: Easy Transition from pandas to Apache Spark

Posted Leave a commentPosted in Announcements, Apache Spark, Company Blog, Data Science, Ecosystem, Education, Engineering Blog, Machine Learning, Open Source, Pandas, python

Today at Spark + AI Summit, we announced Koalas, a new open source project that augments PySpark’s DataFrame API to make it compatible with pandas. Python data science has exploded over the past few years and pandas has emerged as the lynchpin of the ecosystem. When data scientists get their hands on a data set, […]

Experts Reveal Data Science Behind Five Popular Android Apps

Posted Leave a commentPosted in android, android apps, app creation, app design, apps, Big Data, Data Science, phone apps, SmartData Collective Exclusive

Big data is playing a massive role in the formation of new technologies. New developments in data science have contributed to the release of a number of popular Android apps on the market. To the average Android user, big data is an invisible factor. However, it is the foundation of almost every app on their […]

Managing the Complete Machine Learning Lifecycle: On-Demand Webinar now available!

Posted Leave a commentPosted in Company Blog, Data Science, Ecosystem, Education, Machine Learning, Managed MLflow, MLflow, Model Management, Open Source, Product, Webinar

On March 7th, our team hosted a live webinar—Managing the Complete Machine Learning Lifecycle—with Andy Konwinski, Co-Founder and VP of Product at Databricks. In this webinar, we walked you through how MLflow, an open source framework for the complete Machine Learning lifecycle, helps solve for challenges around experiment tracking, reproducible projects and model deployment. Specifically, […]