What’s New in 1.0
and Beyond
Matei Zaharia, Clemens Mewald, Richard Zang
Outline
MLflow Intro (Matei Zaharia)
Overview of Components (Clemens Mewald)
MLflow 1.0 & Roadmap (Clemens Mewald)
Demo (Richard Zang)
ML Lifecycle Challenges
Delta
Tuning Model Mgmt
Raw Data ETL TrainFeaturize Score/Serve
Batch + Realtime
Monitor
Alert, Debug
Deploy
AutoML,
Hyper-p. search
Experiment
Tracking
Remote Cloud
Execution
Project Mgmt
(scale teams)
Model
Exchange
Data
Drift
Model
Drift
Orchestration
(Airflow)
A/B
Testing
CI/CD/Jenkins
push to prod
Feature
Repository
Lifecycle
mgmt.
RetrainUpdate FeaturesProduction Logs
Zoo of Ecosystem Frameworks
Collaboration Scale Governance
An open source platform for the machin
learning lifecycle
What is MLflow?
An open source, extensible framework to manage the complete ML Lifecycle.
MLflow Community Growth
600k 100+ 40
Comparison: Apache Spark took 3 years to get to 100 contributors,
and has 1.2M downloads/month on PyPI
Community Growth in Context
● Time till 100 contributors: MLflow = 1 year, Spark = 3 years
● 600,000 monthly downloads on PyPI
Package Downloads
Last Month
mlflow 600,000
h2o 45,000
sagemaker 50,000
pyspark 729,000
scikit-learn 7,743,000
Some Users & Contributors
Supported Integrations: June ‘18
8
Supported Integrations: June ‘19
9
What Does the 1.0 Release Mean?
API stability of the original components
• Safe to build apps and integrations around them long term
Time to start adding some new features!
10
MLflow Components
11
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
Notebooks
Local Apps
Cloud Jobs
Tracking Server
UI
API
MLflow Tracking
Python or
REST API
Key Concepts in Tracking
Parameters: key-value inputs to your code
Metrics: numeric values (can update over time)
Artifacts: arbitrary files, including models
Source: what code ran?
Project Spec
Code DataConfig
Local Execution
Remote Execution
MLflow Projects
Example MLflow Project
my_project/
├── MLproject
│
│
│
│
│
├── conda.yaml
├── main.py
└── model.py
...
conda_env: conda.yaml
entry_points:
main:
parameters:
training_data: path
lambda: {type: float, default: 0.1}
command: python main.py {training_data} {lambda}
$ mlflow run git://<my_project>
mlflow.run(“git://<my_project>”, ...)
Model Format
Flavor 2Flavor 1
Run Sources
Inference Code
Batch & Stream Scoring
Cloud Serving Tools
MLflow Models
Simple model flavors
usable by many tools
Example MLflow Model
my_model/
├── MLmodel
│
│
│
│
│
└── estimator/
├── saved_model.pb
└── variables/
...
Usable by tools that understand
TensorFlow model format
Usable by any tool that can run
Python (Docker, Spark, etc!)
run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
flavors:
tensorflow:
saved_model_dir: estimator
signature_def_key: predict
python_function:
loader_module: mlflow.tensorflow
MLflow Components
18
Tracking
Record and query
experiments: code,
data, config, results
Projects
Packaging format
for reproducible runs
on any platform
Models
General model format
that supports diverse
deployment tools
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
What’s new with
Selected New Features in MLflow 1.0
• Support for logging metrics per user-defined step
• Improved search
• HDFS support for artifacts
• ONNX Model Flavor [experimental]
• Deploying an MLflow Model as a Docker Image [experimental]
Support for logging metrics per user-defined step
Metrics logged at the end of a run, e.g.:
● Overall accuracy
● Overall AUC
● Overall loss
Metrics logged while training, e.g.:
● Accuracy per minibatch
● AUC per minibatch
● Loss per minibatch
Currently visualized by logging order:
Support for logging metrics per user-defined step
New step argument for log_metric
● Define the x coordinate for the metric
● Define ordering and scale of the horizontal axis in visualizations
log_metric ("exp", 1, 10)
log_metric ("exp", 2, 1000)
log_metric ("exp", 4, 10000)
log_metric ("exp", 8, 100000)
log_metric ("exp", 16, 1000000)
log_metric(key, value, step=None)
Improved Search
Search API supports a simplified version of the SQL WHERE clause, e.g.:
params.model = "LogisticRegression" and metrics.error <= 0.05
Improved Search
Search API supports a simplified version of the SQL WHERE clause, e.g.:
params.model = "LogisticRegression" and metrics.error <= 0.05
all_experiments = [exp.experiment_id for
exp in MlflowClient().list_experiments()]
runs = MlflowClient().search_runs(
all_experiments,
"params.model='LogisticRegression'"
" and metrics.error<=0.05",
ViewType.ALL)
Python API Example
Improved Search
Search API supports a simplified version of the SQL WHERE clause, e.g.:
Python API Example UI Example
all_experiments = [exp.experiment_id for
exp in MlflowClient().list_experiments()]
runs = MlflowClient().search_runs(
all_experiments,
"params.model='LogisticRegression'"
" and metrics.error<=0.05",
ViewType.ALL)
params.model = "LogisticRegression" and metrics.error <= 0.05
HDFS Support for Artifacts
mlflow.log_artifact(local_path, artifact_path=None)
AWS S3 Azure Blob
Store
Google Cloud
Storage
HDFS● DBFS
● NFS
● FTP
● SFTP
Supported Artifact Stores
ONNX Model Flavor
[Experimental]
ONNX models export both
• ONNX native format
• Pyfunc
mlflow.onnx.load_model(model_uri)
mlflow.onnx.log_model(onnx_model, artifact_path, conda_env=None)
mlflow.onnx.save_model(onnx_model, path, conda_env=None,
mlflow_model=<mlflow.models.Model object>)
Supported Model Flavors
Scikit TensorFlow MLlib H2O PyTorch Keras MLeap
Python
Function
R FunctionONNX
Docker Build
[Experimental]
$ mlflow models build-docker -m "runs:/some-run-uuid/my-model" -n "my-image-name"
$ docker run -p 5001:8080 "my-image-name"
Builds a Docker image whose default entrypoint serves the
specified MLflow model at port 8080 within the container.
29
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
beyond 1.0
What users want to see next
What’s coming soon
• New component: Model Registry
• Version-controlled registry of models
• Model lifecycle management
• Model monitoring
What’s coming soon
• New component: Model Registry
• Version-controlled registry of models
• Model lifecycle management
• Model monitoring
• Auto-logging from common frameworks
What’s coming soon
• New component: Model Registry
• Version-controlled registry of models
• Model lifecycle management
• Model monitoring
• Auto-logging from common frameworks
• Parallel coordinates plot
What’s coming soon
• New component: Model Registry
• Version-controlled registry of models
• Model lifecycle management
• Model monitoring
• Auto-logging from common frameworks
• Parallel coordinates plot
• Kubernetes remote run
What’s coming soon
• New component: Model Registry
• Version-controlled registry of models
• Model lifecycle management
• Model monitoring
• Auto-logging from common frameworks
• Parallel coordinates plot
• Kubernetes remote run
• Delta Lake integration (Delta.io) for Data Versioning
What’s coming soon
• New component: Model Registry
• Version-controlled registry of models
• Model lifecycle management
• Model monitoring
• Auto-logging from common frameworks
• Parallel coordinates plot
• Kubernetes remote run
• Delta Lake integration (Delta.io) for Data Versioning
• And more...
37
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
Demo
38
mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
Thank You

MLFlow 1.0 Meetup

  • 1.
    What’s New in1.0 and Beyond Matei Zaharia, Clemens Mewald, Richard Zang
  • 2.
    Outline MLflow Intro (MateiZaharia) Overview of Components (Clemens Mewald) MLflow 1.0 & Roadmap (Clemens Mewald) Demo (Richard Zang)
  • 3.
    ML Lifecycle Challenges Delta TuningModel Mgmt Raw Data ETL TrainFeaturize Score/Serve Batch + Realtime Monitor Alert, Debug Deploy AutoML, Hyper-p. search Experiment Tracking Remote Cloud Execution Project Mgmt (scale teams) Model Exchange Data Drift Model Drift Orchestration (Airflow) A/B Testing CI/CD/Jenkins push to prod Feature Repository Lifecycle mgmt. RetrainUpdate FeaturesProduction Logs Zoo of Ecosystem Frameworks Collaboration Scale Governance An open source platform for the machin learning lifecycle
  • 4.
    What is MLflow? Anopen source, extensible framework to manage the complete ML Lifecycle.
  • 5.
    MLflow Community Growth 600k100+ 40 Comparison: Apache Spark took 3 years to get to 100 contributors, and has 1.2M downloads/month on PyPI
  • 6.
    Community Growth inContext ● Time till 100 contributors: MLflow = 1 year, Spark = 3 years ● 600,000 monthly downloads on PyPI Package Downloads Last Month mlflow 600,000 h2o 45,000 sagemaker 50,000 pyspark 729,000 scikit-learn 7,743,000
  • 7.
    Some Users &Contributors
  • 8.
  • 9.
  • 10.
    What Does the1.0 Release Mean? API stability of the original components • Safe to build apps and integrations around them long term Time to start adding some new features! 10
  • 11.
    MLflow Components 11 Tracking Record andquery experiments: code, data, config, results Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
  • 12.
    Notebooks Local Apps Cloud Jobs TrackingServer UI API MLflow Tracking Python or REST API
  • 13.
    Key Concepts inTracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Artifacts: arbitrary files, including models Source: what code ran?
  • 14.
    Project Spec Code DataConfig LocalExecution Remote Execution MLflow Projects
  • 15.
    Example MLflow Project my_project/ ├──MLproject │ │ │ │ │ ├── conda.yaml ├── main.py └── model.py ... conda_env: conda.yaml entry_points: main: parameters: training_data: path lambda: {type: float, default: 0.1} command: python main.py {training_data} {lambda} $ mlflow run git://<my_project> mlflow.run(“git://<my_project>”, ...)
  • 16.
    Model Format Flavor 2Flavor1 Run Sources Inference Code Batch & Stream Scoring Cloud Serving Tools MLflow Models Simple model flavors usable by many tools
  • 17.
    Example MLflow Model my_model/ ├──MLmodel │ │ │ │ │ └── estimator/ ├── saved_model.pb └── variables/ ... Usable by tools that understand TensorFlow model format Usable by any tool that can run Python (Docker, Spark, etc!) run_id: 769915006efd4c4bbd662461 time_created: 2018-06-28T12:34 flavors: tensorflow: saved_model_dir: estimator signature_def_key: predict python_function: loader_module: mlflow.tensorflow
  • 18.
    MLflow Components 18 Tracking Record andquery experiments: code, data, config, results Projects Packaging format for reproducible runs on any platform Models General model format that supports diverse deployment tools mlflow.org github.com/mlflow twitter.com/MLflowdatabricks.com/mlflow
  • 19.
  • 20.
    Selected New Featuresin MLflow 1.0 • Support for logging metrics per user-defined step • Improved search • HDFS support for artifacts • ONNX Model Flavor [experimental] • Deploying an MLflow Model as a Docker Image [experimental]
  • 21.
    Support for loggingmetrics per user-defined step Metrics logged at the end of a run, e.g.: ● Overall accuracy ● Overall AUC ● Overall loss Metrics logged while training, e.g.: ● Accuracy per minibatch ● AUC per minibatch ● Loss per minibatch Currently visualized by logging order:
  • 22.
    Support for loggingmetrics per user-defined step New step argument for log_metric ● Define the x coordinate for the metric ● Define ordering and scale of the horizontal axis in visualizations log_metric ("exp", 1, 10) log_metric ("exp", 2, 1000) log_metric ("exp", 4, 10000) log_metric ("exp", 8, 100000) log_metric ("exp", 16, 1000000) log_metric(key, value, step=None)
  • 23.
    Improved Search Search APIsupports a simplified version of the SQL WHERE clause, e.g.: params.model = "LogisticRegression" and metrics.error <= 0.05
  • 24.
    Improved Search Search APIsupports a simplified version of the SQL WHERE clause, e.g.: params.model = "LogisticRegression" and metrics.error <= 0.05 all_experiments = [exp.experiment_id for exp in MlflowClient().list_experiments()] runs = MlflowClient().search_runs( all_experiments, "params.model='LogisticRegression'" " and metrics.error<=0.05", ViewType.ALL) Python API Example
  • 25.
    Improved Search Search APIsupports a simplified version of the SQL WHERE clause, e.g.: Python API Example UI Example all_experiments = [exp.experiment_id for exp in MlflowClient().list_experiments()] runs = MlflowClient().search_runs( all_experiments, "params.model='LogisticRegression'" " and metrics.error<=0.05", ViewType.ALL) params.model = "LogisticRegression" and metrics.error <= 0.05
  • 26.
    HDFS Support forArtifacts mlflow.log_artifact(local_path, artifact_path=None) AWS S3 Azure Blob Store Google Cloud Storage HDFS● DBFS ● NFS ● FTP ● SFTP Supported Artifact Stores
  • 27.
    ONNX Model Flavor [Experimental] ONNXmodels export both • ONNX native format • Pyfunc mlflow.onnx.load_model(model_uri) mlflow.onnx.log_model(onnx_model, artifact_path, conda_env=None) mlflow.onnx.save_model(onnx_model, path, conda_env=None, mlflow_model=<mlflow.models.Model object>) Supported Model Flavors Scikit TensorFlow MLlib H2O PyTorch Keras MLeap Python Function R FunctionONNX
  • 28.
    Docker Build [Experimental] $ mlflowmodels build-docker -m "runs:/some-run-uuid/my-model" -n "my-image-name" $ docker run -p 5001:8080 "my-image-name" Builds a Docker image whose default entrypoint serves the specified MLflow model at port 8080 within the container.
  • 29.
  • 30.
    What users wantto see next
  • 31.
    What’s coming soon •New component: Model Registry • Version-controlled registry of models • Model lifecycle management • Model monitoring
  • 32.
    What’s coming soon •New component: Model Registry • Version-controlled registry of models • Model lifecycle management • Model monitoring • Auto-logging from common frameworks
  • 33.
    What’s coming soon •New component: Model Registry • Version-controlled registry of models • Model lifecycle management • Model monitoring • Auto-logging from common frameworks • Parallel coordinates plot
  • 34.
    What’s coming soon •New component: Model Registry • Version-controlled registry of models • Model lifecycle management • Model monitoring • Auto-logging from common frameworks • Parallel coordinates plot • Kubernetes remote run
  • 35.
    What’s coming soon •New component: Model Registry • Version-controlled registry of models • Model lifecycle management • Model monitoring • Auto-logging from common frameworks • Parallel coordinates plot • Kubernetes remote run • Delta Lake integration (Delta.io) for Data Versioning
  • 36.
    What’s coming soon •New component: Model Registry • Version-controlled registry of models • Model lifecycle management • Model monitoring • Auto-logging from common frameworks • Parallel coordinates plot • Kubernetes remote run • Delta Lake integration (Delta.io) for Data Versioning • And more...
  • 37.
  • 38.