同步操作将从 Gitee 极速下载/MLflow 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
An MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools---for example, real-time serving through a REST API or batch inference on Apache Spark. The format defines a convention that lets you save a model in different "flavors" that can be understood by different downstream tools.
Table of Contents
Each MLflow Model is a directory containing arbitrary files, together with an MLmodel
file in the root of the directory that can define multiple flavors that the model can be viewed
in.
Flavors are the key concept that makes MLflow Models powerful: they are a convention that deployment
tools can use to understand the model, which makes it possible to write tools that work with models
from any ML library without having to integrate each tool with each library. MLflow defines
several "standard" flavors that all of its built-in deployment tools support, such as a "Python
function" flavor that describes how to run the model as a Python function. However, libraries can
also define and use other flavors. For example, MLflow's :py:mod:`mlflow.sklearn` library allows
loading models back as a scikit-learn Pipeline object for use in code that is aware of
scikit-learn, or as a generic Python function for use in tools that just need to apply the model
(for example, the mlflow sagemaker tool for deploying models to Amazon SageMaker).
All of the flavors that a particular model supports are defined in its MLmodel file in YAML
format. For example, :py:mod:`mlflow.sklearn` outputs models as follows:
# Directory written by mlflow.sklearn.save_model(model, "my_model") my_model/ ├── MLmodel └── model.pkl
And its MLmodel file describes two flavors:
time_created: 2018年05月25日T17:28:53.35
flavors:
sklearn:
sklearn_version: 0.19.1
pickled_model: model.pkl
python_function:
loader_module: mlflow.sklearn
This model can then be used with any tool that supports either the sklearn or
python_function model flavor. For example, the mlflow sklearn command can serve a
model with the sklearn flavor:
mlflow sklearn serve my_model
In addition, the mlflow sagemaker command-line tool can package and deploy models to AWS
SageMaker as long as they support the python_function flavor:
mlflow sagemaker deploy -m my_model [other options]
Apart from a flavors field listing the model flavors, the MLmodel YAML format can contain the following fields:
You can save and load MLflow Models in multiple ways. First, MLflow includes integrations with several common libraries. For example, :py:mod:`mlflow.sklearn` contains :py:func:`save_model <mlflow.sklearn.save_model>`, :py:func:`log_model <mlflow.sklearn.log_model>`, and :py:func:`load_model <mlflow.sklearn.load_model>` functions for scikit-learn models. Second, you can use the :py:class:`mlflow.models.Model` class to create and write models. This class has four key functions:
MLflow provides several standard flavors that might be useful in your applications. Specifically, many of its deployment tools support these flavors, so you can export your own model in one of these flavors to benefit from all these tools:
python_function)
The python_function model flavor defines a generic filesystem format for Python models and provides utilities
for saving and loading models to and from this format. The format is self-contained in the sense
that it includes all the information necessary to load and use a model. Dependencies
are stored either directly with the model or referenced via Conda environment.
Many MLflow Model persistence modules, such as :mod:`mlflow.sklearn`, :mod:`mlflow.keras`,
and :mod:`mlflow.pytorch`, produce models with the python_function (pyfunc) flavor. This
means that they adhere to the :ref:`python_function filesystem format <pyfunc-filesystem-format>`
and can be interpreted as generic Python classes that implement the specified
:ref:`inference API <pyfunc-inference-api>`. Therefore, any tool that operates on these pyfunc
classes can operate on any MLflow Model containing the pyfunc flavor, regardless of which
persistence module or framework was used to produce the model. This interoperability is very
powerful because it allows any Python model to be productionized in a variety of environments.
The convention for python_function models is to have a predict method or function with the following
signature:
predict(model_input: pandas.DataFrame) -> [numpy.ndarray | pandas.Series | pandas.DataFrame]
Other MLflow components expect python_function models to follow this convention.
The python_function :ref:`model format <pyfunc-filesystem-format>` is defined as a directory
structure containing all required data, code, and configuration.
The :py:mod:`mlflow.pyfunc` module defines functions for saving and loading MLflow Models with the
python_function flavor. This module also includes utilities for creating custom Python models.
For more information, see the :ref:`custom Python models documentation <custom-python-models>`
and the :mod:`mlflow.pyfunc` documentation.
h2o)
The h2o model flavor enables logging and loading H2O models.
The :py:mod:`mlflow.h2o` module defines :py:func:`save_model() <mlflow.h2o.save_model>` and
:py:func:`log_model() <mlflow.h2o.log_model>` methods for saving H2O models in MLflow Model format.
These methods produce MLflow Models with the python_function flavor, allowing you to load them
as generic Python functions for inference via :py:func:`mlflow.pyfunc.load_pyfunc()`. When you load
MLflow Models with the h2o flavor using :py:func:`load_pyfunc() <mlflow.pyfunc.load_pyfunc>`,
the h2o.init() by modifying the
init entry of the persisted H2O model's YAML configuration file: model.h2o/h2o.yaml.
Finally, you can use the :py:func:`mlflow.h2o.load_model()` method to load MLflow Models with the
h2o flavor as H2O model objects.
For more information, see :py:mod:`mlflow.h2o`.
keras)
The keras model flavor enables logging and loading Keras models. The :py:mod:`mlflow.keras`
module defines :py:func:`save_model() <mlflow.keras.save_model>` and
:py:func:`log_model() <mlflow.keras.log_model>` functions that you can use to save Keras models
in MLflow Model format. These functions serialize Keras models as HDF5 files using the Keras
library's built-in model persistence functions. MLflow Models produced by these functions
also contain the python_function flavor, allowing them to be interpreted as generic
Python functions for inference via :py:func:`mlflow.pyfunc.load_pyfunc()`. Finally, you can use
the :py:func:`mlflow.keras.load_model()` function to load MLflow Models with the
keras flavor as :py:mod:`mlflow.keras`.
mleap)
The mleap model flavor supports saving Spark models in MLflow format using the
SparkContext
to evaluate inputs.
You can save Spark models in MLflow format with the mleap flavor by specifying the
sample_input argument of the :py:func:`mlflow.spark.save_model()` or
:py:func:`mlflow.spark.log_model()` method (recommended). The :py:mod:`mlflow.mleap` module also
defines :py:func:`save_model() <mlflow.mleap.save_model>` and
:py:func:`log_model() <mlflow.mleap.log_model>` methods for saving MLeap models in MLflow format,
but these methods do not include the python_function flavor in the models they produce.
A companion module for loading MLflow Models with the MLeap flavor is available in the
mlflow/java package.
For more information, see :py:mod:`mlflow.spark`, :py:mod:`mlflow.mleap`, and the
PyTorch (pytorch)
The pytorch model flavor enables logging and loading PyTorch models.
The :py:mod:`mlflow.pytorch` module defines utilities for saving and loading MLflow Models with the
pytorch flavor. You can use the :py:func:`mlflow.pytorch.save_model()` and
:py:func:`mlflow.pytorch.log_model()` methods to save PyTorch models in MLflow format; both of these
functions use the :py:func:`mlflow.pytorch.load_model()`
method to load MLflow Models with the pytorch flavor as PyTorch model objects. Finally, models
produced by :py:func:`mlflow.pytorch.save_model()` and :py:func:`mlflow.pytorch.log_model()` contain
the python_function flavor, allowing you to load them as generic Python functions for inference
via :py:func:`mlflow.pyfunc.load_pyfunc()`.
For more information, see :py:mod:`mlflow.pytorch`.
sklearn)
The sklearn model flavor provides an easy-to-use interface for saving and loading scikit-learn
models. The :py:mod:`mlflow.sklearn` module defines
:py:func:`save_model() <mlflow.sklearn.save_model>` and
:py:func:`log_model() <mlflow.sklearn.log_model>` functions that save scikit-learn models in
MLflow format, using either Python's pickle module (Pickle) or CloudPickle for model serialization.
These functions produce MLflow Models with the python_function flavor, allowing them to
be loaded as generic Python functions for inference via :py:func:`mlflow.pyfunc.load_pyfunc()`.
Finally, you can use the :py:func:`mlflow.sklearn.load_model()` method to load MLflow Models with
the sklearn flavor as scikit-learn model objects.
For more information, see :py:mod:`mlflow.sklearn`.
spark)
The spark model flavor enables exporting Spark MLlib models as MLflow Models.
The :py:mod:`mlflow.spark` module defines :py:func:`save_model() <mlflow.spark.save_model>` and
:py:func:`log_model() <mlflow.spark.log_model>` methods that save Spark MLlib pipelines in MLflow
model format. MLflow Models produced by these functions contain the python_function flavor,
allowing you to load them as generic Python functions via :py:func:`mlflow.pyfunc.load_pyfunc()`.
When a model with the spark flavor is loaded as a Python function via
:py:func:`load_pyfunc() <mlflow.spark.load_pyfunc>`, a new
MLlib PipelineModel to any production environment supported by MLflow
(SageMaker, AzureML, etc).
Finally, the :py:func:`mlflow.spark.load_model()` method is used to load MLflow Models with
the spark flavor as Spark MLlib pipelines.
For more information, see :py:mod:`mlflow.spark`.
tensorflow)
The tensorflow model flavor allows serialized TensorFlow models in
:py:func:`mlflow.tensorflow.save_model()` and
:py:func:`mlflow.tensorflow.log_model()` methods. These methods also add the python_function
flavor to the MLflow Models that they produce, allowing the models to be interpreted as generic
Python functions for inference via :py:func:`mlflow.pyfunc.load_pyfunc()`. Finally, you can use the
:py:func:`mlflow.tensorflow.load_model()` method to load MLflow Models with the tensorflow
flavor as TensorFlow graphs.
For more information, see :py:mod:`mlflow.tensorflow`.
While MLflow's built-in model persistence utilities are convenient for packaging models from various popular ML libraries in MLflow Model format, they do not cover every use case. For example, you may want to use a model from an ML library that is not explicitly supported by MLflow's built-in flavors. Alternatively, you may want to package custom inference code and data to create an MLflow Model. Fortunately, MLflow provides two solutions that can be used to accomplish these tasks: :ref:`custom-python-models` and :ref:`custom-flavors`.
In this section:
The :py:mod:`mlflow.pyfunc` module provides :py:func:`save_model() <mlflow.pyfunc.save_model>` and
:py:func:`log_model() <mlflow.pyfunc.log_model>` utilities for creating MLflow Models with the
python_function flavor that contain user-specified code and artifact (file) dependencies.
These artifact dependencies may include serialized models produced by any Python ML library.
Because these custom models contain the python_function flavor, they can be deployed
to any of MLflow's supported production environments, such as SageMaker, AzureML, or local
REST endpoints.
The following examples demonstrate how you can use the :py:mod:`mlflow.pyfunc` module to create
custom Python models. For additional information about model customization with MLflow's
python_function utilities, see the
:ref:`python_function custom models documentation <pyfunc-create-custom>`.
This example defines a class for a custom model that adds a specified numeric value, n, to all
columns of a Pandas DataFrame input. Then, it uses the :py:mod:`mlflow.pyfunc` APIs to save an
instance of this model with n = 5 in MLflow Model format. Finally, it loads the model in
python_function format and uses it to evaluate a sample input.
import mlflow.pyfunc
# Define the model class
class AddN(mlflow.pyfunc.PythonModel):
def __init__(self, n):
self.n = n
def predict(self, context, model_input):
return model_input.apply(lambda column: column + self.n)
# Construct and save the model
model_path = "add_n_model"
add5_model = AddN(n=5)
mlflow.pyfunc.save_model(dst_path=model_path, python_model=add5_model)
# Load the model in `python_function` format
loaded_model = mlflow.pyfunc.load_pyfunc(model_path)
# Evaluate the model
import pandas as pd
model_input = pd.DataFrame([range(10)])
model_output = loaded_model.predict(model_input)
assert model_output.equals(pd.DataFrame([range(5, 15)]))
This example begins by training and saving a gradient boosted tree model using the XGBoost
library. Next, it defines a wrapper class around the XGBoost model that conforms to MLflow's
python_function :ref:`inference API <pyfunc-inference-api>`. Then, it uses the wrapper class and
the saved XGBoost model to construct an MLflow Model that performs inference using the gradient
boosted tree. Finally, it loads the MLflow Model in python_function format and uses it to
evaluate test data.
# Load training and test datasets
import xgboost as xgb
from sklearn import datasets
from sklearn.model_selection import train_test_split
iris = datasets.load_iris()
x = iris.data[:, 2:]
y = iris.target
x_train, x_test, y_train, _ = train_test_split(x, y, test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(x_train, label=y_train)
# Train and save an XGBoost model
xgb_model = xgb.train(params={'max_depth': 10}, dtrain=dtrain, num_boost_round=10)
xgb_model_path = "xgb_model.pth"
xgb_model.save_model(xgb_model_path)
# Create an `artifacts` dictionary that assigns a unique name to the saved XGBoost model file.
# This dictionary will be passed to `mlflow.pyfunc.save_model`, which will copy the model file
# into the new MLflow Model's directory.
artifacts = {
"xgb_model": xgb_model_path
}
# Define the model class
import mlflow.pyfunc
class XGBWrapper(mlflow.pyfunc.PythonModel):
def load_context(self, context):
import xgboost as xgb
self.xgb_model = xgb.Booster()
self.xgb_model.load_model(context.artifacts["xgb_model"])
def predict(self, context, model_input):
input_matrix = xgb.DMatrix(model_input.values)
return self.xgb_model.predict(input_matrix)
# Create a Conda environment for the new MLflow Model that contains the XGBoost library
# as a dependency, as well as the required CloudPickle library
import cloudpickle
conda_env = {
'channels': ['defaults'],
'dependencies': [
'xgboost={}'.format(xgb.__version__),
'cloudpickle={}'.format(cloudpickle.__version__),
],
'name': 'xgb_env'
}
# Save the MLflow Model
mlflow_pyfunc_model_path = "xgb_mlflow_pyfunc"
mlflow.pyfunc.save_model(
dst_path=mlflow_pyfunc_model_path, python_model=XGBWrapper(), artifacts=artifacts,
conda_env=conda_env)
# Load the model in `python_function` format
loaded_model = mlflow.pyfunc.load_pyfunc(mlflow_pyfunc_model_path)
# Evaluate the model
import pandas as pd
test_predictions = loaded_model.predict(pd.DataFrame(x_test))
print(test_predictions)
You can also create custom MLflow Models by writing a custom flavor.
As discussed in the :ref:`model-api` and :ref:`model-storage-format` sections, an MLflow Model
is defined by a directory of files that contains an MLmodel configuration file. This MLmodel
file describes various model attributes, including the flavors in which the model can be
interpreted. The MLmodel file contains an entry for each flavor name; each entry is
a YAML-formatted collection of flavor-specific attributes.
To create a new flavor to support a custom model, you define the set of flavor-specific attributes
to include in the MLmodel configuration file, as well as the code that can interpret the
contents of the model directory and the flavor's attributes.
As an example, let's examine the :py:mod:`mlflow.pytorch` module corresponding to MLflow's
pytorch flavor. In the :py:func:`mlflow.pytorch.save_model()` method, a PyTorch model is saved
to a specified output directory. Additionally, :py:func:`mlflow.pytorch.save_model()` leverages the
:py:func:`mlflow.models.Model.add_flavor()` and :py:func:`mlflow.models.Model.save()` functions to
produce an MLmodel configuration containing the pytorch flavor. The resulting configuration
has several flavor-specific attributes, such as pytorch_version, which denotes the version of the
PyTorch library that was used to train the model. To interpret model directories produced by
:py:func:`save_model() <mlflow.pytorch.save_model>`, the :py:mod:`mlflow.pytorch` module also
defines a :py:mod:`load_model() <mlflow.pytorch.load_model>` method.
:py:mod:`mlflow.pytorch.load_model()` reads the MLmodel configuration from a specified
model directory and uses the configuration attributes of the pytorch flavor to load
and return a PyTorch model from its serialized representation.
MLflow provides tools for deploying models on a local machine and to several production environments. Not all deployment methods are available for all model flavors. Deployment is supported for the Python Function format and all compatible formats.
In this section:
python_function model as a local REST API endpoint
MLflow can deploy models locally as local REST API endpoints or to directly score CSV files. This functionality is a convenient way of testing models before deploying to a remote model server. You deploy the Python Function flavor locally using the CLI interface to the :py:mod:`mlflow.pyfunc` module. The local REST API server accepts the following data formats as inputs:
split orientation. For example,
data = pandas_df.to_json(orient='split'). This format is specified using a Content-Type
request header value of application/json or application/json; format=pandas-split.records orientation. We do not recommend using
this format because it is not guaranteed to preserve column ordering. This format is
specified using a Content-Type request header value of
application/json; format=pandas-records.data = pandas_df.to_csv(). This format is
specified using a Content-Type request header value of text/csv.For more information about serializing pandas DataFrames, see Commands
For more info, see:
mlflow pyfunc --help
mlflow pyfunc serve --help
mlflow pyfunc predict --help
python_function model on Microsoft Azure ML
The :py:mod:`mlflow.azureml` module can package python_function models into Azure ML container images.
These images can be deployed to Azure Kubernetes Service (AKS) and the Azure Container Instances (ACI)
platform for real-time serving. The resulting Azure ML ContainerImage contains a web server that
accepts the following data formats as input:
split orientation. For example, data = pandas_df.to_json(orient='split'). This format is specified using a Content-Type request header value of application/json.Example workflow using the MLflow CLI
mlflow azureml build-image -w <workspace-name> -m <model-path> -d "Wine regression model 1"
az ml service create aci -n <deployment-name> --image-id <image-name>:<image-version>
# After the image deployment completes, requests can be posted via HTTP to the new ACI
# webservice's scoring URI. The following example posts a sample input from the wine dataset
# used in the MLflow ElasticNet example:
# https://github.com/mlflow/mlflow/tree/master/examples/sklearn_elasticnet_wine
scoring_uri=$(az ml service show --name <deployment-name> -v | jq -r ".scoringUri")
# `sample_input` is a JSON-serialized pandas DataFrame with the `split` orientation
sample_input='
{
"columns": [
"alcohol",
"chlorides",
"citric acid",
"density",
"fixed acidity",
"free sulfur dioxide",
"pH",
"residual sugar",
"sulphates",
"total sulfur dioxide",
"volatile acidity"
],
"data": [
[8.8, 0.045, 0.36, 1.001, 7, 45, 3, 20.7, 0.45, 170, 0.27]
]
}'
echo $sample_input | curl -s -X POST $scoring_uri\
-H 'Cache-Control: no-cache'\
-H 'Content-Type: application/json'\
-d @-
For more info, see:
mlflow azureml --help
mlflow azureml build-image --help
python_function model on Amazon SageMaker
The :py:mod:`mlflow.sagemaker` module can deploy python_function models locally in a Docker
container with SageMaker compatible environment and remotely on SageMaker.
To deploy remotely to SageMaker you need to set up your environment and user accounts.
To export a custom model to SageMaker, you need a MLflow-compatible Docker image to be available on Amazon ECR.
MLflow provides a default Docker image definition; however, it is up to you to build the image and upload it to ECR.
MLflow includes the utility function build_and_push_container to perform this step. Once built and uploaded, you can use the MLflow container for all MLflow Models. Model webservers deployed using the :py:mod:`mlflow.sagemaker`
module accept the following data formats as input, depending on the deployment flavor:
python_function: For this deployment flavor, the endpoint accepts the same formats
as the pyfunc server. These formats are described in the
:ref:`pyfunc deployment documentation <pyfunc_deployment>`.mleap: For this deployment flavor, the endpoint accepts only
JSON-serialized pandas DataFrames in the split orientation. For example,
data = pandas_df.to_json(orient='split'). This format is specified using a Content-Type
request header value of application/json.Example workflow using the MLflow CLI
mlflow sagemaker build-and-push-container - build the container (only needs to be called once)
mlflow sagemaker run-local -m <path-to-model> - test the model locally
mlflow sagemaker deploy <parameters> - deploy the model remotely
For more info, see:
mlflow sagemaker --help
mlflow sagemaker build-and-push-container --help
mlflow sagemaker run-local --help
mlflow sagemaker deploy --help
python_function model as an Apache Spark UDF
You can output a python_function model as an Apache Spark UDF, which can be uploaded to a
Spark cluster and used to score the model.
Example
pyfunc_udf = mlflow.pyfunc.spark_udf(<path-to-model>)
df = spark_df.withColumn("prediction", pyfunc_udf(<features>))
The resulting UDF is based Spark's Pandas UDF and is currently limited to producing either a single
value or an array of values of the same type per observation. By default, we return the first
numeric column as a double. You can control what result is returned by supplying result_type
argument. The following values are supported:
'int' or LongType: The leftmost long integer that can fit in int64
result is returned or exception is raised if there is none.float32 is returned or exception is raised if there is no numeric column.'double' or ArrayType ( DoubleType ): Return all numeric columns cast to the
requested. type. Exception is raised if there are numeric columns.'string' or ArrayType (
from pyspark.sql.types import ArrayType, FloatType
pyfunc_udf = mlflow.pyfunc.spark_udf(<path-to-model>, result_type=ArrayType(FloatType()))
# The prediction column will contain all the numeric columns returned by the model as floats
df = spark_df.withColumn("prediction", pyfunc_udf(<features>))
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。