Projects#
An MLflow project is a format for packaging data science code in a reusable and reproducible way. The MLflow Projects component includes an API and command-line tools for running projects, which also integrate with the Tracking component to automatically record the parameters and git commit of your source code for reproducibility. This document describes the steps that need to be done to run MLflow projects on Oracle Cloud Infrastructure.
Data Science Jobs#
The examples demonstrated in this section show running the MLflow projects on the OCI Data Science jobs within different runtimes supported by the service. All demonstrated examples were taken from the MLflow official GitHub repository .
Prerequisites
Based on the
General Machine Learning for CPUs on Python 3.8 (generalml_p38_cpu_v1)create and publish a custom conda environment with additional libraries:mlflow
oci-mlflow
Data Science Config#
The OCI Data Science config is a JSON file contains the authentication information as well as the path to the job template YAML file.
{
"oci_job_template_path": "{work_dir}/oci-datascience-template.yaml",
"oci_auth": "api_key",
"oci_config_path": "~/.oci/config",
"oci_profile": "DEFAULT"
}
The {work_dir} can be used to point out that the YAML template located inside the project directory. It will be auto replaced with the absolute path to the project. However for the cases when YAML template cannot be placed in the project folder, the absolute or relative path can be used instead.
Supported authentication types:
API Key-Based Authentication -
api_keyResource Principal Authentication -
resource_principalInstance Principal Authentication -
instance_principal
Data Science Job Template#
The template file contains the information about the infrastructure on which a Data Science job should be run, and also the runtime information. More details can be found in the ADS documentation. The template file is divided into two main sections: infrastructure and runtime. The template also can be generated using ads opctl init command. More details can be found in the ADS documentation.
Data Science Job Infrastructure#
The Data Science job infrastructure allows specifying the configuration of the job instance. It includes such information as:
Compartment ID
Project ID
Subnet ID
Compute Shape
Block Storage Size
Log Group ID
Log ID
More details about job infrastructure can be found in the ADS documentation
infrastructure:
kind: infrastructure
spec:
blockStorageSize: 50
subnetId: ocid1.subnet.oc1.iad..<unique_ID>
compartmentId: ocid1.compartment.oc1..<unique_ID>
projectId: ocid1.datascienceproject.oc1.iad..<unique_ID>
logGroupId: ocid1.loggroup.oc1.iad..<unique_ID>
logId: ocid1.log.oc1.iad..<unique_ID>
shapeConfigDetails:
memoryInGBs: 20
ocpus: 2
shapeName: VM.Standard.E3.Flex
type: dataScienceJob
Data Science Job Runtime#
The runtime of a job defines the source code of your workload, environment variables, CLI arguments, and other configurations for the environment to run the workload. You will not work with the runtimes directly but will have to specify a YAML definition of the runtime to run an MLflow project on the Data Science job.
Depending on the source code, we do provide different types of runtime for defining a data science job:
PythonRuntime for Python code stored locally, OCI object storage, or other remote location supported by fsspec.
NotebookRuntime for a single Jupyter notebook stored locally, OCI object storage, or other remote location supported by fsspec.
ContainerRuntime for container images.
runtime:
kind: runtime
spec:
args: []
conda:
type: published
uri: <oci://bucket@namespace/prefix>
env:
- name: http_proxy
value: <http://ip:port>
entrypoint: "{Entry point script. For the MLflow will be replaced with the CMD}"
scriptPathURI: "{Path to the script. For the MLflow will be replaced with path to the project}"
type: python
runtime:
kind: runtime
spec:
args: []
conda:
type: published
uri: <oci://bucket@namespace/prefix>
env:
- name: http_proxy
value: <http://ip:port>
entrypoint: "{Entry point notebook. For MLflow, it will be replaced with the CMD}"
source: "{Path to the source code directory. For MLflow, it will be replaced with path to the project}"
notebookEncoding: utf-8
type: notebook
runtime:
kind: runtime
spec:
image: <iad.ocir.io/namespace/image_name:version>
cmd: "{Container CMD. For MLflow, it will be replaced with the Project CMD}"
entrypoint:
- bash
- --login
- -c
type: container
Running MLflow project within PythonRuntime#
This example demonstrates an MLflow project that trains a linear regression model on the UC Irvine Wine Quality Dataset. To run this example on the Data Science job, the custom conda environment needs to be prepared and published to the Object Storage bucket. The project can be run from source or by using GIT link.
To run project from the source, pull a sklearn_elasticnet_wine project form the GitHub repository. If you want to run the project with GIT URI, create a
sklearn_elasticnet_winefolder.Prepare and publish a custom conda environment. The libraries listed below need to be installed in your custom conda environment. This section can be skipped if you already prepared the custom conda environment following the prerequisites section in the beginning of the documentation.
scikit-learn
mlflow
pandas
oci-mlflow
Prepare a
oci-datascience-config.jsonfile containing the authentication information and path to the job configuration YAML file.{ "oci_auth": "api_key", "oci_job_template_path": "oci-datascience-template.yaml" }
Copy the
oci-datascience-config.jsonfile to thesklearn_elasticnet_winefolder.Prepare a
oci-datascience-template.yamljob configuration file.kind: job name: "{Job name. For the MLflow will be replaced with the Project name}" spec: infrastructure: kind: infrastructure spec: blockStorageSize: 50 subnetId: ocid1.subnet.oc1.iad..<unique_ID> compartmentId: ocid1.compartment.oc1..<unique_ID> projectId: ocid1.datascienceproject.oc1.iad..<unique_ID logGroupId: ocid1.loggroup.oc1.iad..<unique_ID> logId: ocid1.log.oc1.iad..<unique_ID> shapeConfigDetails: memoryInGBs: 20 ocpus: 2 shapeName: VM.Standard.E3.Flex type: dataScienceJob runtime: kind: runtime spec: args: [] conda: type: published uri: <oci://bucket@namespace/prefix> entrypoint: "{Entry point script. For the MLflow will be replaced with the CMD}" scriptPathURI: "{Path to the script. For the MLflow will be replaced with path to the project}" type: python
Copy the
oci-datascience-template.yamlfile to thesklearn_elasticnet_winefolder.Run the project from the source
cd ~/sklearn_elasticnet_wine export MLFLOW_TRACKING_URI=<tracking_uri> mlflow run . --experiment-name My_Experiment --backend oci-datascience --backend-config ./oci-datascience-config.json
import mlflow mlflow.set_tracking_uri("<tracking_uri>i") mlflow.run( ".", parameters={"alpha": 0.7, "l1-ratio": 0.06}, experiment_name="My_Experiment", backend="oci-datascience", backend_config="oci-datascience-config.json", )
Run the project with GIT URI
cd ~/sklearn_elasticnet_wine export MLFLOW_TRACKING_URI=<tracking_uri> mlflow run https://github.com/mlflow/mlflow#examples/sklearn_elasticnet_wine --experiment-name My_Experiment --backend oci-datascience --backend-config ./oci-datascience-config.json
import mlflow mlflow.set_tracking_uri("<tracking_uri>i") mlflow.run( "https://github.com/mlflow/mlflow#examples/sklearn_elasticnet_wine", experiment_name="My_Experiment", backend="oci-datascience", backend_config="oci-datascience-config.json", )
Running MLflow project within NotebookRuntime#
This example demonstrates an MLflow project that trains a linear regression model on the UC Irvine Wine Quality Dataset. To run this example on the Data Science job, the custom conda environment needs to be prepared and published to the Object Storage bucket.
Download a sklearn_elasticnet_wine project form the GitHub repository.
Prepare and publish a custom conda environment. The libraries listed below need to be installed in your custom conda environment.
scikit-learn
mlflow
pandas
oci-mlflow
Prepare a
oci-datascience-config.jsonfile containing the authentication information and path to the job configuration YAML file.{ "oci_auth": "api_key", "oci_job_template_path": "{work_dir}/oci-datascience-template.yaml" }
Copy the
oci-datascience-config.jsonfile to thesklearn_elasticnet_winefolder.Prepare a
oci-datascience-template.yamljob configuration file.kind: job name: "{Job name. For the MLflow will be replaced with the Project name}" spec: infrastructure: kind: infrastructure spec: blockStorageSize: 50 subnetId: ocid1.subnet.oc1.iad..<unique_ID> compartmentId: ocid1.compartment.oc1..<unique_ID> projectId: ocid1.datascienceproject.oc1.iad..<unique_ID> logGroupId: ocid1.loggroup.oc1.iad..<unique_ID> logId: ocid1.log.oc1.iad..<unique_ID> shapeConfigDetails: memoryInGBs: 20 ocpus: 2 shapeName: VM.Standard.E3.Flex type: dataScienceJob runtime: kind: runtime spec: args: [] conda: type: published uri: <oci://bucket@namespace/prefix> entrypoint: "{Entry point notebook. For MLflow, it will be replaced with the CMD}" source: "{Path to the source code directory. For MLflow, it will be replaced with path to the project}" notebookEncoding: utf-8 type: notebook
Copy the
oci-datascience-template.yamlfile to thesklearn_elasticnet_winefolder.Update the
MLprojectfile with the content provided belowname: tutorial entry_points: main: command: "train.ipynb"
Run the project
cd ~/sklearn_elasticnet_wine export MLFLOW_TRACKING_URI=<tracking_uri> mlflow run . --experiment-name My_Experiment --backend oci-datascience --backend-config ./oci-datascience-config.json
import mlflow mlflow.set_tracking_uri(<tracking_uri>) mlflow.run(".", experiment_name="My_Experiment", backend="oci-datascience", backend_config="oci-datascience-config.json" )
Running MLflow project within ContainerRuntime#
This example demonstrates an MLflow project that trains a linear regression model on the UC Irvine Wine Quality Dataset. In the first step, you will need to download the docker example from the MLflow official GitHub repository and go through the README.rst document provided within the project. The project uses a Docker image to capture the dependencies needed to run training code. Running a project in a Docker environment (as opposed to conda) allows for capturing non-Python dependencies, e.g. Java libraries. Once all steps from the README.rst are passed and the project can be run on the local environment, follow the steps below to run the project on the OCI Data Science jobs.
Download a docker project form the GitHub repository and place the code to the
sklearn_elasticnet_winefolder.Prepare a docker image following the steps from the README.rst. Add into the docker file the
oci-mlflowlibrary.FROM python:3.8 RUN pip install mlflow \ && pip install oci \ && pip install oracle-ads \ && pip install numpy \ && pip install scipy \ && pip install pandas \ && pip install scikit-learn \ && pip install cloudpickle \ && pip install oci-mlflow
Build and publish the image to the OCI container registry
docker tag mlflow-docker-example:<your_tag> <registry_path>/mlflow-docker-example:latest && \ docker push <registry_path>/mlflow-docker-example:latest
Prepare a
oci-datascience-config.jsonfile containing the authentication information and path to the job configuration YAML file.{ "oci_auth": "api_key", "oci_job_template_path": "{work_dir}/oci-datascience-template.yaml" }
Copy the
oci-datascience-config.jsonfile to thesklearn_elasticnet_winefolder.Prepare a
oci-datascience-template.yamljob configuration file.kind: job spec: name: "{Job name. For the MLflow will be replaced with the Project name}" infrastructure: kind: infrastructure spec: blockStorageSize: 50 subnetId: ocid1.subnet.oc1.iad..<unique_ID> compartmentId: ocid1.compartment.oc1..<unique_ID> projectId: ocid1.datascienceproject.oc1.iad..<unique_ID> logGroupId: ocid1.loggroup.oc1.iad..<unique_ID> logId: ocid1.log.oc1.iad..<unique_ID> shapeName: VM.Standard.E3.Flex shapeConfigDetails: memoryInGBs: 20 ocpus: 2 type: dataScienceJob runtime: type: container kind: runtime spec: image: <iad.ocir.io/realm/container:tag> cmd: "{Container CMD. For the MLflow will be replaced with the Project CMD}" entrypoint: - bash - --login - -c
Copy the
oci-datascience-template.yamlfile to thesklearn_elasticnet_winefolder.Run the project
cd ~/sklearn_elasticnet_wine export MLFLOW_TRACKING_URI=<tracking_uri> mlflow run . --experiment-name My_Experiment --backend oci-datascience --backend-config ./oci-datascience-config.jsonjson
import mlflow mlflow.set_tracking_uri(<tracking_uri>) mlflow.run(".", experiment_name="My_Experiment", parameters={"alpha": 0.7}, backend="oci-datascience", backend_config="oci-datascience-config.json" )
Data Flow Applications#
The examples demonstrated in this section show how to run MLflow projects on a Data Flow remote Spark cluster. All examples were taken from the MLflow official repository.
Prerequisites
Based on the
PySpark 3.2 and Data Flow (pyspark32_p38_cpu_v2)create and publish a custom conda environment with additional libraries: - mlflow - oci-mlflow
Running MLflow project within DataflowRuntime#
This example demonstrates an MLflow project that trains a logistic regression model on the Iris dataset. To run this example on the Data Flow cluster, the custom conda environment needs to be prepared and published to the Object Storage bucket.
Download a pyspark_ml_autologging project form the GitHub repository.
Prepare a
oci-datascience-config.jsonfile containing the authentication information and path to the job configuration YAML file.{ "oci_auth": "api_key", "oci_job_template_path": "{work_dir}/oci-datascience-template.yaml" }
Copy the
oci-datascience-config.jsonfile to thepyspark_ml_autologgingfolder.Prepare a
oci-datascience-template.yamljob configuration file. The template can be generated usingads opctl initcommand. More details can be found in the ADS documentation.kind: job name: "{DataFlow application name. For the MLflow will be replaced with the Project name}" spec: infrastructure: kind: infrastructure spec: compartmentId: ocid1.compartment.oc1..<unique_ID> driverShape: VM.Standard.E4.Flex driverShapeConfig: memory_in_gbs: 32 ocpus: 2 executorShape: VM.Standard.E4.Flex executorShapeConfig: memory_in_gbs: 32 ocpus: 2 language: PYTHON logsBucketUri: <oci://bucket@namespace> numExecutors: 1 sparkVersion: 3.2.1 privateEndpointId: ocid1.dataflowprivateendpoint.oc1.iad..<unique_ID> type: dataFlow runtime: kind: runtime spec: configuration: spark.driverEnv.MLFLOW_TRACKING_URI: <http://FQDN-address-of-the-container-instance:5000> conda: type: published uri: <oci://bucket@namespace/prefix> condaAuthType: resource_principal scriptBucket: <oci://bucket@namespace/prefix> scriptPathURI: "{Path to the executable script. For the MLflow will be replaced with the CMD}" overwrite: True type: dataFlow
In the config file, we do also specify a Private Endpoint (
privateEndpointId) which allows the Data Flow cluster to reach out to the tracking server URI (in case of the tracking server deployed in the private network). However, the private endpoint is not required for the case when the tracking server has a public Ip address. More details about the Private Endpoint can be found in the official documentation. We do also specify aspark.driverEnv.MLFLOW_TRACKING_URIproperty, which is only required in case of using a private endpoint and should be an FQDN of the container instance.Copy the
oci-datascience-template.yamlfile to thepyspark_ml_autologgingfolder.Create an
MLprojectfile in thepyspark_ml_autologgingfolder.name: mlflow-project-dataflow-application entry_points: main: command: "logistic_regression.py"
Run the example project
cd ~/pyspark_ml_autologging export MLFLOW_TRACKING_URI=<tracking_uri> mlflow run . --experiment-name My_Experiment --backend oci-datascience --backend-config ./oci-datascience-config.json
import mlflow mlflow.set_tracking_uri(<tracking_uri>) mlflow.run(".", experiment_name="My_Experiment", backend="oci-datascience", backend_config="oci-datascience-config.json" )