42.41. Data Science in the cloud: The “Azure ML SDK” way#

42.41.1. Introduction#

In this notebook, we will learn how to use the Azure ML SDK to train, deploy and consume a model through Azure ML.

Pre-requisites:

  1. You created an Azure ML workspace.

  2. You loaded the Heart Failure dataset into Azure ML.

  3. You uploaded this notebook into Azure ML Studio.

The next steps are:

  1. Create an Experiment in an existing Workspace.

  2. Create a Compute cluster.

  3. Load the dataset.

  4. Configure AutoML using AutoMLConfig.

  5. Run the AutoML experiment.

  6. Explore the results and get the best model.

  7. Register the best model.

  8. Deploy the best model.

  9. Consume the endpoint.

42.41.2. Azure Machine Learning SDK-specific imports#

from azureml.core import Workspace, Experiment
from azureml.core.compute import AmlCompute
from azureml.train.automl import AutoMLConfig
from azureml.widgets import RunDetails
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice

42.41.3. Initialize Workspace#

Initialize a workspace object from persisted configuration. Make sure the config file is present at .\config.json

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

42.41.4. Create an Azure ML experiment#

Let’s create an experiment named ‘aml-experiment’ in the workspace we just initialized.

experiment_name = 'aml-experiment'
experiment = Experiment(ws, experiment_name)
experiment

42.41.5. Create a compute cluster#

You will need to create a compute target for your AutoML run.

aml_name = "heart-f-cluster"
try:
    aml_compute = AmlCompute(ws, aml_name)
    print('Found existing AML compute context.')
except:
    print('Creating new AML compute context.')
    aml_config = AmlCompute.provisioning_configuration(vm_size = "Standard_D2_v2", min_nodes=1, max_nodes=3)
    aml_compute = AmlCompute.create(ws, name = aml_name, provisioning_configuration = aml_config)
    aml_compute.wait_for_completion(show_output = True)

cts = ws.compute_targets
compute_target = cts[aml_name]

42.41.6. Data#

Make sure you have uploaded the dataset to Azure ML and that the key is the same name as the dataset.

key = 'heart-failure-records'
dataset = ws.datasets[key]
df = dataset.to_pandas_dataframe()
df.describe()

42.41.7. AutoML configuration#

automl_settings = {
    "experiment_timeout_minutes": 20,
    "max_concurrent_iterations": 3,
    "primary_metric" : 'AUC_weighted'
}

automl_config = AutoMLConfig(compute_target=compute_target,
                             task = "classification",
                             training_data=dataset,
                             label_column_name="DEATH_EVENT",
                             enable_early_stopping= True,
                             featurization= 'auto',
                             debug_log = "automl_errors.log",
                             **automl_settings
                            )

42.41.8. AutoML run#

remote_run = experiment.submit(automl_config)
RunDetails(remote_run).show()

42.41.9. Save the best model#

best_run, fitted_model = remote_run.get_output()
best_run.get_properties()
model_name = best_run.properties['model_name']
script_file_name = 'inference/score.py'
best_run.download_file('outputs/scoring_file_v_1_0_0.py', 'inference/score.py')
description = "aml heart failure project sdk"
model = best_run.register_model(model_name = model_name,
                                description = description,
                                tags = None)

42.41.10. Deploy the best model#

Run the following code to deploy the best model. You can see the state of the deployment in the Azure ML portal. This step can take a few minutes.

inference_config = InferenceConfig(entry_script=script_file_name, environment=best_run.get_environment())

aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1,
                                               memory_gb = 1,
                                               tags = {'type': "automl-heart-failure-prediction"},
                                               description = 'Sample service for AutoML Heart Failure Prediction')

aci_service_name = 'automl-hf-sdk'
aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)
aci_service.wait_for_deployment(True)
print(aci_service.state)

42.41.11. Consume the endpoint#

You can add inputs to the following input sample.

data = {
    "data":
    [
        {
            'age': "60",
            'anaemia': "false",
            'creatinine_phosphokinase': "500",
            'diabetes': "false",
            'ejection_fraction': "38",
            'high_blood_pressure': "false",
            'platelets': "260000",
            'serum_creatinine': "1.40",
            'serum_sodium': "137",
            'sex': "false",
            'smoking': "false",
            'time': "130",
        },
    ],
}

test_sample = str.encode(json.dumps(data))
response = aci_service.run(input_data=test_sample)
response

42.41.12. Acknowledgments#

Thanks to Microsoft for creating the open-source course Data Science for Beginners. It inspires the majority of the content in this chapter.