You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
323 lines
9.6 KiB
323 lines
9.6 KiB
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"# Data Science in de Cloud: De \"Azure ML SDK\" aanpak\n",
|
|
"\n",
|
|
"## Introductie\n",
|
|
"\n",
|
|
"In dit notebook leren we hoe we de Azure ML SDK kunnen gebruiken om een model te trainen, implementeren en gebruiken via Azure ML.\n",
|
|
"\n",
|
|
"Voorwaarden:\n",
|
|
"1. Je hebt een Azure ML-werkruimte aangemaakt.\n",
|
|
"2. Je hebt de [Heart Failure dataset](https://www.kaggle.com/andrewmvd/heart-failure-clinical-data) geladen in Azure ML.\n",
|
|
"3. Je hebt dit notebook geüpload naar Azure ML Studio.\n",
|
|
"\n",
|
|
"De volgende stappen zijn:\n",
|
|
"\n",
|
|
"1. Maak een Experiment in een bestaande Werkruimte.\n",
|
|
"2. Maak een Compute-cluster.\n",
|
|
"3. Laad de dataset.\n",
|
|
"4. Configureer AutoML met behulp van AutoMLConfig.\n",
|
|
"5. Voer het AutoML-experiment uit.\n",
|
|
"6. Verken de resultaten en selecteer het beste model.\n",
|
|
"7. Registreer het beste model.\n",
|
|
"8. Implementeer het beste model.\n",
|
|
"9. Gebruik de endpoint.\n",
|
|
"\n",
|
|
"## Azure Machine Learning SDK-specifieke imports\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"from azureml.core import Workspace, Experiment\n",
|
|
"from azureml.core.compute import AmlCompute\n",
|
|
"from azureml.train.automl import AutoMLConfig\n",
|
|
"from azureml.widgets import RunDetails\n",
|
|
"from azureml.core.model import InferenceConfig, Model\n",
|
|
"from azureml.core.webservice import AciWebservice"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Werkruimte Initialiseren\n",
|
|
"Initialiseer een werkruimte-object vanuit opgeslagen configuratie. Zorg ervoor dat het configuratiebestand aanwezig is op .\\config.json\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"ws = Workspace.from_config()\n",
|
|
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Een Azure ML-experiment maken\n",
|
|
"\n",
|
|
"Laten we een experiment genaamd 'aml-experiment' maken in de werkruimte die we zojuist hebben geïnitialiseerd.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"experiment_name = 'aml-experiment'\n",
|
|
"experiment = Experiment(ws, experiment_name)\n",
|
|
"experiment"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Een Compute Cluster maken\n",
|
|
"Je moet een [compute-doel](https://docs.microsoft.com/azure/machine-learning/concept-azure-machine-learning-architecture#compute-target) aanmaken voor je AutoML-run.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"aml_name = \"heart-f-cluster\"\n",
|
|
"try:\n",
|
|
" aml_compute = AmlCompute(ws, aml_name)\n",
|
|
" print('Found existing AML compute context.')\n",
|
|
"except:\n",
|
|
" print('Creating new AML compute context.')\n",
|
|
" aml_config = AmlCompute.provisioning_configuration(vm_size = \"Standard_D2_v2\", min_nodes=1, max_nodes=3)\n",
|
|
" aml_compute = AmlCompute.create(ws, name = aml_name, provisioning_configuration = aml_config)\n",
|
|
" aml_compute.wait_for_completion(show_output = True)\n",
|
|
"\n",
|
|
"cts = ws.compute_targets\n",
|
|
"compute_target = cts[aml_name]"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Gegevens\n",
|
|
"Zorg ervoor dat je de dataset hebt geüpload naar Azure ML en dat de sleutel dezelfde naam heeft als de dataset.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"key = 'heart-failure-records'\n",
|
|
"dataset = ws.datasets[key]\n",
|
|
"df = dataset.to_pandas_dataframe()\n",
|
|
"df.describe()"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## AutoML-configuratie\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"automl_settings = {\n",
|
|
" \"experiment_timeout_minutes\": 20,\n",
|
|
" \"max_concurrent_iterations\": 3,\n",
|
|
" \"primary_metric\" : 'AUC_weighted'\n",
|
|
"}\n",
|
|
"\n",
|
|
"automl_config = AutoMLConfig(compute_target=compute_target,\n",
|
|
" task = \"classification\",\n",
|
|
" training_data=dataset,\n",
|
|
" label_column_name=\"DEATH_EVENT\",\n",
|
|
" enable_early_stopping= True,\n",
|
|
" featurization= 'auto',\n",
|
|
" debug_log = \"automl_errors.log\",\n",
|
|
" **automl_settings\n",
|
|
" )"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## AutoML Run\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"remote_run = experiment.submit(automl_config)"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"RunDetails(remote_run).show()"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"best_run, fitted_model = remote_run.get_output()"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"best_run.get_properties()"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"model_name = best_run.properties['model_name']\n",
|
|
"script_file_name = 'inference/score.py'\n",
|
|
"best_run.download_file('outputs/scoring_file_v_1_0_0.py', 'inference/score.py')\n",
|
|
"description = \"aml heart failure project sdk\"\n",
|
|
"model = best_run.register_model(model_name = model_name,\n",
|
|
" description = description,\n",
|
|
" tags = None)"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Beste model implementeren\n",
|
|
"\n",
|
|
"Voer de volgende code uit om het beste model te implementeren. Je kunt de status van de implementatie bekijken in het Azure ML-portaal. Deze stap kan enkele minuten duren.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"inference_config = InferenceConfig(entry_script=script_file_name, environment=best_run.get_environment())\n",
|
|
"\n",
|
|
"aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1,\n",
|
|
" memory_gb = 1,\n",
|
|
" tags = {'type': \"automl-heart-failure-prediction\"},\n",
|
|
" description = 'Sample service for AutoML Heart Failure Prediction')\n",
|
|
"\n",
|
|
"aci_service_name = 'automl-hf-sdk'\n",
|
|
"aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)\n",
|
|
"aci_service.wait_for_deployment(True)\n",
|
|
"print(aci_service.state)"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"source": [
|
|
"## Gebruik de Eindpunt\n",
|
|
"Je kunt invoer toevoegen aan het volgende invoervoorbeeld.\n"
|
|
],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"data = {\n",
|
|
" \"data\":\n",
|
|
" [\n",
|
|
" {\n",
|
|
" 'age': \"60\",\n",
|
|
" 'anaemia': \"false\",\n",
|
|
" 'creatinine_phosphokinase': \"500\",\n",
|
|
" 'diabetes': \"false\",\n",
|
|
" 'ejection_fraction': \"38\",\n",
|
|
" 'high_blood_pressure': \"false\",\n",
|
|
" 'platelets': \"260000\",\n",
|
|
" 'serum_creatinine': \"1.40\",\n",
|
|
" 'serum_sodium': \"137\",\n",
|
|
" 'sex': \"false\",\n",
|
|
" 'smoking': \"false\",\n",
|
|
" 'time': \"130\",\n",
|
|
" },\n",
|
|
" ],\n",
|
|
"}\n",
|
|
"\n",
|
|
"test_sample = str.encode(json.dumps(data))"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"source": [
|
|
"response = aci_service.run(input_data=test_sample)\n",
|
|
"response"
|
|
],
|
|
"outputs": [],
|
|
"metadata": {}
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n---\n\n**Disclaimer**: \nDit document is vertaald met behulp van de AI-vertalingsservice [Co-op Translator](https://github.com/Azure/co-op-translator). Hoewel we streven naar nauwkeurigheid, willen we u erop wijzen dat geautomatiseerde vertalingen fouten of onnauwkeurigheden kunnen bevatten. Het originele document in de oorspronkelijke taal moet worden beschouwd als de gezaghebbende bron. Voor kritieke informatie wordt professionele menselijke vertaling aanbevolen. Wij zijn niet aansprakelijk voor misverstanden of verkeerde interpretaties die voortvloeien uit het gebruik van deze vertaling.\n"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"orig_nbformat": 4,
|
|
"language_info": {
|
|
"name": "python"
|
|
},
|
|
"coopTranslator": {
|
|
"original_hash": "af42669556d5dc19fc4cc3866f7d2597",
|
|
"translation_date": "2025-09-02T05:37:58+00:00",
|
|
"source_file": "5-Data-Science-In-Cloud/19-Azure/notebook.ipynb",
|
|
"language_code": "nl"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
} |