{ "cells": [ { "cell_type": "markdown", "source": [ "# 雲端中的數據科學：「Azure ML SDK」方法\n", "\n", "## 簡介\n", "\n", "在這份筆記中，我們將學習如何使用 Azure ML SDK 來訓練、部署及使用模型，通過 Azure ML 平台完成。\n", "\n", "前置條件：\n", "1. 你已建立 Azure ML 工作區。\n", "2. 你已將 [心臟衰竭數據集](https://www.kaggle.com/andrewmvd/heart-failure-clinical-data) 加載到 Azure ML。\n", "3. 你已將這份筆記上傳到 Azure ML Studio。\n", "\n", "接下來的步驟是：\n", "\n", "1. 在現有的工作區中建立一個實驗。\n", "2. 建立一個計算叢集。\n", "3. 加載數據集。\n", "4. 使用 AutoMLConfig 配置 AutoML。\n", "5. 執行 AutoML 實驗。\n", "6. 探索結果並獲取最佳模型。\n", "7. 註冊最佳模型。\n", "8. 部署最佳模型。\n", "9. 使用端點。\n", "\n", "## Azure Machine Learning SDK 特定的導入\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "from azureml.core import Workspace, Experiment\n", "from azureml.core.compute import AmlCompute\n", "from azureml.train.automl import AutoMLConfig\n", "from azureml.widgets import RunDetails\n", "from azureml.core.model import InferenceConfig, Model\n", "from azureml.core.webservice import AciWebservice" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 初始化工作區\n", "從已保存的配置中初始化一個工作區物件。請確保配置文件存在於 .\\config.json\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "ws = Workspace.from_config()\n", "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 建立 Azure ML 實驗\n", "\n", "讓我們在剛剛初始化的工作區中建立一個名為「aml-experiment」的實驗。\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "experiment_name = 'aml-experiment'\n", "experiment = Experiment(ws, experiment_name)\n", "experiment" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 建立計算叢集 \n", "你需要為你的 AutoML 執行建立一個[計算目標](https://docs.microsoft.com/azure/machine-learning/concept-azure-machine-learning-architecture#compute-target)。 \n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "aml_name = \"heart-f-cluster\"\n", "try:\n", " aml_compute = AmlCompute(ws, aml_name)\n", " print('Found existing AML compute context.')\n", "except:\n", " print('Creating new AML compute context.')\n", " aml_config = AmlCompute.provisioning_configuration(vm_size = \"Standard_D2_v2\", min_nodes=1, max_nodes=3)\n", " aml_compute = AmlCompute.create(ws, name = aml_name, provisioning_configuration = aml_config)\n", " aml_compute.wait_for_completion(show_output = True)\n", "\n", "cts = ws.compute_targets\n", "compute_target = cts[aml_name]" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 數據\n", "請確保你已將數據集上載到 Azure ML，並且鍵的名稱與數據集的名稱相同。\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "key = 'heart-failure-records'\n", "dataset = ws.datasets[key]\n", "df = dataset.to_pandas_dataframe()\n", "df.describe()" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 自動機器學習配置\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "automl_settings = {\n", " \"experiment_timeout_minutes\": 20,\n", " \"max_concurrent_iterations\": 3,\n", " \"primary_metric\" : 'AUC_weighted'\n", "}\n", "\n", "automl_config = AutoMLConfig(compute_target=compute_target,\n", " task = \"classification\",\n", " training_data=dataset,\n", " label_column_name=\"DEATH_EVENT\",\n", " enable_early_stopping= True,\n", " featurization= 'auto',\n", " debug_log = \"automl_errors.log\",\n", " **automl_settings\n", " )" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 自動機器學習運行\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "remote_run = experiment.submit(automl_config)" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "RunDetails(remote_run).show()" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "best_run, fitted_model = remote_run.get_output()" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "best_run.get_properties()" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "model_name = best_run.properties['model_name']\n", "script_file_name = 'inference/score.py'\n", "best_run.download_file('outputs/scoring_file_v_1_0_0.py', 'inference/score.py')\n", "description = \"aml heart failure project sdk\"\n", "model = best_run.register_model(model_name = model_name,\n", " description = description,\n", " tags = None)" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 部署最佳模型\n", "\n", "執行以下程式碼以部署最佳模型。你可以在 Azure ML 入口網站中查看部署的狀態。此步驟可能需要幾分鐘時間。\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "inference_config = InferenceConfig(entry_script=script_file_name, environment=best_run.get_environment())\n", "\n", "aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1,\n", " memory_gb = 1,\n", " tags = {'type': \"automl-heart-failure-prediction\"},\n", " description = 'Sample service for AutoML Heart Failure Prediction')\n", "\n", "aci_service_name = 'automl-hf-sdk'\n", "aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)\n", "aci_service.wait_for_deployment(True)\n", "print(aci_service.state)" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 使用端點\n", "你可以為以下的輸入範例添加輸入內容。\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "data = {\n", " \"data\":\n", " [\n", " {\n", " 'age': \"60\",\n", " 'anaemia': \"false\",\n", " 'creatinine_phosphokinase': \"500\",\n", " 'diabetes': \"false\",\n", " 'ejection_fraction': \"38\",\n", " 'high_blood_pressure': \"false\",\n", " 'platelets': \"260000\",\n", " 'serum_creatinine': \"1.40\",\n", " 'serum_sodium': \"137\",\n", " 'sex': \"false\",\n", " 'smoking': \"false\",\n", " 'time': \"130\",\n", " },\n", " ],\n", "}\n", "\n", "test_sample = str.encode(json.dumps(data))" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "response = aci_service.run(input_data=test_sample)\n", "response" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n---\n\n**免責聲明**： \n此文件已使用人工智能翻譯服務 [Co-op Translator](https://github.com/Azure/co-op-translator) 翻譯。我們致力於提供準確的翻譯，但請注意，自動翻譯可能包含錯誤或不準確之處。應以原始語言的文件作為權威來源。對於關鍵資訊，建議使用專業的人工作業翻譯。我們對因使用此翻譯而引起的任何誤解或誤釋不承擔責任。\n" ] } ], "metadata": { "orig_nbformat": 4, "language_info": { "name": "python" }, "coopTranslator": { "original_hash": "af42669556d5dc19fc4cc3866f7d2597", "translation_date": "2025-09-02T05:44:56+00:00", "source_file": "5-Data-Science-In-Cloud/19-Azure/notebook.ipynb", "language_code": "hk" } }, "nbformat": 4, "nbformat_minor": 2 }