{ "cells": [ { "cell_type": "markdown", "source": [ "# 클라우드에서 데이터 과학: \"Azure ML SDK\" 방식\n", "\n", "## 소개\n", "\n", "이 노트북에서는 Azure ML SDK를 사용하여 모델을 학습, 배포 및 소비하는 방법을 배워보겠습니다.\n", "\n", "사전 준비 사항:\n", "1. Azure ML 워크스페이스를 생성했습니다.\n", "2. [Heart Failure 데이터셋](https://www.kaggle.com/andrewmvd/heart-failure-clinical-data)을 Azure ML에 로드했습니다.\n", "3. 이 노트북을 Azure ML Studio에 업로드했습니다.\n", "\n", "다음 단계는 다음과 같습니다:\n", "\n", "1. 기존 워크스페이스에서 실험을 생성합니다.\n", "2. 컴퓨팅 클러스터를 생성합니다.\n", "3. 데이터셋을 로드합니다.\n", "4. AutoMLConfig를 사용하여 AutoML을 구성합니다.\n", "5. AutoML 실험을 실행합니다.\n", "6. 결과를 탐색하고 최적의 모델을 확인합니다.\n", "7. 최적의 모델을 등록합니다.\n", "8. 최적의 모델을 배포합니다.\n", "9. 엔드포인트를 소비합니다.\n", "\n", "## Azure Machine Learning SDK 관련 임포트\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "from azureml.core import Workspace, Experiment\n", "from azureml.core.compute import AmlCompute\n", "from azureml.train.automl import AutoMLConfig\n", "from azureml.widgets import RunDetails\n", "from azureml.core.model import InferenceConfig, Model\n", "from azureml.core.webservice import AciWebservice" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 작업 공간 초기화 \n", "저장된 구성에서 작업 공간 객체를 초기화합니다. 구성 파일이 .\\config.json에 있는지 확인하세요. \n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "ws = Workspace.from_config()\n", "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## Azure ML 실험 생성\n", "\n", "방금 초기화한 작업 영역에서 'aml-experiment'라는 이름의 실험을 만들어봅시다.\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "experiment_name = 'aml-experiment'\n", "experiment = Experiment(ws, experiment_name)\n", "experiment" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 컴퓨트 클러스터 생성\n", "AutoML 실행을 위해 [컴퓨트 대상](https://docs.microsoft.com/azure/machine-learning/concept-azure-machine-learning-architecture#compute-target)을 생성해야 합니다.\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "aml_name = \"heart-f-cluster\"\n", "try:\n", " aml_compute = AmlCompute(ws, aml_name)\n", " print('Found existing AML compute context.')\n", "except:\n", " print('Creating new AML compute context.')\n", " aml_config = AmlCompute.provisioning_configuration(vm_size = \"Standard_D2_v2\", min_nodes=1, max_nodes=3)\n", " aml_compute = AmlCompute.create(ws, name = aml_name, provisioning_configuration = aml_config)\n", " aml_compute.wait_for_completion(show_output = True)\n", "\n", "cts = ws.compute_targets\n", "compute_target = cts[aml_name]" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 데이터\n", "데이터셋을 Azure ML에 업로드했으며, 키가 데이터셋과 동일한 이름인지 확인하세요.\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "key = 'heart-failure-records'\n", "dataset = ws.datasets[key]\n", "df = dataset.to_pandas_dataframe()\n", "df.describe()" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## AutoML 구성\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "automl_settings = {\n", " \"experiment_timeout_minutes\": 20,\n", " \"max_concurrent_iterations\": 3,\n", " \"primary_metric\" : 'AUC_weighted'\n", "}\n", "\n", "automl_config = AutoMLConfig(compute_target=compute_target,\n", " task = \"classification\",\n", " training_data=dataset,\n", " label_column_name=\"DEATH_EVENT\",\n", " enable_early_stopping= True,\n", " featurization= 'auto',\n", " debug_log = \"automl_errors.log\",\n", " **automl_settings\n", " )" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## AutoML 실행\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "remote_run = experiment.submit(automl_config)" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "RunDetails(remote_run).show()" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "best_run, fitted_model = remote_run.get_output()" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "best_run.get_properties()" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "model_name = best_run.properties['model_name']\n", "script_file_name = 'inference/score.py'\n", "best_run.download_file('outputs/scoring_file_v_1_0_0.py', 'inference/score.py')\n", "description = \"aml heart failure project sdk\"\n", "model = best_run.register_model(model_name = model_name,\n", " description = description,\n", " tags = None)" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 최적의 모델 배포하기\n", "\n", "다음 코드를 실행하여 최적의 모델을 배포하세요. 배포 상태는 Azure ML 포털에서 확인할 수 있습니다. 이 단계는 몇 분 정도 소요될 수 있습니다.\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "inference_config = InferenceConfig(entry_script=script_file_name, environment=best_run.get_environment())\n", "\n", "aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1,\n", " memory_gb = 1,\n", " tags = {'type': \"automl-heart-failure-prediction\"},\n", " description = 'Sample service for AutoML Heart Failure Prediction')\n", "\n", "aci_service_name = 'automl-hf-sdk'\n", "aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)\n", "aci_service.wait_for_deployment(True)\n", "print(aci_service.state)" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## 엔드포인트 사용하기\n", "다음 입력 샘플에 입력값을 추가할 수 있습니다.\n" ], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "data = {\n", " \"data\":\n", " [\n", " {\n", " 'age': \"60\",\n", " 'anaemia': \"false\",\n", " 'creatinine_phosphokinase': \"500\",\n", " 'diabetes': \"false\",\n", " 'ejection_fraction': \"38\",\n", " 'high_blood_pressure': \"false\",\n", " 'platelets': \"260000\",\n", " 'serum_creatinine': \"1.40\",\n", " 'serum_sodium': \"137\",\n", " 'sex': \"false\",\n", " 'smoking': \"false\",\n", " 'time': \"130\",\n", " },\n", " ],\n", "}\n", "\n", "test_sample = str.encode(json.dumps(data))" ], "outputs": [], "metadata": {} }, { "cell_type": "code", "execution_count": null, "source": [ "response = aci_service.run(input_data=test_sample)\n", "response" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n---\n\n**면책 조항**: \n이 문서는 AI 번역 서비스 [Co-op Translator](https://github.com/Azure/co-op-translator)를 사용하여 번역되었습니다. 정확성을 위해 최선을 다하고 있으나, 자동 번역에는 오류나 부정확성이 포함될 수 있습니다. 원본 문서의 원어 버전을 신뢰할 수 있는 권위 있는 자료로 간주해야 합니다. 중요한 정보의 경우, 전문적인 인간 번역을 권장합니다. 이 번역 사용으로 인해 발생하는 오해나 잘못된 해석에 대해 책임을 지지 않습니다.\n" ] } ], "metadata": { "orig_nbformat": 4, "language_info": { "name": "python" }, "coopTranslator": { "original_hash": "af42669556d5dc19fc4cc3866f7d2597", "translation_date": "2025-09-01T20:09:08+00:00", "source_file": "5-Data-Science-In-Cloud/19-Azure/notebook.ipynb", "language_code": "ko" } }, "nbformat": 4, "nbformat_minor": 2 }