Last reviewed: May 2026
Build the AWS services on the DP-100 exam with plain Terraform — one block at a time, each tied back to an exam domain. The same code works on OpenTofu.
By the end of this lab you'll have provisioned, with plain Terraform, the Azure Machine Learning workspace control plane — the workspace itself, the three required dependencies (Storage Account, Key Vault, Application Insights), and an Azure ML compute cluster scaled to zero-when-idle so the lab doesn't bleed money. This is the DP-100 reference workspace setup; training jobs and model deployments plug into it.
Drop the snippets into a single main.tf, run terraform init, then terraform apply step-by-step.
>= 1.5 or OpenTofu >= 1.6.az login).Control plane idles at near-$0:
The DP-100 cost trap is leaving the compute cluster's min_node_count > 0 — even one node idle runs $200+/month. We set min_node_count = 0 and scale_down_nodes_after_idle_duration = PT15M (scale down 15 minutes after idle). Verify before running. Destroy when done.
Standard Azure opener. Azure ML workspaces are regional — pick a region with broad GPU SKU availability if you're going past the lab (eastus, westus, westeurope are the safe choices).
terraform {
required_version = ">= 1.5"
required_providers {
azurerm = { source = "hashicorp/azurerm", version = "~> 4.0" }
random = { source = "hashicorp/random", version = "~> 3.6" }
}
}
provider "azurerm" {
features {
key_vault {
purge_soft_delete_on_destroy = true
}
}
}
resource "random_id" "suffix" {
byte_length = 3
}
data "azurerm_client_config" "current" {}
locals {
tags = {
Project = "certlabpro-dp-100"
ManagedBy = "terraform"
}
}
resource "azurerm_resource_group" "main" {
name = "certlabpro-dp-100-rg"
location = "eastus"
tags = local.tags
}Azure ML workspaces require three pre-existing resources to attach to: a Storage Account (for datasets, models, logs), a Key Vault (for credentials), and an Application Insights instance (for run telemetry). DP-100 tests this triplet repeatedly — "why can't I create the workspace?" is almost always missing one of these.
The storage account here gets the standard secure defaults; the Key Vault uses RBAC authorization (the modern default).
resource "azurerm_storage_account" "ml" {
name = "dp100ml${random_id.suffix.hex}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
account_tier = "Standard"
account_replication_type = "LRS"
account_kind = "StorageV2"
https_traffic_only_enabled = true
min_tls_version = "TLS1_2"
allow_nested_items_to_be_public = false
tags = local.tags
}
resource "azurerm_key_vault" "ml" {
name = "kv-dp100-${random_id.suffix.hex}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
tenant_id = data.azurerm_client_config.current.tenant_id
sku_name = "standard"
enable_rbac_authorization = true
soft_delete_retention_days = 7
tags = local.tags
}
resource "azurerm_application_insights" "ml" {
name = "appi-dp100-${random_id.suffix.hex}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
application_type = "web"
tags = local.tags
}The workspace ties the three dependencies together and gets a system-assigned managed identity that downstream compute targets, datasets, and endpoints will use to read from the dependencies. DP-100's Manage Azure resources for ML domain tests this exact shape — workspace + identity + role assignments.
We set public_network_access_enabled = true to keep the lab simple; production workspaces typically use private endpoints (DP-100 Design and prepare a machine-learning solution domain tests private-link variants).
resource "azurerm_machine_learning_workspace" "main" {
name = "mlw-dp100-${random_id.suffix.hex}"
resource_group_name = azurerm_resource_group.main.name
location = azurerm_resource_group.main.location
application_insights_id = azurerm_application_insights.ml.id
key_vault_id = azurerm_key_vault.ml.id
storage_account_id = azurerm_storage_account.ml.id
public_network_access_enabled = true
identity {
type = "SystemAssigned"
}
tags = local.tags
}Training jobs need compute. Azure ML compute clusters are managed VM pools that scale based on job queue depth. min_node_count = 0 is the DP-100 cost-optimization mandatory setting for lab/dev workspaces — when no jobs are queued, the cluster scales to zero nodes and bills $0 (just metadata).
Standard_DS3_v2 (4 vCPU, 14 GB RAM, $0.30/hour) is the typical lab default — large enough to run sklearn or small PyTorch training jobs, small enough to be cheap. Production training clusters use GPU SKUs (Standard_NC6s_v3 family).
The scale_down_nodes_after_idle_duration = "PT15M" (ISO 8601 duration for 15 minutes) is the recurring DP-100 cost question: setting this too long leaves expensive nodes running; too short causes thrash. 15 minutes is the documented Azure default.
resource "azurerm_machine_learning_compute_cluster" "main" {
name = "cpu-cluster"
location = azurerm_resource_group.main.location
vm_priority = "Dedicated"
vm_size = "Standard_DS3_v2"
machine_learning_workspace_id = azurerm_machine_learning_workspace.main.id
scale_settings {
min_node_count = 0
max_node_count = 2
scale_down_nodes_after_idle_duration = "PT15M"
}
identity {
type = "SystemAssigned"
}
tags = local.tags
}terraform destroy tears down everything. Key notes:
purge_soft_delete_on_destroy = true in the provider features makes Key Vault destroy actually purge. Workspace soft-delete is configurable in the portal but terraform destroy works regardless.DP-100 covers more ML-on-Azure surfaces this lab can't fit — Compute Instances (single-user IDE VMs, very expensive idle), Online Endpoints (managed real-time inference), Batch Endpoints (managed batch inference), AutoML jobs, Designer (visual pipeline editor), MLflow tracking integration, ParallelRunStep, model registry promotion workflows, and the entire data-asset / data-store catalog.
We stick to the workspace control plane because it's the substrate every DP-100 pattern attaches to. Endpoints attach to the workspace. Jobs run on the compute cluster. Models register against the workspace's MLflow tracking. Datasets land in the storage account.
For the surfaces above, see the Browse and Editorial sections of this cert page. The DP-100 Train ML models and Deploy and operationalize ML solutions domains are best learned by running jobs against this workspace — the lab gives you the substrate; the Python SDK does the actual training and deployment.