最后审核时间:2026年5月
使用原生 Terraform 构建 PCDE 考试中的 AWS 服务——每次构建一个代码块,并紧扣考试领域。相同的代码可在 OpenTofu 上运行。
在本实验结束时,您将使用纯 Terraform 预置一个 PCDE 形状的 CI/CD + 可观测性基础架构——一个用于构建镜像的 Artifact Registry 仓库、一个监视存根 GitHub 源的 Cloud Build 触发器、一个带有两个 Cloud Run 目标(预演 + 生产)的 Cloud Deploy 交付管道,以及一个针对生产目标的 Cloud Monitoring SLO + 警报策略。共五个模块;PCDE 测试的 提交 → 构建 → 部署 → 观测 循环。
将代码片段放入单个 main.tf 文件中,然后逐步运行 terraform init 和 terraform apply。
>= 1.5 或 OpenTofu >= 1.6。your-project-id(以及可选地在步骤 3 中替换 github-owner / github-repo)。在本实验范围内,免费或接近免费:
min_instances = 0:闲置时 $0。按实验工作量计算,每月约 $0。
启用 Cloud Build、Cloud Deploy、Cloud Run、Artifact Registry 和 Cloud Monitoring API。
terraform {
required_version = ">= 1.5"
required_providers {
google = { source = "hashicorp/google", version = "~> 6.0" }
}
}
provider "google" {
project = "your-project-id" # REPLACE
region = "us-central1"
}
locals {
labels = {
project = "certlabpro-pcde"
managed_by = "terraform"
}
}
resource "google_project_service" "cloudbuild" {
service = "cloudbuild.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "clouddeploy" {
service = "clouddeploy.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "run" {
service = "run.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "artifactregistry" {
service = "artifactregistry.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "monitoring" {
service = "monitoring.googleapis.com"
disable_on_destroy = false
}PCDE 规范的 CI/CD 形式:Cloud Build 将镜像输出到 Artifact Registry;Cloud Deploy 在不同环境中提升相同的镜像 SHA。该仓库是所有环境的单一事实来源——不可变镜像,可变标签。
resource "google_artifact_registry_repository" "images" {
repository_id = "certlabpro-pcde-images"
location = "us-central1"
format = "DOCKER"
labels = local.labels
depends_on = [google_project_service.artifactregistry]
}Cloud Build 触发器是 PCDE 提交到构建 的基本单元。我们定义一个针对 GitHub 仓库 main 分支的触发器——每次推送都会运行仓库根目录下的 cloudbuild.yaml,它将执行 docker build + docker push 到步骤 2 中的 Artifact Registry。
将 github-owner 和 github-repo 替换为您的实际仓库。触发器将预置,但在仓库上安装 GitHub Cloud Build 应用之前(一次性手动控制台步骤)不会触发。
resource "google_cloudbuild_trigger" "main_push" {
name = "certlabpro-pcde-main-push"
description = "Build on push to main"
filename = "cloudbuild.yaml"
github {
owner = "github-owner" # REPLACE
name = "github-repo" # REPLACE
push {
branch = "^main$"
}
}
depends_on = [google_project_service.cloudbuild]
}Cloud Deploy 是 GCP 的托管 CD 服务——通过有序的阶段(通常是开发 → 预演 → 生产)提升发布,每个阶段都由一个目标(Cloud Run 服务、GKE 集群或 Anthos 集群)支持。PCDE 考试广泛测试这种一次发布,多次提升的形式。
我们定义了两个 Cloud Run 目标服务(一个用于预演,一个用于生产,两者都可扩展到零)+ 一个将它们串联起来的交付管道。实际发布将在 Cloud Build 成功运行后通过 gcloud deploy releases create 触发;本实验预置了基础架构,但未触发发布。
resource "google_cloud_run_v2_service" "staging" {
name = "certlabpro-pcde-staging"
location = "us-central1"
template {
scaling { max_instance_count = 5 }
containers {
image = "us-docker.pkg.dev/cloudrun/container/hello"
}
}
labels = local.labels
depends_on = [google_project_service.run]
}
resource "google_cloud_run_v2_service" "prod" {
name = "certlabpro-pcde-prod"
location = "us-central1"
template {
scaling { max_instance_count = 10 }
containers {
image = "us-docker.pkg.dev/cloudrun/container/hello"
}
}
labels = local.labels
depends_on = [google_project_service.run]
}
resource "google_clouddeploy_target" "staging" {
name = "staging"
location = "us-central1"
run {
location = "projects/${data.google_project.current.project_id}/locations/us-central1"
}
depends_on = [google_project_service.clouddeploy]
}
resource "google_clouddeploy_target" "prod" {
name = "prod"
location = "us-central1"
run {
location = "projects/${data.google_project.current.project_id}/locations/us-central1"
}
require_approval = true # promotion to prod needs manual approval
depends_on = [google_project_service.clouddeploy]
}
data "google_project" "current" {}
resource "google_clouddeploy_delivery_pipeline" "main" {
name = "certlabpro-pcde-pipeline"
location = "us-central1"
serial_pipeline {
stages {
target_id = google_clouddeploy_target.staging.name
}
stages {
target_id = google_clouddeploy_target.prod.name
}
}
depends_on = [google_project_service.clouddeploy]
}PCDE 的 SLO / SLI / 错误预算 词汇是承载可观测性的核心形态——每个标记为站点可靠性的 PCDE 考试问题都测试这一点。我们定义了一个服务级别目标:“在 28 天滚动窗口内,99% 的对生产服务的 HTTP 请求应返回 2xx。” 然后,当燃尽率表明预算消耗过快时,将触发 Cloud Monitoring 警报。
有了这五个模块(Artifact Registry、Cloud Build 触发器、两个 Cloud Run 目标、Cloud Deploy 管道、生产 SLO + 警报),PCDE 的 提交 → 构建 → 部署 → 观测 循环就实现了端到端预置。
resource "google_monitoring_service" "prod" {
service_id = "certlabpro-pcde-prod-svc"
display_name = "PCDE prod Cloud Run service"
basic_service {
service_type = "CLOUD_RUN"
service_labels = {
service_name = google_cloud_run_v2_service.prod.name
location = google_cloud_run_v2_service.prod.location
}
}
depends_on = [google_project_service.monitoring]
}
resource "google_monitoring_slo" "prod_availability" {
service = google_monitoring_service.prod.service_id
slo_id = "certlabpro-pcde-prod-availability"
display_name = "99% requests return 2xx (28-day rolling)"
goal = 0.99
rolling_period_days = 28
basic_sli {
availability {
enabled = true
}
}
}
resource "google_monitoring_alert_policy" "prod_budget_burn" {
display_name = "PCDE prod — fast budget burn"
combiner = "OR"
conditions {
display_name = "1-hour burn rate > 14.4 (fast burn)"
condition_threshold {
filter = "select_slo_burn_rate(\"${google_monitoring_slo.prod_availability.name}\", \"3600s\")"
duration = "0s"
comparison = "COMPARISON_GT"
threshold_value = 14.4 # consumes 2% of monthly budget in 1 hour
}
}
}terraform destroy 会销毁所有资源。Cloud Deploy 管道 + 目标 会干净地销毁(没有正在进行的发布会阻碍销毁)。两个 Cloud Run 服务 会销毁(min_instances = 0 → 无论如何,实验期间不会产生闲置费用)。Cloud Build 触发器 会分离。SLO + 警报 + 监控服务 会干净地销毁。
PCDE 涵盖了本实验无法容纳的许多方面——Cloud Logging 深度解析(日志路由、日志接收器、日志存储桶、日志分析 ↔ BigQuery 集成)、Cloud Trace + Cloud Profiler + Cloud Debugger(应用性能)、Error Reporting、Cloud Monitoring 仪表板 + 自定义指标 + 正常运行时间检查、Cloud Build 私有工作器池、Cloud Build 审批规则、Cloud Deploy 金丝雀 + 蓝绿部署策略、Cloud Deploy 自定义渲染/验证步骤、基于 Skaffold 的 Cloud Deploy 渲染管道、用于镜像证明强制执行的 Binary Authorization、Container Threat Detection、Web App and API Protection (WAAP) / Cloud Armor、Anthos Config Management + Policy Controller、GKE Backup,以及 PCDE 考试经常引用的整个 Google SRE Book / SRE Workbook 实践。
我们坚持使用 Artifact Registry + Cloud Build + Cloud Deploy + Cloud Run + SLO/Alert 等基本单元,因为它们是 PCDE 规范的 CI/CD/操作循环。Logging / Trace / Profiler / Debugger / Error Reporting 都连接到相同的生产目标服务。Binary Authorization 守护着 Cloud Deploy 的提升步骤。Cloud Monitoring SLO 是承载可靠性的核心基本单元——正确构建形态;随着架构的成熟,叠加更多的遥测表面。
有关服务的概念性覆盖,请参阅此认证页面的 浏览、手册 和 Editorial 部分。