1
0
Fork 0

Add secrets support and instruction for budget notifications (#444)

* Move to secrets

* Add billing actions
This commit is contained in:
Grigory Movsesyan 2023-05-26 12:15:44 +02:00 committed by GitHub
parent a29b5c4357
commit ba66080689
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 189 additions and 74 deletions

View file

@ -1,53 +1,94 @@
#ToDo:
Move secrets to gcp secrets
Format readme in md
Cloud build for terraform
[?]Wait for kubernetes plugin to finish apply
Add readme if the budget is exceeded
Be aware, the actions you execute on your gcp project will generate some cost.
#Permissions
# Permissions
TODO
#1st run (bootstrap)
# Step 1: Bootstrap
Copy `variables.tfvars` from `variables.tfvars_example`
Replace the placeholders for `project-id` and `billing-account` in `variables.tfvars`
Insert secret values in the `variables.tfvars` file or insert values on runtime when using terraform plan or apply
Initialise terraform
### Initialise terraform with local backend
Comment out everything in `backend.tf` file to use local state for the first run as the bucket for storing the state is not created.
```terraform init```
```
terraform -chdir=terraform init
```
Create the state bucket
```terraform apply -var-file=variables.tfvars -target="google_storage_bucket.terraform_state"```
### Create the state bucket
```
terraform -chdir=terraform apply -var-file=variables.tfvars -target="google_storage_bucket.terraform_state"
```
To disable the conformation use `--auto-aprove` flag
##Move the state to the bucket.
## Migrate the state to the bucket.
Uncomment everything in `backend.tf` file to use remote state with newly created bucket.
```export PROJECT_ID="<PROJECT_ID>"```
```terraform init -backend-config="bucket=terraform-state-${PROJECT_ID}" -backend-config="prefix=terraform/state"```
```
export PROJECT_ID="<PROJECT_ID>"
```
```
terraform -chdir=terraform init -backend-config="bucket=terraform-state-${PROJECT_ID}" -backend-config="prefix=terraform/state"
```
## Create the secrets.
```
terraform -chdir=terraform apply -var-file=variables.tfvars -target="google_secret_manager_secret.secret"
```
Create the cluster. Due to the problem described [here](https://github.com/hashicorp/terraform-provider-kubernetes/issues/1775) terraform kubernetes provider requires kubernetes cluster to be created first. So to create the cluster without applying kubernetes resources we will do the apply in 2 runs using the `-target` flag.
```terraform apply -var-file=variables.tfvars -target="google_container_cluster.llvm_premerge_checks_cluster"```
## Create the cluster.
Due to the problem described [here](https://github.com/hashicorp/terraform-provider-kubernetes/issues/1775) terraform kubernetes provider requires kubernetes cluster to be created first. So to create the cluster without applying kubernetes resources we will do the apply in 2 runs using the `-target` flag.
```
terraform -chdir=terraform apply -var-file=variables.tfvars -target="google_container_cluster.llvm_premerge_checks_cluster"
```
##Creating worker images
## Build the builders
To deploy build workers you need the worker docker images in your project.
TODO cloud build SA permissions
###Linux worker image
### Linux worker image
Execute cloud build to build Linux worker:
```gcloud builds submit --config=containers/buildkite-premerge-debian/cloudbuild.yaml containers/buildkite-premerge-debian/ --project=${PROJECT_ID}```
```
gcloud builds submit --config=containers/buildkite-premerge-debian/cloudbuild.yaml containers/buildkite-premerge-debian/ --project=${PROJECT_ID}
```
###Windows worker image
### Windows worker image
Build windows cloud builder. Follow the steps described here: [link](https://github.com/GoogleCloudPlatform/cloud-builders-community/tree/master/windows-builder)
Execute cloud build to build Windows worker:
```gcloud builds submit --config=containers/buildkite-premerge-windows/cloudbuild.yaml containers/buildkite-premerge-windows/ --project=${PROJECT_ID}```
```
gcloud builds submit --config=containers/buildkite-premerge-windows/cloudbuild.yaml containers/buildkite-premerge-windows/ --project=${PROJECT_ID}
```
##Create the rest of the gcp resources including workers in kubernetes pods
```terraform apply -var-file="variables.tfvars"```
## Create the rest of the gcp resources including workers in kubernetes pods
```
terraform -chdir=terraform apply -var-file="variables.tfvars"
```
#Budget
TODO
## Terraform cloud build automation
Manual trigger
```
gcloud builds submit --config=terraform/cloudbuild.yaml terraform --project=${PROJECT_ID} --substitutions=_GIT_REPO=${GIT_REPO}
```
Automatic trigger:
```
<TODO>
```
# Budget
Budget alerts set on the monthly basis. Notification emails will be triggered on 50%, 90% and 100% of you budget.
## Actions to reduce project costs
Adjust the GKE cluster nodes: Set the node count to 1 per built platform (Linux and Windows). The node count can be controlled from the `variables.tfvars` file using the `linux-agents-count` and `windows-agents-count` parameters. If these parameters are not set in the `variables.tfvars` file, you can check the default parameters configured in the `variables.tf` file.
## Break glass procedure
Regarding the break glass procedure for emergency configuration of the Kubernetes build nodes, you can use the following set of gcloud commands. It is recommended to execute them from the [Cloud Shell](https://cloud.google.com/shell/docs/using-cloud-shell) for simplicity. Please note that these commands assume you have the necessary permissions and authentication to access and modify the GKE cluster.
```
export PROJECT_ID=<your project id>
export ZONE="europe-west3-c"
gcloud container clusters get-credentials ${PROJECT_ID}-cluster --zone ${ZONE} --project ${PROJECT_ID}
#gcloud container clusters update llvm-premerge-checks-cluster --node-pool linux-agents --zone ${ZONE} --project ${PROJECT_ID} --no-enable-autoscaling
gcloud container clusters resize llvm-premerge-checks-cluster --node-pool linux-agents --num-nodes 1 --zone ${ZONE} --project ${PROJECT_ID}
#gcloud container clusters update llvm-premerge-checks-cluster --node-pool windows-agents --zone ${ZONE} --project ${PROJECT_ID} --no-enable-autoscaling
gcloud container clusters resize llvm-premerge-checks-cluster --node-pool windows-agents --num-nodes 1 --zone ${ZONE} --project ${PROJECT_ID}
```
These commands will scale down the deployments to have a single replica for both the Linux and Windows agents. Adjust the replica count as needed. Please note that making changes to your GKE cluster configuration or scaling down nodes may impact the availability and performance of the pipeline.

View file

@ -1,10 +1,8 @@
#todo fix billing alert creation
data "google_billing_account" "account" {
billing_account = var.billing-account
}
#todo do not create billing if option in variables is off
resource "google_billing_budget" "budget" {
billing_account = data.google_billing_account.account.id
billing_account = data.google_project.current_project.billing_account
display_name = "budget"
amount {
specified_amount {
@ -12,10 +10,12 @@ resource "google_billing_budget" "budget" {
units = var.billing-budget
}
}
budget_filter {
projects = ["projects/${var.project-id}"]
projects = ["projects/${data.google_project.current_project.number}"]
credit_types_treatment = "EXCLUDE_ALL_CREDITS"
calendar_period = "MONTH"
#services = ["services/24E6-581D-38E5"] # Bigquery
}
@ -29,11 +29,12 @@ resource "google_billing_budget" "budget" {
threshold_percent = 1.0
}
#TODO add if not empty billing admins var. Else use default admins
all_updates_rule {
monitoring_notification_channels = [
for k, v in google_monitoring_notification_channel.notification_channel : google_monitoring_notification_channel.notification_channel[k].id
]
disable_default_iam_recipients = true
disable_default_iam_recipients = length(var.billing-admins) < 1 ? false : true
}
}

57
terraform/cloudbuild.yaml Normal file
View file

@ -0,0 +1,57 @@
steps:
- name: gcr.io/cloud-builders/git
args:
- '-c'
- 'git clone ${_GIT_REPO} repo --depth 1'
entrypoint: bash
- name: hashicorp/terraform
args:
- init
- '-backend-config=bucket=${_TF_BACKEND_BUCKET}'
- '-backend-config=prefix=${_TF_BACKEND_PREFIX}'
dir: repo/terraform
- name: hashicorp/terraform
args:
- plan
- '-var=project-id=${PROJECT_ID}'
- '-var=buildkite-api-token-readonly=$$BUILDKITE_API_TOKEN_READONLY'
- '-var=buildkite-agent-token=$$BUILDKITE_AGENT_TOKEN'
- '-var=conduit-api-token=$$CONDUIT_API_TOKEN'
- '-var=git-id-rsa=$$GIT_ID_RSA'
- '-var=id-rsa-pub=$$ID_RSA_PUB'
- '-var=git-known-hosts=$$GIT_KNOWN_HOSTS'
- '-out=/workspace/tfplan-${BUILD_ID}'
secretEnv:
- 'BUILDKITE_API_TOKEN_READONLY'
- 'BUILDKITE_AGENT_TOKEN'
- 'CONDUIT_API_TOKEN'
- 'GIT_ID_RSA'
- 'ID_RSA_PUB'
- 'GIT_KNOWN_HOSTS'
dir: repo/terraform
# - name: hashicorp/terraform
# args:
# - apply
# - '-auto-approve'
# - /workspace/tfplan-${BUILD_ID}
# dir: repo/terraform
substitutions:
_GIT_REPO: $(body.project.git_http_url)
_TF_BACKEND_BUCKET: 'terraform-state-${PROJECT_ID}'
_TF_BACKEND_PREFIX: terraform/state
availableSecrets:
secretManager:
- versionName: 'projects/${PROJECT_ID}/secrets/buildkite-api-token-readonly/versions/latest'
env: 'BUILDKITE_API_TOKEN_READONLY'
- versionName: 'projects/${PROJECT_ID}/secrets/buildkite-agent-token/versions/latest'
env: 'BUILDKITE_AGENT_TOKEN'
- versionName: 'projects/${PROJECT_ID}/secrets/conduit-api-token/versions/latest'
env: 'CONDUIT_API_TOKEN'
- versionName: 'projects/${PROJECT_ID}/secrets/git-id-rsa/versions/latest'
env: 'GIT_ID_RSA'
- versionName: 'projects/${PROJECT_ID}/secrets/id-rsa-pub/versions/latest'
env: 'ID_RSA_PUB'
- versionName: 'projects/${PROJECT_ID}/secrets/git-known-hosts/versions/latest'
env: 'GIT_KNOWN_HOSTS'
options:
dynamic_substitutions: true

View file

@ -58,7 +58,7 @@ resource "google_container_cluster" "llvm_premerge_checks_cluster" {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "services"
}
depends_on = [google_project_service.compute_api, google_project_service.container_api]
depends_on = [google_project_service.google_api]
}
resource "google_container_node_pool" "linux_agents_nodepool" {
@ -79,8 +79,9 @@ resource "google_container_node_pool" "linux_agents_nodepool" {
}
autoscaling {
min_node_count = 0
max_node_count = var.linux-agents-count
min_node_count = 0
max_node_count = var.linux-agents-count
location_policy = "BALANCED"
}
}
@ -102,8 +103,9 @@ resource "google_container_node_pool" "windows_agents_nodepool" {
}
autoscaling {
min_node_count = 0
max_node_count = var.windows-agents-count
min_node_count = 0
max_node_count = var.windows-agents-count
location_policy = "BALANCED"
}
}

View file

@ -5,9 +5,19 @@ data "google_project" "current_project" {
}
locals {
cloud_build_sa_roles = ["roles/storage.objectAdmin", "roles/compute.instanceAdmin", "roles/compute.securityAdmin"]
cloud_build_sa_roles = ["roles/editor", "roles/storage.objectAdmin", "roles/secretmanager.secretAccessor","roles/secretmanager.viewer","roles/resourcemanager.projectIamAdmin"]
enabled_apis = [
"secretmanager.googleapis.com",
"billingbudgets.googleapis.com",
"cloudbuild.googleapis.com",
"compute.googleapis.com",
"container.googleapis.com",
"cloudresourcemanager.googleapis.com",
"cloudbilling.googleapis.com"
]
}
#todo create separate sa for cloud build
# data "google_iam_policy" "cloud_build_sa" {
# binding {
# role = "roles/iam.serviceAccountUser"
@ -24,41 +34,23 @@ locals {
# }
resource "google_project_iam_member" "cloudbuild_sa_roles" {
project = var.project-id
project = var.project-id
for_each = toset(local.cloud_build_sa_roles)
role = each.value
role = each.value
member = "serviceAccount:${data.google_project.current_project.number}@cloudbuild.gserviceaccount.com"
}
resource "google_project_service" "cloudbuild_api" {
service = "cloudbuild.googleapis.com"
}
resource "google_project_service" "compute_api" {
service = "compute.googleapis.com"
}
resource "google_project_service" "container_api" {
service = "container.googleapis.com"
}
resource "google_project_service" "cloudresourcemanager_api" {
service = "cloudresourcemanager.googleapis.com"
}
resource "google_project_service" "cloudbilling_api" {
service = "cloudbilling.googleapis.com"
}
resource "google_project_service" "billingbudgets_api" {
service = "billingbudgets.googleapis.com"
resource "google_project_service" "google_api" {
for_each = toset(local.enabled_apis)
service = each.value
}
resource "google_storage_bucket" "terraform_state" {
name = "terraform-state-${var.project-id}"
uniform_bucket_level_access = true
location = "EU"
depends_on = [google_project_service.google_api]
}
resource "google_compute_network" "vpc_network" {

26
terraform/secrets.tf Normal file
View file

@ -0,0 +1,26 @@
locals {
secrets = {
"buildkite-api-token-readonly": var.buildkite-api-token-readonly,
"buildkite-agent-token": var.buildkite-agent-token,
"conduit-api-token": var.conduit-api-token,
"git-id-rsa": var.git-id-rsa,
"id-rsa-pub": var.id-rsa-pub,
"git-known-hosts": var.git-known-hosts
}
}
resource "google_secret_manager_secret" "secret" {
for_each = local.secrets
secret_id = each.key
replication {
automatic = true
}
depends_on = [google_project_service.google_api]
}
resource "google_secret_manager_secret_version" "secret_version" {
for_each = local.secrets
secret = google_secret_manager_secret.secret[each.key].id
secret_data = each.value
}

View file

@ -2,16 +2,14 @@ variable "project-id" {
type = string
}
variable "billing-account" {
type = string
}
variable "billing-budget" {
type = number
type = number
default = "25000"
}
variable "billing-admins" {
type = map(any)
type = map(any)
default = {}
}
variable "region" {

View file

@ -1,7 +1,5 @@
project-id = ""
billing-account = ""
billing-budget = 25000
billing-admins = {"test": "test@test.com"}
#billing-admins = {"test": "test@test.com"}
#linux-agents-machine-type = "e2-standard-8"
#linux-agents-count = 1