Template Kubernetes manifests with dynamic data using Gomplate functions
TL;DR:
- TargetGroupBinding AWS Load Balancer Controler custom resource requires TargetGroup ARN to be specified
- TargetGroup ARN includes a random ID at the end of ARN to uniquely identify a target group - like
arn:aws:elasticloadbalancing:eu-west-1:<account-id>:targetgroup/<target-group-name>/ba7a3694de41e946
- To deploy multiple TargetGroupBindings user is forced to copy & paste TargetGroupARNs from AWS
- Gomplate functions can use TargetGroupName to TargetGroupARN mapping from AWS and template Kubernetes resources in git
- GitHub Actions can be used to automatically prepare a PR if new manifests are supplied or TargetGroup will be recreated on AWS
The story
The reason why the solution from this story was created in the first place stems from testing and tweaking the AWS load balancer controller. I have struggled to force it to manage only the resources I want him to control, i.e.: register only targets. But that's not a trivial task.
AWS Load Balancer Controller default behavior (and origin one) is to create and manage AWS ALB (Application Load Balancer) or AWS NLB (Network Load Balancer) in terms of:
- creating/deleting LB instance
- creating listeners and certificates
- attaching security groups
- creating target groups
- registering pod IPs to target group
Configuration for a particular load balancer that will be created can be tweaked by using annotations on the Ingress or Service object.
Most often, after Load Balancer is successfully provisioned and the DNS record points to that LB to serve traffic for a particular domain, the LB shouldn't be modified or (God please save us from that) deleted. Due to that, I realized I want the AWS ALB/NLB to be created and managed by Terraform, not the AWS Load Balancer Controller.
And here comes the first challenge.
First challenge
First of all, I have restricted AWS Load Balancer controller AWS IAM permissions so it cannot create/delete nor modify ALB/NLB, listeners, and security groups. It can only create listener rules and target groups. To force the AWS Load Balancer controller to use existing ALB/NLB it must satisfy a few requirements (not directly mentioned in the docs):
- tags on a Load Balancer must match the ones that AWS Load Balancer Controller specifies when it creates a new instance
- Load Balancer settings must match as well (like protocols, and SSL policies)
- security groups (and tags on them) and security group rules match the ones that AWS Load Balancer Controller would create
- and more I have not remembered (for sure there was a thing with certificates)
Any mismatch of the above settings (like a tag missing on a security group) will cause the controller to raise errors and not reconcile the ingress objects into listener rules.
And then I discovered a TargetGroupBinding Custom Resource.
Create infrastructure - VPC, EKS (by Terraform) -> Create AWS Load Balancer (by AWS Load Balancer Controller -> Create Route53 entry mapping domain to Load Balancer (by Terraform)
It cannot be created in one step. There is another solution to that - ExternalDNS, but I'm not entirely convinced that's the best approach.
TargetGroupBinding
Quoting:
TargetGroupBinding is a custom resource (CR) that can expose your pods using an existing ALB TargetGroup or NLB TargetGroup.
This resource allows the controller to work in a different mode - only register/deregister pod IPs to an existing target group. But it shifts the approach a little in terms of rules - they must be configured elsewhere (not by Ingress objects), i.e.: from terraform.
To sum up, Terraform will create:
- Load Balancer (ALB/NLB)
- Security Groups
- Listener
- Listener Rules
- Certificates
- Target Groups
And AWS Load Balancer Controller will register pod IPs to pointed Target Groups.
TargetGroupBinding resource requires a few things to be specified:
- service name
- service port
- target group ARN
apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
name: my-tgb
spec:
serviceRef:
name: awesome-service # route traffic to the awesome-service
port: 80
targetGroupARN: <arn-to-targetGroup>
targetType: IP
ipAddressType: ipv4
Target group ARN looks like this: arn:aws:elasticloadbalancing:eu-west-1:123456789:targetgroup/my-tgb/54add415a4352341
Now comes another challenge, how to handle the creation of this resource for hundreds of services and multiple environments (thus AWS Accounts) in a GitOps way (using for example ArgoCD) without manually copying this ARN for each service?
Solution
Pretty simple:
- TargetGroupBindings objects are created by Helm Charts (can be created standalone too)
targetGroupARN
have to be specified for Helm Chart values to create the objecttargetGroupARN
the value will be templated by Gomplate in a PR using Github Actions
Gomplate
A script for Gomplate to use target-groups.json
file as a source to get Target Group IDs:
Configuring Gomplate to use the script as a plugin:
Now it would be sufficient to call the following script:
# Get all AWS account TargetGroups as a Json file in mapping
# <target-group-name>: <target-group-arn>
aws elbv2 describe-target-groups \
--query 'TargetGroups[*].[TargetGroupName, TargetGroupArn]' --output json | \
jq 'map({(.[0]): .[1]}) | add' > target-groups.json
# Template `values.yaml.gotmpl` files into `values.yaml`
gomplate \
--input-dir values \
--include 'values.yaml.gotmpl' \
--output-map='values/{{`{{ .in | strings.ReplaceAll ".yaml.gotmpl" ".yaml" }}`}}'
Thus the following file:
will be templated into:
targetGroupARN: "arn:aws:elasticloadbalancing:eu-west-1:123456789:targetgroup/my-tgb/54add415a4352341"
Github Actions
Now let's automate that to happen every time a values
directory is modified. If no changes will be detected, PR won't be created.
First, let's specify a file with AWS account ID mapping, that will be used by Github Actions to assume a correct role.
Having that ready (and AWS OIDC already configured), let's specify Github Actions workflow:
Extras
Updating TargetGroupsBinding objects
When using ArgoCD to manage TargetGroupBinding remember to place argocd.argoproj.io/sync-options: Replace=true
annotation on the object. If a service name or any field changes, the entire object must be recreated. Any in-place updates to that object will fail in any other way.
apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
name: target-group-name
annotations:
argocd.argoproj.io/sync-options: Replace=true
(if you don't set vpcId field in TargetGroupBinding, AWS Load Balancer Controller will get it from AWS and set it for you but this breaks GitOps integrations like ArgoCD (will see a diff between desired manifest and live manifest)
Example values file: