Senior DevOps Engineer

Full Time Opportunity

Ownership
Architecture
Reliability
Observability
Security
Compliance
Communication
Documentation
Autonomy
Transparency
Collaboration
Simplicity
Troubleshooting
Scalability
Resilience

Overview

Our client is looking for a Senior DevOps Engineer to help lead a major infrastructure transformation from on-premise roots to a cloud-native, Kubernetes-first architecture on AWS. This role is responsible for owning the cloud platform layer end-to-end, spanning infrastructure, CI/CD, observability, security, and developer enablement in a live production financial environment where uptime, auditability, and security are critical. The successful person will join a small, high-autonomy engineering team, shape platform engineering conventions and tooling choices from the ground up, and simplify complex systems for the engineering teams that rely on them.

Responsibilities

Design and maintain Terraform and Terragrunt modules for multi-account AWS environments
Manage EKS clusters, Karpenter node provisioning, networking, and IAM
Drive infrastructure toward immutable, declarative patterns
Build and operate the observability stack across metrics, logs, traces, dashboards, and alerting
Define SLOs and support SLO-based alerting tied to DORA metrics
Drive incident response and contribute to HA/DR architecture for a regulated financial platform
Own CI/CD pipelines using GitLab runners, ArgoCD, and GitOps patterns
Build golden pipelines and self-service tooling that reduce friction between code merge and production
Implement policy-as-code, container scanning, supply chain security, and secrets management
Partner with security teams on audit readiness and compliance controls
Architect and deploy production EKS clusters with Karpenter for intelligent, cost-efficient node scaling
Build a Terragrunt-driven multi-account AWS landing zone with SCPs, Control Tower, and least-privilege IAM
Design GitOps deployment pipelines using ArgoCD with ApplicationSets for multi-environment promotion
Stand up a full observability stack using metrics, logs, traces, and alerting
Evaluate and implement a service mesh such as Istio or Linkerd for mTLS, traffic management, and canary deployments
Introduce supply chain security practices including image signing with Cosign, SBOM generation, and admission policies
Build an internal developer platform with self-service namespaces, templated pipelines, and environment provisioning
Contribute to architecture decision records, blameless post-mortems, internal documentation, and broader platform engineering standards
Help shape conventions, tooling choices, golden paths, and internal engineering culture from the ground up

Experience

3+ years of production Kubernetes experience
Strong hands-on experience with EKS; GKE or AKS experience is also relevant
Deep understanding of Kubernetes networking, RBAC, autoscaling, and production troubleshooting
Proven Terraform experience at scale, including module authoring, remote state management, and provider version pinning
Experience with Terragrunt and multi-account cloud patterns
Deep AWS experience across EKS, IAM, VPC architecture, Organizations, SCPs, and a broad range of AWS services
Experience designing secure, fast, developer-trusted CI/CD pipelines using GitHub Actions, GitLab CI, or equivalent
Strong security-first delivery experience, including least-privilege design, container scanning, secrets management, and shift-left practices
Strong communication skills, including the ability to explain architectural decisions clearly and write operational runbooks
GitOps experience with ArgoCD or Flux, including ApplicationSets, progressive delivery, or Argo Rollouts
Service mesh knowledge, including Istio or Linkerd, mTLS, traffic shaping, and blue-green or canary deployment patterns
Deep observability experience across OpenTelemetry, Prometheus at scale, Grafana Loki, and SLO-based alerting
Experience with policy-as-code and admission control frameworks such as OPA, Gatekeeper, or Kyverno
Working familiarity with managed databases, especially Aurora PostgreSQL, including connection pooling, replication, and performance monitoring
Familiarity working alongside systems backed by Aurora and Redis
Platform engineering experience across Backstage, internal tooling, golden paths, and developer self-service portals
FinOps awareness, including cost tagging, Karpenter consolidation policies, Savings Plans, and Cost Anomaly Detection

Qualifications

AWS certifications related to cloud architecture, security, or DevOps are advantageous

Tools & Technologies

AWS
EKS
Kubernetes
Terraform
Terragrunt
Docker
GitLab
GitHub
ArgoCD
GitOps
ApplicationSets
Karpenter
IAM
VPC
Organizations
SCPs
ControlTower
Prometheus
Grafana
OpenTelemetry
Loki
Elastic
CloudWatch
OPA
Gatekeeper
Kyverno
Trivy
SOPS
GuardDuty
SecurityHub
SecretsManager
Istio
Linkerd
Backstage
Cosign
SBOM
Aurora
PostgreSQL
Redis
Python
Bash
Go
HCL
OIDC
RBAC
SLOs
DORA
Flux

Cape Town

Expected Salary

120 000 ZAR p/m

Work Policy

Hybrid

Team

Engineering

Industry

Software Development

Interview Process

Initial Screening: (Conducted by Oneo.)
30-Minute Call: Focused on company culture and a technical dive.
Technical Interview: To assess technical skills.
Panel Interview: Deeper dive into technical & practical knowledge.
Meet the local team.

Not looking right now?

Join our elite talent network

and never miss another opportunity.