Infrastructure as Code (IaC) and GitOps: The Foundation of Modern Platforms

Teddy Aryono 20 min read 11 Feb, 2026

In the previous post, we explored internal developer portals as the interface layer that provides developers with unified access to platform capabilities. But what powers those capabilities? How do platforms actually provision infrastructure, deploy applications, and maintain consistency across environments?

The answer lies in two foundational patterns: Infrastructure as Code (IaC) and GitOps. These aren’t just technical implementations - they represent fundamental shifts in how we manage infrastructure and application deployment. Understanding these patterns is essential for both building platforms and working effectively with them.

Infrastructure as Code: The Foundation

Infrastructure as Code is the practice of defining infrastructure through machine-readable files rather than manual processes or interactive configuration. But this dry definition misses the profound philosophical shift IaC represents.

The Core Philosophy

IaC applies software engineering principles to infrastructure management. The insight: infrastructure should be treated like application code.

Version Controlled - Every infrastructure change is tracked in Git. You can see who changed what, when, and why. You have complete history and can roll back to any previous state. Infrastructure changes become auditable and reversible.

Reviewed - Infrastructure changes go through pull requests with code review. Multiple people examine changes before they reach production. The review process itself becomes documentation of decisions and rationale.

Tested - You can validate infrastructure definitions before applying them to production systems. Unit tests check individual modules. Integration tests create real infrastructure in test environments. Policy tests ensure compliance requirements are met.

Reproducible - The same definition creates identical infrastructure every time. No more “it works on my machine” for infrastructure. You can confidently recreate entire environments from definitions. Disaster recovery becomes: run the IaC.

Self-Documenting - The code itself is documentation that’s always up-to-date. Want to know how production is configured? Read the Terraform files. They can’t be out of sync with reality because they are the source of truth.

This shift from imperative (run these commands in this order) to declarative (here’s what the end state should look like) is transformative.

Declarative vs Imperative: A Critical Distinction

This distinction is fundamental to understanding how IaC works:

Imperative approaches specify the steps to achieve the desired state. You’re telling the system HOW to build what you want:

 1# Create VPC
 2aws ec2 create-vpc --cidr-block 10.0.0.0/16
 3
 4# Wait for VPC to be available
 5aws ec2 wait vpc-available --vpc-ids vpc-123
 6
 7# Create subnet
 8aws ec2 create-subnet --vpc-id vpc-123 --cidr-block 10.0.1.0/24
 9
10# Create route table
11aws ec2 create-route-table --vpc-id vpc-123
12
13# Associate route table with subnet
14aws ec2 associate-route-table --subnet-id subnet-456 --route-table-id rtb-789

Problems with this approach:

If something already exists, you get errors or duplicates
You need complex logic to handle existing resources
No automatic cleanup of resources you no longer need
Difficult to understand current state by reading commands
Running the script twice produces different results than running it once

Declarative approaches specify the desired end state. You’re telling the system WHAT you want:

 1resource "aws_vpc" "main" {
 2  cidr_block = "10.0.0.0/16"
 3  
 4  tags = {
 5    Name = "main-vpc"
 6  }
 7}
 8
 9resource "aws_subnet" "main" {
10  vpc_id     = aws_vpc.main.id
11  cidr_block = "10.0.1.0/24"
12  
13  tags = {
14    Name = "main-subnet"
15  }
16}
17
18resource "aws_route_table" "main" {
19  vpc_id = aws_vpc.main.id
20  
21  tags = {
22    Name = "main-route-table"
23  }
24}
25
26resource "aws_route_table_association" "main" {
27  subnet_id      = aws_subnet.main.id
28  route_table_id = aws_route_table.main.id
29}

Benefits of declarative:

The tool figures out HOW to achieve the desired state
If you run this twice, it sees everything exists and does nothing
If you change the CIDR block and run it again, the tool knows what to update
The definition is readable - you can understand the infrastructure by reading the code
Adding or removing resources is straightforward

Most modern IaC tools are declarative: Terraform/OpenTofu, CloudFormation, Pulumi (supports both), Kubernetes manifests, Crossplane. This declarative approach enables powerful patterns like drift detection and automatic remediation.

State Management: The Core Challenge

Declarative IaC requires understanding current infrastructure state to calculate what changes to make. This is the state problem.

Consider: You have Terraform defining 100 resources. You run terraform apply and it creates everything. Now you change one resource and run terraform apply again. How does Terraform know:

Which resources already exist?
Which resource you changed?
What needs to be updated versus left alone?
What resources should be deleted because they’re no longer in your definitions?

The answer: state file. Terraform maintains a database of what it previously created. When you run terraform apply:

Reads your desired state (the .tf files)
Reads the last known state (the state file)
Queries actual infrastructure to see if reality matches the state file
Calculates a diff (what needs to change)
Applies the changes
Updates the state file with the new state

State management introduces several challenges:

Lost State - If you lose the state file, Terraform doesn’t know what it created. Running terraform apply again might try to recreate everything (causing conflicts) or think nothing exists (orphaning resources that cost money).

Corrupted State - If state gets out of sync with reality (someone manually changed infrastructure, or a crash during apply), you have drift. Terraform’s view doesn’t match reality, leading to unexpected behavior.

Concurrent Modifications - Two people run terraform apply simultaneously on the same state. They both read the same initial state, calculate different changes, and try to update. Results in conflicts, lost updates, or corrupted state.

Sensitive Data - State files contain everything about your infrastructure, including sensitive values like database passwords and API keys. They need to be secured but also accessible to automation and the team.

Solutions to state management challenges:

Remote State - Store state in a shared location (S3, Azure Blob Storage, Terraform Cloud, Google Cloud Storage) instead of local files. Everyone and all automation access the same state. Enables collaboration and eliminates “works on my machine” for infrastructure.

State Locking - When someone starts a terraform apply, they acquire a lock. Others wait until the lock is released. Prevents concurrent modifications. Uses DynamoDB for S3 backends, Consul, or cloud-native locking mechanisms.

State Encryption - Encrypt state at rest and in transit. Sensitive values are protected. Most remote backends provide this automatically.

Separate State Per Environment - Don’t share state between dev, staging, and production. Each environment has its own state file. Limits blast radius of mistakes and allows environments to evolve independently.

Drift Detection - Regularly compare state to actual infrastructure. Alert when they diverge. Some tools (Terraform Cloud, Spacelift) can auto-remediate drift, reverting manual changes.

How Platforms Use IaC

Platforms don’t just “use Terraform” - they structure IaC in specific ways to enable self-service, enforce standards, and reduce cognitive load on developers.

Module Libraries: Encoding Best Practices

Instead of every team writing their own infrastructure code, platforms provide reusable, tested modules that encode organizational best practices:

 1# Without modules: Teams write their own S3 configuration every time
 2resource "aws_s3_bucket" "data" {
 3  bucket = "my-data-bucket"
 4  
 5  # ... 30 lines of configuration for:
 6  # - Encryption settings
 7  # - Versioning
 8  # - Lifecycle policies
 9  # - Access logging
10  # - Public access blocks
11  # - CORS rules
12  # - Bucket policies
13  # - Cost allocation tags
14  # - Compliance tags
15}
16
17# With platform modules: Standards are built in
18module "data_bucket" {
19  source = "github.com/company/terraform-modules//s3-bucket"
20  
21  name        = "my-data-bucket"
22  environment = "production"
23  
24  # Everything else is handled by the module:
25  # - Encryption enabled by default
26  # - Versioning enabled
27  # - Access logging configured
28  # - Public access blocked
29  # - Cost tags applied
30  # - Compliance controls enforced
31}

The module handles complexity and ensures consistency. Teams get:

Encryption by default (compliance requirement)
Proper access logging (security requirement)
Versioning for disaster recovery
Lifecycle policies to control costs
Appropriate tags for cost allocation
Security configurations that pass audits

When security requirements change, the platform team updates the module once. All teams using the module get the improvement automatically (when they update their module version).

Composition Patterns: Higher-Level Abstractions

Platforms combine multiple modules into higher-level capabilities:

 1# Platform provides a "web service" module that creates everything needed
 2module "web_service" {
 3  source = "github.com/company/platform-modules//web-service"
 4  
 5  name        = "payment-api"
 6  environment = "production"
 7  runtime     = "python"
 8  
 9  # The module creates:
10  # - ECS task definition with proper resource limits
11  # - Application Load Balancer with SSL termination
12  # - Auto-scaling policies based on CPU and request count
13  # - CloudWatch dashboards with key metrics
14  # - Log groups with proper retention
15  # - IAM roles following least-privilege principles
16  # - Security groups with minimal required access
17  # - Route53 records for DNS
18  # - Service discovery configuration
19  # All properly integrated and following organizational standards
20}

One module invocation creates a complete, production-ready service environment. The team doesn’t need to understand ECS, ALB, IAM policies, CloudWatch, or security group rules. The platform has encoded that knowledge in the module.

Policy as Code: Automated Enforcement

Platforms enforce security and compliance requirements through automated policy checks:

 1# Using Pulumi's policy framework
 2def s3_bucket_encrypted(args, report_violation):
 3    if args.resource_type == "aws:s3/bucket:Bucket":
 4        if not args.props.get("serverSideEncryptionConfiguration"):
 5            report_violation("S3 buckets must have encryption enabled")
 6
 7def rds_backup_retention(args, report_violation):
 8    if args.resource_type == "aws:rds/instance:Instance":
 9        retention = args.props.get("backupRetentionPeriod", 0)
10        if retention < 7:
11            report_violation("RDS instances must retain backups for at least 7 days")

Or using Open Policy Agent with Terraform:

 1deny[msg] {
 2  resource := input.resource_changes[_]
 3  resource.type == "aws_s3_bucket"
 4  not resource.change.after.server_side_encryption_configuration
 5  msg := sprintf("S3 bucket %v must have encryption enabled", [resource.address])
 6}
 7
 8deny[msg] {
 9  resource := input.resource_changes[_]
10  resource.type == "aws_db_instance"
11  resource.change.after.publicly_accessible == true
12  msg := sprintf("RDS instance %v cannot be publicly accessible", [resource.address])
13}

These policies run during terraform plan. Violations block the change from being applied. Security and compliance requirements are enforced automatically through code, not through documentation that people might ignore or forget.

Testing Infrastructure Code

Just like application code, infrastructure code can and should be tested:

 1# Using pytest with Pulumi
 2@pulumi.runtime.test
 3def test_web_service_creates_load_balancer():
 4    # Create infrastructure in test mode (doesn't actually provision)
 5    service = WebService("test-service", environment="test")
 6    
 7    # Assert load balancer was created with correct configuration
 8    def check_load_balancer(args):
 9        assert args is not None
10        assert args.internal == False
11        assert args.enable_deletion_protection == True
12    
13    pulumi.Output.all(service.load_balancer).apply(check_load_balancer)

Or using Terratest with Terraform to test against real infrastructure:

 1func TestWebService(t *testing.T) {
 2    opts := terraform.Options{
 3        TerraformDir: "../modules/web-service",
 4        Vars: map[string]interface{}{
 5            "name":        "test-service",
 6            "environment": "test",
 7        },
 8    }
 9    
10    // Cleanup after test
11    defer terraform.Destroy(t, &opts)
12    
13    // Create actual infrastructure
14    terraform.InitAndApply(t, &opts)
15    
16    // Verify outputs
17    loadBalancerDNS := terraform.Output(t, &opts, "load_balancer_dns")
18    assert.NotEmpty(t, loadBalancerDNS)
19    
20    // Test the actual infrastructure
21    http_helper.HttpGetWithRetry(
22        t,
23        fmt.Sprintf("https://%s/health", loadBalancerDNS),
24        nil,
25        200,
26        "OK",
27        30,
28        3*time.Second,
29    )
30}

Testing catches bugs before they reach production. You can test modules in isolation, validate outputs, even test against real infrastructure in ephemeral environments that are created and destroyed as part of the test suite.

IaC Tools Landscape

Different tools serve different needs:

Terraform/OpenTofu - The most widely used. Cloud-agnostic through providers (AWS, GCP, Azure, Kubernetes, and hundreds more). Large ecosystem of modules. Uses HCL (HashiCorp Configuration Language). OpenTofu is the open-source fork created after HashiCorp’s license change to BSL.

Pulumi - Infrastructure as code using real programming languages (Python, TypeScript, Go, C#, Java). Get full language features: loops, conditionals, functions, classes, testing frameworks. Good for complex logic and dynamic infrastructure.

CloudFormation - AWS-native. Deeply integrated with AWS services, often supporting new features before other tools. JSON or YAML. Free to use. Limited to AWS.

AWS CDK - Write infrastructure in programming languages that generates CloudFormation. Higher-level constructs (“create a load-balanced Fargate service”) compile to CloudFormation. Gets AWS feature support quickly since it uses CloudFormation underneath.

Crossplane - Kubernetes-native infrastructure management. Define infrastructure as Kubernetes custom resources. Infrastructure becomes part of your k8s cluster state, managed by controllers.

Ansible - Can be used for IaC but is more imperative. Better suited for configuration management than infrastructure provisioning.

The choice depends on your constraints: cloud providers, team skills, existing tooling, governance requirements, and complexity needs.

GitOps: The Workflow Pattern

Now let’s examine how changes flow through the system. GitOps is a workflow pattern that makes Git the single source of truth for both application and infrastructure state.

Core Principles

GitOps is built on four fundamental principles:

1. Declarative - Everything is described declaratively. You specify desired state, not procedures to achieve it. Kubernetes manifests, Terraform files, configuration files - all declarative.

2. Versioned and Immutable - All desired state is stored in Git. Every change is a commit. You have complete audit trail and can roll back to any previous state. Git becomes your infrastructure’s time machine.

3. Pulled Automatically - Software agents automatically pull desired state from Git and apply it to target environments. No manual kubectl apply or terraform apply. Changes happen automatically when Git changes.

4. Continuously Reconciled - Agents constantly compare actual state to desired state in Git. If they diverge (drift), the agent automatically remediates. Someone manually changes a deployment? It gets reverted to match Git within minutes.

How GitOps Works

The typical workflow:

1. Developer Makes a Change - Could be application code, could be infrastructure definition, could be configuration. Commits to Git and opens a pull request.

2. CI Runs - Automated tests, linting, security scans, policy checks. All validation happens before merge. The PR shows exactly what will change.

3. Review and Merge - Team reviews the change. After approval, PR merges to main branch. The merge is the deployment trigger.

4. GitOps Agent Detects Change - ArgoCD or FluxCD watches the Git repository. Sees the new commit within seconds (typical polling interval: 30-60 seconds).

5. Agent Applies Change - Pulls the new manifests/definitions, compares to current state, calculates what needs to change, applies the changes to the cluster or infrastructure.

6. Agent Reports Status - Updates sync status in Git (via commit status or PR comments). Updates the developer portal with deployment state. Sends notifications if configured.

7. Continuous Reconciliation - Agent periodically re-syncs (default: every 3 minutes) to catch any drift. If someone manually changes something, it gets reverted to match Git.

GitOps for Applications

The most common GitOps use case is deploying applications to Kubernetes.

Repository Structure:

my-app/
├── app/                    # Application source code
│   ├── src/
│   └── Dockerfile
├── manifests/              # Kubernetes manifests
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   └── configmap.yaml
└── .github/
    └── workflows/
        └── ci.yaml         # Build and update manifests

GitOps Flow:

Developer changes application code
CI builds new container image, tags it (e.g., v1.2.3 or sha-abc123)
CI updates manifests/deployment.yaml with new image tag
CI commits the manifest change back to Git
ArgoCD sees the manifest change
ArgoCD applies updated deployment to cluster
Kubernetes rolls out new version
Developer sees deployment status in portal

The key insight: developers never run kubectl apply. All changes flow through Git. Git is the source of truth for what should be running.

ArgoCD is the most popular GitOps tool for Kubernetes. You define an Application resource:

 1apiVersion: argoproj.io/v1alpha1
 2kind: Application
 3metadata:
 4  name: payment-api
 5  namespace: argocd
 6spec:
 7  project: default
 8  
 9  source:
10    repoURL: https://github.com/company/payment-api
11    targetRevision: main
12    path: manifests
13    
14  destination:
15    server: https://kubernetes.default.svc
16    namespace: production
17    
18  syncPolicy:
19    automated:
20      prune: true        # Delete resources not in Git
21      selfHeal: true     # Revert manual changes
22    syncOptions:
23      - CreateNamespace=true

ArgoCD watches the manifests/ directory in the repository. Any changes are automatically applied to the production namespace. If someone manually changes a deployment with kubectl, ArgoCD reverts it within 3 minutes (default reconciliation interval).

FluxCD is another popular option, more focused on GitOps primitives and extensibility. Uses Kubernetes-native resources and is lighter-weight.

GitOps for Infrastructure

GitOps isn’t limited to Kubernetes - you can apply it to infrastructure:

Approach 1: Terraform + GitOps

Repository structure:

infrastructure/
├── modules/              # Reusable Terraform modules
│   ├── vpc/
│   ├── rds/
│   └── eks/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   └── terraform.tfvars
│   ├── staging/
│   │   ├── main.tf
│   │   └── terraform.tfvars
│   └── production/
│       ├── main.tf
│       └── terraform.tfvars
└── .github/
    └── workflows/
        └── terraform.yaml

Workflow:

Changes to infrastructure code are committed to Git
CI runs terraform plan on pull requests, shows what would change
After merge, CI (or Atlantis, or a Terraform controller) runs terraform apply
Infrastructure changes are applied automatically
State is stored remotely and locked during operations

Approach 2: Crossplane (Kubernetes-Native)

Crossplane turns infrastructure into Kubernetes resources. Infrastructure definitions are Custom Resource Definitions (CRDs) in your cluster:

 1apiVersion: database.aws.crossplane.io/v1beta1
 2kind: RDSInstance
 3metadata:
 4  name: payment-db
 5  namespace: production
 6spec:
 7  forProvider:
 8    region: us-west-2
 9    dbInstanceClass: db.t3.medium
10    engine: postgres
11    engineVersion: "15"
12    masterUsername: postgres
13    allocatedStorage: 100
14    storageEncrypted: true
15    backupRetentionPeriod: 7
16  writeConnectionSecretToRef:
17    name: payment-db-connection
18    namespace: production

This definition lives in Git. ArgoCD applies it to your Kubernetes cluster. Crossplane watches for RDSInstance resources and provisions actual RDS instances in AWS. Fully GitOps-native - infrastructure is just another Kubernetes resource.

When you need to change the database (increase storage, change instance class), you update the YAML in Git, commit, and Crossplane handles the update. The Kubernetes reconciliation model applies to infrastructure.

The Reconciliation Pattern

This is the magic of GitOps: continuous reconciliation.

Traditional Approach:

Make a change
Apply it manually or through automation
Hope nothing changes it later
Reality drifts from definitions over time
Drift accumulates until the next big refactor/rebuild

GitOps Approach:

Define desired state in Git
Agent continuously ensures actual state matches desired state
Manual changes are automatically reverted
Configuration drift is impossible
System self-heals

This pattern is borrowed from Kubernetes itself. The Kubernetes control plane constantly reconciles:

You want 3 replicas of a pod
A node dies, taking a pod with it
ReplicaSet controller notices: 2 replicas != 3 desired
Controller creates replacement pod
System self-heals back to desired state

GitOps applies this reconciliation pattern to everything, not just pod replicas. Applications, infrastructure, configuration, policies - all continuously reconciled against Git.

How GitOps Connects to Platforms

Platforms use GitOps as the mechanism for applying changes safely and reliably:

Self-Service Infrastructure Provisioning:

Developer uses portal to request a database
Portal creates a Git commit with database definition (Terraform or Crossplane resource)
Portal opens pull request automatically (or commits directly if review isn’t required)
Team reviews (if policy requires) and merges
GitOps agent sees the merge, applies the change
Database is provisioned
Portal queries status and shows the new database in service catalog

Application Deployment:

Developer merges code to main branch
CI builds container image, runs tests
CI updates manifest repository with new image tag
GitOps agent sees manifest change, deploys new version
Portal shows deployment progress from ArgoCD status

Configuration Changes:

Developer needs to change environment variable
Updates the ConfigMap or deployment manifest in Git
Creates pull request
After review and merge, GitOps applies it
No manual kubectl commands needed

Rollbacks:

New deployment causes issues
Team reverts the Git commit that changed the image tag
GitOps automatically rolls back to previous version
Or, team uses ArgoCD UI to roll back to previous revision
ArgoCD updates Git to reflect the rollback

The platform orchestrates these workflows. Git provides the audit trail and source of truth. GitOps agents provide the execution and reconciliation.

GitOps Best Practices

Separate Application and Configuration Repositories - Application code lives in one repo (with CI/CD that builds images). Kubernetes manifests live in another repo (watched by GitOps). This separation provides:

Different permissions (developers can change code, platform team controls production manifests)
Cleaner audit trail for production changes
CI can update manifests without triggering application builds
Different review processes for application code vs. deployment configuration

Environment Promotion Patterns:

Several approaches work:

Branch per Environment - dev branch for development, staging for staging, main for production. Changes flow through branches (merge dev to staging, staging to main).

Directory per Environment - environments/dev/, environments/staging/, environments/production/. Each directory has manifests for that environment. Promotion is copying changes between directories.

Repository per Environment - Separate repos for each environment. Highest isolation, clearest audit trail, but more operational overhead.

Progressive Delivery Integration:

GitOps tools integrate with progressive delivery:

Argo Rollouts for advanced deployment strategies (canary, blue-green with automatic analysis)
Flagger for automatic canary deployments with metric-based promotion/rollback
Define rollout strategies in Git
GitOps tool manages the progressive rollout automatically
Automatic rollback if metrics degrade during canary

Policy Enforcement:

Policies can be stored in Git and applied alongside resources:

OPA policies as ConfigMaps
Kyverno policies as Kubernetes resources
Gatekeeper constraints
GitOps applies policies when it applies other resources
Policies prevent invalid resources from being created

Bringing IaC and GitOps Together

These patterns reinforce each other:

IaC defines WHAT to create - the desired infrastructure state
GitOps defines HOW changes flow - the workflow and reconciliation pattern
Developer Portal provides WHERE developers interact - the unified interface

Complete Platform Flow

Example: “Deploy a new service with database”

1. Portal Interaction - Developer fills out form in portal: service name, runtime (Python), needs database (PostgreSQL).

2. Template Execution - Portal executes template that:

Generates application skeleton code
Generates Kubernetes manifests (using templates)
Generates Terraform for database (using platform modules)
Creates Git repositories for application and manifests
Commits everything to Git

3. CI Pipeline - Triggered by initial commit:

Builds application container image
Runs tests and security scans
Pushes image to registry
Updates manifest repo with image tag

4. Infrastructure Provisioning - GitOps controller (Terraform Cloud, Atlantis, or Crossplane):

Sees database definition in Git
Runs Terraform to provision RDS instance
Creates secrets with connection information
Reports status back to portal

5. Application Deployment - ArgoCD:

Sees new manifests in Git
Creates Kubernetes resources (Deployment, Service, Ingress)
Deployment pulls container image and starts pods
Service is accessible via Ingress
Reports sync status

6. Portal Visibility - Developer sees in portal:

Service registered in catalog
Link to Git repository
CI build status
Deployment status from ArgoCD
Infrastructure status from Terraform
Running pods from Kubernetes
Metrics and logs

The platform orchestrates this entire workflow. IaC provides the execution engine for infrastructure. GitOps provides the workflow for safely applying changes. The portal provides the interface and orchestration.

Key Takeaways

For Engineering Leaders:

IaC and GitOps are not just technical implementations; they enable audit trails, reproducibility, and safety at scale
Investment in these foundations pays dividends through reduced incidents, faster recovery, and easier compliance
These patterns work together - IaC without GitOps lacks workflow; GitOps without IaC lacks execution
The combination enables true self-service while maintaining governance

For Platform Engineers:

IaC should be structured in modules that encode organizational best practices, not just tool wrappers
State management is critical - invest in remote state, locking, and encryption from day one
GitOps reconciliation provides self-healing infrastructure that reverts drift automatically
Testing infrastructure code is as important as testing application code

For Both:

Declarative beats imperative for infrastructure - desired state is easier to reason about than procedures
Git as single source of truth provides audit trail, rollback capability, and review process
The reconciliation pattern (continuously comparing desired vs. actual state) is powerful beyond Kubernetes
These patterns enable the self-service capabilities that developers access through portals

Looking Ahead

We’ve now explored the complete stack: from platform fundamentals through architecture and abstraction philosophy, to the interface layer (developer portals) and the foundational patterns (IaC and GitOps) that power platform capabilities.

In future posts, we’ll dive deeper into specific platform capabilities - deployment and release management, observability and monitoring, data platforms - exploring how they’re built using these foundations. We’ll also examine organizational structures, team topologies, and the cultural aspects of building successful platform teams.

The technical foundations are essential, but platform engineering succeeds or fails based on how well the platform serves its users. The tools and patterns we’ve covered enable the platform; organizational design and culture determine whether it’s adopted and valued.

This is the sixth post in a series exploring platform engineering in depth. Previous posts covered platform fundamentals, SRE relationships, platform architecture, building useful abstractions, and internal developer portals.

#platform-engineering

Reply to this post by email ↪