">
 

Terraform + Terragrunt + Ansible: A Hands-On Learning Journey

Iniciado por joomlamz, 25 de Maio de 2026, 13:00

Respostas: 1   |   Visualizações: 15

Tópico anterior - Tópico seguinte

0 Membros e 1 Visitante estão a ver este tópico.

Olá, comunidade de webmastersmz.com! Vamos mergulhar no mundo da automação de infraestrutura com Terraform, Terragrunt e Ansible. Essas ferramentas são fundamentais para qualquer equipe de DevOps que busca simplificar e padronizar a gestão de ambientes de desenvolvimento, teste e produção.

**Terraform** é uma ferramenta de código aberto que permite criar, gerenciar e provisionar infraestrutura como código. Com Terraform, é possível definir a infraestrutura desejada em um arquivo de configuração e, em seguida, executar comandos para criar e gerenciar essa infraestrutura em nuvem ou em servidores locais. Isso ajuda a reduzir a complexidade e a aumentar a eficiência na gestão de recursos de TI.

**Terragrunt** é uma ferramenta que se baseia em Terraform e fornece uma camada adicional de gerenciamento de configurações e estado. Com Terragrunt, é possível gerenciar múltiplas configurações de Terraform e manter o estado da infraestrutura de forma centralizada. Isso é especialmente útil em ambientes complexos com muitas equipes e projetos.

**Ansible** é uma ferramenta de automação de configuração que permite gerenciar e configurar servidores e aplicativos de forma simples e eficiente. Com Ansible, é possível criar playbooks que definem as tarefas de configuração e execução, e que podem ser executados em múltiplos servidores de forma paralela. Isso ajuda a reduzir o tempo de configuração e a aumentar a consistência em ambientes de produção.

A combinação de Terraform, Terragrunt e Ansible oferece uma solução completa para a automação de infraestrutura e configuração de aplicativos. Com essas ferramentas, é possível criar ambientes de desenvolvimento e produção consistentes e escaláveis, e gerenciar a infraestrutura de forma eficiente e segura.

Agora, convido-vos a discutir sobre as vantagens e desvantagens de cada ferramenta, e como elas podem ser integradas em vossos projetos. Além disso, é importante considerar a segurança e a gestão de acesso em ambientes de automação de infraestrutura. Qual é a vossa experiência com essas ferramentas? Como elas têm ajudado ou dificultado o vosso trabalho?

Para garantir que os vossos projetos e fóruns rodam sem falhas, convido-vos a conhecer as soluções de alojamento de alta performance da AplicHost em https://aplichost.com. Com a AplicHost, podeis ter certeza de que os vossos sites e aplicativos estarão sempre disponíveis e seguros, graças às nossas soluções de alojamento de alta performance e suporte técnico especializado. Não hesitem em contactar-nos para saber mais sobre como podemos ajudar a impulsionar o vosso negócio online!

Terraform + Terragrunt + Ansible: A Hands-On Learning Journey



Tópico: Terraform + Terragrunt + Ansible: A Hands-On Learning Journey
Categoria: Tutoriais | Programação & Tecnologia
Idioma Principal: Português (Conteúdo de Tecnologia)

Descrição do Conteúdo / Informações:
-------------------------------------------------------------------------
I recently got interview feedback that changed how I approach learning:

"You've used these tools, but the technical depth wasn't there."

Instead of just reading documentation, I decided to build a real multi-environment infrastructure setup from scratch — dev, staging, and prod — using Terraform, Terragrunt, and Ansible. This post is a walkthrough of what I built, why each decision was made, and what I actually learned along the way.



The Problem with Single-Environment Thinking


Up until this point, my Terraform workflow looked like this:

write main.tf → terraform apply → done

That works fine for a single environment. But in a real company, code never goes directly to production. There's always a pipeline:


Dev — developers experiment here, things can break, no real users


Staging — production mirror, QA tests here before release


Prod — real users, real traffic, every mistake costs something

When you try to scale your single main.tf to three environments, three problems appear immediately.

Problem 1: Code duplication. You copy main.tf into environments/dev, environments/staging, and environments/prod. Now you have three identical files. When you add a new resource to dev, you have to manually copy it to the other two. Forget once — your environments silently drift apart.

Problem 2: State file collisions. Terraform saves the current state of your infrastructure to a file called terraform.tfstate. If all three environments write to the same S3 path, a dev apply can overwrite the prod state. Infrastructure gone.

Problem 3: No access control. Without IAM isolation, any engineer with AWS credentials can accidentally run terragrunt apply in the wrong environment.

These are the three problems this lab is designed to solve.



Project Architecture


Here's the full directory structure we're building:

terraform-ansible/
├── _base
│   ├── main.tf        # single Terraform entry point, used by all environments
│   └── modules
│       ├── ec2
│       │   ├── main.tf
│       │   ├── outputs.tf
│       │   └── variables.tf
│       ├── sg
│       │   ├── main.tf
│       │   ├── outputs.tf
│       │   └── variables.tf
│       └── vpc
│           ├── main.tf
│           ├── outputs.tf
│           └── variables.tf
├── ansible
│   ├── ansible.cfg
│   ├── group_vars
│   │   ├── env_dev.yml
│   │   ├── env_prod.yml
│   │   └── env_staging.yml
│   ├── inventory
│   │   └── aws_ec2.yml   # dynamic inventory — AWS tag based
│   ├── playbooks
│   │   └── provision.yml
│   └── roles
│       ├── common
│       │   └── tasks
│       │       └── main.yml
│       └── webserver
│           ├── handlers
│           │   └── main.yml
│           └── tasks
│               └── main.yml
└── live
├── dev
│   └── terragrunt.hcl # dev-specific values
├── prod
│   └── terragrunt.hcl # prod-specific values
├── staging
│   └── terragrunt.hcl # staging-specific values
└── terragrunt.hcl     # root config — S3 backend, state locking

The flow looks like this:

terragrunt apply (live/dev)

├── reads live/terragrunt.hcl        → generates backend.tf automatically
├── reads live/dev/terragrunt.hcl    → gets environment-specific inputs
├── runs _base/main.tf               → provisions VPC, SG, EC2
└── triggers null_resource           → runs Ansible playbook automatically



Step 1: Terraform Modules — Reusable Infrastructure Components


Modules are Terraform's way of packaging reusable infrastructure. Instead of writing the same VPC configuration in every environment, you write it once as a module and call it with different parameters.

Each module follows the same three-file pattern:


variables.tf — what inputs the module accepts


main.tf — what resources it creates


outputs.tf — what values it exposes to the caller

Here's the EC2 module as an example:

modules/ec2/variables.tf

variable "instance_type" {
description = "EC2 instance type"
type        = string
}

variable "environment" {
type = string
}

variable "subnet_id" {
type = string
}

variable "sg_id" {
type = string
}

variable "key_name" {
description = "SSH key pair name"
type        = string
}

modules/ec2/main.tf

data "aws_ami" "amazon_linux" {
most_recent = true
owners      = ["amazon"]

filter {
name   = "name"
values = ["al2023-ami-*-x86_64"]
}
}

resource "aws_instance" "main" {
ami                    = data.aws_ami.amazon_linux.id
instance_type          = var.instance_type
subnet_id              = var.subnet_id
vpc_security_group_ids = [var.sg_id]
key_name               = var.key_name

tags = {
Name        = "${var.environment}-server"
Environment = var.environment
ManagedBy   = "terraform"
Project     = "terraform-lab"
}
}

modules/ec2/outputs.tf

output "instance_id" {
value = aws_instance.main.id
}

output "public_ip" {
value = aws_instance.main.public_ip
}

The VPC and Security Group modules follow the same pattern. The key insight: modules are just functions. They take inputs, create resources, and return outputs.



Step 2: _base/main.tf — The Single Entry Point


All three environments use this exact file. It calls the modules and accepts all variable values from outside — from Terragrunt:

terraform {
required_providers {
aws = {
source  = "hashicorp/aws"
version = "~> 5.0"
}
}
}

variable "environment"   { type = string }
variable "vpc_cidr"      { type = string }
variable "instance_type" { type = string }
variable "key_name"      { type = string  default = "terraform-lab-key" }
variable "region"        { type = string  default = "eu-central-1" }

module "vpc" {
source      = "../modules/vpc"
vpc_cidr    = var.vpc_cidr
environment = var.environment
}

module "sg" {
source      = "../modules/sg"
vpc_id      = module.vpc.vpc_id
environment = var.environment
}

module "ec2" {
source        = "../modules/ec2"
instance_type = var.instance_type
environment   = var.environment
subnet_id     = module.vpc.subnet_id
sg_id         = module.sg.sg_id
key_name      = var.key_name
}

resource "null_resource" "ansible_provision" {
depends_on = [module.ec2]

triggers = {
instance_id = module.ec2.instance_id
}

provisioner "local-exec" {
command = <<-EOT
echo "Waiting for instance to be ready..."
sleep 30
cd /path/to/ansible && \
ansible-playbook playbooks/provision.yml -e "target_env=${var.environment}"
EOT
}
}

output "instance_id" { value = module.ec2.instance_id }
output "public_ip"   { value = module.ec2.public_ip }
output "vpc_id"      { value = module.vpc.vpc_id }

Notice that _base/main.tf has no hardcoded values — no instance type, no CIDR block, no environment name. Everything comes from outside. This is what makes it reusable across environments.



Step 3: Terragrunt — Solving the Multi-Environment Problem


Terragrunt is a thin wrapper around Terraform. It doesn't replace Terraform — it just removes the need to duplicate main.tf across environments by injecting environment-specific values at runtime.

Think of _base/main.tf as a function. Terragrunt calls that function with different arguments for each environment.



Root config


live/terragrunt.hcl is written once and inherited by all environments:

locals {
env = basename(get_terragrunt_dir())
# get_terragrunt_dir() returns the current directory path
# basename() extracts just the last segment: "dev", "staging", or "prod"
# so env is automatically set from the folder name — no hardcoding needed
}

remote_state {
backend = "s3"
config = {
bucket         = "your-tfstate-bucket"
key            = "${local.env}/terraform.tfstate"
region         = "eu-central-1"
encrypt        = true
dynamodb_table = "terraform-locks"
}
generate = {
path      = "backend.tf"
if_exists = "overwrite_terragrunt"
# backend.tf is generated automatically before every apply
# you never write it manually
}
}

The key field is the critical part. When you run from live/dev, local.env becomes "dev", so the state is saved to dev/terraform.tfstate. From live/prod, it goes to prod/terraform.tfstate. State isolation is automatic.



Per-environment config


Each environment only contains what's different — the input values:

live/dev/terragrunt.hcl

include "root" {
path = find_in_parent_folders()
# inherits everything from live/terragrunt.hcl
}

terraform {
source = "../../_base"
# points to the shared main.tf
}

inputs = {
environment   = "dev"
vpc_cidr      = "10.0.0.0/16"
instance_type = "t3.micro"
key_name      = "terraform-lab-key"
}

live/prod/terragrunt.hcl

include "root" {
path = find_in_parent_folders()
}

terraform {
source = "../../_base"   # same main.tf
}

inputs = {
environment   = "prod"
vpc_cidr      = "10.2.0.0/16"
instance_type = "t3.medium"   # only the values differ
key_name      = "terraform-lab-key"
}

To deploy:

# Deploy only dev
cd live/dev && terragrunt apply

# Plan all environments at once
cd live && terragrunt run-all plan

# Apply all environments at once
cd live && terragrunt run-all apply



Step 4: Ansible — Post-Provisioning Configuration


Terraform answers the question: "Does this EC2 instance exist?"

Ansible answers the question: "Is nginx installed on that instance and configured correctly?"

These are two different problems. Terraform manages infrastructure state. Ansible manages configuration state. You need both.



Dynamic inventory


Instead of hardcoding IP addresses, Ansible discovers instances by their AWS tags:

ansible/inventory/aws_ec2.yml

plugin: amazon.aws.aws_ec2

regions:
- eu-central-1

filters:
tag:ManagedBy:
- terraform
instance-state-name:
- running

keyed_groups:
- key: tags.Environment
prefix: env
separator: "_"

hostnames:
- tag:Name
- public-ip-address

compose:
ansible_host: public_ip_address
environment: tags.Environment

Any running instance tagged with ManagedBy: terraform is automatically discovered. Instances are grouped by their Environment tag — so dev instances land in the env_dev group, prod in env_prod, and so on. Even if the IP address changes after a destroy/apply cycle, the inventory stays correct.



Roles


ansible/roles/common/tasks/main.yml — runs on every instance:

---
- name: Update all packages
ansible.builtin.dnf:
name: "*"
state: latest

- name: Install base tools
ansible.builtin.dnf:
name: [git, htop, vim, wget]
state: present

- name: Create deploy user
ansible.builtin.user:
name: deploy
shell: /bin/bash
groups: wheel
append: yes

- name: Grant deploy user sudo access
ansible.builtin.copy:
dest: /etc/sudoers.d/deploy
content: "deploy ALL=(ALL) NOPASSWD:ALL"
mode: "0440"

- name: Set timezone
ansible.builtin.timezone:
name: Europe/Istanbul

ansible/roles/webserver/tasks/main.yml — installs and configures nginx:

---
- name: Install nginx
ansible.builtin.dnf:
name: nginx
state: present

- name: Start and enable nginx
ansible.builtin.systemd:
name: nginx
state: started
enabled: yes
daemon_reload: yes

- name: Create environment-specific index.html
ansible.builtin.copy:
dest: /usr/share/nginx/html/index.html
content: |
<h1>{{ app_environment }} environment</h1>
<p>Instance: {{ ansible_facts['hostname'] }}</p>
<p>IP: {{ ansible_facts['default_ipv4']['address'] }}</p>
mode: "0644"
notify: nginx restart



Playbook


---
- name: Instance provisioning
hosts: "env_{{ target_env }}"
become: true
vars:
app_environment: "{{ tags.Environment }}"

roles:
- common
- webserver

Run against a specific environment:

# Only dev
ansible-playbook playbooks/provision.yml -e "target_env=dev"

# Only prod
ansible-playbook playbooks/provision.yml -e "target_env=prod"



Idempotency test


One of Ansible's core properties is idempotency — running the same playbook twice should produce the same result. The second run should show changed=0:

# First run
ansible-playbook playbooks/provision.yml -e "target_env=dev"
# → ok=10  changed=8  failed=0

# Second run — nothing changes
ansible-playbook playbooks/provision.yml -e "target_env=dev"
# → ok=10  changed=0  failed=0

changed=0 on the second run confirms idempotency is working.



Step 5: Connecting Everything — One Command to Rule Them All


With null_resource in _base/main.tf, running terragrunt apply automatically triggers Ansible after the EC2 instance is ready:

terragrunt apply

VPC created

Security Group created

EC2 instance running

null_resource triggers (depends_on = [module.ec2])

sleep 30 (wait for SSH to be ready)

ansible-playbook runs automatically

nginx installed, configured, running

From a single command, you get a fully provisioned and configured server.



Step 6: Proving It Works — IAM Isolation & Drift Testing




IAM isolation


A dev engineer should not be able to touch prod state files. We enforce this with IAM policies:

{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::your-tfstate-bucket/dev/*"
}

The dev IAM user can only read/write to dev/* in S3. Attempting to write to prod/*:

AWS_ACCESS_KEY_ID=dev-key AWS_SECRET_ACCESS_KEY=dev-secret \
aws s3 cp test.txt s3://your-tfstate-bucket/prod/test.txt

# An error occurred (AccessDenied) when calling the PutObject operation

Human error blocked at the policy level.



Drift test


Add a new tag to modules/ec2/main.tf:

tags = {
Name        = "${var.environment}-server"
Environment = var.environment
ManagedBy   = "terraform"
Project     = "terraform-lab"    # new tag
}

Run run-all plan to see the change propagated to all three environments simultaneously:

cd live && terragrunt run-all plan

# Plan: 0 to add, 1 to change, 0 to destroy  (dev)
# Plan: 0 to add, 1 to change, 0 to destroy  (staging)
# Plan: 0 to add, 1 to change, 0 to destroy  (prod)

One file changed. Three environments updated. No manual copying, no risk of forgetting one.



Key Takeaways


After building this from scratch, here's what actually clicked for me:

Terraform and Ansible solve different problems. Terraform manages infrastructure state — "does this resource exist in AWS?" Ansible manages configuration state — "is nginx installed and running on that server?" You need both because provisioning a server and configuring it are fundamentally different concerns.

Terragrunt's value isn't magic — it's discipline. The single _base/main.tf enforces consistency. You can't accidentally configure staging differently from prod because there's only one source of truth. Configuration drift becomes structurally impossible rather than just unlikely.

IAM policy is the last line of defense. Engineers make mistakes. The cd live/prod && terragrunt apply accident will happen eventually. When it does, the question is whether your infrastructure or your IAM policy catches it first.

Idempotency is a property you verify, not assume. Running the playbook twice and checking for changed=0 isn't just a test — it's how you know your automation is actually reliable.

All code from this lab is available on GitHub. If you spot something that could be done better, I'd genuinely love to hear it in the comments.


Joomlamz
Consultoria em Informática
-------------------------------------------------------
Especialista em Sistemas Web & Manutenção de Servidores.
A desenvolver o novo AplPortal com suporte a PHP 8.
Precisa de ajuda profissional? Contacte-me.

Tags: