Actions & PR hygiene

Actions are where Terrantula stops being a read-only catalog and starts doing things — but always within the rules, and always through a pull request. This page is about designing Actions that are correct and PRs that humans can review and CI can apply without surprises.

The non-negotiable that shapes everything here:

Actions open pull requests — they never run

terraform apply An Action validates parameters, evaluates conditions, enforces constraints, and runs placement — all before the PR opens. Then its trigger commits files and opens a PR (or fires a run against a runner you already operate). Your CI applies on merge.Terrantula never executes Terraform. Design every Action as "produce a reviewable change," not "make the change."

The full Action field reference is Action; the substrate-specific trigger details are in the Triggers reference.

Scope the Action to the verb

Do match associatedWith.scope to the verb: collection for "create a new one," instance for "modify this existing one."

associatedWith:
  entityType: Tenant
  scope: collection     # OnboardTenant — make a new tenant

Don't make one mega-Action that both creates and mutates depending on a parameter.

Because scope decides where the Action surfaces: collection Actions appear at the type level ("onboard a new tenant"), instance Actions attach to each existing entity ("suspend this tenant"). One Action per verb keeps the operator's choices unambiguous and keeps each Action's PR small and single-purpose.

Gate availability with conditions

Do declare conditions so an Action only offers itself when it makes sense:

conditions:
  - field: entity.state
    operator: eq
    value: active        # only suspend a tenant that's currently active

Don't rely on the operator "knowing not to" run an Action in the wrong state.

Because conditions are evaluated before the Action is offered — a SuspendTenant on an already-suspended tenant simply isn't available. This moves a class of mistakes out of the runtime and into the model, and it documents the precondition where the next person will read it.

Declare the lifecycle at every phase

Do set onTrigger, onSuccess, and onFailure on every operation:

operation:
  type: create-entity
  entityType: Tenant
  onTrigger: provisioning   # state while the PR is open
  onSuccess: active         # state once the apply lands
  onFailure: failed         # state if the run fails

Don't leave failure state implicit or reuse the success state for failure.

Because the three phases are what make the lifecycle explicit and recoverable. onTrigger is the "in flight, PR open" state; onSuccess is reached only when the change actually applies; onFailure parks the entity somewhere a human can see it went wrong. Without a distinct onFailure, a failed apply leaves the graph claiming success — the read-only projection stops being honest.

Let placement choose the target — don't hardcode it

Do invoke a cell via recommendations and interpolate the result:

recommendations:
  - name: target-cluster
    title: Target cluster
    cell: prod-clusters
    sortBy: tenant-count
    order: asc
    required: true
# ... then reference {{ recommendations.target-cluster.* }} downstream

Don't make the operator type a cluster ID into a parameter.

Because placement is the fleet decision Terraform was never going to make. A recommendations block ranks the cell's members by a metric and proposes the best target, and the cell's constraints reject the run if the fleet is full — before the PR opens. A hand-typed cluster ID skips capacity enforcement and re-introduces the placement-by-gut you adopted Terrantula to kill.

Reference secrets — never inline a credential

Do reference secrets by name in trigger auth:

trigger:
  type: pull-request
  auth:
    type: token
    token: "{{ secrets.github-token }}"

Don't paste a token literal into the Action YAML.

Because a secret is encrypted at rest and never appears in your catalog, version control, or logs. The catalog is config you commit; a literal token there is a credential in your git history forever. Set the value out-of-band with terrantula secrets set-value after applying.


PR hygiene — make the change reviewable and safely appliable

A merged Action PR runs through your CI and applies to your infrastructure. The reviewer is a human; the applier is your pipeline. Both deserve a clean change.

Write titles and bodies a reviewer can scan

Do interpolate the meaningful facts into the PR title:

title: "Onboard {{ parameters.customer_id }} → {{ recommendations.target-cluster.name }}"

Don't open every PR titled "Terrantula change."

Because a reviewer reading a PR list should see what changed and where without opening it: which tenant, which cluster, which environment. Put the run reference in the body ({{ run.id }} is safe to embed) so the PR ties back to the ActionRun audit trail.

Keep the file change minimal and idempotent

Do commit the smallest diff that expresses the change. For a fleet that keeps one file per tenant, write that file with operation: replace. For a monolithic tfvars/yaml you don't want to refactor, use a patch:

files:
  - path: "tenants/{{ parameters.customer_id }}.tfvars.json"
    operation: replace
    content: |
      { "customer_id": "{{ parameters.customer_id }}", "region": "{{ recommendations.target-cluster.properties.region }}" }

Don't rewrite a whole shared file to add one tenant.

Because a minimal diff is a reviewable diff — the reviewer sees exactly the tenant being added, not a thousand-line reformat. patch mode (json-merge, json-array-append, yaml-key) lets a cattle workflow extend a monolithic file without refactoring to one-file-per-tenant and preserves surrounding comments and formatting, so the diff stays surgical.

Route review with labels and reviewers

Do set labels, reviewers, or teamReviewers so the PR lands in front of the right people.

Because an Action PR is a real change to production infrastructure; it should follow the same review path as any other PR to that repo. Routing it automatically means the cattle workflow doesn't bypass your governance — it feeds into it.

Wire auto-completion so runs don't hang

Do set webhookSecret (and configure the repo webhook) when you want the run to complete automatically on merge:

webhookSecret: "{{ secrets.github-webhook-secret }}"

Don't ship a pull-request trigger expecting auto-completion without the webhook in place.

Because auto-completion on merge requires the repo webhook and the server's webhook secret. Without it the ActionRun stays running until something posts to the callback URL, and the entity never transitions to active. If you don't wire the webhook, plan to complete runs explicitly.

Apply through your existing CI — don't add a Terrantula-only apply path

Do let the merged PR run through the same CI that applies every other change to that repo. For multi-stack flows that need a post-merge kick, use postMergeDispatch to fire your existing apply workflow.

Don't stand up a separate "Terrantula apply" pipeline, and never look for a way to have Terrantula apply directly.

Because the whole point is that Terrantula sits on top of the runner you already operate. Your CI, your runner, your review process stay in charge; Terrantula adds the structure on top. A parallel apply path is a second source of truth waiting to drift.

One Action, many substrates

Do keep the operation, recommendations, and conditions substrate-agnostic and let the trigger block be the only thing that knows about your runner. Swapping type: pull-request for type: terraform-cloud, atlantis, or atmos-workflow should change only the trigger.

Because the trigger is the only part that knows your runner — and substrate order is Terraform first, Atmos as a peer, OpenTofu second. Keeping the placement and lifecycle logic above the trigger means the same Action runs on whatever you happen to run, and migrating substrates is a trigger edit, not a rewrite. Use envOverrides to point dev/staging/prod at different repos or workspaces from one definition.

Track every run — it's your audit trail

Do treat the ActionRun record as the source of truth for who fired what, with what parameters, and how it ended.

Because every Action firing is recorded as an ActionRun. That's the audit trail that replaces the Slack thread and the spreadsheet — don't reinvent it elsewhere.

Common mistakes (the anti-patterns)

  • A trigger that applies. There is no terraform apply in any Action. If you find yourself wanting one, the answer is postMergeDispatch into your CI, not direct apply.
  • Hardcoded placement. A cluster ID in a parameter skips capacity enforcement. Use recommendations.
  • A literal token in the YAML. Credentials live in secrets, not in committed config.
  • Giant rewrite diffs. Reformatting a shared file to add one tenant makes the PR unreviewable. Use patch or one-file-per-tenant.
  • Auto-completion with no webhook. The run hangs running and the entity never goes active.

Next: Multi-tenancy & self-hosting → — run the cattle wedge for many teams, on your own infrastructure.