Your entity types and cells are the foundation everything else sits on. Get them right and relationships, the cascade, and Actions fall out naturally. Get them wrong and you'll be reshaping a graph that real Terraform state is already projecting into — a much more expensive fix.
This page is opinionated about where to draw the lines. The full field reference lives in EntityType and Cell.
Do define an entity type for the thing you have many of and manage as a population — Tenant, TenantCluster, CustomerStack, EphemeralEnv.
Don't create an entity type for a one-off pet — the shared CI account, the single data warehouse, the bastion host.
Because Terrantula's value is making a population legible: counting, placing, capping, tearing down in order. A type with exactly one instance carries all the modeling overhead and none of the payoff. If you only ever have one, it's a pet — leave it in plain Terraform and review it line-by-line. See the cattle mindset for the dividing line.
The single most consequential modeling decision is how coarse or fine an entity type is.
Do make the entity type the thing you place, count, and govern as a unit. If you ask "how many of these are active?" or "which one should the next thing land on?", that's an entity.
Don't model every Terraform resource as its own entity type. A tenant is one entity; the dozen AWS resources its module creates are not twelve entities.
Because entities are a projection of Terraform state, not a mirror of it. The graph should answer fleet questions, not re-render your terraform state list. Aim for the level at which placement and capacity decisions happen.
If a field would have the same value across the whole population, it's probably a constant in your TF module, not an entity property. If it varies per instance and you'd query or filter on it (region, plan_tier, customer_id), it's a property. If it describes the link between two entities (a per-tenant namespace), it belongs on a relationship, not the entity.
Do declare properties with explicit type, mark the ones every instance must have as required: true, and use enum for closed sets like plan_tier: [basic, premium].
Don't leave a load-bearing field optional "to be safe", and don't model a closed set as a free-text string.
Because required + enum is validation Terrantula enforces at apply and at Action trigger time — a typo'd plan_tier fails loudly instead of silently provisioning the wrong shape. Optional fields that are really required just defer the error to runtime.
Keep the property set lean. A property earns its place if you query, filter, place, or interpolate on it. Descriptive trivia that no Action or metric reads is noise the graph has to carry forever.
Do enumerate the real lifecycle an instance moves through, and set initialState to where a fresh instance begins:
Don't collapse the lifecycle to active/inactive, and don't invent states no Action ever transitions into.
Because states are what let Terrantula answer "how many tenants are provisioning right now?" and what conditions gate on (only SuspendTenant an active tenant). A lifecycle that's too flat can't express the guards you'll want; states nobody uses are dead weight that confuse the graph view. active and failed are always available implicitly — list the rest. Order them in the sequence instances actually travel.
Do define load and capacity metrics as derived, computed from the graph:
Don't add a writable tenant_count property that an Action bumps by hand.
Because a derived metric tracks reality automatically as relationships are created and removed — it can't drift out of sync with the fleet. A hand-maintained counter is a pet masquerading as a metric: the first failed Action or out-of-band change desyncs it, and every placement decision after that is wrong. Metrics are the input to placement and constraints, so a stale one is worse than none.
Do express hard limits as constraints against a metric, at the layer they belong:
Don't leave "we cap at fifty per cluster" in a Confluence page and trust everyone to check it.
Because deprovisioning rot and capacity-as-tribal-knowledge are the exact failures Terrantula exists to kill. A constraint is enforced before the placement that would breach it — the fifty-first onboard is rejected, not discovered on the bill. A runbook is enforced by hope.
A cell groups entities of one type, ranks them for placement, and caps the group. The boundary question is: what set of candidates does "where does the next one go?" choose among?
Do draw a cell around a homogeneous, interchangeable pool — prod-clusters-us-east, prod-clusters-eu-west — where any member is a valid home for the next tenant.
Don't put dev and prod clusters in one cell, or mix us-east and eu-west if a tenant's region_preference means it can only land in one.
Because the cell is the candidate set for placement. If members aren't interchangeable, least-loaded will happily place a US tenant on an EU cluster. Split the cell along the axis that constrains placement — usually region, environment, or tier — so every member really is a valid target.
| Policy | Use when |
|---|---|
least-loaded (default) | You want to fill clusters evenly toward capacity — the usual cattle default. |
round-robin | You want even distribution regardless of current load. |
random | Placement is genuinely arbitrary and you want no hot spot. |
Do default to least-loaded and only change it for a specific reason.
Because least-loaded packs the fleet sensibly and surfaces capacity pressure where it actually is. The others are for narrower needs; reach for them deliberately.
Do cap the individual on the entity type and the fleet on the cell:
Don't rely on one layer to do both jobs.
Because the per-instance constraint (max: 50 per cluster) and the cell aggregate (sum max: 500 across the fleet) answer different questions — "is this cluster full?" versus "is the fleet full?". You usually want both: a fleet of clusters each well under its own ceiling can still hit a global budget, and vice versa.
Do add members to a cell explicitly (you choose which clusters belong) until you have a clear, stable property to derive on.
Don't reach for derived membership before you need it.
Because explicit membership is predictable — you can see exactly what's in the cell — and it's how most fleets stay. Derived membership (computed from a property value) is powerful when cluster count grows or churns, but it's an optimization to adopt once the rule is obvious, not a starting point.
kind enum. If kind: [tenant, cluster, database] lives inside one type, you've smeared three populations together. Split into three types — they have different lifecycles, metrics, and Actions.cluster_id string property on Tenant is a relationship pretending to be a property; you lose cardinality enforcement and the cascade. Model it as a runs_on relationship.Next: Relationship & cascade design → — connect the entities so teardown happens in the right order.