§ 3.04 · NCA
← studies·status: in progress·phase 8.6 → 9·last update 2026-04-17

NCA Ecology — emergence under learned rules

Rules → emergence, with AI as part of the rule itself. An open-ended exploration: no target shape is declared up front; what the kernel can become is the question, and the surprises along the branches are the answer.

// 0 · why constraint geometry

A neural cellular automaton is a tiny update rule (≈8K parameters) applied locally to every grid cell, run for many steps, trained end-to-end through the rollout. The interesting design surface is the loss. The kernel learns whatever physics the loss asks it to preserve.

What makes this path different from a Houdini PDE solver is that the loss can target objects an Euler–Lagrange machinery cannot express: trajectory statistics, cross-scale information, learned perceptual losses. SGD reaches them. Closed-form variational physics does not.

What follows reads as a lab notebook. Each card is one direction the constraint geometry was pushed in, kept whether or not it converged. The page is messy on purpose; a tidied-up version would lose the side-trips, and the side-trips are where the surprises live: a trivial solution that turns out visually rich (§ 1.04), a transient window of structure the loss curve never noticed (§ 1.03), an attractor that only appears at one specific resolution (§ 4.03). The pattern that emerges across the cards is the actual finding. The same small loss family produces dunes, mycelium, vertical wires, labyrinths, coffee-oil holes, and tree bark, each at its own bend in the design space.

constraint geometry
────────────────────────────────────────────────────────────────

   classical PDE                    this project
   ──────────────                   ────────────────────
   write the equation               write the constraints
   integrate                        SGD searches a rule
                                    that satisfies them all

   reaches                          reaches
   = equations humans have          = any autodiff-friendly
     written down                     function, including
                                      PDE-unwritable ones:
                                       · trajectory statistics
                                       · cross-scale information
                                       · topological invariants
                                       · learned perceptual losses
§ 1 · early loss design

Texture field

Pre-architectural-conservation runs. Loss-as-physics: pick three terms, watch what the kernel does. Most outputs are dunes and fingerprints; a handful of side-runs find something stranger.

§ 1.01 · conservation_physics_005·stable · uninteresting

Dune baseline — three-term loss

First loss that produced a non-trivial steady state. Three terms: soft mass conservation, multi-scale spatial heterogeneity (encourage structure), anti-stasis (penalise dead frames).

L = λ₁·(ΣΔM)² − λ₂·Var(M; scales 2,4,8) − λ₃·‖Mₜ − Mₜ₋₁‖

Result: fingerprint/dune at every λ ratio. Sets the ceiling of what residual-ΔM kernels with this loss family can reach. Used as the baseline that every later card compares against.

005 · 2k
005 · 5k
005 · 10k
003 · 3k
003 · 5k
002
006
§ 1.02 · advection_diffusion_020·stable · clean

Diagonal fingerprint — naming the channels

Channels are given physical identity (M = mass, V = velocity). The conservation term becomes an advection–diffusion tension pair: linear advection wants to grow structure, quadratic Dirichlet caps it. Small ∇M is advection-dominated, large ∇M is Dirichlet-dominated.

L_phys = λ_adv·‖∇·(M·V)‖ + λ_dir·‖∇M‖² ; λ_adv = 1, λ_dir = 0.1

First time the kernel learned a non-trivial physics over named channels. Pattern is cleaner than the dune baseline because the velocity field picks a direction; the diagonal hatching is stable across the entire training window.

020 · 2k
020 · 4k
020 · 6k
020 · 8k
020 · 10k
018
022
016
024
026
§ 1.03 · conservation_physics_008·transient · interesting in window

Tents → vertical texture → collapse

Same loss as 1.01 but trained 2× longer (20k steps instead of 10k). What the loss curve hides: the kernel transiently invents two structures before pool collapse erases everything.

Steps 700-3000: triangle/tent structures (early dune-formation phase, never noticed at 10k cutoff). Steps 8400-11200: a different attractor with vertical hatching that conservation_005 never reached. Past step 14000 every loss term decays to zero and the field goes uniform. Pool diversity collapses faster than the rule can find something stable. The interesting visual structure exists only in the window between two failures.

0.7k · tent
1k · tent
2k
3k
5k
7k
8.4k · vertical
9k · vertical
10k
11.2k
14k · decay
20k · dead
§ 1.04 · dissipative_physics_004·trivial static solution · visually best in this section

Mycelium mesh — degenerate but rich

Adds a radial source/sink injection at every NCA step. After step 1000 the kernel learns a precise static compensation function (delta ≈ −source + sink); conservation and anti-stasis both decay to zero and only heterogeneity is still gradient-active.

Technically a degenerate solution. The kernel doesn't actually do dynamics. But because the source/sink field is spatially non-uniform, the static compensation it learns is also non-uniform, and heterogeneity decorates that scaffold with two visually distinct attractors: bright blob clouds (mode A) and dark mycelium-mesh patches (mode B). Same training, alternating per pool state. The most-photographed run of this phase, even though the rule is doing the wrong thing for the right reasons.

004 · 0.5k
004 · 1k
004 · 2k · A
004 · 3k · B
004 · 4k
004 · 5k
004 · 6k
004 · 7k
004 · 8k · mesh
004 · 9k
004 · 10k
003 · 5k
002 · 5k
rollout · 256²
§ 1.05 · physical_channels_009·failure · visually striking

Radial dots — mass over-concentration

A try at a stronger spatial-heterogeneity term inside the physical-channels architecture. Loss gets eaten by point-like attractors. Mass over-concentrates into bright dots and short radial streaks on a near-empty background; every gradient is shouting in the same direction.

Counted as a failure because the heterogeneity is fake (it's variance from the empty background, not from real structure). Kept here because the radial-dot phase by itself is a useful visual that the rest of the project hasn't reproduced any other way.

009 · 1k
009 · 3k
009 · 5k
009 · 7k
009 · 10k
007
008
012
014
§ 1.06 · source_sink_baseline_012·trivial · pass-through

Field-chasing — the kernel memorises the input

Source/sink without the strict-conservation guard from 1.04. The cheapest way to satisfy the loss is to copy the input field back out, and the kernel converges exactly there. A bright disk that traces the radial source.

Mid-training (steps 6000-8000) it briefly explored half-dune patches in the high-flux regions. Those frames are the only thing worth keeping from this run.

012 · 1k
012 · 3k
012 · 5k
012 · 6k · patches
012 · 7k · patches
012 · 8k
012 · 10k · trivial
010
008
§ 2 · phase 8.5

Free-energy loss

Replace the tension-pair with an explicit Ginzburg-Landau free energy. Buys scale invariance: the same 64² checkpoint runs unchanged at 1024². The visual quality stays mid until the next section adds multi-scale perception.

§ 2.01 · phase85_gl·partial · scale-invariant lines

Ginzburg-Landau labyrinths

Different loss family. Replace the tension pair with an explicit free-energy functional and a width-setting Dirichlet term:

F = U(M) − T·H(M) + γ·M²(1−M)² + (κ/2)·‖∇M‖²

The double-well γ-term creates two phases; the Dirichlet κ-term sets the interface width. Sweep over dw / gr / T. Most variants look mid; a few of them (dw=20, gr=10) lock into stable labyrinths with multi-pixel walls. Bonus: the same 64² checkpoint runs unchanged at 1024², because line widths are scale-invariant.

dw20 gr10
dw20 gr5
dw10 gr5
dw50 gr25
dw only 10
dw only 20
dir only 0.5
dir only 2
GL phase-field sweep · red = single-term controls, blue = combined
§ 2.02 · energy_landscape_diagnostic·design note

What each loss term does

Six-panel reference plot from when the GL loss was being put together. Top row: the bare free energy, the M²(1−M)² double-well barrier, and the logistic reaction term (which is in the dynamics, not the loss). Bottom row: how adding the double-well term builds wells, how Dirichlet sets the interface width, and the two terms combined giving a finite-width interface energy. Useful as a sidebar; not visually special.

energy-landscape diagnostic · matplotlib
§ 3 · phase 8.6

Multi-scale perception — coffee-oil

DyNCA-style perception pyramid (DoG at scales 1, 2, 4, 8, 16) applied to the GL loss. Worm width scales with the field. The best-looking attractor in the project came out of this section. Most of the surrounding ablations confirm how narrow the good window is.

§ 3.01 · phase86_coffee_oil·best attractor of the project

Coffee-oil — multi-scale perception

Add a DyNCA-style perception pyramid: the same DoG kernel applied at scales 1, 2, 4, 8, 16, summed before the MLP. Same loss family as 2.01, same kernel size; the worm width scales from ~10px at 64² baseline to ~30-60px at 256² pyramid.

The phase-separation indicator f₀+f₁ climbs from 0.20 at the single-scale baseline to 0.65 at 256² pyramid. Currently the most cinematic attractor in the project.

pilot 64²
g128 pyramid
g256 pyramid
no-grad (control)
low-grad 0.25
resolution scan · 64² → 256² with DoG pyramid
§ 3.02 · phase86_coffee_oil ablation·control sweep

GL-strength ablation — accidentally tree bark

Same 64² pilot but with the gradient term scaled down (0.25× and 0×). The full GL pilot gives round coffee-oil holes. Cutting gradient weight by 4× gives fewer, longitudinal holes. Removing it entirely gives jagged tree-bark fragments. Scale-transferring up to 128² breaks down. The bottom row is the M-S (mean–subtract) reference for comparison, which gives the egg-cell look.

GL-strength ablation · grid 64² + scale-transfer at grid 128²
§ 3.03 · phase86_stage_a·convergent · narrow design surface

Pyramid scale × σ_blur sweep

Holding the coffee-oil loss fixed, sweep the perception pyramid scale set and σ_blur. All seven variants land in roughly the same attractor (fine intricate worms), with minor differences in line density. Useful negative result: once the loss is right, the pyramid shape is not very sensitive.

a0 · 1,2,4,8
a1 · +16
a2 · +32
a3 · skip 8
b1 · σ 4
b2 · σ 8
b3 · σ 16
Stage A pyramid scan · warm-start from ng128_sb2_dw6_gr0.1
§ 3.04 · phase86_nograd_multiscale·failure family · useful as control

No-gradient multi-scale zoo

Dropping the Dirichlet gradient term while keeping the multi-scale perception. Without the gradient term, the loss stops being a free-energy minimiser and reduces to a multi-scale heterogeneity maximiser. Whole zoo of attractors emerges, none with the cinematic line-quality of 3.01. Worth keeping for the variety: the σ_blur and double-well-strength axes are the only things the rule can hold onto, and they bend the texture in oddly specific ways.

sb2 dw4
sb2 dw6
sb3 dw4
sb3 dw6
sb4 dw4
sb4 dw6
ng128 sb2 dw6
ng128 sb3 dw4
ng256 sb2 dw6
ng256 sb3 dw4
no-gradient multiscale summary · σ_blur × dw_strength
§ 4 · phase 8.6

Environment as source/sink

The frontier of the project as of April 2026. Inject a fBM environment field at every NCA step and ask whether the rule conditions its trajectory on it. Inference-only injection fails; training-time injection works and gives the first conditioned-trajectory evidence.

§ 4.01 · phase86_env_init·negative result

Inference-only injection — the rule erases it

Train without environment, then inject a fractional-Brownian-motion field at inference (R = relu(R + α·fBM) once per step). The hope: the trained kernel uses R as a passive source/sink and the output texture conditions on it.

The kernel erases the injected field within ~45 steps. End state is indistinguishable from training distribution. Heterogeneity has to enter at training time. Stage A is a global attractor.

init · α·fBM
final · erased
alt seed
§ 4.02 · phase86_env_train_64·first conditioned-trajectory evidence

Training under environment — four α, four attractors

Now inject fBM at every NCA step during training. Four warm-started checkpoints diverge as α scales 0.001 → 0.008. The first literal evidence that the same learned rule traces different trajectories under different ambient fields.

R ← relu(R + α·fBM) ; α ∈ {0.001, 0.003, 0.005, 0.008}
α = 0.001
α = 0.003
α = 0.005
α = 0.008
α sweep · M field (left) ↔ environment R (right)
§ 4.03 · phase86_env_train_256·frontier · best of phase 8.6

256² beauty shot — vertical wires

Same recipe at 256². α = 0.003 lands on a dense vertical-wire attractor that the 64² runs only hinted at. Same kernel, same training loss, same family. The wider field gives the rule room to lay out finer line structure.

α = 0.003 · final
α = 0.002 · sibling
α = 0.003 · 100-frame rollout
§ 5 · side-trips

Other directions tried

Loose ends and dead-ends kept here for completeness: points in design space that were explored, found narrow, and not pursued into the main line.

§ 5.01 · flow_lenia_phase_a2·architectural conservation · narrow attractor

Flow-Lenia translation-invariant capsule

Conservation moved from a soft penalty into the architecture (reintegration tracking transport). Same tension-pair philosophy, different substrate. Result: a translation-invariant spatially-localised pattern. Top 5% of pixels hold 100% of mass; cross-position correlation 0.99+ inside the training distribution.

A clean point in design space, but everything about this attractor is the same shape; there's nothing to vary. Listed here for completeness. Every later card uses soft conservation instead.

a2 · 2k
a2 · 5k
a2 · 10k
a · 5k
a · 10k
b · 5k
b2 · 5k
// 6 · open problems
// 7 · around this — reading list & rabbit-holes

The territory this project sits inside. A casual reading list of things I've read, watched, or kept coming back to. Mix of canonical papers, Wikipedia explainers, author talks, and YouTube popularisations.

Project code lives at github.com/Boning1011/Rule-Emergence-AI ↗. Working notes inside are first-person and stay there.

this project's direct lineage
classical CA / attractors
non-equilibrium thermodynamics & pattern formation
adjacent ideas
talks, videos, related projects