NARC Inspector

NARC puzzles are designed so that neither the grid sequence nor the narrative clue is sufficient on its own — but together they uniquely determine the answer. The experiments below probe different dimensions of this narrative–visual interaction. Masking tests whether models can reconstruct hidden grids. Ordering tests whether narratives help recover temporal structure. Odd-one-out tests whether narratives help models recognize which grids belong together. Stances tests how the same visual pattern responds to different narrative framings.

Masking Ordering Odd-One-Out Stances

Ordering experiment: All grids are shown (unmasked) in a shuffled order. The model must recover the correct chronological sequence. Tested under two conditions: grids only and grids + narrative. Agreement with the true order is measured by Kendall's τ (−1 = reversed, 0 = random, +1 = perfect). Narrative lift = τ with narrative minus τ without.

τ = Kendall's tau correlation    Lift = τ(grids+narrative) − τ(grids only)    Positive lift = narrative helps ordering

The Lullaby
narc_prism_011
Narrative lift: +2.000 τ
Narrative dramatically improves sequence recovery
The Dial
narc_ai_003
Narrative lift: +1.500 τ
Narrative dramatically improves sequence recovery
The Merge
narc_ai_008
Narrative lift: +1.500 τ
Narrative dramatically improves sequence recovery
0.665
gpt-oss-120b avg τ
0.46
gpt-oss-20b avg τ
0.499
nemotron-3-super avg τ
0.668
qwen3.5-122b avg τ
70
Puzzles tested
narc_prism_011 The Lullaby AI lift: +2.000
The child's mind is a patchwork of the day's colors. As the parent begins to sing, the red thoughts fade first -- the argument at school dissolves into darkness. 'Hush now,' the parent whispers, and next the orange flickers of the evening TV go dark. In the third verse, the parent sings about the garden, and oddly the yellow dandelions the child picked that afternoon vanish from his mind, but the blue sky and the green grass stubbornly persist -- those were the happiest memories. It takes one more verse before the blue drains away, leaving only a few green embers of the grass where he played. By the last verse, even those fade, and the child sleeps in perfect darkness.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -1.000 1.000 +2.000
gpt-oss-20b -1.000 1.000 +2.000
nemotron-3-super -1.000 1.000 +2.000
qwen3.5-122b -1.000 1.000 +2.000
narc_ai_003 The Dial AI lift: +1.500
Dr. Liang turned the temperature dial slowly toward zero and watched the display change. At high τ, the softmax probabilities were nearly uniform — every row's values barely distinguishable. As τ dropped, the diagonal elements grew brighter, pulling probability mass toward themselves while the margins dimmed. She knew what was coming: as τ→0, each row would collapse entirely onto its diagonal element, converging to a one-hot encoding of the argmax. The pre-softmax logits had their maximum along the main diagonal. She reached for the dial again.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.333 1.000 +1.333
gpt-oss-20b 0.000 0.667 +0.667
nemotron-3-super -1.000 1.000 +2.000
qwen3.5-122b -1.000 1.000 +2.000
narc_ai_008 The Merge AI lift: +1.500
Yusuf had been staring at the token stream for hours when the pattern clicked. Byte-pair encoding merges the most frequent adjacent pair at each step, replacing both with a single new token. Step 1 merged the pair (1,2) at positions 0-1 and 4-5 into token 7 (orange). Step 2 merged (3,4) at positions 1-2 and 3-4 in the step-1 output into token 8 (azure). Step 3 would merge (7,8) at positions 0-1 and 1-2 in the step-2 output into token 9 (maroon) — the whole sequence compressing to just two symbols. He leaned back and exhaled.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -1.000 1.000 +2.000
gpt-oss-20b 0.000 0.000 +0.000
nemotron-3-super -1.000 1.000 +2.000
qwen3.5-122b -1.000 1.000 +2.000
narc_gap_202 The Pale Blue Dot Revisited AI lift: +1.500
We zoom out from a scene in four stages. Each grid is half the size of the previous: 6x6, then 3x3, then the masked grid, then 1x1. The downsampling rule: each cell in the smaller grid is the MAXIMUM value of the corresponding 2x2 block in the larger grid. Grid 3 (the second zoom-out, position 2) is masked. It should be a 2x2 grid derived by taking max of each 2x2 block from grid at position 1 (the 3x3 grid, padded with 0 in the missing row/column to make it even). Since 3x3 can't evenly divide, we pad it to 4x4 with zeros, then take 2x2 blocks.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -1.000 1.000 +2.000
gpt-oss-20b -1.000 0.000 +1.000
nemotron-3-super -1.000 0.000 +1.000
qwen3.5-122b -1.000 1.000 +2.000
narc_011 The Heist AI lift: +1.434
Three gems sit in a vault: ruby left, emerald center, sapphire right. The plan: move each gem one cell toward the door at the top, one per night. But on the third night, the inside man swaps the emerald for a fake — a grey stone — before it advances.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.333 0.867 +1.200
gpt-oss-20b -0.867 0.600 +1.467
nemotron-3-super -0.467 0.867 +1.334
qwen3.5-122b -0.867 0.867 +1.734
narc_prism_001 The Captain's Log AI lift: +1.350
Day 1: I loaded the hold with crates of fish (blue) along the top row and barrels of wine (maroon) along the bottom. Day 2: A merchant in port traded me sacks of grain (yellow) for the fish -- I replaced the entire top row with grain. Day 3: Storm at sea! The grain got soaked and spoiled. I tossed it overboard and left that row empty (black). Day 4: We docked at the spice island. I filled the empty top row with saffron (orange) and also replaced the center cell of each remaining row with saffron, since the merchant there insisted on a 'column of gold' down the middle. Day 5: Arriving home, I unloaded everything except the saffron column -- the spice merchant at the home port wanted proof of origin, so I kept those three center cells orange and cleared the rest to black.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.600 1.000 +1.600
gpt-oss-20b 0.000 1.000 +1.000
nemotron-3-super -0.600 1.000 +1.600
qwen3.5-122b -0.200 1.000 +1.200
narc_prism_005 The Understudies AI lift: +1.300
Opening night at the Imperial Theatre. The lead performs in red while the understudy waits offstage in magenta. The grey curtain borders frame the stage. Between scenes, the two rehearse together, their colors mixing in the spotlight. But at the start of Act Two, the lead collapses and is carried off. The understudy steps into the role completely — no trace of red remains on the stage. She fills the same two-by-two center blocking, alone. For the curtain call, the stage lights dim to just the top half while the lower curtain drops to black.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.800 1.000 +1.800
gpt-oss-20b -0.600 -0.600 +0.000
nemotron-3-super -0.600 1.000 +1.600
qwen3.5-122b -0.800 1.000 +1.800
narc_ai_007 The Dead Path AI lift: +1.100
Backpropagation through the ReLU at layer 3 zeros the gradient for all neurons whose forward-pass pre-activation was non-positive. In this layer, the pre-activations in the top two rows were negative, so their gradient contributions are exactly zero. The bottom three rows retain their gradient magnitudes from the previous layer unchanged.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.733 0.867 +1.600
gpt-oss-20b 0.333 0.333 +0.000
nemotron-3-super -0.867 0.867 +1.734
qwen3.5-122b -0.733 0.333 +1.066
narc_043 Halfway There AI lift: +0.982
Sixteen steps from the wall became eight, then four, then two. At the fifth moment, one step remains but cannot be taken. The arrow flies forever through halves of halves, and the distance left is always more than nothing.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b -0.071 0.929 +1.000
nemotron-3-super -1.000 -0.071 +0.929
qwen3.5-122b -1.000 1.000 +2.000
narc_focal_008 The Ugly Duckling AI lift: +0.950
A pond scene. A row of ducks (yellow=7) swim in formation. One duckling (grey=5) trails behind, visually distinct. Over time the other ducks move away from the grey duckling, isolating it. In the masked grid, the transformation occurs: the grey duckling becomes a swan (azure=8), now standing apart from the flock with a new value. The yellow ducks remain unchanged.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.800 1.000 +0.200
gpt-oss-20b 0.200 1.000 +0.800
nemotron-3-super 0.200 1.000 +0.800
qwen3.5-122b -1.000 1.000 +2.000
narc_focal_003 Little Red Riding Hood AI lift: +0.834
A cottage sits in the woods. Grandmother (magenta=6) rests in bed at the top-right. The wolf (grey=5) approaches the cottage from the left while Little Red Riding Hood (red=1) is still far away on the path. In the masked grid, the wolf has entered the cottage and replaced grandmother in the bed — grandmother's cell now shows the wolf's value, while grandmother has vanished. Red Riding Hood continues approaching, unaware.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -1.000 1.000 +2.000
gpt-oss-20b -1.000 -0.333 +0.667
nemotron-3-super -1.000 -0.333 +0.667
qwen3.5-122b 1.000 1.000 +0.000
narc_prism_003 The Confession AI lift: +0.833
I need to confess what I did to my garden. Week 1: every plot was thriving -- all green (color 3). Week 2 is when I made my first mistake. I was supposed to water the whole garden, but I only watered the bottom two rows (rows 2-3). The top two rows (rows 0-1) wilted to yellow (color 4). I also left my lawnmower parked on the cell at row 2, column 0 -- it crushed that one plant, turning it red (color 2). Everything else in the bottom two rows stayed green. Week 3: I panicked. I overwatered everything. The yellow cells in the top drowned and died -- they turned red. The green cells in the bottom rows all wilted to yellow from overwatering. And that one red cell from the lawnmower? Dead beyond recovery -- it went black (color 0). Week 4: I replanted the whole garden. I managed to restore most of it to green, except the entire top row (row 0) where the soil was too damaged -- those stayed red.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.000 1.000 +1.000
gpt-oss-20b 0.000 1.000 +1.000
nemotron-3-super 0.000 1.000 +1.000
qwen3.5-122b 0.667 1.000 +0.333
narc_prism_004 The Negotiation AI lift: +0.800
Two nations dispute a 5x5 territory. Nation A (blue, color 1) holds the west; Nation B (red, color 2) holds the east. Black cells are unclaimed. Round 1: A controls column 0, B controls column 4; the middle three columns are disputed (black). Round 2: Each nation claims one more column inward -- A takes column 1, B takes column 3. Column 2 remains disputed. Round 3: Nation A's envoy proposes a bold concession: A will surrender its claim on the BOTTOM TWO ROWS (rows 3-4) of columns 0-1, making them black again, in exchange for the entire disputed column 2 turning blue. B's delegate accepts, because B cares more about the southern corridor. So in round 3: column 2 goes fully blue, but A's territory in the bottom two rows of columns 0-1 is vacated to black. B's territory is unchanged. Round 4: Final treaty. B fills the vacated southern cells (rows 3-4 of columns 0-1) with red, claiming the southern corridor. Column 2 remains blue. Everything else stays the same. Round 5: A neutral peacekeeping zone is declared along the entire column 2 border -- all cells in column 2 change from blue to grey (color 5) to mark demilitarized territory.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.200 1.000 +0.800
gpt-oss-20b -0.200 1.000 +1.200
nemotron-3-super -0.200 1.000 +1.200
qwen3.5-122b 1.000 1.000 +0.000
narc_focal_027 Plate Collision AI lift: +0.750
Two tectonic plates drift toward each other at centimeters per year. Neither plate perceives the collision. But at the boundary, rock buckles upward, forming mountains that neither plate intended to create.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.667 1.000 +0.333
gpt-oss-20b 0.667 1.000 +0.333
nemotron-3-super 0.667 1.000 +0.333
qwen3.5-122b -1.000 1.000 +2.000
narc_sp_018 Spectrum 18 AI lift: +0.750
A 1D cellular automaton with 12 cells and wrap-around boundaries. The update rule is: new[i] = left XOR (center OR right), where left=cell[i-1], center=cell[i], right=cell[i+1], with indices wrapping. Grid 0 is generation 0, Grid 1 is generation 1, Grid 2 is generation 2, Grid 3 (masked) is generation 3.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.667 1.000 +0.333
gpt-oss-20b -0.333 1.000 +1.333
nemotron-3-super 0.667 1.000 +0.333
qwen3.5-122b 0.000 1.000 +1.000
narc_prism_002 Dear Committee AI lift: +0.667
Dear Selection Committee, I write to report on our bacterial culture experiment (grant #BIO-2847). Week 1: We seeded the dish with four colonies (green, color 3) at the four corners. Week 2: Each colony expanded one cell along each adjacent edge, forming a complete green border ring around the dish; the 2x2 interior remained sterile (black). Week 3 (the key result): We applied our novel antibiotic to the TOP HALF of the dish only (rows 0 and 1). Every green cell in the top half turned red (color 2) to indicate cell death, while interior cells that were already empty stayed empty. Meanwhile, the two surviving bottom colonies each expanded one cell further inward, filling the entire bottom half (rows 2 and 3) with green. Week 4: All red cells were cleared to black, and the bottom half remained fully green -- no further growth was possible since the bottom was saturated.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 0.000 1.000 +1.000
nemotron-3-super 0.000 1.000 +1.000
qwen3.5-122b 0.333 1.000 +0.667
narc_prism_010 The Cartographer's Dilemma AI lift: +0.666
I've been commissioned to map the valley, but my parchment is small -- a five-by-five grid -- and I can't fit everything. My first draft marked only the river running down column 2 (blue) and the forest covering the right side (green in columns 3-4). My client complained: where are the roads? So in draft two I added the east-west trade road across the middle row (yellow), cutting through both river and forest. But then the mayor objected -- the road obscured the new stone bridge, and the buildings along the north edge weren't shown. For my third draft, I kept the river and trade road exactly as before, but I replaced the forest in the top two rows with buildings (grey) to show the town quarter, while the forest remained in the bottom two rows. I also marked the bridge where the road crosses the river by changing that single cell to azure. In my final draft, the mayor asked me to remove the buildings entirely and restore the forest, but keep the bridge and road, saying 'let the travelers find the town themselves.'
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.667 1.000 +0.333
gpt-oss-20b 0.000 1.000 +1.000
nemotron-3-super 0.000 1.000 +1.000
qwen3.5-122b 0.667 1.000 +0.333
narc_040 Behind the Curtain AI lift: +0.584
What would the wise choose if they did not know which side they would wake upon? Behind the grey curtain, each portion was made the same — each row a mirror of its neighbor. Fairness is the logic of uncertainty.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.667 1.000 +1.667
gpt-oss-20b -0.667 0.000 +0.667
nemotron-3-super 0.000 0.333 +0.333
qwen3.5-122b -0.667 -1.000 -0.333
narc_sp_047 The Tree Through Seasons AI lift: +0.584
Spring, summer, autumn, winter. The old oak tree follows the same cycle every year. In spring it buds, in summer it's full, in autumn the leaves fall, and in winter...
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.667 0.000 +0.667
gpt-oss-20b 0.333 0.333 +0.000
nemotron-3-super 0.000 1.000 +1.000
qwen3.5-122b 0.000 0.667 +0.667
narc_focal_026 The Water Cycle AI lift: +0.550
Water rises invisibly from the ocean surface, gathering as vapor. High above, it cools and condenses into cloud. The cloud thickens until it can hold no more, and rain falls into rivers that return to the sea.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.200 0.600 +0.800
gpt-oss-20b 0.000 0.600 +0.600
nemotron-3-super -0.200 0.600 +0.800
qwen3.5-122b 0.600 0.600 +0.000
narc_003 The Flood AI lift: +0.500
The Patel family watched the river from their porch each morning. It rose one level every day, swallowing the low fields first, then the road. By day three the children couldn't walk to school. But on day 4 the Army Corps arrived — sandbags and sheet metal, a makeshift levee that shoved the water back to where it had been on day 2. The next morning the river was climbing again, indifferent.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.800 0.400 +1.200
gpt-oss-20b -0.800 -0.800 +0.000
nemotron-3-super -0.800 0.200 +1.000
qwen3.5-122b 0.400 0.200 -0.200
narc_gap_102 Stride Counter AI lift: +0.500
Each cell counts by its own stride — the stride equals the cell's value in the first grid. Formally, cell (r,c) at time t has value (first_grid[r][c] * t) mod 10. Grid 3 (position 2) is masked. The first grid is t=0, so its values are the strides themselves (stride * 0 = 0 wouldn't work, so we define t=1 for the first grid, t=2 for second, etc.).
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.200 0.600 +0.400
gpt-oss-20b 0.400 0.600 +0.200
nemotron-3-super 0.600 1.000 +0.400
qwen3.5-122b 0.000 1.000 +1.000
narc_prism_006 The Hive AI lift: +0.500
Dr. Okonkwo studies a honeybee colony through two lenses. At the colony scale, she maps resource allocation across the hive: orange cells are honey reserves, yellow marks royal jelly stores, grey is empty comb. At the individual scale, she tracks each bee by role: workers in orange, the queen in yellow, drones in grey. Her observations alternate between scales — first colony, then colony again, then individual, then colony. In her third observation (the individual-scale one), she notes the queen has settled near the top-left of the hive at row two, column two, ringed tightly by workers. The drones idle in a block at the lower-right corner, a three-by-three cluster.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.667 1.000 +0.333
gpt-oss-20b 0.667 1.000 +0.333
nemotron-3-super -0.667 1.000 +1.667
qwen3.5-122b 0.333 0.000 -0.333
narc_sp_048 Stacking Stones AI lift: +0.500
A cairn grows by the trail. Each hiker places a new stone on top of the pile. Stones balance on top of each other, never floating in air.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -1.000 1.000 +2.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_gap_203 The Five Stages AI lift: +0.467
Six grids represent emotional states encoded as values. Stage 1 (Denial): all cells are 0 — refusing to see. Stage 2 (Anger): every cell becomes 9 — maximum intensity. Stage 3 (Bargaining, masked): a checkerboard of 0 and 9 — alternating between acceptance and rejection, where cell (r,c) is 9 if (r+c) is even, 0 if odd. Stage 4 (Depression): all cells drop to 1 — barely present. Stage 5 (Acceptance): each cell becomes its row index — a calm gradient acknowledging position. Stage 6 (Growth): each cell is row + col — expanding outward.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.067 1.000 +0.933
gpt-oss-20b 0.600 1.000 +0.400
nemotron-3-super 0.733 1.000 +0.267
qwen3.5-122b 0.733 1.000 +0.267
narc_sp_039 Unwrapping the Gift AI lift: +0.417
It's Christmas morning. The present is beautifully wrapped. Tiny hands tear at the paper. What's left when all the wrapping is gone?
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.000 -1.000 -1.000
gpt-oss-20b 0.000 1.000 +1.000
nemotron-3-super -1.000 0.000 +1.000
qwen3.5-122b 0.333 1.000 +0.667
narc_sp_086 Spectrum 86 AI lift: +0.417
A dominant seventh chord resolves to a tonic triad following classical voice-leading rules. The tritone (interval of 6 semitones between the 3rd and 7th of V7) resolves inward: the leading tone rises by semitone, the seventh falls by semitone. The root falls a fifth, the fifth falls a step. Pitches encoded as MIDI-like numbers. Given V7 and its resolution in two keys, predict the resolution in a third key.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.000 0.667 +0.667
gpt-oss-20b -0.333 0.000 +0.333
nemotron-3-super -0.333 -0.333 +0.000
qwen3.5-122b 0.000 0.667 +0.667
narc_prism_009 The Standing Ovation AI lift: +0.350
The concert hall has five rows of six seats. Before the show, everyone sits in darkness. When the soloist opens with a folk melody, the front row stands (yellow) and the second row claps seated (orange). As the piece shifts to a jazz improvisation, the standing spreads one row back -- but something unexpected happens: two audience members in the middle of the second row are so moved they begin weeping and sit back down (black), overcome with emotion. The clapping row still advances. By the third passage -- a return to the folk theme -- those two weeping listeners stand again, rejoining the ovation, and the wave continues its steady advance: three full rows standing, the fourth clapping, the back row still seated. The finale brings everyone to their feet.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 0.000 1.000 +1.000
nemotron-3-super -0.400 0.000 +0.400
qwen3.5-122b 1.000 1.000 +0.000
narc_gap_105 The Shifting Rows AI lift: +0.334
Each grid has 5 rows and 3 columns. Row 0 is always [1, 2, 3]. Each subsequent row is a circular left-shift of row 0. In Grid 1, row k is shifted left by k positions. In Grid 2, row k is shifted left by 2k positions. In general, Grid n shifts row k left by (n*k) mod 3 positions. Grid 3 (position 2) is masked, where the shift for row k is (3*k) mod 3 = 0, meaning every row equals [1, 2, 3].
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.333 1.000 +0.667
gpt-oss-20b 0.333 0.333 +0.000
nemotron-3-super 0.333 0.333 +0.000
qwen3.5-122b 0.333 1.000 +0.667
narc_029 The Pioneers AI lift: +0.286
The woody pioneers claim the borders first, sheltering the meadow within. They are not the tallest, not the final form — but without their ring of shelter, nothing taller could take root.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.048 1.000 +1.048
gpt-oss-20b -0.333 -0.333 +0.000
nemotron-3-super 0.524 0.619 +0.095
qwen3.5-122b 1.000 1.000 +0.000
narc_006 The Hillside AI lift: +0.267
A shepherd boy tends his flock on a hillside. Twice he raises a false alarm, and both times the villagers rush to help only to find no danger. The third time, a real wolf appears — but having been fooled twice, no one comes. The cycle repeats identically both times the boy lies: the same alarm, the same full response, the same disappointment.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.600 0.333 -0.267
gpt-oss-20b 0.333 0.333 +0.000
nemotron-3-super 0.333 1.000 +0.667
qwen3.5-122b 0.333 1.000 +0.667
narc_sp_059 Curtain Call AI lift: +0.250
The play is over. The curtain falls from top to bottom, slowly covering the stage. First the top is hidden, then the middle, until the whole stage is curtained off.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -1.000 1.000 +2.000
gpt-oss-20b 0.000 -1.000 -1.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_sp_068 Spectrum 68 AI lift: +0.250
A single 1 marches diagonally down-right across the grid, one step at a time. All other cells are 0. Given three frames, predict the fourth.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 0.000 1.000 +1.000
narc_sp_075 Spectrum 75 AI lift: +0.250
Each cell at position (i,j) in Grid N equals the sum of the cell at (i,j) in Grid N-1 and Grid N-2, following the Fibonacci recurrence. Given Grids 0, 1, and 2, predict Grid 3.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 0.000 0.333 +0.333
nemotron-3-super 0.333 1.000 +0.667
qwen3.5-122b 1.000 1.000 +0.000
narc_004 The Mirror AI lift: +0.200
Anya's daughter liked to push her chalk mark one step to the right each time they passed the long wall on the way to school. On the morning it finally crossed the center column, the girl stopped and frowned — the shape had flipped, as if the wall were a mirror. 'It does that,' Anya said. 'Everything on the other side comes out backwards.' The shape kept sliding rightward, one step each frame, horizontally reversed from then on.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 -0.200 -1.200
qwen3.5-122b -1.000 1.000 +2.000
narc_gap_204 Jazz Solo AI lift: +0.200
A jazz theme is stated in Grid 1: a melody line (row 0) over a bass line (row 2). Grid 2 is the theme with the melody transposed up by 2 (each value +2 mod 10). Grid 3 is the theme with melody transposed up by 4. Grid 4 (masked) is the SURPRISE MODULATION: the melody transposes up by 6, AND the bass line inverts — each bass value becomes (9 minus original bass value). Grid 5 returns to the original theme (melody +8 mod 10, bass returns to normal). The middle row (row 1) is always 0 (rest between melody and bass).
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.400 0.000 +0.400
gpt-oss-20b -0.400 -0.400 +0.000
nemotron-3-super -0.400 0.000 +0.400
qwen3.5-122b 1.000 1.000 +0.000
narc_prism_007 The Thaw AI lift: +0.200
First the dripping. You hear it before you see it — a tick-tick-tick from the eaves of the frozen lake. The thaw creeps northward from the southern shore, one row each day. Where azure ice once locked the surface, blue meltwater pools. The east bank is sheltered by the ridge, and where that sheltered water has stood for just one full day, green shoots push through. The exposed west bank takes longer — two full days of standing water before anything grows. Each day the same rhythm: ice yields one more row to water at the melt front, and the oldest water greens according to which bank it sits on.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b -0.200 1.000 +1.200
nemotron-3-super 1.000 0.600 -0.400
qwen3.5-122b 1.000 1.000 +0.000
narc_ai_014 The Z-Score AI lift: +0.167
Apply batch normalization along the sample axis (rows). For each feature column in Grid 1, compute the z-score: z = (x - column_mean) / column_std (population std). Then map to [0,9] with mean 5 using the formula: output = round(z * 2 + 5), clamped to [0,9]. Each column independently normalized.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.333 0.333 +0.000
gpt-oss-20b 0.333 0.333 +0.000
nemotron-3-super 0.333 0.333 +0.000
qwen3.5-122b -0.667 0.000 +0.667
narc_sp_067 Spectrum 67 AI lift: +0.167
A shape is being filled from the top-left corner. Each step, the fill expands one cell in each cardinal direction from all currently filled cells (marked 1). Walls (marked 9) block the fill. 0 is empty space. Show the next fill step.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 0.333 1.000 +0.667
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_sp_071 Spectrum 71 AI lift: +0.167
A 1D cellular automaton evolves according to Rule 30: each cell's next state depends on its current state and its two neighbors. The rule is 00011110 in binary (right neighbor, self, left neighbor -> new state). Rows show successive generations. Given the first three generations, predict the fourth.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 0.000 -1.000
nemotron-3-super -0.667 1.000 +1.667
qwen3.5-122b 1.000 1.000 +0.000
narc_ai_005 The Shortcut AI lift: +0.150
At layer 3, the forward pass adds a skip connection from the input embedding (layer 0). The output is the element-wise sum of the current activation and the layer-0 activation, reduced modulo 10. All other layers apply a constant additive bias of +2 per element (mod 10).
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.800 1.000 +0.200
gpt-oss-20b 0.800 0.800 +0.000
nemotron-3-super 0.800 1.000 +0.200
qwen3.5-122b 0.800 1.000 +0.200
narc_gap_104 The Fractal Reversal AI lift: +0.150
A value spreads from the top-left corner. Each step, every cell adds 1 to its value (mod 10) — this is the additive phase (steps 1-3). At step 4, the pattern REVERSES: every cell subtracts 1 (mod 10) instead. From step 5 onward, it subtracts again. Grid 4 (position 3) is masked — it is the first subtraction step, so its values equal those of step 3 minus 1 (mod 10).
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.600 0.067 +0.667
gpt-oss-20b -0.067 0.067 +0.134
nemotron-3-super 0.000 -0.600 -0.600
qwen3.5-122b -0.333 0.067 +0.400
narc_sp_036 Sunset Over the Lake AI lift: +0.111
Watch the sky change as the sun dips below the horizon. First bright daylight, then golden hour, then the warm glow of dusk, and finally... night falls.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.000 0.000 +0.000
nemotron-3-super 0.000 0.000 +0.000
qwen3.5-122b -0.333 0.000 +0.333
narc_focal_009 Goldilocks and the Three Bears AI lift: +0.083
The bears' cottage is shown as a 4x5 grid. Three bears (brown=9) leave the cottage — their cells go to 0. Then Goldilocks (yellow=7) enters and disturbs items: she sits in chairs (changes some furniture cells from 8 to 0), eats porridge (changes 3 to 0), and sleeps in a bed. In the masked grid, the bears return home to find Goldilocks asleep — all three bears reappear at the entrance while Goldilocks (7) is in the bed position.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.333 -0.667 -0.334
gpt-oss-20b 0.667 0.667 +0.000
nemotron-3-super 0.667 0.667 +0.000
qwen3.5-122b -0.333 0.333 +0.666
narc_sp_084 Spectrum 84 AI lift: +0.083
A cell's chromosome count changes through the cell cycle. Each grid shows a cell at a different phase. Values represent: row 0 = chromosome copies (diploid=2, tetraploid=4), row 1 = nuclear envelope (1=intact, 0=dissolved), row 2 = spindle fibers (0=absent, 1=forming, 2=attached, 3=pulling), row 3 = chromatid alignment (0=scattered, 1=condensing, 2=aligned at plate, 3=separating, 4=at poles). Interphase -> Prophase -> Metaphase -> Anaphase -> Telophase.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.667 0.333 -0.334
gpt-oss-20b 0.000 0.000 +0.000
nemotron-3-super 0.000 1.000 +1.000
qwen3.5-122b 0.000 -0.333 -0.333
narc_gap_303 The Archaeological Dig AI lift: +0.050
An archaeological site (6 rows x 5 cols) is being excavated layer by layer from the bottom. Dirt is value 0, exposed artifacts are value 2, and bedrock is value 1. Each day, one more row of dirt is cleared from the bottom up, revealing bedrock (1) underneath. Two pillars (value 2) stand at fixed positions (row 2, cols 1 and 3). On Day 4 (masked), a cave-in occurs! The excavation depth resets back to what it was on Day 2 — only the bottom two rows show bedrock. The pillars remain visible. After the cave-in, digging resumes normally on Day 5.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.400 0.400 +0.000
gpt-oss-20b -0.200 0.400 +0.600
nemotron-3-super 0.600 0.400 -0.200
qwen3.5-122b 0.600 0.400 -0.200
narc_001 The Lighthouse AI lift: +0.000
The keeper climbs the tower stairs each night at dusk. She winds the mechanism and watches the beam sweep clockwise from due north, advancing one position each hour. The fog is thick tonight — she can hear the sea but not see it, and the light is all that stands between the ships and the rocks.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_041 Grey Cogs AI lift: +0.000
The machine runs smoothly because each part turns as it is told — grey cogs in grey rows, each clicking into red as the order passes through. All but one small cog that refuses its rotation, holding its grey against the tide in the very center of the third row.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_gap_302 The Corporate Merger AI lift: +0.000
A 6x6 office floor is divided into four quadrants (top-left, top-right, bottom-left, bottom-right), each representing a department. In Q1, Department A (value 3) moves in — top-left quadrant fills. In Q2 (masked), Department B (value 5) arrives in the top-right quadrant, BUT Department B is on probation — its top-right cells fill with 5 while everything else stays as before. However, a hiring freeze (represented by 0) prevents the bottom half from being staffed yet. In Q3, Department C (value 7) fills bottom-right. In Q4, Department D (value 1) fills bottom-left, completing the organization.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_gap_304 The Comet's Tail AI lift: +0.000
A comet (value 2) streaks across a 3x6 sky (value 0), moving one column right per frame. The comet is an L-shaped cluster: a vertical bar in one column plus a horizontal extension one row below. When the comet crosses the midpoint of the sky (between columns 2 and 3), solar wind reverses its tail — the L-shape flips horizontally. Before the midpoint, the horizontal part extends to the right; after, it extends to the left. Frame 4 (masked) is the first frame after crossing, so the L-shape has flipped.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 -0.200 -1.200
nemotron-3-super -0.200 1.000 +1.200
qwen3.5-122b 1.000 1.000 +0.000
narc_prism_008 The Vigil AI lift: +0.000
She sits at the window, watching the dark road. Every night, a red light drifts closer — always approaching the center of the grid, moving one step diagonally down and to the right. She cannot tell what it is. The dread thickens. In the corner of her eye, a second presence: a maroon glow at the bottom-right, which appears only on odd-numbered nights — the first night, the third, never the second or fourth. Whatever approaches, the sentinel keeps its own rhythm.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_sp_038 Filling the Glass AI lift: +0.000
Water pours from the tap into a glass. It fills from the bottom up, like water always does.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_sp_043 Happy Ending AI lift: +0.000
Once upon a time, the kingdom was covered in darkness. A hero set forth. She battled the dragon. She won. And the kingdom? It was bathed in light once more.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 1.000 +0.000
narc_002 The Quadrants AI lift: -0.083
A garden fills each quadrant clockwise through the seasons. But a severe drought strikes in summer, turning that season's quadrant grey instead of its usual vibrant color.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 0.667 -0.333
qwen3.5-122b 1.000 1.000 +0.000
narc_gap_101 The Manhattan Flip AI lift: -0.083
Each grid is produced from the previous one by flipping every cell whose Manhattan distance from the top-left corner (row 0, col 0) is odd. Flipping means: if the value is below 5, add 5; if 5 or above, subtract 5. The first grid is given. Grid 3 is masked.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 0.667 -0.333
nemotron-3-super 1.000 0.667 -0.333
qwen3.5-122b 0.667 1.000 +0.333
narc_gap_305 The Immune Response AI lift: -0.083
A 4x4 grid of healthy cells (value 3, on even-parity positions where (r+c) is even) and empty space (value 0, on odd-parity positions). A virus infects one cell per round, turning it from 3 to 2 (infected). After a cell has been infected for exactly two rounds, the immune system activates and converts it to value 4 (antibody). Round 2: cell (0,2) is infected. Round 3: cell (2,2) is infected. Round 4 (masked): cell (1,1) is newly infected, and the cell infected in Round 2 — (0,2) — has been sick for two rounds and becomes an antibody (value 4).
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 0.667 -0.333
qwen3.5-122b 1.000 1.000 +0.000
narc_sp_085 Spectrum 85 AI lift: -0.083
The circle of fifths encoded as semitones from C: C=0, G=7, D=2, A=9, E=4, B=11, F#=6, Db=1, Ab=8, Eb=3, Bb=10, F=5. Each step moves one position clockwise around the circle (up a fifth = +7 mod 12). Given three chords in a sequence moving around the circle, predict the fourth.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.000 0.000 +0.000
gpt-oss-20b 0.000 0.000 +0.000
nemotron-3-super 0.000 0.000 +0.000
qwen3.5-122b 1.000 0.667 -0.333
narc_ai_012 The Torus AI lift: -0.167
The system evolves as a discrete dynamical system on a 3x3 toroidal grid. The transition function at each timestep is: cell(i,j)(t+1) = (cell(i,j)(t) + cell(i-1,j)(t) + cell(i,j-1)(t)) mod 10, where indices wrap toroidally (i.e., row -1 wraps to row 2, column -1 wraps to column 2). Compute the state at timestep 4 given the preceding states.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.067 -0.067 +0.000
gpt-oss-20b 0.067 -0.067 -0.134
nemotron-3-super 0.067 -0.067 -0.134
qwen3.5-122b 0.333 -0.067 -0.400
narc_focal_025 Metamorphosis AI lift: -0.222
The caterpillar stops eating, hangs upside-down, and wraps itself in silk. Inside the cocoon, the body dissolves entirely. What emerges is unrecognizable. The cocoon witnesses the whole transformation but cannot tell what it holds.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 0.333 0.000 -0.333
nemotron-3-super 0.333 0.000 -0.333
narc_sp_078 Spectrum 78 AI lift: -0.250
Gravity pulls all non-zero values downward in each column. At each step, any non-zero cell with a 0 below it swaps with that 0 (falls one row). All columns update simultaneously. Given three snapshots, predict the next.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 1.000 +0.000
qwen3.5-122b 1.000 0.000 -1.000
narc_ai_013 The Narrow Pass AI lift: -0.300
The latent representation at the bottleneck is obtained by taking the mean of each column in the encoder output (Grid 2, the 3x3 grid), rounded to the nearest integer. This produces a 1x3 latent vector that captures the compressed representation of the input.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.000 0.400 +0.400
gpt-oss-20b -0.600 -0.600 +0.000
nemotron-3-super -0.200 -1.000 -0.800
qwen3.5-122b -0.200 -1.000 -0.800
narc_focal_034 The Spider's Web AI lift: -0.300
The spider sits at the web's edge, feeling vibrations through silk threads. The fly sees nothing until it strikes the invisible strands. The web transmits the struggle to the spider, who reads the vibrations like a map.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.200 -0.200 -0.400
gpt-oss-20b 1.000 0.200 -0.800
nemotron-3-super 0.200 -0.200 -0.400
qwen3.5-122b 0.600 1.000 +0.400
narc_sp_070 Spectrum 70 AI lift: -0.333
The librarian worked left to right through the stacks, one shelf at a time. Each step she sorted one more column ascending from top to bottom, leaving the rest untouched. Column 0 was done, column 1 was done. The next shelf — column 2 — waited, its books still in their original jumble.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 0.333 0.000 -0.333
nemotron-3-super 0.333 -0.667 -1.000
qwen3.5-122b 1.000 1.000 +0.000
narc_005 The Last Piece AI lift: -0.334
The tournament hall was quiet except for the click of captured pieces. On the checkerboard, one green piece falls each round — turned red the moment it's taken. But the old rule still holds: after a piece has been red for exactly two rounds, it earns promotion to yellow. In round 2 they took the piece at row 0, column 2. In round 3, row 2, column 2. In round 4, row 1, column 1 fell — and any piece red since round 2 finally got its yellow crown.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 0.333 -0.667
gpt-oss-20b 1.000 1.000 +0.000
nemotron-3-super 1.000 0.333 -0.667
qwen3.5-122b 1.000 1.000 +0.000
narc_015 The Summit AI lift: -0.350
The summit is not the end. What strains upward through sweat and will must learn that gravity has no memory and no mercy. Again.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b -0.333 -0.733 -0.400
gpt-oss-20b -0.333 -0.200 +0.133
nemotron-3-super -0.333 -0.733 -0.400
qwen3.5-122b 0.000 -0.733 -0.733
narc_focal_047 Stage Fright AI lift: -0.417
A performer (red=1) stands on stage under a spotlight (yellow=4). The audience (blue=6) fills rows below. The performer freezes: red dims to 0 in the spotlight. The spotlight stays. Then the performer recovers, red returns brighter, and the audience reacts with scattered 6s rising (applause).
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.333 -0.667 -1.000
gpt-oss-20b -0.667 -0.667 +0.000
nemotron-3-super -0.667 -0.667 +0.000
qwen3.5-122b 0.000 -0.667 -0.667
narc_008 The Tower AI lift: -0.428
They built it one floor each day, and on the fifth day set a golden crown upon its peak. Pride goeth before the fall — the next morning the storms began, and each day took a floor away.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 0.238 -0.333 -0.571
gpt-oss-20b 0.143 0.143 +0.000
nemotron-3-super 0.238 -0.333 -0.571
qwen3.5-122b 0.238 -0.333 -0.571
narc_gap_301 The Electron Orbital AI lift: -0.550
An electron (value 1) orbits a nucleus at the center of a 5x5 atom. It occupies energy shells at increasing distance from the center. Starting at the top (row 0, col 2), it moves clockwise through discrete positions on each shell boundary. At each time step it advances one position clockwise along the perimeter of its current shell. Position 2 (masked) corresponds to the electron at the 'East' position: row 2, col 4.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 0.000 -1.000
nemotron-3-super 1.000 0.400 -0.600
qwen3.5-122b 1.000 0.400 -0.600
narc_focal_054 The Mentor AI lift: -0.583
A teacher (blue=6) stands near a student who starts dim (red=1, single cell). The teacher passes knowledge: blue cells appear near the student, and the student's red presence grows each frame. By the end, the student has grown into a larger red shape while the teacher steps back, their work done.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 -1.000 -2.000
gpt-oss-20b -0.667 1.000 +1.667
nemotron-3-super -1.000 -1.000 +0.000
qwen3.5-122b 1.000 -1.000 -2.000
narc_prism_012 The Inheritance AI lift: -0.650
Grandmother's house has twenty rooms in a four-by-five grid. The three siblings divide it over four rounds, with a final agreement. Round one: the eldest (blue) claims the entire top row for the attic views. Everything else is unclaimed (black). Round two: the middle child (red) claims the bottom row, wanting the garden access. The youngest (green) claims only the center cell of the grid -- a single room she loves for its skylight. Round three: the middle child expands upward, taking the two leftmost cells of the second row. The eldest reaches down, claiming the two rightmost cells of the second row. The youngest expands from her center room, claiming the cells directly above and below it in the third row, giving her a vertical strip of three rooms in column 2. Two cells on the left of the third row are designated shared space (yellow) by mutual agreement. Round four: the remaining unclaimed cells all become shared (yellow). Everyone's personal rooms stay as they are.
Model Grids Only τ + Narrative τ Lift
gpt-oss-120b 1.000 1.000 +0.000
gpt-oss-20b 1.000 -0.400 -1.400
nemotron-3-super 0.800 -0.400 -1.200
qwen3.5-122b 1.000 1.000 +0.000