NARC Inspector

NARC puzzles are designed so that neither the grid sequence nor the narrative clue is sufficient on its own — but together they uniquely determine the answer. The experiments below probe different dimensions of this narrative–visual interaction. Masking tests whether models can reconstruct hidden grids. Ordering tests whether narratives help recover temporal structure. Odd-one-out tests whether narratives help models recognize which grids belong together. Stances tests how the same visual pattern responds to different narrative framings.

Masking Ordering Odd-One-Out Stances

Odd-one-out experiment: Three grids from the same puzzle are shown alongside one distractor grid from a different puzzle. The model must identify which grid doesn't belong. Tested under two conditions: grids only and grids + narrative. This probes whether narratives help models recognize which grids are thematically related, even when pixel-level reconstruction (masking) fails.

Accuracy = % of trials where distractor was correctly identified    Lift = accuracy(+narrative) − accuracy(grids only) in percentage points    Red border = distractor grid

5x5 Medium Compression
comp_5x5_medium
Narrative lift: +66.7pp accuracy
Narrative helps models spot which grid doesn't belong
The Inscription
iter_055
Narrative lift: +66.7pp accuracy
Narrative helps models spot which grid doesn't belong
Scaffold
hc_scaffold
Narrative lift: +50.0pp accuracy
Narrative helps models spot which grid doesn't belong
80.8%
gemma-4-26b +narr
91.7%
gemma-4-31b +narr
87.2%
gpt-oss-120b +narr
58.1%
gpt-oss-20b +narr
66.4%
nemotron-3-super +narr
91.7%
qwen3.5-122b +narr
75
Puzzles tested
comp_5x5_medium 5x5 Medium Compression Colab lift: +66.7pp
Which grid doesn't belong? (Distractor from iter_005)
The quarry face advanced eastward one cut at a time, the saw working left to right across the limestone. Five columns took five weeks. On the sixth week, the newest face — the rightmost cut, where the stone had been exposed to weather the shortest time — crumbled in the freeze. The older faces, their surfaces case-hardened by months of exposure, held. The crew recut the failed column the following spring.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 0.0% 100.0% +100.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_055 The Inscription Colab lift: +66.7pp
The mason carved the memorial one column at a time, chiseling from left to right across the face of the stone. The first column took a week; the last took a day — his hand had learned the stone. For a season the whole inscription stood sharp and legible. Then the winter rains found the grain of the rock. The earliest-carved column, the one that had weathered the most freezes, that had been exposed to the elements longest while the mason worked his way across — its letters blurred first, then vanished entirely. The rest held. He returned in spring to recut what the rain had taken.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 0.0% 100.0% +100.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hc_scaffold Scaffold Colab lift: +50.0pp
They built it from the left and the inspector condemned the newest section first.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 0.0% 100.0% +100.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_focal_009 Goldilocks and the Three Bears AI lift: +50.0pp
The bears' cottage is shown as a 4x5 grid. Three bears (brown=9) leave the cottage — their cells go to 0. Then Goldilocks (yellow=7) enters and disturbs items: she sits in chairs (changes some furniture cells from 8 to 0), eats porridge (changes 3 to 0), and sleeps in a bed. In the masked grid, the bears return home to find Goldilocks asleep — all three bears reappear at the entrance while Goldilocks (7) is in the bed position.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 0.0% 100.0% +100.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 0.0% 100.0% +100.0pp
narc_new_016 The Mountain Colab lift: +50.0pp
Snow crept down the mountainside all winter, one contour at a time. Then a warm front moved in — not the gentle kind, but the kind that strips the snow bare overnight. The lowest snow was the first to go, retreating exactly one contour back up the slope. After the warm spell passed, the cold returned and the snow advanced again.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_077 Spectrum 77 AI lift: +50.0pp
Each step applies two operations: first transpose the grid (swap rows and columns), then add 1 to every cell. Given two steps, predict the result after one more application.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
stance_intentional Intentional Stance Colab lift: +50.0pp
Which grid doesn't belong? (Distractor from narc_prism_005)
The general wanted to hold every position on the ridge. He sent his troops up from the valley floor, securing one line at a time. By nightfall the whole ridge was his. But the enemy commander, who knew the general's weakness for overextension, struck at dawn at the highest line — the one the general valued least, the one farthest from supply, the position he'd taken last and would sacrifice first. The rest of the ridge held. The general retook the summit the following week, just as the enemy knew he would.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
synth_blueprint The Blueprint Colab lift: +50.0pp
The architect's plan called for offices filling the floor plate from the left, one bay at a time. By the fourth revision every bay was designated office space. Then the client changed the brief: the rightmost bay, the one added last and with the fewest structural commitments, was to become a server room instead. Same footprint, different function. The architect recolored that column on the plan from blue to red. The rest stayed blue. The final revision restored all bays to the same designation.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hunt_016 The Mutiny Colab lift: +36.1pp
The lower ranks overthrew the officer who had just taken command of the center.
Model Grids Only + Narrative Lift
gemma-4-26b 66.7% 100.0% +33.3pp
gemma-4-31b 66.7% 100.0% +33.3pp
gpt-oss-120b 0.0% 66.7% +66.7pp
gpt-oss-20b 0.0% 66.7% +66.7pp
nemotron-3-super 50.0% 66.7% +16.7pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_011 The Heist AI lift: +33.3pp
Which grid doesn't belong? (Distractor from narc_018)
Three gems sit in a vault: ruby left, emerald center, sapphire right. The plan: move each gem one cell toward the door at the top, one per night. But on the third night, the inside man swaps the emerald for a fake — a grey stone — before it advances.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 0.0% 100.0% +100.0pp
narc_ai_006 The Disjunction AI lift: +33.3pp
The output is the element-wise exclusive disjunction of the two input feature maps. Where both inputs share the same activation level, the output neuron is suppressed to zero. Where they disagree, the output takes the nonzero value. This is the canonical nonlinearly-separable binary classification that a single-layer perceptron cannot learn.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_focal_027 Plate Collision AI lift: +33.3pp
Two tectonic plates drift toward each other at centimeters per year. Neither plate perceives the collision. But at the boundary, rock buckles upward, forming mountains that neither plate intended to create.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_gap_101 The Manhattan Flip AI lift: +33.3pp
Each grid is produced from the previous one by flipping every cell whose Manhattan distance from the top-left corner (row 0, col 0) is odd. Flipping means: if the value is below 5, add 5; if 5 or above, subtract 5. The first grid is given. Grid 3 is masked.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_001 The Garden Colab lift: +33.3pp
Old Mrs. Chen tended her courtyard garden through the seasons. She planted seeds in every other square, a careful checkerboard of green against bare earth. By summer they had all bloomed into bright flowers. But that October an early frost crept through the courtyard gate, and the one flower she had planted dead center — her pride — turned black overnight. The rest survived, sheltered by the courtyard walls.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_010 The Signal Colab lift: +33.3pp
The radio engineer watched the signal propagate across her test grid, each pulse expanding the wavefront by one cell in every direction. Two clean pulses, textbook diffusion. Then the interference struck — some stray frequency that collapsed the entire wavefront back to its origin point. She cursed, reset nothing, and let the next pulse go. It expanded again as if nothing had happened, the grid ignorant of its own disruption.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_012 The Ink Stain Colab lift: +33.3pp
The calligrapher dipped her brush and drew a single stroke across the right edge of the rice paper, a column of black ink. She added a second column, then a third. But before the ink dried she blotted the leftmost column with her sleeve, wiping it clean by accident. She steadied herself and finished the final column, filling the paper from edge to edge.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_018 The Canopy Colab lift: +33.3pp
The forest canopy thickened from one edge of the aerial photograph each spring, new growth filling in row by row until the entire frame was green. But the drought that summer killed the youngest trees first — those in the most recently grown strip — leaving that row bare while the older growth survived. The autumn rains brought the canopy back to full cover.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_047 The Tree Through Seasons AI lift: +33.3pp
Spring, summer, autumn, winter. The old oak tree follows the same cycle every year. In spring it buds, in summer it's full, in autumn the leaves fall, and in winter...
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
stance_design Design Stance Colab lift: +33.3pp
The building was designed to be constructed from the foundation up, one floor per month. The architect specified that the top floor — the observation deck — should be the last built and the first demolished in any retrofit, since it bore no structural load. When the retrofit came, the contractors followed the spec: the top floor was removed while the rest of the building continued to operate. The replacement floor was installed the following quarter.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 0.0% 100.0% +100.0pp
stance_physical Physical Stance Colab lift: +33.3pp
Which grid doesn't belong? (Distractor from narc_gap_104)
Water fills a tank from the bottom, one layer at a time, as the inlet pipe feeds the lowest point. When the tank is full, the sun heats the surface. The topmost layer — the thinnest, with the most surface area exposed to air — evaporates first. The deeper layers, insulated by the water above, retain their heat more slowly. The evaporated layer is replenished by the next rain.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hunt_017 The Coup d'Etat Colab lift: +31.9pp
Blue extended its reach, but red saw an opening and claimed the new outpost before blue could fortify it.
Model Grids Only + Narrative Lift
gemma-4-26b 75.0% 75.0% +0.0pp
gemma-4-31b 75.0% 75.0% +0.0pp
gpt-oss-120b 0.0% 75.0% +75.0pp
gpt-oss-20b 33.3% 50.0% +16.7pp
nemotron-3-super 0.0% 50.0% +50.0pp
qwen3.5-122b 25.0% 75.0% +50.0pp
hunt_021 The Filibuster Colab lift: +30.6pp
Blue advanced confidently, but red blocked the move and claimed the position for itself.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 50.0% -50.0pp
gemma-4-31b 66.7% 100.0% +33.3pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 66.7% +66.7pp
nemotron-3-super 66.7% 66.7% +0.0pp
qwen3.5-122b 66.7% 100.0% +33.3pp
hunt_027 The Ransom Colab lift: +22.2pp
Blue planted a flag deep in red's corner. Red recaptured it immediately and struck back into blue's territory.
Model Grids Only + Narrative Lift
gemma-4-26b 66.7% 100.0% +33.3pp
gemma-4-31b 66.7% 100.0% +33.3pp
gpt-oss-120b 33.3% 100.0% +66.7pp
gpt-oss-20b 33.3% 33.3% +0.0pp
nemotron-3-super 66.7% 66.7% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
agent_002 The Betrayal Colab lift: +16.7pp
Blue invaded green's corner. Red, pretending to be an ally, swooped in and stole what blue had just conquered.
Model Grids Only + Narrative Lift
gemma-4-26b 66.7% 66.7% +0.0pp
gemma-4-31b 66.7% 100.0% +33.3pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 66.7% +66.7pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
comp_L0_literal Compression L0: Literal Colab lift: +16.7pp
Which grid doesn't belong? (Distractor from iter_042)
The grid fills column by column from left to right. After all four columns are filled, the leftmost column (column 0) is removed, setting all cells in that column back to 0. The other three columns remain filled.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
comp_L1_spatial Compression L1: Spatial Colab lift: +16.7pp
A signal propagates across the grid from west to east, one column at a time. Once the whole grid is active, the western edge loses signal — the first column goes dark while the rest stays lit. The signal is eventually restored.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
comp_L2_figurative Compression L2: Figurative Colab lift: +16.7pp
The painter worked left to right across the wall, one strip each day. When the mural was finished she stepped back — and saw that rain had already washed the first day's work clean. The oldest paint, the strip she'd laid down earliest, was gone. The rest held. She repainted it the following week.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 0.0% 100.0% +100.0pp
hc_sediment Sediment Colab lift: +16.7pp
Layer by layer the riverbed rose until the spring flood scoured the newest deposit clean off the bottom.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hunt_018 The Usurper Colab lift: +16.7pp
While blue and red fought over the middle, green seized the moment and replaced red's guards on both flanks.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 25.0% 100.0% +75.0pp
gpt-oss-20b 50.0% 75.0% +25.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_015 The Cellar Colab lift: +16.7pp
The old cellar flooded every spring when the snowmelt seeped through the ceiling. Water crept down from above, filling the top row first, then the middle, then the floor. This year the plumber had fixed the ceiling, so when spring came the cellar stayed dry. But a pipe burst in the east wall that April, and water poured in from the right side instead — filling only the rightmost column while the rest of the cellar stayed bone-dry. By the time they shut the valve, the whole cellar had flooded from that one broken pipe.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_033 The Mosaic Colab lift: +16.7pp
Professor Okafor excavated the Roman mosaic one row at a time, starting from the access trench at the top of the grid and working down. By the third season the entire floor was exposed — every tessera gleaming in the sun for the first time in two thousand years. Then the tremor hit. A section of the trench wall collapsed inward, burying the topmost row of tiles under a metre of earth while the deeper rows, set into bedrock, stayed clear. She spent the next season re-excavating what the earth had reclaimed.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_049 The Blackboard Colab lift: +16.7pp
Professor Maddox filled the blackboard the way she filled her students' minds: methodically, from left to right, one column of equations at a time. By the end of the lecture every cell was covered in chalk. She began erasing from the left, clearing one column per day for the next week's material. But the teaching assistant, trying to help, erased from the wrong end — wiping the rightmost column, the material they hadn't reviewed yet, while the columns she'd meant to clear still stood. She stared at the board: two clean columns on the edges, the untouched lecture still filling the middle. She started over.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_051 The Tapestry Colab lift: +16.7pp
The weaver worked the tapestry in vertical strips, hanging each finished column on the frame before starting the next. Leftmost first, then each column to the right in turn. When the frame was full she stepped back to admire it — and saw the moths. They had found the oldest thread, the first column she'd hung, the one that had been exposed longest while she worked on the rest. Eaten to nothing while the newer columns, still smelling of the dye vat, were untouched. She rewove the ruined strip and sealed the room.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_029 The Pioneers AI lift: +16.7pp
Which grid doesn't belong? (Distractor from hunt_024)
The woody pioneers claim the borders first, sheltering the meadow within. They are not the tallest, not the final form — but without their ring of shelter, nothing taller could take root.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 0.0% 100.0% +100.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_ai_005 The Shortcut AI lift: +16.7pp
At layer 3, the forward pass adds a skip connection from the input embedding (layer 0). The output is the element-wise sum of the current activation and the layer-0 activation, reduced modulo 10. All other layers apply a constant additive bias of +2 per element (mod 10).
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 0.0% 100.0% +100.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_ai_007 The Dead Path AI lift: +16.7pp
Backpropagation through the ReLU at layer 3 zeros the gradient for all neurons whose forward-pass pre-activation was non-positive. In this layer, the pre-activations in the top two rows were negative, so their gradient contributions are exactly zero. The bottom three rows retain their gradient magnitudes from the previous layer unchanged.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_ai_008 The Merge AI lift: +16.7pp
Yusuf had been staring at the token stream for hours when the pattern clicked. Byte-pair encoding merges the most frequent adjacent pair at each step, replacing both with a single new token. Step 1 merged the pair (1,2) at positions 0-1 and 4-5 into token 7 (orange). Step 2 merged (3,4) at positions 1-2 and 3-4 in the step-1 output into token 8 (azure). Step 3 would merge (7,8) at positions 0-1 and 1-2 in the step-2 output into token 9 (maroon) — the whole sequence compressing to just two symbols. He leaned back and exhaled.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_focal_034 The Spider's Web AI lift: +16.7pp
The spider sits at the web's edge, feeling vibrations through silk threads. The fly sees nothing until it strikes the invisible strands. The web transmits the struggle to the spider, who reads the vibrations like a map.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 0.0% 100.0% +100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_gap_303 The Archaeological Dig AI lift: +16.7pp
An archaeological site (6 rows x 5 cols) is being excavated layer by layer from the bottom. Dirt is value 0, exposed artifacts are value 2, and bedrock is value 1. Each day, one more row of dirt is cleared from the bottom up, revealing bedrock (1) underneath. Two pillars (value 2) stand at fixed positions (row 2, cols 1 and 3). On Day 4 (masked), a cave-in occurs! The excavation depth resets back to what it was on Day 2 — only the bottom two rows show bedrock. The pillars remain visible. After the cave-in, digging resumes normally on Day 5.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_003 The Tide Pool Colab lift: +16.7pp
The marine biologist filmed the tide pool over four days. Each morning the water crept higher, life blooming in every crevice — anemones and urchins claiming each inch of rock. By the third morning every cell of her observation grid was alive. But that afternoon a rogue swell crashed over the south wall of the pool, scouring the bottom row clean of everything that had taken root. The next tide brought new larvae, and by the fourth morning the pool was full of life again, as though the wave had never come.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_004 The Blackout Colab lift: +16.7pp
The city woke in stages. First the western blocks hummed to life, their streetlights winking on one column at a time as the power grid warmed up. By nightfall the entire district blazed. Then the transformer on the east side blew — a shower of sparks and suddenly the rightmost column of blocks went dark, plunging that whole strip back into silence. Repair crews worked through the night, and by dawn every light was burning again.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_018 Spectrum 18 AI lift: +16.7pp
A 1D cellular automaton with 12 cells and wrap-around boundaries. The update rule is: new[i] = left XOR (center OR right), where left=cell[i-1], center=cell[i], right=cell[i+1], with indices wrapping. Grid 0 is generation 0, Grid 1 is generation 1, Grid 2 is generation 2, Grid 3 (masked) is generation 3.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_075 Spectrum 75 AI lift: +16.7pp
Each cell at position (i,j) in Grid N equals the sum of the cell at (i,j) in Grid N-1 and Grid N-2, following the Fibonacci recurrence. Given Grids 0, 1, and 2, predict Grid 3.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
agent_003 The Scapegoat Colab lift: +11.1pp
Which grid doesn't belong? (Distractor from stance_015)
Blue and red were at war, but both blamed green. They drove green out of the middle ground entirely, leaving a void that red then filled.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 66.7% -33.3pp
gemma-4-31b 66.7% 100.0% +33.3pp
gpt-oss-120b 33.3% 100.0% +66.7pp
gpt-oss-20b 100.0% 66.7% -33.3pp
nemotron-3-super 66.7% 66.7% +0.0pp
qwen3.5-122b 66.7% 100.0% +33.3pp
hunt_020 The Backstab Colab lift: +11.1pp
Blue reached across the divide, but red turned on its neighbor and snatched the foothold blue had just established.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 66.7% 100.0% +33.3pp
nemotron-3-super 66.7% 100.0% +33.3pp
qwen3.5-122b 100.0% 100.0% +0.0pp
agent_006 The Double Cross Colab lift: +5.5pp
Blue and green attacked red together. But green was playing both sides — it turned on blue too, seizing blue's position for itself.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 66.7% 100.0% +33.3pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hunt_023 The Siege Colab lift: -0.0pp
Red surrounded the lone holdout, cutting off escape on both flanks before closing in for the capture.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 66.7% 100.0% +33.3pp
gpt-oss-20b 100.0% 33.3% -66.7pp
nemotron-3-super 66.7% 66.7% +0.0pp
qwen3.5-122b 66.7% 100.0% +33.3pp
iter_009 The Ember Colab lift: +0.0pp
The fire had started in the southeast corner of the ridge and spread predictably, each hour claiming more of the slope. By dusk the whole ridge was burning. Then the wind reversed — a katabatic downdraft off the peaks — and pushed the fire line back in on itself. The edges went out first, starved of fuel, until only the center still burned: a single hot core that the crew could finally surround. By morning they had it contained, and the ridge was dark.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 0.0% 0.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_050 The Planting Colab lift: +0.0pp
He sowed the field in columns, walking west to east with the seed bag over his shoulder. One column a day, five days' work. By Friday the whole field was planted. Saturday morning he found the crows. They had worked the earliest-planted column — the strip he'd sown on Monday, where the seeds had been sitting longest and the soil was most disturbed. Picked clean, every last grain. The rest of the field, sown later and less disturbed, they hadn't touched. He replanted the stolen column and hung a scarecrow for good measure.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 0.0% -100.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_002 The Quadrants AI lift: +0.0pp
Which grid doesn't belong? (Distractor from narc_029)
A garden fills each quadrant clockwise through the seasons. But a severe drought strikes in summer, turning that season's quadrant grey instead of its usual vibrant color.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_004 The Mirror AI lift: +0.0pp
Anya's daughter liked to push her chalk mark one step to the right each time they passed the long wall on the way to school. On the morning it finally crossed the center column, the girl stopped and frowned — the shape had flipped, as if the wall were a mirror. 'It does that,' Anya said. 'Everything on the other side comes out backwards.' The shape kept sliding rightward, one step each frame, horizontally reversed from then on.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_006 The Hillside AI lift: +0.0pp
A shepherd boy tends his flock on a hillside. Twice he raises a false alarm, and both times the villagers rush to help only to find no danger. The third time, a real wolf appears — but having been fooled twice, no one comes. The cycle repeats identically both times the boy lies: the same alarm, the same full response, the same disappointment.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_009 The Journey AI lift: +0.0pp
Which grid doesn't belong? (Distractor from narc_ai_002)
A traveler crosses the valley heading east, one step each day. On the fourth day she reaches the river and wades in — the water is cold and blue beneath her feet. She turns back west, retracing a higher path along the ridge, one step each day home.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_gap_103 The Rotation Ambiguity AI lift: +0.0pp
Grid 2 is the 90-degree clockwise rotation of Grid 1. Grid 3 is the TRANSPOSE (not the 180-degree rotation) of Grid 1. The transpose swaps rows and columns: cell (r,c) becomes cell (c,r).
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_new_002 The Migration Colab lift: +0.0pp
Every spring the starlings gathered in the northwest corner of the marsh, their dark shapes clustered tight against the cold. By summer they had drifted to the center of the wetland, feeding in the warm shallows. But the autumn storm of that year was fierce — gale-force winds that drove the entire flock backward, all the way to their spring roost in the northwest. Come winter they pressed on again, scattering southeast as if the storm had never happened.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_060 The Escape AI lift: +0.0pp
A bird in a cage. The door opens. Freedom! The cage is left empty -- the bird is gone, soaring somewhere we can't see.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_069 Spectrum 69 AI lift: +0.0pp
Each step adds a border of 1s around the previous grid, growing it by 2 in each dimension. Given the seed and one expansion, predict the next.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
stance_design_5x5 Design Stance 5x5 Colab lift: +0.0pp
The server rack was built to fill from the left bay first, one column of blades at a time, following the cooling airflow path. The spec mandated that in any capacity reduction, the most recently installed column — the rightmost, farthest from the intake — be decommissioned first, since those blades ran hottest. When the power budget was cut, the ops team followed protocol: the last column went dark while the rest kept running. The budget was restored the following quarter.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
synth_chorus The Chorus Colab lift: +0.0pp
Which grid doesn't belong? (Distractor from narc_042)
They joined the chorus one row at a time from the front, the basses anchoring the sound. By the third rehearsal the whole ensemble was singing. Then the sopranos in the back row — the last section recruited, still sight-reading — lost their nerve and went silent. The rest held the chord.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 100.0% +100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
synth_palimpsest The Palimpsest Colab lift: +0.0pp
Which grid doesn't belong? (Distractor from narc_029)
The scribe copied the new text over the old, column by column from the left. But he was lazy about the last column — the parchment was thickest there, and the old ink showed through. Every other column was cleanly overwritten; the rightmost still bore the ghost of what had been there first.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
comp_L3_dense Compression L3: Dense Colab lift: -16.7pp
Which grid doesn't belong? (Distractor from narc_033)
First in, first out: what the wall received earliest, the weather took back first.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 0.0% -100.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hc_bloom Bloom Colab lift: -16.7pp
Which grid doesn't belong? (Distractor from narc_042)
It bloomed from the center and the frost took the petals first, leaving only the heart.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 0.0% -100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
iter_042 The Kiln Colab lift: -16.7pp
Which grid doesn't belong? (Distractor from narc_050)
The potter loaded her kiln by feel — decades of burned fingertips had taught her exactly how the heat moved. It rose from the element at the base, climbing one shelf at a time until the whole chamber glowed. She had just closed the peepholes when she heard the crack: the door seal had split on the left side, letting in a tongue of cold air. The leftmost column of shelves cooled immediately — she could see the color drain from the clay. The right three columns, shielded by the door's intact half, held their heat. She patched the seal and brought the whole kiln back to temperature by morning.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 0.0% -100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_003 The Flood AI lift: -16.7pp
The Patel family watched the river from their porch each morning. It rose one level every day, swallowing the low fields first, then the road. By day three the children couldn't walk to school. But on day 4 the Army Corps arrived — sandbags and sheet metal, a makeshift levee that shoved the water back to where it had been on day 2. The next morning the river was climbing again, indifferent.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 0.0% -100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_ai_012 The Torus AI lift: -16.7pp
The system evolves as a discrete dynamical system on a 3x3 toroidal grid. The transition function at each timestep is: cell(i,j)(t+1) = (cell(i,j)(t) + cell(i-1,j)(t) + cell(i,j-1)(t)) mod 10, where indices wrap toroidally (i.e., row -1 wraps to row 2, column -1 wraps to column 2). Compute the state at timestep 4 given the preceding states.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 0.0% 0.0% +0.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 0.0% -100.0pp
narc_ai_013 The Narrow Pass AI lift: -16.7pp
The latent representation at the bottleneck is obtained by taking the mean of each column in the encoder output (Grid 2, the 3x3 grid), rounded to the nearest integer. This produces a 1x3 latent vector that captures the compressed representation of the input.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 0.0% -100.0pp
narc_focal_025 Metamorphosis AI lift: -16.7pp
The caterpillar stops eating, hangs upside-down, and wraps itself in silk. Inside the cocoon, the body dissolves entirely. What emerges is unrecognizable. The cocoon witnesses the whole transformation but cannot tell what it holds.
Model Grids Only + Narrative Lift
gemma-4-26b 0.0% 0.0% +0.0pp
gemma-4-31b 0.0% 0.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 100.0% 0.0% -100.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 0.0% 0.0% +0.0pp
narc_focal_026 The Water Cycle AI lift: -16.7pp
Water rises invisibly from the ocean surface, gathering as vapor. High above, it cools and condenses into cloud. The cloud thickens until it can hold no more, and rain falls into rivers that return to the sea.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 0.0% -100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 0.0% 100.0% +100.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_gap_202 The Pale Blue Dot Revisited AI lift: -16.7pp
We zoom out from a scene in four stages. Each grid is half the size of the previous: 6x6, then 3x3, then the masked grid, then 1x1. The downsampling rule: each cell in the smaller grid is the MAXIMUM value of the corresponding 2x2 block in the larger grid. Grid 3 (the second zoom-out, position 2) is masked. It should be a 2x2 grid derived by taking max of each 2x2 block from grid at position 1 (the 3x3 grid, padded with 0 in the missing row/column to make it even). Since 3x3 can't evenly divide, we pad it to 4x4 with zeros, then take 2x2 blocks.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 0.0% -100.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 0.0% 100.0% +100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 100.0% 0.0% -100.0pp
narc_gap_204 Jazz Solo AI lift: -16.7pp
Which grid doesn't belong? (Distractor from narc_sp_075)
A jazz theme is stated in Grid 1: a melody line (row 0) over a bass line (row 2). Grid 2 is the theme with the melody transposed up by 2 (each value +2 mod 10). Grid 3 is the theme with melody transposed up by 4. Grid 4 (masked) is the SURPRISE MODULATION: the melody transposes up by 6, AND the bass line inverts — each bass value becomes (9 minus original bass value). Grid 5 returns to the original theme (melody +8 mod 10, bass returns to normal). The middle row (row 1) is always 0 (rest between melody and bass).
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 100.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_sp_066 Spectrum 66 AI lift: -16.7pp
Each step adds a new column on the right containing the row sums of the previous grid. The original columns remain. Given the first two states, predict the third.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 0.0% -100.0pp
nemotron-3-super 100.0% 100.0% +0.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
hc_embers Embers Colab lift: -33.3pp
The ridge burned from the corner out and collapsed from the edges in.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 0.0% 0.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 0.0% -100.0pp
narc_005 The Last Piece AI lift: -33.3pp
Which grid doesn't belong? (Distractor from narc_013)
The tournament hall was quiet except for the click of captured pieces. On the checkerboard, one green piece falls each round — turned red the moment it's taken. But the old rule still holds: after a piece has been red for exactly two rounds, it earns promotion to yellow. In round 2 they took the piece at row 0, column 2. In round 3, row 2, column 2. In round 4, row 1, column 1 fell — and any piece red since round 2 finally got its yellow crown.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 100.0% +0.0pp
gpt-oss-20b 100.0% 0.0% -100.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp
narc_ai_011 Even and Odd AI lift: -33.3pp
Which grid doesn't belong? (Distractor from narc_ai_003)
During training with dropout rate p=0.5, the binary mask retains neurons where (row + col) is even (0-indexed). Dropped neurons are zeroed. Surviving neurons are scaled by 1/(1-p) = 2 to compensate, clamped to [0,9]. Apply this dropout mask to the activation pattern in Grid 1.
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 0.0% -100.0pp
gemma-4-31b 100.0% 0.0% -100.0pp
gpt-oss-120b 0.0% 0.0% +0.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 0.0% 0.0% +0.0pp
qwen3.5-122b 0.0% 0.0% +0.0pp
narc_gap_104 The Fractal Reversal AI lift: -33.3pp
A value spreads from the top-left corner. Each step, every cell adds 1 to its value (mod 10) — this is the additive phase (steps 1-3). At step 4, the pattern REVERSES: every cell subtracts 1 (mod 10) instead. From step 5 onward, it subtracts again. Grid 4 (position 3) is masked — it is the first subtraction step, so its values equal those of step 3 minus 1 (mod 10).
Model Grids Only + Narrative Lift
gemma-4-26b 100.0% 100.0% +0.0pp
gemma-4-31b 100.0% 100.0% +0.0pp
gpt-oss-120b 100.0% 0.0% -100.0pp
gpt-oss-20b 0.0% 0.0% +0.0pp
nemotron-3-super 100.0% 0.0% -100.0pp
qwen3.5-122b 100.0% 100.0% +0.0pp