StochasticGoose is an action-learning agent for the ARC-AGI-3 Agent Preview Competition. It uses a simple reinforcement learning approach to predict which actions will cause frame changes, enabling more efficient exploration than random selection.
- Lead Developer: Dries Smit
- Adviser/Reviewer: Jack Cole
The action learning agent uses a CNN-based model to predict which actions (ACTION1-ACTION6) will result in new frame states. This enables more precise exploration by biasing action selection toward actions predicted to cause changes.
Key Features:
- CNN with shared backbone for action and coordinate prediction
- Binary classification: predicts if actions will change the current frame
- Hierarchical sampling: first select action type, then coordinate if needed. The coordinate sampling is done purely through convolution to retain the 2D grid bias.
- Efficient experience buffer that stores all experiences with hash-based deduplication for maximum sample efficiency given the ~200k sample constraint
- Dynamic model reset when reaching new levels
- Python 3.10+
- CUDA-capable GPU (recommended)
- uv package manager
git clone --recurse-submodules [email protected]:DriesSmit/ARC3-solution.git
cd ARC3-solution
Copy the example environment file and set your API key (get your API key from https://three.arcprize.org/user):
cd ARC-AGI-3-Agents
cp .env-example .env
# Then edit .env file and replace the empty ARC_API_KEY= with your actual API key
cd ..
make install
Add the following code to ARC-AGI-3-Agents/agents/__init__.py
(under the imports and before load_dotenv()
):
import sys
import os
sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))))
from custom_agent import *
Also, add the following field to the FrameData
class in ARC-AGI-3-Agents/agents/structs.py
(after the full_reset
field):
available_actions: list[GameAction] = Field(default_factory=list)
make action
- Input: 16-channel one-hot encoded frames (64x64)
- Backbone: 4-layer CNN (32→64→128→256 channels)
- Action Head: Predicts ACTION1-ACTION5 probabilities
- Coordinate Head: Predicts 64x64 click position probabilities for ACTION6 with 2D inductive bias using convolutional layers instead of flattened representations
- Supervised Learning: (state, action) → frame_changed labels
- Experience Buffer: 200K unique state-action pairs with hash-based deduplication
- Dynamic Reset: Clears buffer and resets model when reaching new levels
- Loss: Binary cross-entropy with light entropy regularization
- Stochastic Sampling: Uses sigmoid probabilities for action selection
- Hierarchical Selection: First sample action type, then coordinates if ACTION6
- Change Prediction: Biases exploration toward actions predicted to cause changes
The agent generates comprehensive logs and TensorBoard metrics:
# View training metrics
make tensorboard
# Open http://localhost:6006 in browser
ARC3/
├── ARC-AGI-3-Agents/ # Competition framework (submodule)
├── custom_agents/
│ ├── __init__.py # Agent registration
│ ├── action.py # Main action learning agent
│ └── view_utils.py # Visualization utilities
├── custom_agents.py # Agent imports
├── Makefile # Build commands
├── README.md # This file
├── requirements.txt # Python dependencies
└── utils.py # Shared utilities
# Standard competition run
make action
# Run with specific game ID
uv run ARC-AGI-3-Agents/main.py --agent=action --game=vc33
# View logs and metrics
make tensorboard
# Clean generated files
make clean