Did you ever want to SELECT * FROM clean_data_you_think_exists
? Well, now you can!
A DuckDB extension that lets you query any data using natural language or run arbitrary JavaScript code directly in your SQL queries.
⚠️ WARNING: HIGHLY EXPERIMENTAL⚠️ This extension is in early experimental stage and should be used with extreme caution.
- DO NOT use in production environments
- The generated code may be unpredictable or incorrect
- Always review the generated queries before executing
- API calls to LLMs incur costs - monitor your usage
- Proceed at your own risk!
Query any data using natural language in DuckDB, or execute JavaScript code directly! Powered by LLMs (OpenAI/Anthropic) and Deno.
# Clone the repo
git clone --recurse-submodules https://github.com/nicosuave/wizard
cd wizard
# Set your API key
export OPENAI_API_KEY="your-api-key" # or ANTHROPIC_API_KEY
# Build the extension
make build # for debug build
# OR
make release # for optimized build
# Try it out!
duckdb -unsigned
Then in DuckDB:
LOAD 'build/release/wizard.duckdb_extension';
SELECT * FROM wizard('bitcoin price');
-- Load the extension
LOAD 'path/to/wizard.duckdb_extension';
-- Fetch recent earthquake data
SELECT * FROM wizard('recent earthquakes for past month');
-- Get cryptocurrency prices
SELECT * FROM wizard('Bitcoin price in USD');
-- Weather information
SELECT * FROM wizard('current weather in Seattle');
-- Programming jokes
SELECT * FROM wizard('random programming joke');
-- Complex queries work too!
SELECT magnitude, place, time
FROM wizard('earthquakes magnitude > 5.0 past week')
WHERE place LIKE '%California%'
ORDER BY magnitude DESC;
-- Run arbitrary JavaScript code directly!
SELECT * FROM js('
async function fetch_data() {
// You can use any JavaScript here, including npm packages
const response = await fetch("https://api.github.com/repos/duckdb/duckdb");
const data = await response.json();
return [{
name: data.name,
stars: data.stargazers_count,
language: data.language
}];
}
');
-- Use npm packages in your JavaScript
SELECT * FROM js('
import dayjs from "npm:dayjs";
async function fetch_data() {
const now = dayjs();
return [{
current_time: now.format(),
unix_timestamp: now.unix(),
day_of_week: now.format("dddd")
}];
}
');
- Rust toolchain (for building)
- Make
- Git
- Clone the repository
git clone --recurse-submodules https://github.com/nicosuave/wizard
cd wizard
- Set your LLM API key (you'll need one of these)
# For OpenAI
export OPENAI_API_KEY="your-openai-api-key"
# OR for Anthropic (untested)
export ANTHROPIC_API_KEY="your-anthropic-api-key"
- Build the extension
# Debug build (faster compilation, includes debug symbols)
make build
# OR Release build (optimized for performance)
make release
The extension will be built to:
- Debug:
build/debug/wizard.duckdb_extension
- Release:
build/release/wizard.duckdb_extension
Always run DuckDB in unsigned mode to load custom extensions:
# Start DuckDB in unsigned mode
duckdb -unsigned
# Or with a database file
duckdb -unsigned mydata.db
Then load the extension:
-- For release build
LOAD 'build/release/wizard.duckdb_extension';
-- Or for debug build
LOAD 'build/debug/wizard.duckdb_extension';
-- Now you can use it!
SELECT * FROM wizard('current weather in Seattle');
The wizard extension operates with significant constraints:
What it CAN do:
- Make HTTP/HTTPS requests using Deno's built-in fetch
- Import and use any npm package via Deno's npm specifier
- Parse JSON responses natively
- Transform API responses into DuckDB tables
- Execute modern JavaScript/TypeScript code
- Cache responses for performance (60x speedup)
What it CANNOT do:
- Access the local filesystem (sandboxed)
- Make direct database connections
- Execute arbitrary system commands
The extension works by having the LLM generate JavaScript code that:
- Uses Deno's built-in fetch() for HTTP requests
- Can import npm packages like
npm:yahoo-finance2
ornpm:dayjs
- Returns data as an array of objects
Responses are cached to improve performance:
- Same queries return instantly from cache (60x faster)
- Cache persists for the session
- Use
--bust-cache
flag to force a fresh API call:SELECT * FROM wizard('bitcoin price --bust-cache');
- Debug mode available with
--debug
flag to see generated code
# Start DuckDB in unsigned mode
duckdb -unsigned
# Load the extension (adjust path based on your build type)
LOAD 'build/release/wizard.duckdb_extension';
# Query away!
SELECT * FROM wizard('show me Tesla stock data for the last week');
You can also execute JavaScript code directly without going through an LLM:
-- Simple example
SELECT * FROM js('
async function fetch_data() {
return [
{ message: "Hello from JavaScript!", value: 42 }
];
}
');
-- Fetch data from APIs
SELECT * FROM js('
async function fetch_data() {
const response = await fetch("https://api.coinbase.com/v2/exchange-rates?currency=BTC");
const data = await response.json();
return [{
currency: "BTC",
usd_price: parseFloat(data.data.rates.USD),
timestamp: new Date().toISOString()
}];
}
');
-- Use npm packages
SELECT * FROM js('
import { format } from "npm:date-fns";
async function fetch_data() {
const dates = ["2024-01-01", "2024-06-15", "2024-12-25"];
return dates.map(date => ({
original: date,
formatted: format(new Date(date), "MMMM do, yyyy"),
day_of_week: format(new Date(date), "EEEE")
}));
}
');
The js()
function:
- Executes arbitrary JavaScript/TypeScript code
- Has access to Deno's fetch API and npm packages
- Must define an async
fetch_data()
function that returns an array of objects - Runs in the same sandboxed Deno environment as wizard-generated code
- Your natural language query is sent to an LLM (OpenAI or Anthropic)
- The LLM generates JavaScript code that fetches the requested data
- The code is executed in a sandboxed Deno environment
- Deno handles all HTTP requests and npm package imports
- Results are returned as a DuckDB table that you can query with SQL
The extension leverages Deno's secure runtime and built-in fetch API, plus its ability to import npm packages directly.
- "No API key found": Set your OPENAI_API_KEY or ANTHROPIC_API_KEY environment variable
- "Extension not found": Make sure you've built the extension with
make release
- Rate limits: The extension uses real API calls, so you may hit rate limits with many queries
- Slow first query: The first query calls the LLM API; subsequent identical queries use cache
make test_debug # Test debug build
make test_release # Test release build
MIT License - see LICENSE file for details.