AI-Powered SQL Governance & Query Rewriting Library
AIDataRoleGuard is an intelligent SQL rewriting engine designed for the age of autonomous data access. As AI agents increasingly generate and execute SQL queriesβwhether in analytics tools, automation scripts, or natural language interfacesβAIDataRoleGuard ensures sensitive data stays protected through role-based masking and granular permission enforcement.
AI systems are amazing at discovering patterns, asking complex questions, and generating SQL on the flyβbut they often lack contextual awareness of data sensitivity and organizational policy.
AIDataRoleGuard steps in as the safeguard between generated SQL and your database, rewriting queries to enforce access rules and protect sensitive information. No matter how the SQL is writtenβby humans or AIβthe library ensures it only retrieves what the user's role allows.
π§ AI is great at asking questions. AIDataRoleGuard makes sure it never asks for answers it shouldn't see.
AIDataRoleGuard acts as an intelligent middleware layer between your application and database, intercepting and rewriting SQL queries based on role-based policies:
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β AI Agent β β Application β β AIDataRoleGuard β β Database β
β Generated βββββΆβ Layer βββββΆβ SQL Rewriter βββββΆβ (MySQL/PG/ β
β SQL β β β β β β SQLite/etc) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Role Config β
β & Audit Log β
βββββββββββββββββββ
How It Works:
- Query Interception: Captures SQL before it reaches the database
- Role Resolution: Identifies the current user's role and permissions
- Policy Application: Applies masking, filtering, and access rules
- Query Rewriting: Transforms the original SQL to enforce policies
- Audit Logging: Records all transformations for compliance and monitoring
- AI-Powered SQL Rewriting: Dynamically transforms
SELECT
,INSERT
,UPDATE
, andDELETE
based on role config. - Policy-Driven Architecture: Use clean JSON/YAML to declare access, masking, and mutations.
- Stack-Agnostic Design: Plug it into any backend that emits SQLβNode.js, Python, Java, Go.
- Field-Level Masking: Replace sensitive columns with safe placeholders like
'***'
. - Audit Logging: Track every rewrite for compliance, debugging, and behavioral analytics.
- RAG Systems: Protect customer PII when AI agents query knowledge bases
- Business Intelligence: Ensure AI-generated reports respect data governance
- Natural Language Query: Shield sensitive data in conversational analytics tools
- Automated Insights: Control what data AI can access for pattern discovery
- Multi-Tenant SaaS: Automatically enforce tenant isolation in shared databases
- Role-Based Dashboards: Dynamically filter data based on user permissions
- API Gateways: Add data governance to microservices without code changes
- Data Exploration Tools: Allow analysts to explore data within security boundaries
- GDPR Compliance: Automatically mask personal data for non-privileged users
- SOX Controls: Ensure financial data access follows strict authorization rules
- Healthcare (HIPAA): Protect patient information in medical record systems
- Financial Services: Maintain data privacy in banking and trading applications
- Data Anonymization: Automatically sanitize production data for development environments
- A/B Testing: Control feature access based on user segments and roles
- Staging Environments: Ensure test environments don't expose sensitive production data
roles:
- name: ai-agent
permissions:
- resource: customer_data
actions: [select]
mask: ["email", "creditCard", "ssn"]
- name: admin
permissions:
- resource: customer_data
actions: [select, insert, update, delete]
mask: []
Input:
SELECT customerName, email FROM customer_data;
Output for ai-agent role:
SELECT customerName, '***' AS email FROM customer_data;
βββ SQLProcessor
β βββ processQuery(query: string, permissions: Permission[]) β string
β βββ extractTableNames(query: string) β string[]
β βββ extractColumns(query: string) β string[]
β βββ maskColumns(sql: string, maskingRules: MaskingRule[]) β string
β βββ addWhereFilters(sql: string, filters: string[]) β string
β βββ injectRowLevelSecurity(sql: string, userContext: UserContext) β string
β βββ validateBasicSyntax(query: string) β boolean
β βββ isSelectQuery(query: string) β boolean
- All-in-one SQL processing - extraction, validation, and rewriting
- Direct string manipulation using regex patterns
- Fast and simple - perfect for AI-generated queries
- Focus on SELECT statements - primary use case for AI agents
βββ RoleEngine
β βββ resolveRole(context: UserContext) β Role
β βββ getPermissions(role: Role, resource: string) β Permission[]
β βββ canAccess(role: Role, action: string, resource: string) β boolean
β βββ getMaskingRules(role: Role, resource: string) β MaskingRule[]
- Manages role resolution from user context
- Enforces permission checks
- Provides masking rules for sensitive fields
βββ PolicyManager
β βββ loadPolicies(source: string) β Policy[]
β βββ validatePolicy(policy: Policy) β ValidationResult
β βββ refreshPolicies() β void
β βββ getPolicyForResource(resource: string) β Policy
- Loads role configurations from YAML/JSON
- Validates policy syntax and logic
- Supports hot-reloading of policies
βββ AuditLogger
β βββ logQueryRewrite(original: string, rewritten: string, context: AuditContext) β void
β βββ logAccessAttempt(user: string, resource: string, action: string) β void
β βββ logPolicyViolation(violation: PolicyViolation) β void
β βββ generateAuditReport(timeRange: TimeRange) β AuditReport
- Records all query transformations
- Tracks access attempts and violations
- Generates compliance reports
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Application Layer β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β AI Agents β β Web Apps β β APIs β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AIDataRoleGuard Core β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β SQL Processor β β Role Engine β β Policy Manager β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β βββββββββββββββββββ βββββββββββββββββββ β
β β Audit Logger β β Cache Layer β β
β βββββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Database Layer β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β MySQL β β PostgreSQL β β SQLite β β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Enhanced role configuration with more granular controls
roles:
- name: ai-agent
permissions:
- resource: customer_data
actions: [select]
conditions:
- field: created_date
operator: ">="
value: "2024-01-01"
mask:
- field: email
type: partial
pattern: "***@domain.com"
- field: ssn
type: full
replacement: "***-**-****"
- field: creditCard
type: tokenize
algorithm: "sha256"
row_filters:
- "status = 'active'"
- "tenant_id = ${user.tenant_id}"
- resource: analytics_data
actions: [select, insert]
aggregate_only: true # Only allow aggregate functions
- name: data-analyst
permissions:
- resource: customer_data
actions: [select]
mask:
- field: ssn
type: full
replacement: "***-**-****"
time_restrictions:
- days: ["monday", "tuesday", "wednesday", "thursday", "friday"]
- hours: ["09:00", "17:00"]
-
Phase 1: Core String Processor (MVP)
- Regex-based SQL parsing for SELECT statements
- Column masking with simple replacements
- Basic WHERE clause injection
- Target: 90% of AI-generated queries
-
Phase 2: Enhanced String Processing
- Support for INSERT/UPDATE/DELETE operations
- Advanced masking patterns (partial, tokenization)
- Multi-table query support
- Target: Handle edge cases and expand coverage
-
Phase 3: Performance & Scale
- Query result caching
- Policy hot-reloading
- Audit logging optimization
- Target: Production-ready performance
-
Phase 4: AI-Enhanced Features
- Query pattern analysis
- Automated policy recommendations
- Anomaly detection
- Target: Intelligent governance
class AIDataRoleGuard {
constructor(
private sqlProcessor: SQLProcessor,
private roleEngine: RoleEngine,
private auditLogger: AuditLogger
) {}
rewriteQuery(sql: string, userRole: string): string {
// 1. Get permissions for user role
const permissions = this.roleEngine.getPermissions(userRole);
// 2. Process and rewrite query in one step
const rewrittenSQL = this.sqlProcessor.processQuery(sql, permissions);
// 3. Log the transformation
this.auditLogger.log(sql, rewrittenSQL, userRole);
return rewrittenSQL;
}
}
class SQLProcessor {
processQuery(sql: string, permissions: Permission[]): string {
// 1. Extract table and column info
const tables = this.extractTableNames(sql);
const columns = this.extractColumns(sql);
// 2. Apply masking rules
let rewrittenSQL = this.maskColumns(sql, permissions.getMaskingRules());
// 3. Add row-level filters
rewrittenSQL = this.addWhereFilters(rewrittenSQL, permissions.getRowFilters());
return rewrittenSQL;
}
private maskColumns(sql: string, maskingRules: MaskingRule[]): string {
let maskedSQL = sql;
for (const rule of maskingRules) {
const pattern = new RegExp(`\\b${rule.column}\\b`, 'gi');
maskedSQL = maskedSQL.replace(pattern, `'${rule.replacement}' AS ${rule.column}`);
}
return maskedSQL;
}
private addWhereFilters(sql: string, filters: string[]): string {
if (filters.length === 0) return sql;
// Simple approach: inject WHERE clause
if (sql.toLowerCase().includes('where')) {
return sql.replace(/where/i, `WHERE ${filters.join(' AND ')} AND`);
} else {
return sql.replace(/from\s+\w+/i, `$& WHERE ${filters.join(' AND ')}`);
}
}
}
- AI queries are predictable: Most AI systems generate clean, simple SELECT statements
- Fast processing: String manipulation is much faster than AST parsing
- Easy to debug: Regex patterns are straightforward to understand and modify
- Sufficient coverage: Handles 90%+ of real-world AI-generated queries
- Simple testing: Easy to write unit tests for string transformations
- Core Logic: TypeScript/Node.js for cross-platform compatibility
- String Processing: Native regex with well-tested patterns
- Configuration: YAML/JSON with simple validation
- Caching: In-memory LRU cache for policies (Redis optional)
- Logging: Simple structured logging (JSON format)
- Testing: Jest with comprehensive regex pattern tests
Key Dependencies (Minimal):
js-yaml
- YAML configuration parsingjoi
orzod
- Configuration validationwinston
- Structured logging- No SQL parsing libraries needed!