This project provides tools to generate realistic company documents (contracts, invoices, payroll) and load them into a Neo4j graph database for analysis and querying.
- Customer Contracts
- Vendor Contracts
- Customer Invoices
- Vendor Invoices
- Employee Payroll Documents
- Department Payroll Reports
- Schema Management
- Document Ingestion
- Relationship Mapping
-
Set up Development Environment
devbox shell poetry install
-
Configure Environment Variables
Copy the example environment file and configure your settings:
cp .env.example .env
Required environment variables:
NEO4J_URI
: Your Neo4j database URI (default: neo4j://localhost:7687)NEO4J_USERNAME
: Neo4j database usernameNEO4J_PASSWORD
: Neo4j database passwordOPENAI_API_KEY
: Your OpenAI API key for contract generationLLAMA_PARSE_API_KEY
: Your Llama Parse API key for document parsing
-
Generate Company Documents
Run the following generators to create sample documents:
# Generate customer contracts poetry run python -m src.generators.generate_contracts # Generate vendor contracts poetry run python -m src.generators.generate_vendor_contracts # Generate customer invoices poetry run python -m src.generators.generate_invoices # Generate vendor invoices poetry run python -m src.generators.generate_vendor_invoices # Generate payroll documents poetry run python -m src.generators.generate_payrolls
-
Set up Graph Database
Apply the graph schema:
poetry run python -m src.graph.apply_schema
-
Load Documents into Graph Database
Ingest generated documents:
poetry run python -m src.graph.loader
All generated documents are stored in the company_documents
directory with the following structure: