Skip to content

Sirush/Jiten

Repository files navigation

Credits

  • Sudachi.rs - Morphological analyzer
  • Nazeka - Original deconjugation rules, deconjugator
  • JL - Updated deconjugation rules, deconjugator port
  • Ichiran - Parser tests
  • JMDict - Dictionary
  • JmdictFurigana - Furigana dictionary for JMDict
  • Lapis - Anki notetype
  • Hatsuon - Pitch accent display

Installation

System Requirements

  • Operating System: Linux, macOS, or Windows (with Docker support)
  • CPU & Memory: Minimum 2 CPU cores and 4GB of RAM recommended for development
  • Ports: Ensure ports 8080, 3001, and 3005 are available for the API, web, and Umami services respectively

Prerequisites

Before you begin, make sure you have the following installed on your machine:

Installation Steps - Frontend

1. Clone the Repository

Open your terminal and run the following command to clone the repository:

git clone https://github.com/Sirush/Jiten.git
cd Jiten

2. Configure Environment Variables

The project uses an environment file to set various configuration options for the services. Follow these steps:

  1. Copy the provided .env.example file to a new file named .env:

    cp .env.example .env
  2. Open the .env file in your favorite text editor and modify the variables as needed.

3. Start the Services

With Docker and Docker Compose installed and the environment variables configured, you can now start the application:

  1. Open a terminal in the project root directory.

  2. Run the following command to spin up the services:

    docker-compose up -d

    This command will:

    • Build and run the API and Web services using their respective Dockerfiles.
    • Pull and run the Postgres and Umami containers.
    • Create and attach the required volumes.

Installation Steps - CLI

coming soon™

Services Overview

  • Postgres:

    • Database for the project.
  • API:

    • Built from the Jiten.Api/Dockerfile.
    • Exposes port 8080.
  • Web:

    • Nuxt frontend.
    • Built from the Jiten.Web/Dockerfile.
    • Exposes port 3001.
  • Umami:

    • A web analytics tool based on Umami running with PostgreSQL.
    • Exposes port 3005.

Additional Notes

  • Traefik Labels:
    The services are configured with Traefik labels for reverse proxying. Make sure your Traefik setup is compatible if you plan to use it for routing (e.g., the rules Host(api.jiten.moe), Host(jiten.moe), and Host(umami.jiten.moe)).

  • Local Development:
    When developing locally, you might want to adjust URLs in the .env file to match your local environment (e.g., API_BASE_URL=http://localhost:8080/api).

  • Persistent Storage:
    The Docker Compose file defines persistent volumes (postgres_data, uploads, and dictionaries) to store data across container restarts.

  • Installing Pgroonga:
    To use the search function, you must install Pgroonga, a powerful full text search engine compatible with Japanese. Follow the instructions in the Pgroonga documentation according to your platform, and then execute the following command to activate the extension in your database and create the index:

CREATE EXTENSION IF NOT EXISTS pgroonga;
CREATE INDEX "IX_Decks_Title_Pgroonga" ON jiten."Decks"
USING pgroonga ("OriginalTitle", "RomajiTitle", "EnglishTitle");

Parser performance & cache

Activating the cache can offer an appreciable speedup at the cost of RAM

Here's 3 scenarios, on 75 decks totaling 42millions moji, all running on 8 threads:

  • Word Cache & Deconjugator cache: 316303ms / 8 GB RAM / 8m moji/min
  • Word Cache only: 324542ms / 3.7 GB RAM / 7.8m moji/min
  • No Cache: 502354ms / 3 GB RAM / 5m moji/min

The best option is to have the word cache only, the deconjugator only offering ~3% more speed at a great cost of RAM

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •