Machine Learning toolkit for Natural Language Processing.
Written for LxMLS - Lisbon Machine Learning Summer School
- Scientific Python and Mathematical background
- Linear Classifiers (Gradient Descent)
- Feed-forward models in deep learning (Backpropagation)
- Sequence models in deep learning
- Attention Models (Transformers)
- Multimodal Models
Note
Bear in mind that the main purpose of the toolkit is educational. You may resort to other toolboxes if you are looking for efficient implementations of the algorithms described.
Important
Use the student branch not this one 🚨!
Download the code. If you are used to git just clone the student branch. For example from the command line in do
git clone https://github.com/LxMLS/lxmls-toolkit.git lxmls-toolkit-student
cd lxmls-toolkit-student
git checkout student
Install uv
Linux and MacOS
curl -LsSf https://astral.sh/uv/install.sh | sh
Windows
Open Command Prompt (search for cmd
) to run the following commands.
First, check if your system has git
using
git --version
If git
isn't installed run the following command to install it
winget install Git.Git
Then, install uv
using
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
If that errors out, try
winget install astral-sh.uv
Reference
If you do not have the proper python version, install it with
uv python install 3.12
If you have an Nvidia GPU, get the CUDA driver version with
nvidia-smi
or nvcc --version
.
Reference
Choose the torch index based on your system and setup the environment:
uv sync --extra {cpu, cu118, cu124, cu126}
For example, if you're on MacOS you'd use
uv sync --extra cpu
Activate the virtual environment with
Linux and MacOS
source ./.venv/bin/activate
Windows
.venv\Scripts\activate
Important
Remember to run scripts from the root directory lxmls-toolkit-student
Note
The following instructions are for developers building the toolkit.
Install the ruff
linter & ty
type-checker with
uv sync --extra dev
To run all tests install pytest
uv sync --extra test
and run
pytest -m "not gpu" -n auto
Run tests that are GPU intensive with single worker using
pytest -m gpu -n 1