Skip to content

Machine Learning applied to Natural Language Processing Toolkit used in the Lisbon Machine Learning Summer School

License

Notifications You must be signed in to change notification settings

LxMLS/lxmls-toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Version from PEP 621 TOML

LxMLS 2025

Machine Learning toolkit for Natural Language Processing.
Written for LxMLS - Lisbon Machine Learning Summer School

  • Scientific Python and Mathematical background
  • Linear Classifiers (Gradient Descent)
  • Feed-forward models in deep learning (Backpropagation)
  • Sequence models in deep learning
  • Attention Models (Transformers)
  • Multimodal Models

Note

Bear in mind that the main purpose of the toolkit is educational. You may resort to other toolboxes if you are looking for efficient implementations of the algorithms described.

Instructions for Students

Important

Use the student branch not this one 🚨!

Download the code. If you are used to git just clone the student branch. For example from the command line in do

git clone https://github.com/LxMLS/lxmls-toolkit.git lxmls-toolkit-student
cd lxmls-toolkit-student
git checkout student

Install uv

Linux and MacOS

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows
Open Command Prompt (search for cmd) to run the following commands.

First, check if your system has git using

git  --version

If git isn't installed run the following command to install it

winget install Git.Git

Then, install uv using

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

If that errors out, try

winget install astral-sh.uv

Set up environment

Reference
If you do not have the proper python version, install it with

uv python install 3.12

If you have an Nvidia GPU, get the CUDA driver version with nvidia-smi or nvcc --version.

Reference
Choose the torch index based on your system and setup the environment:

uv sync --extra {cpu, cu118, cu124, cu126}

For example, if you're on MacOS you'd use

uv sync --extra cpu

Activate the virtual environment with

Linux and MacOS

source ./.venv/bin/activate

Windows

.venv\Scripts\activate

Important

Remember to run scripts from the root directory lxmls-toolkit-student

Development

Note

The following instructions are for developers building the toolkit.

Install the ruff linter & ty type-checker with

uv sync --extra dev 

To run all tests install pytest

uv sync --extra test

and run

pytest -m "not gpu" -n auto

Run tests that are GPU intensive with single worker using

pytest -m gpu -n 1

About

Machine Learning applied to Natural Language Processing Toolkit used in the Lisbon Machine Learning Summer School

Resources

License

Stars

Watchers

Forks

Packages

No packages published