An effort to bring LLMs to very resource-constrained systems.
This is a toy project. Please don't take it too seriously :)
Bazel is the primary and modern way to build and test.
Build:
bazel build tools:ullm
Test:
bazel test ...
Run:
bazel run tools/ullm -- -p "The quick brown fox jumped. Where did they go?"
A Makefile is offered as a way to more easily tweak the build. It is not as full-featured as Bazel.
Build:
make clean && make fetchdeps && make -j`nproc`
Run:
./out/ullm.elf -c out/stories15M.bin -t out/llama2.c/tokenizer.bin -p "The quick brown fox jumped. Where did he go?"
ullm
contains a heavily modified version of the llama2.c
project.
Forked from llama2.c.
This code retains the MIT license, and the headers indicate as such.
Thanks to Andrej Karpathy for the llama2.c project. I really enjoyed experimenting with the code.