Skip to content

MollySophia/rwkv-mobile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rwkv-mobile

An inference runtime with multiple backends supported.

Goal:

  • Easy integration on different platforms using flutter or native cpp, including mobile devices.
  • Support inference using different hardware like Qualcomm Hexagon NPU, or general CPU/GPU.
  • Provide easy-to-use C apis
  • Provide an api server compatible with AI00_server(openai api)

Supported or planned backends:

  • WebRWKV (WebGPU): Compatible with most PC graphics cards, as well as macOS Metal. Doesn't work on Qualcomm's proprietary Adreno GPU driver though.
  • llama.cpp: Run on Android devices with CPU inference.
  • ncnn: Initial support for rwkv v6/v7 unquantized models (suitable for running tiny models everywhere).
  • Qualcomm Hexagon NPU: Based on Qualcomm's QNN SDK.
  • CoreML: (WIP) Running RWKV with Apple Neural Engine. Based on Apple's CoreML framework.
  • To be continued...

How to build:

  • Install rust and cargo (for building the web-rwkv backend)
  • git clone --recursive https://github.com/MollySophia/rwkv-mobile
  • cd rwkv-mobile && mkdir build && cd build
  • cmake ..
  • cmake --build . -j $(nproc)

TODO:

  • Better tensor abstraction for different backends
  • Batch inference for all backends

About

Inference RWKV with multiple supported backends.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •