Rustling#

Rustling is a blazingly fast library for computational linguistics.

Installation#

Using pip:

pip install rustling

Using conda:

conda install -c conda-forge rustling

For Pyodide, pre-built WASM wheels (with multithreading disabled, as Pyodide does not support it) are available from each GitHub release — look for the .whl file with emscripten in the filename.

Rustling is also available in Rust.

Performance#

Benchmarked against Python implementations from NLTK, wordseg (v0.0.5), pylangacq (v0.19.1), and hmmlearn (v0.3.3). See benchmarks/ for full details and reproduction scripts.

Component

Task

Speedup

vs.

Language Models

Fit

11x

NLTK

Score

2x

NLTK

Generate

86–107x

NLTK

Word Segmentation

LongestStringMatching

9x

wordseg

POS Tagging

Training

5x

NLTK

Tagging

17x

NLTK

HMM

Fit

14x

hmmlearn

Predict

0.9x

hmmlearn

Score

5x

hmmlearn

CHAT Parsing

Reading from a ZIP archive

30x

pylangacq

Reading from strings

35x

pylangacq

Parsing utterances

15x

pylangacq

Parsing tokens

8x

pylangacq

Source Code#

The source code is available on GitHub.