Rustling#

Rustling is a blazingly fast library for computational linguistics. It is written in Rust, with Python bindings.

Installation#

pip install rustling

Performance#

Benchmarked against Python implementations from NLTK, wordseg (v0.0.5), pylangacq (v0.19.1), and hmmlearn (v0.3.3). See benchmarks/ for full details and reproduction scripts.

Component

Task

Speedup

vs.

Language Models

Fit

10x

NLTK

Score

1.9x

NLTK

Generate

106–114x

NLTK

Word Segmentation

LongestStringMatching

9x

wordseg

POS Tagging

Training

5x

NLTK

Tagging

18x

NLTK

HMM

Fit

13x

hmmlearn

Predict

0.9x

hmmlearn

Score

5x

hmmlearn

CHAT Parsing

Reading from a ZIP archive

43x

pylangacq

Reading from strings

70x

pylangacq

Parsing utterances

15x

pylangacq

Parsing tokens

9x

pylangacq

Source Code#

The source code is available on GitHub.