Rustling#
Rustling is a blazingly fast library for computational linguistics. It is written in Rust, with Python bindings.
Installation#
pip install rustling
Performance#
Benchmarked against pure Python implementations from NLTK, wordseg (v0.0.5), and pylangacq (v0.19.1). See benchmarks/ for full details and reproduction scripts.
Component |
Task |
Speedup |
vs. |
|---|---|---|---|
Language Models |
Fit |
10x |
NLTK |
Score |
2x |
NLTK |
|
Generate |
80–112x |
NLTK |
|
Word Segmentation |
LongestStringMatching |
9x |
wordseg |
RandomSegmenter |
1.1x |
wordseg |
|
POS Tagging |
Training |
5x |
NLTK |
Tagging |
7x |
NLTK |
|
CHAT Parsing |
from_dir |
55x |
pylangacq |
from_zip |
48x |
pylangacq |
|
from_files |
63x |
pylangacq |
|
from_strs |
116x |
pylangacq |
|
words() |
3x |
pylangacq |
|
utterances() |
15x |
pylangacq |
Source Code#
The source code is available on GitHub.