rustling.tagging#
Part-of-speech tagging.
Package Contents#
- class rustling.tagging.AveragedPerceptronTagger(*, frequency_threshold: int = 10, ambiguity_threshold: float = 0.95, n_iter: int = 5)#
A part-of-speech tagger using an averaged perceptron model.
This is a modified version based on the textblob-aptagger codebase (MIT license), with original implementation by Matthew Honnibal.
- predict(words: Sequence[str]) list[str]#
Predict tags for the words.
- Parameters:
words – A segmented sentence or phrase, where each word is a string.
- Returns:
The list of predicted tags.
- fit(tagged_sents: Sequence[Sequence[tuple[str, str]]]) None#
Fit a model.
- Parameters:
tagged_sents – A list of segmented and tagged sentences for training. Each sentence is a sequence of (word, tag) tuples.
- save(path: str) None#
Save the model to a JSON file.
- Parameters:
path – The path where the model will be saved as a JSON file.
- load(path: str) None#
Load a model from a JSON file.
- Parameters:
path – The path where the model, stored as a JSON file, is located.
- Raises:
FileNotFoundError – If the file at the given path does not exist.
EnvironmentError – If the file cannot be read as a tagger model.
- property weights: dict[str, dict[str, float]]#
Get the model’s weights dictionary.
- Returns:
A dictionary mapping features to their weight vectors.
- property tagdict: dict[str, str]#
Get the tag dictionary.
- Returns:
A dictionary mapping words to their most likely tags.
- property classes: set[str]#
Get the set of POS tag classes.
- Returns:
A set of all tag classes in the model.