1 "Trends" Tool

The purpose of the “text” field is to provide an example of an actual use of the word, as the lemma FIXME, but some words (e.g. “DuFay”) shouldn’t be. (Also, some lemmas are strange, like “whatev”.)

Python method
def should_use_token (token,*,lang)

Recognizes tokens which should be included in counting with respect to the given spacy.language.Language instance.

Some kinds of tokens which are excluded:

punctuation;
whitespace;
stop words; and
tokens which have a “boring” part-of-speech tag.

Part-of-speech tags that are considered “boring” notably include "NUM" (numeral) and "SYM" (symbol). Currently, all part-of-speech tags are considered “boring” except for "NOUN" and "PROPN" (i.e. proper and common nouns).

top ← prev up next →