The text to analyze
PrivatecharFrequency maps for characters and words
PrivatesentencesPrivatesyllablePrivate OptionalsyllableCached syllable stats
Private ReadonlytextThe original text to analyze
PrivatewordPrivatewordsTokenized words and sentences
Private Static ReadonlyREGEXRegular expressions used in text analysis
PrivatecomputeComputes character and word frequencies from the tokenized text.
PrivatecomputePrivateestimateEstimates the number of syllables in a word using a simple heuristic. Uses caching to avoid redundant calculations for identical words.
The word to estimate syllables for
Gets the average sentence length in words.
Gets the average number of syllables per word in the text.
Gets the average word length in the text.
Gets the frequency of each character in the text.
Gets the least common words (hapax legomena) in the text. Hapax legomena are words that occur only once in the text.
Calculates the Honore's R statistic for the text as a measure of lexical richness.
Gets the original text length in characters.
Calculates the LIX (Lesbarhetsindex) score for the text. The LIX score is a readability index that combines average word length and sentence length.
Gets the ratio of long words (words with length >= len) to total words.
Optionallen: number = 7Minimum length for a word to be considered long
Gets the number of words with at most a specified maximum syllable count.
Maximum syllable count for a word to be included
Gets the median number of syllables per word in the text.
Gets the number of words with at least a specified minimum syllable count.
Minimum syllable count for a word to be included
Gets the number of monosyllabic words (words with exactly one syllable).
Gets the most common words in the text, limited to a specified number.
Optionallimit: number = 5Maximum number of common words to return
Calculates various readability scores based on the text.
This method supports multiple readability metrics:
Optionalmetric: "flesch" | "fleschde" | "kincaid" = 'flesch'The readability metric to calculate
Estimates the reading time for the text based on words per minute (WPM).
Optionalwpm: number = 200Words per minute for the calculation
Gets the number of sentences in the text.
Gets the ratio of short words (words with length <= len) to total words.
Optionallen: number = 3Maximum length for a word to be considered short
Estimates the number of syllables in the text.
Gets the frequency of Unicode codepoints in the text.
Calculates the ratio of uppercase letters to total letters in the text.
Gets the number of words in the text.
Gets a histogram of word frequencies in the text.
Calculates the Wiener Sachtextformel (WSTF) scores for the text. The WSTF scores are a set of readability metrics based on word and sentence characteristics.
Checks if the text contains any numbers.
PrivatetokenizeTokenizes the input text into words and sentences.
Constructs a new TextAnalyzer instance with the provided input text.