textstat:文本可读性计算包

mac2025-05-02  11


textstat是python的文本可读性计算包,可以计算 文章层面、段落层面·句子层面 的文本的

音节统计syllable_count

词汇数统计lexicon_count

句子数统计sentence_count

各种可读性算法

目前支持的语言有英语en、德语de、西班牙语es、法语fr、意大利语it、荷兰语nl、波兰语pl、俄语ru,目前不支持中文呢。

可读性计算方法有

The Flesch Reading Ease formula

Flesch-Kincaid Grade Level

The Fog Scale (Gunning FOG Formula)

The SMOG Index

Automated Readability Index

The Coleman-Liau Index

Linsear Write Formula

Dale-Chall Readability Score

安装

!pip3 install textstat

音节统计

textstat.syllable_count(text)

import textstat test = 'Playing games' textstat.syllable_count(test)

Run

3

词汇统计

textstat.lexicon_count(text, removepunct=True)

test2 = "Playing games has always!" textstat.lexicon_count(test2, removepunct=True)

Run

4

可读性

输入text,返回可读性值。

textstat.fleschreadingease(text)

textstat.smog_index(text)

textstat.fleschkincaidgrade(text)

textstat.colemanliauindex(text)

textstat.automatedreadabilityindex(text)

textstat.dalechallreadability_score(text)

textstat.difficult_words(text)

textstat.linsearwriteformula(text)

textstat.gunning_fog(text)

textstat.text_standard(text)

每种算法大家请移步到github项目链接

https://github.com/shivam5992/textstat

查看计算原理及得分的解读。

test_data = "Playing games has always been thought to be important to the development of well-balanced \ and creative children; however, what part, if any, they should play in the lives of \ adults has never been researched that deeply. I believe that playing games is every bit \ as important for adults as for children. Not only is taking time out to play games with our \ children and other adults valuable to building interpersonal relationships but is also a wonderful way \ to release built up tension."

Run

print(textstat.flesch_reading_ease(test_data)) print(textstat.smog_index(test_data)) print(textstat.flesch_kincaid_grade(test_data)) print(textstat.coleman_liau_index(test_data)) print(textstat.automated_readability_index(test_data)) print(textstat.dale_chall_readability_score(test_data)) print(textstat.difficult_words(test_data)) print(textstat.linsear_write_formula(test_data)) print(textstat.gunning_fog(test_data)) print(textstat.text_standard(test_data))

Run

52.23 12.5 12.8 11.03 15.5 6.72 9 16.333333333333332 12.38 12th and 13th grade

近期文章

最新回复(0)