fbpx
Wikipedia

Readability

Readability is the ease with which a reader can understand a written text. In natural language, the readability of text depends on its content (the complexity of its vocabulary and syntax) and its presentation (such as typographic aspects that affect legibility, like font size, line height, character spacing, and line length).[1] Researchers have used various factors to measure readability, such as:

  • Speed of perception
  • Perceptibility at a distance
  • Perceptibility in peripheral vision
  • Visibility
  • Reflex blink technique
  • Rate of work (reading speed)
  • Eye movements
  • Fatigue in reading[2]
  • Cognitively-motivated features[3]
  • Word difficulty
  • N-gram analysis[4]
  • Semantic Richness[5]

Higher readability eases reading effort and speed for any reader, but it makes a larger difference for those who do not have high reading comprehension.

Readability exists in both natural language and programming languages though in different forms. In programming, things such as programmer comments, choice of loop structure, and choice of names can determine the ease with which humans can read computer program code.

Numeric readability metrics (also known as readability tests or readability formulas) for natural language tend to use simple measures like word length (by letter or syllable), sentence length, and sometimes some measure of word frequency. They can be built into word processors,[6] can score documents, paragraphs, or sentences, and are a much cheaper and faster alternative to a readability survey involving human readers. They are faster to calculate than more accurate measures of syntactic and semantic complexity. In some cases they are used to estimate appropriate grade level.

Definition

People have defined readability in various ways, e.g., in: The Literacy Dictionary,[7] Jeanne Chall and Edgar Dale,[8] G. Harry McLaughlin,[9] William DuBay.[10][further explanation needed]

Applications

Easy reading helps learning and enjoyment,[11] and can save money.[12]

Much research has focused on matching prose to reading skill, resulting in formulas for use in research, government, teaching, publishing, the military, medicine, and business.[13][14]

Readability and newspaper readership

Several studies in the 1940s showed that even small increases in readability greatly increases readership in large-circulation newspapers.

In 1947, Donald Murphy of Wallace's Farmer used a split-run edition to study the effects of making text easier to read. He found that reducing from the 9th to the 6th-grade reading level increased readership by 43% for an article on 'nylon'. The result was a gain of 42,000 readers in a circulation of 275,000. He also found a 60% increase in readership for an article on corn, with better responses from people under 35.[15]

Wilber Schramm interviewed 1,050 newspaper readers. He found that an easier reading style helps to determine how much of an article is read. This was called reading persistence, depth, or perseverance. He also found that people will read less of long articles than of short ones. A story nine paragraphs long will lose 3 out of 10 readers by the fifth paragraph. A shorter story will lose only two. Schramm also found that the use of subheads, bold-face paragraphs, and stars to break up a story actually lose readers.[16]

A study in 1947 by Melvin Lostutter showed that newspapers generally were written at a level five years above the ability of average American adult readers.

The reading ease of newspaper articles was not found to have much connection with the education, experience, or personal interest of the journalists writing the stories. It instead had more to do with the convention and culture of the industry. Lostutter argued for more readability testing in newspaper writing. Improved readability must be a "conscious process somewhat independent of the education and experience of the staffs writers."[17]

A study by Charles Swanson in 1948 showed that better readability increases the total number of paragraphs read by 93% and the number of readers reading every paragraph by 82%.[18]

In 1948, Bernard Feld did a study of every item and ad in the Birmingham News of 20 November 1947. He divided the items into those above the 8th-grade level and those at the 8th grade or below. He chose the 8th-grade breakpoint, as that was determined to be the average reading level of adult readers. An 8th-grade text "...will reach about 50% of all American grown-ups," he wrote. Among the wire-service stories, the lower group got two-thirds more readers, and among local stories, 75% more readers. Feld also believed in drilling writers in Flesch's clear-writing principles.[19]

Both Rudolf Flesch and Robert Gunning worked extensively with newspapers and the wire services in improving readability. Mainly through their efforts in a few years, the readability of US newspapers went from the 16th to the 11th-grade level, where it remains today.

The two publications with the largest circulations, TV Guide (13 million) and Reader's Digest (12 million), are written at the 9th-grade level.[10] The most popular novels are written at the 7th-grade level. This supports the fact that the average adult reads at the 9th-grade level. It also shows that, for recreation, people read texts that are two grades below their actual reading level.[20]

The George Klare studies

George Klare and his colleagues looked at the effects of greater reading ease on Air Force recruits. They found that more readable texts resulted in greater and more complete learning. They also increased the amount read in a given time, and made for easier acceptance.[21][22]

Other studies by Klare showed how the reader's skills,[23] prior knowledge,[24] interest, and motivation[23][24] affect reading ease.

Early research

In the 1880s, English professor L. A. Sherman found that the English sentence was getting shorter. In Elizabethan times, the average sentence was 50 words long. In his own time, it was 23 words long.

Sherman's work established that:

  • Literature is a subject for statistical analysis.
  • Shorter sentences and concrete terms help people to make sense of what is written.
  • Speech is easier to understand than text.
  • Over time, text becomes easier if it is more like speech.

Sherman wrote: "Literary English, in short, will follow the forms of standard spoken English from which it comes. No man should talk worse than he writes, no man should write better than he should talk.... The oral sentence is clearest because it is the product of millions of daily efforts to be clear and strong. It represents the work of the race for thousands of years in perfecting an effective instrument of communication."[25]

In 1889 in Russia, the writer Nikolai A. Rubakin published a study of over 10,000 texts written by everyday people.[26] From these texts, he took 1,500 words he thought most people understood. He found that the main blocks to comprehension are unfamiliar words and long sentences.[27] Starting with his own journal at the age of 13, Rubakin published many articles and books on science and many subjects for the great numbers of new readers throughout Russia. In Rubakin's view, the people were not fools. They were simply poor and in need of cheap books, written at a level they could grasp.[26]

In 1921, Harry D. Kitson published The Mind of the Buyer, one of the first books to apply psychology to marketing. Kitson's work showed that each type of reader bought and read their own type of text. On reading two newspapers and two magazines, he found that short sentence length and short word length were the best contributors to reading ease.[28]

Text leveling

The earliest reading ease assessment is the subjective judgment termed text leveling. Formulas do not fully address the various content, purpose, design, visual input, and organization of a text.[29][30][31] Text leveling is commonly used to rank the reading ease of texts in areas where reading difficulties are easy to identify, such as books for young children. At higher levels, ranking reading ease becomes more difficult, as individual difficulties become harder to identify. This has led to better ways to assess reading ease.

Vocabulary frequency lists

In the 1920s, the scientific movement in education looked for tests to measure students' achievement to aid in curriculum development. Teachers and educators had long known that, to improve reading skill, readers—especially beginning readers—need reading material that closely matches their ability. University-based psychologists did much of the early research, which was later taken up by textbook publishers.[11]

Educational psychologist Edward Thorndike of Columbia University noted that, in Russia and Germany, teachers used word frequency counts to match books to students. Word skill was the best sign of intellectual development, and the strongest predictor of reading ease. In 1921, Thorndike published Teachers Word Book, which contained the frequencies of 10,000 words.[32] It made it easier for teachers to choose books that matched class reading skills. It also provided a basis for future research on reading ease.

Until computers came along, word frequency lists were the best aids for grading reading ease of texts.[20] In 1981 the World Book Encyclopedia listed the grade levels of 44,000 words.[33]

Early children's readability formulas

In 1923, Bertha A. Lively and Sidney L. Pressey published the first reading ease formula. They were concerned that junior high school science textbooks had so many technical words. They felt that teachers spent all class time explaining these words. They argued that their formula would help to measure and reduce the "vocabulary burden" of textbooks. Their formula used five variable inputs and six constants. For each thousand words, it counted the number of unique words, the number of words not on the Thorndike list, and the median index number of the words found on the list. Manually, it took three hours to apply the formula to a book.[34]

After the Lively–Pressey study, people looked for formulas that were more accurate and easier to apply. By 1980, over 200 formulas were published in different languages.[35][citation needed] In 1928, Carleton Washburne and Mabel Vogel created the first modern readability formula. They validated it by using an outside criterion, and correlated .845 with test scores of students who read and liked the criterion books.[36] It was also the first to introduce the variable of interest to the concept of readability.[37]

Between 1929 and 1939, Alfred Lewerenz of the Los Angeles School District published several new formulas.[38][39][40][41][42]

In 1934, Edward Thorndike published his formula. He wrote that word skills can be increased if the teacher introduces new words and repeats them often.[43] In 1939, W.W. Patty and W. I Painter published a formula for measuring the vocabulary burden of textbooks. This was the last of the early formulas that used the Thorndike vocabulary-frequency list.[44]

Early adult readability formulas

During the recession of the 1930s, the U.S. government invested in adult education. In 1931, Douglas Waples and Ralph Tyler published What Adults Want to Read About. It was a two-year study of adult reading interests. Their book showed not only what people read but what they would like to read. They found that many readers lacked suitable reading materials: they would have liked to learn but the reading materials were too hard for them.[45]

Lyman Bryson of Teachers College, Columbia University found that many adults had poor reading ability due to poor education. Even though colleges had long tried to teach how to write in a clear and readable style, Bryson found that it was rare. He wrote that such language is the result of a "...discipline and artistry that few people who have ideas will take the trouble to achieve... If simple language were easy, many of our problems would have been solved long ago."[20] Bryson helped set up the Readability Laboratory at the college. Two of his students were Irving Lorge and Rudolf Flesch.

In 1934, Ralph Ojemann investigated adult reading skills, factors that most directly affect reading ease, and causes of each level of difficulty. He did not invent a formula, but a method for assessing the difficulty of materials for parent education. He was the first to assess the validity of this method by using 16 magazine passages tested on actual readers. He evaluated 14 measurable and three reported factors that affect reading ease.

Ojemann emphasized the reported features, such as whether the text was coherent or unduly abstract. He used his 16 passages to compare and judge the reading ease of other texts, a method now called scaling. He showed that even though these factors cannot be measured, they cannot be ignored.[46]

Also in 1934, Ralph Tyler and Edgar Dale published the first adult reading ease formula based on passages on health topics from a variety of textbooks and magazines. Of 29 factors that are significant for young readers, they found ten that are significant for adults. They used three of these in their formula.[47]

In 1935, William S. Gray of the University of Chicago and Bernice Leary of Xavier College in Chicago published What Makes a Book Readable, one of the most important books in readability research. Like Dale and Tyler, they focused on what makes books readable for adults of limited reading ability. Their book included the first scientific study of the reading skills of American adults. The sample included 1,690 adults from a variety of settings and regions. The test used a number of passages from newspapers, magazines, and books—as well as a standard reading test. They found a mean grade score of 7.81 (eighth month of the seventh grade). About one-third read at the 2nd to 6th-grade level, one-third at the 7th to 12th-grade level, and one-third at the 13th–17th grade level.

The authors emphasized that one-half of the adult population at that time lacked suitable reading materials. They wrote, "For them, the enriching values of reading are denied unless materials reflecting adult interests are adapted to their needs." The poorest readers, one-sixth of the adult population, need "simpler materials for use in promoting functioning literacy and in establishing fundamental reading habits."[48]

Gray and Leary then analyzed 228 variables that affect reading ease and divided them into four types:

  1. Content
  2. Style
  3. Format
  4. Organization

They found that content was most important, followed closely by style. Third was format, followed closely by organization. They found no way to measure content, format, or organization—but they could measure variables of style. Among the 17 significant measurable style variables, they selected five to create a formula:

Their formula had a correlation of .645 with comprehension as measured by reading tests given to about 800 adults.[48]

In 1939, Irving Lorge published an article that reported other combinations of variables that indicate difficulty more accurately than the ones Gray and Leary used. His research also showed that, "The vocabulary load is the most important concomitant of difficulty."[49] In 1944, Lorge published his Lorge Index, a readability formula that used three variables and set the stage for simpler and more reliable formulas that followed.[50]

By 1940, investigators had:

  • Successfully used statistical methods to analyze reading ease
  • Found that unusual words and sentence length were among the first causes of reading difficulty
  • Used vocabulary and sentence length in formulas to predict reading ease

Popular readability formulas

The Flesch formulas

In 1943, Rudolf Flesch published his PhD dissertation, Marks of a Readable Style, which included a readability formula to predict the difficulty of adult reading material. Investigators in many fields began using it to improve communications. One of the variables it used was personal references, such as names and personal pronouns. Another variable was affixes.[51]

In 1948, Flesch published his Reading Ease formula in two parts. Rather than using grade levels, it used a scale from 0 to 100, with 0 equivalent to the 12th grade and 100 equivalent to the 4th grade. It dropped the use of affixes. The second part of the formula predicts human interest by using personal references and the number of personal sentences. The new formula correlated 0.70 with the McCall-Crabbs reading tests.[52] The original formula is:

Reading Ease score = 206.835 − (1.015 × ASL) − (84.6 × ASW)
Where: ASL = average sentence length (number of words divided by number of sentences)
ASW = average word length in syllables (number of syllables divided by number of words)

Publishers discovered that the Flesch formulas could increase readership up to 60%. Flesch's work also made an enormous impact on journalism. The Flesch Reading Ease formula became one of the most widely used, tested, and reliable readability metrics.[53][54] In 1951, Farr, Jenkins, and Patterson simplified the formula further by changing the syllable count. The modified formula is:

New reading ease score = 1.599nosw − 1.015sl − 31.517
Where: nosw = number of one-syllable words per 100 words and
sl = average sentence length in words.[55]

In 1975, in a project sponsored by the U.S. Navy, the Reading Ease formula was recalculated to give a grade-level score. The new formula is now called the Flesch–Kincaid grade-level formula.[56] The Flesch–Kincaid formula is one of the most popular and heavily tested formulas. It correlates 0.91 with comprehension as measured by reading tests.[10]

The Dale–Chall formula

Edgar Dale, a professor of education at Ohio State University, was one of the first critics of Thorndike's vocabulary-frequency lists. He claimed that they did not distinguish between the different meanings that many words have. He created two new lists of his own. One, his "short list" of 769 easy words, was used by Irving Lorge in his formula. The other was his "long list" of 3,000 easy words, which were understood by 80% of fourth-grade students. However, one has to extend the word lists by regular plurals of nouns, regular forms of the past tense of verbs, progressive forms of verbs etc. In 1948, he incorporated this list into a formula he developed with Jeanne S. Chall, who later founded the Harvard Reading Laboratory.

To apply the formula:

  1. Select several 100-word samples throughout the text.
  2. Compute the average sentence length in words (divide the number of words by the number of sentences).
  3. Compute the percentage of words NOT on the Dale–Chall word list of 3,000 easy words.
  4. Compute this equation from 1948:
    Raw score = 0.1579*(PDW) + 0.0496*(ASL) if the percentage of PDW is less than 5%, otherwise compute
    Raw score = 0.1579*(PDW) + 0.0496*(ASL) + 3.6365

Where:

Raw score = uncorrected reading grade of a student who can answer one-half of the test questions on a passage.
PDW = Percentage of difficult words not on the Dale–Chall word list.
ASL = Average sentence length

Finally, to compensate for the "grade-equivalent curve", apply the following chart for the Final Score:

Raw score Final score
4.9 and below Grade 4 and below
5.0–5.9 Grades 5–6
6.0–6.9 Grades 7–8
7.0–7.9 Grades 9–10
8.0–8.9 Grades 11–12
9.0–9.9 Grades 13–15 (college)
10 and above Grades 16 and above.

[57]

Correlating 0.93 with comprehension as measured by reading tests, the Dale–Chall formula is the most reliable formula and is widely used in scientific research.[citation needed]

In 1995, Dale and Chall published a new version of their formula with an upgraded word list, the New Dale–Chall readability formula.[58] Its formula is:

Raw score = 64 - 0.95 *(PDW) - 0.69 *(ASL)

The Gunning fog formula

In the 1940s, Robert Gunning helped bring readability research into the workplace. In 1944, he founded the first readability consulting firm dedicated to reducing the "fog" in newspapers and business writing. In 1952, he published The Technique of Clear Writing with his own Fog Index, a formula that correlates 0.91 with comprehension as measured by reading tests.[10] The formula is one of the most reliable and simplest to apply:

Grade level= 0.4 * ( (average sentence length) + (percentage of Hard Words) )
Where: Hard Words = words with more than two syllables.[59]

Fry readability graph

In 1963, while teaching English teachers in Uganda, Edward Fry developed his Readability Graph. It became one of the most popular formulas and easiest to apply.[60][61] The Fry Graph correlates 0.86 with comprehension as measured by reading tests.[10]

McLaughlin's SMOG formula

Harry McLaughlin determined that word length and sentence length should be multiplied rather than added as in other formulas. In 1969, he published his SMOG (Simple Measure of Gobbledygook) formula:

SMOG grading = 3 + polysyllable count.
Where: polysyllable count = number of words of more than two syllables in a sample of 30 sentences.[9]

The SMOG formula correlates 0.88 with comprehension as measured by reading tests.[10] It is often recommended for use in healthcare.[62]

The FORCAST formula

In 1973, a study commissioned by the US military of the reading skills required for different military jobs produced the FORCAST formula. Unlike most other formulas, it uses only a vocabulary element, making it useful for texts without complete sentences. The formula satisfied requirements that it would be:

  • Based on Army-job reading materials.
  • Suitable for the young adult-male recruits.
  • Easy enough for Army clerical personnel to use without special training or equipment.

The formula is:

Grade level = 20 − (N / 10)
Where N = number of single-syllable words in a 150-word sample.[63]

The FORCAST formula correlates 0.66 with comprehension as measured by reading tests.[10]

The Golub Syntactic Density Score

The Golub Syntactic Density Score was developed by Lester Golub in 1974. It is among a smaller subset of readability formulas that concentrate on the syntactic features of a text. To calculate the reading level of a text, a sample of several hundred words is taken from the text. The number of words in the sample is counted, as are the number of T-units. A T-unit is defined as an independent clause and any dependent clauses attached to it. Other syntactical units are then counted and entered into the following table:

 1. Words/T-unit .95 X _________ ___ 2. Subordinate clauses/T-unit .90 X _________ ___ 3. Main clause word length (mean) .20 X _________ ___ 4. Subordinate clause length (mean) .50 X _________ ___ 5. Number of Modals (will, shall, can, may, must, would...) .65 X _________ ___ 6. Number of Be and Have forms in the auxiliary .40 X _________ ___ 7. Number of Prepositional Phrases .75 X _________ ___ 8. Number of Possessive nouns and pronouns .70 X _________ ___ 9. Number of Adverbs of Time (when, then, once, while...) .60 X _________ ___ 10. Number of gerunds, participles, and absolutes Phrases .85 X _________ ___ 

Users add the numbers in the right hand column and divide the total by the number of T-units. Finally, the quotient is entered into the following table to arrive at a final readability score.

SDS 0.5 1.3 2.1 2.9 3.7 4.5 5.3 6.1 6.9 7.7 8.5 9.3 10.1 10.9
Grade 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Measuring coherence and organization

For centuries, teachers and educators have seen the importance of organization, coherence, and emphasis in good writing. Beginning in the 1970s, cognitive theorists began teaching that reading is really an act of thinking and organization. The reader constructs meaning by mixing new knowledge into existing knowledge. Because of the limits of the reading ease formulas, some research looked at ways to measure the content, organization, and coherence of text. Although this did not improve the reliability of the formulas, their efforts showed the importance of these variables in reading ease.

Studies by Walter Kintch and others showed the central role of coherence in reading ease, mainly for people learning to read.[64] In 1983, Susan Kemper devised a formula based on physical states and mental states. However, she found this was no better than word familiarity and sentence length in showing reading ease.[65]

Bonnie Meyer and others tried to use organization as a measure of reading ease. While this did not result in a formula, they showed that people read faster and retain more when the text is organized in topics. She found that a visible plan for presenting content greatly helps readers to assess a text. A hierarchical plan shows how the parts of the text are related. It also aids the reader in blending new information into existing knowledge structures.[66]

Bonnie Armbruster found that the most important feature for learning and comprehension is textual coherence, which comes in two types:

  • Global coherence, which integrates high-level ideas as themes in an entire section, chapter, or book.
  • Local coherence, which joins ideas within and between sentences.

Armbruster confirmed Kintsch's finding that coherence and structure are more help for younger readers.[67] R. C. Calfee and R. Curley built on Bonnie Meyer's work and found that an unfamiliar underlying structure can make even simple text hard to read. They brought in a graded system to help students progress from simpler story lines to more advanced and abstract ones.[68]

Many other studies looked at the effects on reading ease of other text variables, including:

  • Image words, abstraction, direct and indirect statements, types of narration and sentences, phrases, and clauses;[48]
  • Difficult concepts;[54]
  • Idea density;[69]
  • Human interest;[59][70]
  • Nominalization;[71]
  • Active and passive voice;[72][73][74][75]
  • Embeddedness;[73]
  • Structural cues;[76][77]
  • The use of images;[78][79]
  • Diagrams and line graphs;[80]
  • Highlighting;[81]
  • Fonts and layout;[82]
  • Document age.[83]

Advanced readability formulas

The John Bormuth formulas

John Bormuth of the University of Chicago looked at reading ease using the new Cloze deletion test developed by Wilson Taylor. His work supported earlier research including the degree of reading ease for each kind of reading. The best level for classroom "assisted reading" is a slightly difficult text that causes a "set to learn", and for which readers can correctly answer 50% of the questions of a multiple-choice test. The best level for unassisted reading is one for which readers can correctly answer 80% of the questions. These cutoff scores were later confirmed by Vygotsky[84] and Chall and Conard.[85] Among other things, Bormuth confirmed that vocabulary and sentence length are the best indicators of reading ease. He showed that the measures of reading ease worked as well for adults as for children. The same things that children find hard are the same for adults of the same reading levels. He also developed several new measures of cutoff scores. One of the most well known was the Mean Cloze Formula, which was used in 1981 to produce the Degree of Reading Power system used by the College Entrance Examination Board.[86][87][88]

The Lexile framework

In 1988, Jack Stenner and his associates at MetaMetrics, Inc. published a new system, the Lexile Framework, for assessing readability and matching students with appropriate texts.

The Lexile framework uses average sentence length, and average word frequency in the American Heritage Intermediate Corpus to predict a score on a 0–2000 scale. The AHI Corpus includes five million words from 1,045 published works often read by students in grades three to nine.

The Lexile Book Database has more than 100,000 titles from more than 450 publishers. By knowing a student's Lexile score, a teacher can find books that match his or her reading level.[89]

ATOS readability formula for books

In 2000, researchers of the School Renaissance Institute and Touchstone Applied Science Associates published their Advantage-TASA Open Standard (ATOS) Reading ease Formula for Books. They worked on a formula that was easy to use and that could be used with any texts.

The project was one of the widest reading ease projects ever. The developers of the formula used 650 normed reading texts, 474 million words from all the text in 28,000 books read by students. The project also used the reading records of more than 30,000 who read and were tested on 950,000 books.

They found that three variables give the most reliable measure of text reading ease:

  • words per sentence
  • average grade level of words
  • characters per word

They also found that:

  • To help learning, the teacher should match book reading ease with reading skill.
  • Reading often helps with reading gains.
  • For reading alone below the 4th grade, the best learning gain requires at least 85% comprehension.
  • Advanced readers need 92% comprehension for independent reading.
  • Book length can be a good measure of reading ease.
  • Feedback and interaction with the teacher are the most important factors in reading.[90][91]

CohMetrix psycholinguistics measurements

Coh-Metrix can be used in many different ways to investigate the cohesion of the explicit text and the coherence of the mental representation of the text. "Our definition of cohesion consists of characteristics of the explicit text that play some role in helping the reader mentally connect ideas in the text."[92] The definition of coherence is the subject of much debate. Theoretically, the coherence of a text is defined by the interaction between linguistic representations and knowledge representations. While coherence can be defined as characteristics of the text (i.e., aspects of cohesion) that are likely to contribute to the coherence of the mental representation, Coh-Metrix measurements provide indices of these cohesion characteristics.[92]

Other formulas

Artificial Intelligence (AI) approach

Unlike the traditional readability formulas, artificial intelligence approaches to readability assessment (also known as Automatic Readability Assessment) incorporate myriad linguistic features and construct statistical prediction models to predict text readability.[4][93] These approaches typically consist of three steps: 1. a training corpus of individual texts, 2. a set of linguistic features to be computed from each text, and 3. a machine learning model to predict the readability, using the computed linguistic feature values.[94][95][93]

Corpora

WeeBit

In 2012, Sowmya Vajjala at the University of Tübingen created the WeeBit corpus by combining educational articles from the Weekly Reader website and BBC-Bitesize website, which provide texts for different age groups.[95] In total, there are 3125 articles that are divided into five readability levels (from age 7 to 16). Weebit corpus has been used in several AI-based readability assessment research.[96]

Newsela

Wei Xu (University of Pennsylvania), Chris Callison-Burch (University of Pennsylvania), and Courtney Napoles (Johns Hopkins University) introduced the Newsela corpus to the academic field in 2015.[97] The corpus is a collection of thousands of news articles professionally leveled to different reading complexities by professional editors at Newsela. The corpus was originally introduced for text simplification research, but was also used for text readability assessment.[98]

Linguistic features

Lexico-Semantic

The type-token ratio is one of the features that are often used to captures the lexical richness, which is a measure of vocabulary range and diversity. To measure the lexical difficulty of a word, the relative frequency of the word in a representative corpus like the Corpus of Contemporary American English (COCA) is often used. Below includes some examples for lexico-semantic features in readability assessment.[96]

  • Average number of syllables per word
  • Out-of-vocabulary rate, in comparison to the full corpus
  • Type-token ratio: the ratio of unique terms to total terms observed
  • Ratio of function words, in comparison to the full corpus
  • Ratio of pronouns, in comparison to the full corpus
  • Language model perplexity (comparing the text to generic or genre-specific models)

In addition, Lijun Feng pioneered the cognitively-motivated features (mostly lexical) in 2009. This was during her doctorate study at the City University of New York (CUNY).[99] The cognitively-motivated features were originally designed for adults with intellectual disability, but was proved to improve readability assessment accuracy in general. Cognitively-motivated features, in combination with a logistic regression model, can correct the average error of Flesch–Kincaid grade-level by more than 70%. The newly discovered features by Feng include:

  • Number of lexical chains in document
  • Average number of unique entities per sentence
  • Average number of entity mentions per sentence
  • Total number of unique entities in document
  • Total number of entity mentions in document
  • Average lexical chain length
  • Average lexical chain span

Syntactic

Syntactic complexity is correlated with longer processing times in text comprehension.[100] It is common to use a rich set of these syntactic features to predict the readability of a text. The more advanced variants of syntactic readability features are frequently computed from parse tree. Emily Pitler (University of Pennsylvania) and Ani Nenkova (University of Pennsylvania) are considered pioneers in evaluating the parse-tree syntactic features and making it widely used in readability assessment.[101][96] Some examples include:

  • Average sentence length
  • Average parse tree height
  • Average number of noun phrases per sentence
  • Average number of verb phrases per sentence

Using the readability formulas

The accuracy of readability formulas increases when finding the average readability of a large number of works. The tests generate a score based on characteristics such as statistical average word length (which is used as an unreliable proxy for semantic difficulty; sometimes word frequency is taken into account) and sentence length (as an unreliable proxy for syntactic complexity) of the work.

Most experts agree that simple readability formulas like Flesch–Kincaid grade-level can be highly misleading. Even though the traditional features like the average sentence length have high correlation with reading difficulty, the measure of readability is much more complex. The artificial intelligence, data-driven approach (see above) was studied to tackle this shortcoming.

Writing experts have warned that an attempt to simplify the text only by changing the length of the words and sentences may result in text that is more difficult to read. All the variables are tightly related. If one is changed, the others must also be adjusted, including approach, voice, person, tone, typography, design, and organization.

Writing for a class of readers other than one's own is very difficult. It takes training, method, and practice. Among those who are good at this are writers of novels and children's books. The writing experts all advise that, besides using a formula, observe all the norms of good writing, which are essential for writing readable texts. Writers should study the texts used by their audience and their reading habits. This means that for a 5th-grade audience, the writer should study and learn good quality 5th-grade materials.[20][59][70][102][103][104][105]

See also

References

  1. ^ "Typographic Readability and Legibility". Web Design Envato Tuts+. Retrieved 2020-08-17.
  2. ^ Tinker, Miles A. (1963). Legibility of Print. Iowa: Iowa State University Press. pp. 5–7. ISBN 0-8138-2450-8.
  3. ^ Feng, Lijun; Elhadad, Noémie; Huenerfauth, Matt (March 2009). "Cognitively Motivated Features for Readability Assessment". Proceedings of the 12th Conference of the European Chapter of the ACL: 229–237.
  4. ^ a b Xia, Menglin; Kochmar, Ekaterina; Briscoe, Ted (June 2016). "Text Readability Assessment for Second Language Learners". Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications: 12–22. arXiv:1906.07580. doi:10.18653/v1/W16-0502.
  5. ^ Lee, Bruce W.; Jang, Yoo Sung; Lee, Jason Hyung-Jong (Nov 2021). "Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features". Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: 10669–10686. arXiv:2109.12258. doi:10.18653/v1/2021.emnlp-main.834. S2CID 237940206.
  6. ^ "How to get readability in word & improve content readability". 18 April 2021.
  7. ^ Harris, Theodore L. and Richard E. Hodges, eds. 1995. The Literacy Dictionary, The Vocabulary of Reading and Writing. Newark, DE: International Reading Assn.
  8. ^ Dale, Edgar and Jeanne S. Chall. 1949. "The concept of readability." Elementary English 26:23.
  9. ^ a b McLaughlin, G. H. 1969. "SMOG grading-a new readability formula." Journal of reading 22:639–646.
  10. ^ a b c d e f g DuBay, W. H. 2006. Smart language: Readers, Readability, and the Grading of Text. Costa Mesa:Impact Information.
  11. ^ a b Fry, Edward B. 2006. "Readability." Reading Hall of Fame Book. Newark, DE: International Reading Assn.
  12. ^ Kimble, Joe. 1996–97. Writing for dollars. Writing to please. Scribes journal of legal writing 6. Available online at: http://www.plainlanguagenetwork.org/kimble/dollars.htm
  13. ^ Fry, E. B. 1986. Varied uses of readability measurement. Paper presented at the 31st Annual Meeting of the International Reading Association, Philadelphia, PA.
  14. ^ Rabin, A. T. 1988 "Determining difficulty levels of text written in languages other than English." In Readability: Its past, present, and future, eds. B. L. Zakaluk and S. J. Samuels. Newark, DE: International Reading Association.
  15. ^ Murphy, D. 1947. "How plain talk increases readership 45% to 60%." Printer's ink. 220:35–37.
  16. ^ Schramm, W. 1947. "Measuring another dimension of newspaper readership." Journalism quarterly 24:293–306.
  17. ^ Lostutter, M. 1947. "Some critical factors in newspaper readability." Journalism quarterly 24:307–314.
  18. ^ Swanson, C. E. 1948. "Readability and readership: A controlled experiment." Journalism quarterly 25:339–343.
  19. ^ Feld, B. 1948. "Empirical test proves clarity adds readers." Editor and publisher 81:38.
  20. ^ a b c d Klare, G. R. and B. Buck. 1954. Know Your Reader: The scientific approach to readability. New York: Heritage House.
  21. ^ Klare, G. R., J. E. Mabry, and L. M. Gustafson. 1955. "The relationship of style difficulty to immediate retention and to acceptability of technical material." Journal of educational psychology 46:287–295.
  22. ^ Klare, G. R., E. H. Shuford, and W. H. Nichols. 1957 . "The relationship of style difficulty, practice, and efficiency of reading and retention." Journal of Applied Psychology. 41:222–26.
  23. ^ a b Klare, G. R. 1976. "A second look at the validity of the readability formulas." Journal of reading behavior. 8:129–52.
  24. ^ a b Klare, G. R. 1985. "Matching reading materials to readers: The role of readability estimates in conjunction with other information about comprehensibility." In Reading, thinking, and concept development, eds. T. L Harris and E. J. Cooper. New York: College Entrance Examination Board.
  25. ^ Sherman, Lucius Adelno 1893. Analytics of literature: A manual for the objective study of English prose and poetry. Boston: Ginn and Co.
  26. ^ a b Choldin, M.T. (1979), "Rubakin, Nikolai Aleksandrovic", in Kent, Allen; Lancour, Harold; Nasri, William Z.; Daily, Jay Elwood (eds.), Encyclopedia of library and information science, vol. 26 (illustrated ed.), CRC Press, pp. 178–79, ISBN 9780824720261
  27. ^ Lorge, I. 1944. "Word lists as background for communication." Teachers College Record 45:543–552.
  28. ^ Kitson, Harry D. 1921. The Mind of the Buyer. New York: Macmillan.
  29. ^ Clay, M. 1991. Becoming literate: The construction of inner control. Portsmouth, NH: Heinneman.
  30. ^ Fry, E. B. 2002. "Text readability versus leveling." Reading Teacher 56 no. 23:286–292.
  31. ^ Chall, J. S., J. L. Bissex, S. S. Conard, and S. H. Sharples. 1996. Qualitative assessment of text difficulty: A practical guide for teachers and writers. Cambridge MA: Brookline Books.
  32. ^ Thorndike E.L. 1921 The teacher's word book. 1932 A teacher's word book of the twenty thousand words found most frequently and widely in general reading for children and young people. 1944 (with J.E. Lorge) The teacher's word book of 30,000 words.
  33. ^ Dale, E. and J. O'Rourke. 1981. The living word vocabulary: A national vocabulary inventory. World Book-Childcraft International.
  34. ^ Lively, Bertha A. and S. L. Pressey. 1923. "A method for measuring the 'vocabulary burden' of textbooks. Educational administration and supervision 9:389–398.
  35. ^ [1]DuBay, William H (2004). The Principles of Readability. p. 2.
  36. ^ The Classic Readability Studies, William H. DuBay, Editor (chapter on Washburne, C. i M. Vogel. 1928).
  37. ^ Washburne, C. and M. Vogel. 1928. "An objective method of determining grade placement of children's reading material. Elementary school journal 28:373–81.
  38. ^ Lewerenz, A. S. 1929. "Measurement of the difficulty of reading materials." Los Angeles educational research bulletin 8:11–16.
  39. ^ Lewerenz, A. S. 1929. "Objective measurement of diverse types of reading material. Los Angeles educational research bulletin 9:8–11.
  40. ^ Lewerenz, A. S. 1930. "Vocabulary grade placement of typical newspaper content." Los Angeles educational research bulletin 10:4–6.
  41. ^ Lewerenz, A. S. 1935. "A vocabulary grade placement formula." Journal of experimental education 3: 236
  42. ^ Lewerenz, A. S. 1939. "Selection of reading materials by pupil ability and interest." Elementary English review 16:151–156.
  43. ^ Thorndike, E. 1934. "Improving the ability to read." Teachers college record 36:1–19, 123–44, 229–41. October, November, December.
  44. ^ Patty. W. W. and W. I. Painter. 1931. "A technique for measuring the vocabulary burden of textbooks." Journal of educational research 24:127–134.
  45. ^ Waples, D. and R. Tyler. 1931. What adults want to read about.Chicago: University of Chicago Press.
  46. ^ Ojemann, R. H. 1934. "The reading ability of parents and factors associated with reading difficulty of parent-education materials." University of Iowa studies in child welfare 8:11–32.
  47. ^ Dale, E. and R. Tyler. 1934. "A study of the factors influencing the difficulty of reading materials for adults of limited reading ability." Library quarterly 4:384–412.
  48. ^ a b c Gray, W. S. and B. Leary. 1935. What makes a book readable. Chicago: Chicago University Press.
  49. ^ Lorge, I. 1939. "Predicting reading difficulty of selections for children. Elementary English Review 16:229–233.
  50. ^ Lorge, I. 1944. "Predicting readability." Teachers college record 45:404–419.
  51. ^ Flesch, R. "Marks of a readable style." Columbia University contributions to education, no. 187. New York: Bureau of Publications, Teachers College, Columbia University.
  52. ^ Flesch, R. 1948. "A new readability yardstick." Journal of Applied Psychology 32:221–33.
  53. ^ Klare, G. R. 1963. The measurement of readability. Ames, Iowa: University of Iowa Press.
  54. ^ a b Chall, J. S. 1958. Readability: An appraisal of research and application. Columbus, OH: Bureau of Educational Research, Ohio State University.
  55. ^ Farr, J. N., J. J. Jenkins, and D. G. Paterson. 1951. "Simplification of the Flesch Reading Ease Formula." Journal of Applied Psychology. 35, no. 5:333–357.
  56. ^ Kincaid, J. P., R. P. Fishburne, R. L. Rogers, and B. S. Chissom. 1975. Derivation of new readability formulas (Automated Readability Index, Fog Count, and Flesch Reading Ease Formula) for Navy enlisted personnel. CNTECHTRA Research Branch Report 8-75.
  57. ^ Dale, E. and J. S. Chall. 1948. '"A formula for predicting readability". Educational research bulletin Jan. 21 and Feb 17, 27:1–20, 37–54.
  58. ^ Chall, J. S. and E. Dale. 1995. Readability revisited: The new Dale–Chall readability formula. Cambridge, MA: Brookline Books.
  59. ^ a b c Gunning, R. 1952. The Technique of Clear Writing. New York: McGraw–Hill.
  60. ^ Fry, E. B. 1963. Teaching faster reading. London: Cambridge University Press.
  61. ^ Fry, E. B. 1968. "A readability formula that saves time." Journal of reading 11:513–516.
  62. ^ Doak, C. C., L. G. Doak, and J. H. Root. 1996. Teaching patients with low literacy skills. Philadelphia: J. P. Lippincott Company.
  63. ^ Caylor, J. S., T. G. Stitch, L. C. Fox, and J. P. Ford. 1973. Methodologies for determining reading requirements of military occupational specialties: Technical report No. 73-5. Alexander, VA: Human Resources Research Organization.
  64. ^ Kintsch, W. and J. R. Miller 1981. "Readability: A view from cognitive psychology." In Teaching: Research reviews. Newark, DE: International Reading Assn.
  65. ^ Kemper, S. 1983. "Measuring the inference load of a text." Journal of educational psychology 75, no. 3:391–401.
  66. ^ Meyer, B. J. 1982. "Reading research and the teacher: The importance of plans." College composition and communication 33, no. 1:37–49.
  67. ^ Armbruster, B. B. 1984. "The problem of inconsiderate text" In Comprehension instruction, ed. G. Duffy. New York: Longmann, p. 202–217.
  68. ^ Calfee, R. C. and R. Curley. 1984. "Structures of prose in content areas." In Understanding reading comprehension, ed. J. Flood. Newark, DE: International Reading Assn., pp. 414–430.
  69. ^ Dolch. E. W. 1939. "Fact burden and reading difficulty." Elementary English review 16:135–138.
  70. ^ a b Flesch, R. (1949). The Art of Readable Writing. New York: Harper. OCLC 318542.
  71. ^ Coleman, E. B. and P. J. Blumenfeld. 1963. "Cloze scores of nominalization and their grammatical transformations using active verbs." Psychology reports 13:651–654.
  72. ^ Gough, P. B. 1965. "Grammatical transformations and the speed of understanding." Journal of verbal learning and verbal behavior 4:107–111.
  73. ^ a b Coleman, E. B. 1966. "Learning of prose written in four grammatical transformations." Journal of Applied Psychology 49:332–341.
  74. ^ Clark, H. H. and S. E. Haviland. 1977. "Comprehension and the given-new contract." In Discourse production and comprehension, ed. R. O. Freedle. Norwood, NJ: Ablex Press, pp. 1–40.
  75. ^ Hornby, P. A. 1974. "Surface structure and presupposition." Journal of verbal learning and verbal behavior 13:530–538.
  76. ^ Spyridakis, J. H. 1989. "Signaling effects: A review of the research-Part 1." Journal of technical writing and communication 19, no 3:227-240.
  77. ^ Spyridakis, J. H. 1989. "Signaling effects: Increased content retention and new answers-Part 2." Journal of technical writing and communication 19, no. 4:395–415.
  78. ^ Halbert, M. G. 1944. "The teaching value of illustrated books." American school board journal 108, no. 5:43–44.
  79. ^ Vernon, M. D. 1946. "Learning from graphic material." British journal of psychology 36:145–158.
  80. ^ Felker, D. B., F. Pickering, V. R. Charrow, V. M. Holland, and J. C. Redish. 1981. Guidelines for document designers. Washington, D. C: American Institutes for Research.
  81. ^ Klare, G. R., J. E. Mabry, and L. M. Gustafson. 1955. "The relationship of patterning (underlining) to immediate retention and to acceptability of technical material." Journal of Applied Psychology 39, no 1:40–42.
  82. ^ Klare, G. R. 1957. "The relationship of typographic arrangement to the learning of technical material." Journal of Applied Psychology 41, no 1:41–45.
  83. ^ Jatowt, A. and K. Tanaka. 2012. "Longitudinal analysis of historical texts' readability." Proceedings of Joint Conference on Digital Libraries 2012 353-354
  84. ^ Vygotsky, L. 1978. Mind in society. Cambridge, MA: Harvard University Press.
  85. ^ Chall, J. S. and S. S. Conard. 1991. Should textbooks challenge students? The case for easier or harder textbooks. New York: Teachers College Press.
  86. ^ Bormuth, J. R. 1966. "Readability: A new approach." Reading research quarterly 1:79–132.
  87. ^ Bormuth, J. R. 1969. Development of readability analysis: Final Report, Project no 7-0052, Contract No. OEC-3-7-0070052-0326. Washington, D. C.: U. S. Office of Education, Bureau of Research, U. S. Department of Health, Education, and Welfare.
  88. ^ Bormuth, J. R. 1971. Development of standards of readability: Towards a rational criterion of passage performance. Washington, D. C.: U. S. Office of Education, Bureau of Research, U. S. Department of Health, Education, and Welfare.
  89. ^ Stenner, A. J., I Horabin, D. R. Smith, and R. Smith. 1988. The Lexile Framework. Durham, NC: Metametrics.
  90. ^ School Renaissance Institute. 2000. The ATOS readability formula for books and how it compares to other formulas. Madison, WI: School Renaissance Institute, Inc.
  91. ^ Paul, T. 2003. Guided independent reading. Madison, WI: School Renaissance Institute, Inc. http://www.renlearn.com/GIRP2008.pdf
  92. ^ a b Graesser, A.C.; McNamara, D.S.; Louwerse, M.M. (2003), Sweet, A.P.; Snow, C.E. (eds.), "What do readers need to learn in order to process coherence relations in narrative and expository text", Rethinking reading comprehension, New York: Guilford Publications, pp. 82–98
  93. ^ a b Lee, Bruce W.; Lee, Jason (Dec 2020). "LXPER Index 2.0: Improving Text Readability Assessment Model for L2 English Students in Korea". Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications: 20–24. arXiv:2010.13374.
  94. ^ Feng, Lijun; Jansche, Martin; Huernerfauth, Matt; Elhadad, Noémie (August 2010). "A Comparison of Features for Automatic Readability Assessment". Coling 2010: Posters: 276–284.
  95. ^ a b Vajjala, Sowmya; Meurers, Detmar (June 2012). "On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition". Proceedings of the Seventh Workshop on Building Educational Applications Using NLP: 163–173.
  96. ^ a b c Collins-Thompson, Kevyn (2015). "Computational assessment of text readability: A survey of current and future research". International Journal of Applied Linguistics. 165 (2): 97–135. doi:10.1075/itl.165.2.01col.
  97. ^ Xu, Wei; Callison-Burch, Chris; Napoles, Courtney (2015). "Problems in Current Text Simplification Research: New Data Can Help". Transactions of the Association for Computational Linguistics. 3: 283–297. doi:10.1162/tacl_a_00139. S2CID 17817489.
  98. ^ Deutsch, Tovly; Jasbi, Masoud; Shieber, Stuart (July 2020). "Linguistic Features for Readability Assessment". Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications: 1–17. arXiv:2006.00377. doi:10.18653/v1/2020.bea-1.1.
  99. ^ Feng, Lijun; Elhadad, Noémie; Huenerfauth, Matt (March 2009). "Cognitively motivated features for readability assessment". EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Eacl '09: 229–237. doi:10.3115/1609067.1609092. S2CID 13888774.
  100. ^ Gibson, Edward (1998). "Linguistic complexity: locality of syntactic dependencies". Cognition. 68 (1): 1–76. doi:10.1016/S0010-0277(98)00034-1. PMID 9775516. S2CID 377292.
  101. ^ Pitler, Emily; Nenkova, Ani (October 2008). "Revisiting Readability: A Unified Framework for Predicting Text Quality". Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing: 186–195.
  102. ^ Flesch, R. 1946. The art of plain talk. New York: Harper.
  103. ^ Flesch, R. 1979. How to write in plain English: A book for lawyers and consumers. New York: Harpers.
  104. ^ Klare, G. R. 1980. How to write readable English. London: Hutchinson.
  105. ^ Fry, E. B. 1988. "Writeability: the principles of writing for increased comprehension." In Readability: Its past, present, and future, eds. B. I. Zakaluk and S. J. Samuels. Newark, DE: International Reading Assn.

Further reading

  • Harris, A. J. and E. Sipay. 1985. How to increase reading ability, 8th Ed. New York & London: Longman.
  • Ruddell, R. B. 1999. Teaching children to read and write. Boston: Allyn and Bacon.
  • Manzo, A. V. and U. C. Manzo. 1995. Teaching children to be literate. Fort Worth: Harcourt Brace.
  • Vacca, J. A., R. Vacca, and M. K. Gove. 1995. Reading and learning to read. New York: HarperCollins.

External links

  • Readability Scoring Tool - Scores against many readability formulas at once - Readable.io
  • Readability Tests - Joe's Web Tools
  • Text Content Analysis Tool -UsingEnglish.com , free membership required

readability, website, service, code, readability, computer, programming, source, code, ease, with, which, reader, understand, written, text, natural, language, readability, text, depends, content, complexity, vocabulary, syntax, presentation, such, typographic. For the website see Readability service For code readability see Computer programming Readability of source code Readability is the ease with which a reader can understand a written text In natural language the readability of text depends on its content the complexity of its vocabulary and syntax and its presentation such as typographic aspects that affect legibility like font size line height character spacing and line length 1 Researchers have used various factors to measure readability such as Speed of perception Perceptibility at a distance Perceptibility in peripheral vision Visibility Reflex blink technique Rate of work reading speed Eye movements Fatigue in reading 2 Cognitively motivated features 3 Word difficulty N gram analysis 4 Semantic Richness 5 Higher readability eases reading effort and speed for any reader but it makes a larger difference for those who do not have high reading comprehension Readability exists in both natural language and programming languages though in different forms In programming things such as programmer comments choice of loop structure and choice of names can determine the ease with which humans can read computer program code Numeric readability metrics also known as readability tests or readability formulas for natural language tend to use simple measures like word length by letter or syllable sentence length and sometimes some measure of word frequency They can be built into word processors 6 can score documents paragraphs or sentences and are a much cheaper and faster alternative to a readability survey involving human readers They are faster to calculate than more accurate measures of syntactic and semantic complexity In some cases they are used to estimate appropriate grade level Contents 1 Definition 2 Applications 2 1 Readability and newspaper readership 2 2 The George Klare studies 3 Early research 4 Text leveling 5 Vocabulary frequency lists 6 Early children s readability formulas 7 Early adult readability formulas 8 Popular readability formulas 8 1 The Flesch formulas 8 2 The Dale Chall formula 8 3 The Gunning fog formula 8 4 Fry readability graph 8 5 McLaughlin s SMOG formula 8 6 The FORCAST formula 8 7 The Golub Syntactic Density Score 8 8 Measuring coherence and organization 9 Advanced readability formulas 9 1 The John Bormuth formulas 9 2 The Lexile framework 9 3 ATOS readability formula for books 9 4 CohMetrix psycholinguistics measurements 10 Other formulas 11 Artificial Intelligence AI approach 11 1 Corpora 11 1 1 WeeBit 11 1 2 Newsela 11 2 Linguistic features 11 2 1 Lexico Semantic 11 2 2 Syntactic 12 Using the readability formulas 13 See also 14 References 15 Further reading 16 External linksDefinition EditPeople have defined readability in various ways e g in The Literacy Dictionary 7 Jeanne Chall and Edgar Dale 8 G Harry McLaughlin 9 William DuBay 10 further explanation needed Applications EditEasy reading helps learning and enjoyment 11 and can save money 12 Much research has focused on matching prose to reading skill resulting in formulas for use in research government teaching publishing the military medicine and business 13 14 Readability and newspaper readership Edit Several studies in the 1940s showed that even small increases in readability greatly increases readership in large circulation newspapers In 1947 Donald Murphy of Wallace s Farmer used a split run edition to study the effects of making text easier to read He found that reducing from the 9th to the 6th grade reading level increased readership by 43 for an article on nylon The result was a gain of 42 000 readers in a circulation of 275 000 He also found a 60 increase in readership for an article on corn with better responses from people under 35 15 Wilber Schramm interviewed 1 050 newspaper readers He found that an easier reading style helps to determine how much of an article is read This was called reading persistence depth or perseverance He also found that people will read less of long articles than of short ones A story nine paragraphs long will lose 3 out of 10 readers by the fifth paragraph A shorter story will lose only two Schramm also found that the use of subheads bold face paragraphs and stars to break up a story actually lose readers 16 A study in 1947 by Melvin Lostutter showed that newspapers generally were written at a level five years above the ability of average American adult readers The reading ease of newspaper articles was not found to have much connection with the education experience or personal interest of the journalists writing the stories It instead had more to do with the convention and culture of the industry Lostutter argued for more readability testing in newspaper writing Improved readability must be a conscious process somewhat independent of the education and experience of the staffs writers 17 A study by Charles Swanson in 1948 showed that better readability increases the total number of paragraphs read by 93 and the number of readers reading every paragraph by 82 18 In 1948 Bernard Feld did a study of every item and ad in the Birmingham News of 20 November 1947 He divided the items into those above the 8th grade level and those at the 8th grade or below He chose the 8th grade breakpoint as that was determined to be the average reading level of adult readers An 8th grade text will reach about 50 of all American grown ups he wrote Among the wire service stories the lower group got two thirds more readers and among local stories 75 more readers Feld also believed in drilling writers in Flesch s clear writing principles 19 Both Rudolf Flesch and Robert Gunning worked extensively with newspapers and the wire services in improving readability Mainly through their efforts in a few years the readability of US newspapers went from the 16th to the 11th grade level where it remains today The two publications with the largest circulations TV Guide 13 million and Reader s Digest 12 million are written at the 9th grade level 10 The most popular novels are written at the 7th grade level This supports the fact that the average adult reads at the 9th grade level It also shows that for recreation people read texts that are two grades below their actual reading level 20 The George Klare studies Edit George Klare and his colleagues looked at the effects of greater reading ease on Air Force recruits They found that more readable texts resulted in greater and more complete learning They also increased the amount read in a given time and made for easier acceptance 21 22 Other studies by Klare showed how the reader s skills 23 prior knowledge 24 interest and motivation 23 24 affect reading ease Early research EditIn the 1880s English professor L A Sherman found that the English sentence was getting shorter In Elizabethan times the average sentence was 50 words long In his own time it was 23 words long Sherman s work established that Literature is a subject for statistical analysis Shorter sentences and concrete terms help people to make sense of what is written Speech is easier to understand than text Over time text becomes easier if it is more like speech Sherman wrote Literary English in short will follow the forms of standard spoken English from which it comes No man should talk worse than he writes no man should write better than he should talk The oral sentence is clearest because it is the product of millions of daily efforts to be clear and strong It represents the work of the race for thousands of years in perfecting an effective instrument of communication 25 In 1889 in Russia the writer Nikolai A Rubakin published a study of over 10 000 texts written by everyday people 26 From these texts he took 1 500 words he thought most people understood He found that the main blocks to comprehension are unfamiliar words and long sentences 27 Starting with his own journal at the age of 13 Rubakin published many articles and books on science and many subjects for the great numbers of new readers throughout Russia In Rubakin s view the people were not fools They were simply poor and in need of cheap books written at a level they could grasp 26 In 1921 Harry D Kitson published The Mind of the Buyer one of the first books to apply psychology to marketing Kitson s work showed that each type of reader bought and read their own type of text On reading two newspapers and two magazines he found that short sentence length and short word length were the best contributors to reading ease 28 Text leveling EditThe earliest reading ease assessment is the subjective judgment termed text leveling Formulas do not fully address the various content purpose design visual input and organization of a text 29 30 31 Text leveling is commonly used to rank the reading ease of texts in areas where reading difficulties are easy to identify such as books for young children At higher levels ranking reading ease becomes more difficult as individual difficulties become harder to identify This has led to better ways to assess reading ease Vocabulary frequency lists EditIn the 1920s the scientific movement in education looked for tests to measure students achievement to aid in curriculum development Teachers and educators had long known that to improve reading skill readers especially beginning readers need reading material that closely matches their ability University based psychologists did much of the early research which was later taken up by textbook publishers 11 Educational psychologist Edward Thorndike of Columbia University noted that in Russia and Germany teachers used word frequency counts to match books to students Word skill was the best sign of intellectual development and the strongest predictor of reading ease In 1921 Thorndike published Teachers Word Book which contained the frequencies of 10 000 words 32 It made it easier for teachers to choose books that matched class reading skills It also provided a basis for future research on reading ease Until computers came along word frequency lists were the best aids for grading reading ease of texts 20 In 1981 the World Book Encyclopedia listed the grade levels of 44 000 words 33 Early children s readability formulas EditIn 1923 Bertha A Lively and Sidney L Pressey published the first reading ease formula They were concerned that junior high school science textbooks had so many technical words They felt that teachers spent all class time explaining these words They argued that their formula would help to measure and reduce the vocabulary burden of textbooks Their formula used five variable inputs and six constants For each thousand words it counted the number of unique words the number of words not on the Thorndike list and the median index number of the words found on the list Manually it took three hours to apply the formula to a book 34 After the Lively Pressey study people looked for formulas that were more accurate and easier to apply By 1980 over 200 formulas were published in different languages 35 citation needed In 1928 Carleton Washburne and Mabel Vogel created the first modern readability formula They validated it by using an outside criterion and correlated 845 with test scores of students who read and liked the criterion books 36 It was also the first to introduce the variable of interest to the concept of readability 37 Between 1929 and 1939 Alfred Lewerenz of the Los Angeles School District published several new formulas 38 39 40 41 42 In 1934 Edward Thorndike published his formula He wrote that word skills can be increased if the teacher introduces new words and repeats them often 43 In 1939 W W Patty and W I Painter published a formula for measuring the vocabulary burden of textbooks This was the last of the early formulas that used the Thorndike vocabulary frequency list 44 Early adult readability formulas EditDuring the recession of the 1930s the U S government invested in adult education In 1931 Douglas Waples and Ralph Tyler published What Adults Want to Read About It was a two year study of adult reading interests Their book showed not only what people read but what they would like to read They found that many readers lacked suitable reading materials they would have liked to learn but the reading materials were too hard for them 45 Lyman Bryson of Teachers College Columbia University found that many adults had poor reading ability due to poor education Even though colleges had long tried to teach how to write in a clear and readable style Bryson found that it was rare He wrote that such language is the result of a discipline and artistry that few people who have ideas will take the trouble to achieve If simple language were easy many of our problems would have been solved long ago 20 Bryson helped set up the Readability Laboratory at the college Two of his students were Irving Lorge and Rudolf Flesch In 1934 Ralph Ojemann investigated adult reading skills factors that most directly affect reading ease and causes of each level of difficulty He did not invent a formula but a method for assessing the difficulty of materials for parent education He was the first to assess the validity of this method by using 16 magazine passages tested on actual readers He evaluated 14 measurable and three reported factors that affect reading ease Ojemann emphasized the reported features such as whether the text was coherent or unduly abstract He used his 16 passages to compare and judge the reading ease of other texts a method now called scaling He showed that even though these factors cannot be measured they cannot be ignored 46 Also in 1934 Ralph Tyler and Edgar Dale published the first adult reading ease formula based on passages on health topics from a variety of textbooks and magazines Of 29 factors that are significant for young readers they found ten that are significant for adults They used three of these in their formula 47 In 1935 William S Gray of the University of Chicago and Bernice Leary of Xavier College in Chicago published What Makes a Book Readable one of the most important books in readability research Like Dale and Tyler they focused on what makes books readable for adults of limited reading ability Their book included the first scientific study of the reading skills of American adults The sample included 1 690 adults from a variety of settings and regions The test used a number of passages from newspapers magazines and books as well as a standard reading test They found a mean grade score of 7 81 eighth month of the seventh grade About one third read at the 2nd to 6th grade level one third at the 7th to 12th grade level and one third at the 13th 17th grade level The authors emphasized that one half of the adult population at that time lacked suitable reading materials They wrote For them the enriching values of reading are denied unless materials reflecting adult interests are adapted to their needs The poorest readers one sixth of the adult population need simpler materials for use in promoting functioning literacy and in establishing fundamental reading habits 48 Gray and Leary then analyzed 228 variables that affect reading ease and divided them into four types Content Style Format OrganizationThey found that content was most important followed closely by style Third was format followed closely by organization They found no way to measure content format or organization but they could measure variables of style Among the 17 significant measurable style variables they selected five to create a formula Average sentence length Number of different hard words Number of personal pronouns Percentage of unique words Number of prepositional phrasesTheir formula had a correlation of 645 with comprehension as measured by reading tests given to about 800 adults 48 In 1939 Irving Lorge published an article that reported other combinations of variables that indicate difficulty more accurately than the ones Gray and Leary used His research also showed that The vocabulary load is the most important concomitant of difficulty 49 In 1944 Lorge published his Lorge Index a readability formula that used three variables and set the stage for simpler and more reliable formulas that followed 50 By 1940 investigators had Successfully used statistical methods to analyze reading ease Found that unusual words and sentence length were among the first causes of reading difficulty Used vocabulary and sentence length in formulas to predict reading easePopular readability formulas EditThe Flesch formulas Edit Main article Flesch Kincaid readability tests In 1943 Rudolf Flesch published his PhD dissertation Marks of a Readable Style which included a readability formula to predict the difficulty of adult reading material Investigators in many fields began using it to improve communications One of the variables it used was personal references such as names and personal pronouns Another variable was affixes 51 In 1948 Flesch published his Reading Ease formula in two parts Rather than using grade levels it used a scale from 0 to 100 with 0 equivalent to the 12th grade and 100 equivalent to the 4th grade It dropped the use of affixes The second part of the formula predicts human interest by using personal references and the number of personal sentences The new formula correlated 0 70 with the McCall Crabbs reading tests 52 The original formula is Reading Ease score 206 835 1 015 ASL 84 6 ASW Where ASL average sentence length number of words divided by number of sentences ASW average word length in syllables number of syllables divided by number of words dd Publishers discovered that the Flesch formulas could increase readership up to 60 Flesch s work also made an enormous impact on journalism The Flesch Reading Ease formula became one of the most widely used tested and reliable readability metrics 53 54 In 1951 Farr Jenkins and Patterson simplified the formula further by changing the syllable count The modified formula is New reading ease score 1 599nosw 1 015sl 31 517Where nosw number of one syllable words per 100 words and sl average sentence length in words 55 dd In 1975 in a project sponsored by the U S Navy the Reading Ease formula was recalculated to give a grade level score The new formula is now called the Flesch Kincaid grade level formula 56 The Flesch Kincaid formula is one of the most popular and heavily tested formulas It correlates 0 91 with comprehension as measured by reading tests 10 The Dale Chall formula Edit Main article Dale Chall readability formula Edgar Dale a professor of education at Ohio State University was one of the first critics of Thorndike s vocabulary frequency lists He claimed that they did not distinguish between the different meanings that many words have He created two new lists of his own One his short list of 769 easy words was used by Irving Lorge in his formula The other was his long list of 3 000 easy words which were understood by 80 of fourth grade students However one has to extend the word lists by regular plurals of nouns regular forms of the past tense of verbs progressive forms of verbs etc In 1948 he incorporated this list into a formula he developed with Jeanne S Chall who later founded the Harvard Reading Laboratory To apply the formula Select several 100 word samples throughout the text Compute the average sentence length in words divide the number of words by the number of sentences Compute the percentage of words NOT on the Dale Chall word list of 3 000 easy words Compute this equation from 1948 Raw score 0 1579 PDW 0 0496 ASL if the percentage of PDW is less than 5 otherwise compute Raw score 0 1579 PDW 0 0496 ASL 3 6365Where Raw score uncorrected reading grade of a student who can answer one half of the test questions on a passage PDW Percentage of difficult words not on the Dale Chall word list ASL Average sentence lengthFinally to compensate for the grade equivalent curve apply the following chart for the Final Score Raw scoreFinal score4 9 and belowGrade 4 and below5 0 5 9Grades 5 66 0 6 9Grades 7 87 0 7 9Grades 9 108 0 8 9Grades 11 129 0 9 9Grades 13 15 college 10 and aboveGrades 16 and above 57 Correlating 0 93 with comprehension as measured by reading tests the Dale Chall formula is the most reliable formula and is widely used in scientific research citation needed In 1995 Dale and Chall published a new version of their formula with an upgraded word list the New Dale Chall readability formula 58 Its formula is Raw score 64 0 95 PDW 0 69 ASL The Gunning fog formula Edit Main article Gunning fog index In the 1940s Robert Gunning helped bring readability research into the workplace In 1944 he founded the first readability consulting firm dedicated to reducing the fog in newspapers and business writing In 1952 he published The Technique of Clear Writing with his own Fog Index a formula that correlates 0 91 with comprehension as measured by reading tests 10 The formula is one of the most reliable and simplest to apply Grade level 0 4 average sentence length percentage of Hard Words Where Hard Words words with more than two syllables 59 Fry readability graph Edit Main article Fry readability formula In 1963 while teaching English teachers in Uganda Edward Fry developed his Readability Graph It became one of the most popular formulas and easiest to apply 60 61 The Fry Graph correlates 0 86 with comprehension as measured by reading tests 10 McLaughlin s SMOG formula Edit Main article SMOG Harry McLaughlin determined that word length and sentence length should be multiplied rather than added as in other formulas In 1969 he published his SMOG Simple Measure of Gobbledygook formula SMOG grading 3 polysyllable count Where polysyllable count number of words of more than two syllables in a sample of 30 sentences 9 The SMOG formula correlates 0 88 with comprehension as measured by reading tests 10 It is often recommended for use in healthcare 62 The FORCAST formula Edit In 1973 a study commissioned by the US military of the reading skills required for different military jobs produced the FORCAST formula Unlike most other formulas it uses only a vocabulary element making it useful for texts without complete sentences The formula satisfied requirements that it would be Based on Army job reading materials Suitable for the young adult male recruits Easy enough for Army clerical personnel to use without special training or equipment The formula is Grade level 20 N 10 Where N number of single syllable words in a 150 word sample 63 The FORCAST formula correlates 0 66 with comprehension as measured by reading tests 10 The Golub Syntactic Density Score Edit The Golub Syntactic Density Score was developed by Lester Golub in 1974 It is among a smaller subset of readability formulas that concentrate on the syntactic features of a text To calculate the reading level of a text a sample of several hundred words is taken from the text The number of words in the sample is counted as are the number of T units A T unit is defined as an independent clause and any dependent clauses attached to it Other syntactical units are then counted and entered into the following table 1 Words T unit 95 X 2 Subordinate clauses T unit 90 X 3 Main clause word length mean 20 X 4 Subordinate clause length mean 50 X 5 Number of Modals will shall can may must would 65 X 6 Number of Be and Have forms in the auxiliary 40 X 7 Number of Prepositional Phrases 75 X 8 Number of Possessive nouns and pronouns 70 X 9 Number of Adverbs of Time when then once while 60 X 10 Number of gerunds participles and absolutes Phrases 85 X Users add the numbers in the right hand column and divide the total by the number of T units Finally the quotient is entered into the following table to arrive at a final readability score SDS 0 5 1 3 2 1 2 9 3 7 4 5 5 3 6 1 6 9 7 7 8 5 9 3 10 1 10 9Grade 1 2 3 4 5 6 7 8 9 10 11 12 13 14Measuring coherence and organization Edit For centuries teachers and educators have seen the importance of organization coherence and emphasis in good writing Beginning in the 1970s cognitive theorists began teaching that reading is really an act of thinking and organization The reader constructs meaning by mixing new knowledge into existing knowledge Because of the limits of the reading ease formulas some research looked at ways to measure the content organization and coherence of text Although this did not improve the reliability of the formulas their efforts showed the importance of these variables in reading ease Studies by Walter Kintch and others showed the central role of coherence in reading ease mainly for people learning to read 64 In 1983 Susan Kemper devised a formula based on physical states and mental states However she found this was no better than word familiarity and sentence length in showing reading ease 65 Bonnie Meyer and others tried to use organization as a measure of reading ease While this did not result in a formula they showed that people read faster and retain more when the text is organized in topics She found that a visible plan for presenting content greatly helps readers to assess a text A hierarchical plan shows how the parts of the text are related It also aids the reader in blending new information into existing knowledge structures 66 Bonnie Armbruster found that the most important feature for learning and comprehension is textual coherence which comes in two types Global coherence which integrates high level ideas as themes in an entire section chapter or book Local coherence which joins ideas within and between sentences Armbruster confirmed Kintsch s finding that coherence and structure are more help for younger readers 67 R C Calfee and R Curley built on Bonnie Meyer s work and found that an unfamiliar underlying structure can make even simple text hard to read They brought in a graded system to help students progress from simpler story lines to more advanced and abstract ones 68 Many other studies looked at the effects on reading ease of other text variables including Image words abstraction direct and indirect statements types of narration and sentences phrases and clauses 48 Difficult concepts 54 Idea density 69 Human interest 59 70 Nominalization 71 Active and passive voice 72 73 74 75 Embeddedness 73 Structural cues 76 77 The use of images 78 79 Diagrams and line graphs 80 Highlighting 81 Fonts and layout 82 Document age 83 Advanced readability formulas EditThe John Bormuth formulas Edit John Bormuth of the University of Chicago looked at reading ease using the new Cloze deletion test developed by Wilson Taylor His work supported earlier research including the degree of reading ease for each kind of reading The best level for classroom assisted reading is a slightly difficult text that causes a set to learn and for which readers can correctly answer 50 of the questions of a multiple choice test The best level for unassisted reading is one for which readers can correctly answer 80 of the questions These cutoff scores were later confirmed by Vygotsky 84 and Chall and Conard 85 Among other things Bormuth confirmed that vocabulary and sentence length are the best indicators of reading ease He showed that the measures of reading ease worked as well for adults as for children The same things that children find hard are the same for adults of the same reading levels He also developed several new measures of cutoff scores One of the most well known was the Mean Cloze Formula which was used in 1981 to produce the Degree of Reading Power system used by the College Entrance Examination Board 86 87 88 The Lexile framework Edit In 1988 Jack Stenner and his associates at MetaMetrics Inc published a new system the Lexile Framework for assessing readability and matching students with appropriate texts The Lexile framework uses average sentence length and average word frequency in the American Heritage Intermediate Corpus to predict a score on a 0 2000 scale The AHI Corpus includes five million words from 1 045 published works often read by students in grades three to nine The Lexile Book Database has more than 100 000 titles from more than 450 publishers By knowing a student s Lexile score a teacher can find books that match his or her reading level 89 ATOS readability formula for books Edit In 2000 researchers of the School Renaissance Institute and Touchstone Applied Science Associates published their Advantage TASA Open Standard ATOS Reading ease Formula for Books They worked on a formula that was easy to use and that could be used with any texts The project was one of the widest reading ease projects ever The developers of the formula used 650 normed reading texts 474 million words from all the text in 28 000 books read by students The project also used the reading records of more than 30 000 who read and were tested on 950 000 books They found that three variables give the most reliable measure of text reading ease words per sentence average grade level of words characters per wordThey also found that To help learning the teacher should match book reading ease with reading skill Reading often helps with reading gains For reading alone below the 4th grade the best learning gain requires at least 85 comprehension Advanced readers need 92 comprehension for independent reading Book length can be a good measure of reading ease Feedback and interaction with the teacher are the most important factors in reading 90 91 CohMetrix psycholinguistics measurements Edit Coh Metrix can be used in many different ways to investigate the cohesion of the explicit text and the coherence of the mental representation of the text Our definition of cohesion consists of characteristics of the explicit text that play some role in helping the reader mentally connect ideas in the text 92 The definition of coherence is the subject of much debate Theoretically the coherence of a text is defined by the interaction between linguistic representations and knowledge representations While coherence can be defined as characteristics of the text i e aspects of cohesion that are likely to contribute to the coherence of the mental representation Coh Metrix measurements provide indices of these cohesion characteristics 92 Other formulas EditAutomated readability index 1967 Linsear Write Raygor readability estimate 1977 Spache readability formula 1952 Artificial Intelligence AI approach EditUnlike the traditional readability formulas artificial intelligence approaches to readability assessment also known as Automatic Readability Assessment incorporate myriad linguistic features and construct statistical prediction models to predict text readability 4 93 These approaches typically consist of three steps 1 a training corpus of individual texts 2 a set of linguistic features to be computed from each text and 3 a machine learning model to predict the readability using the computed linguistic feature values 94 95 93 Corpora Edit WeeBit Edit In 2012 Sowmya Vajjala at the University of Tubingen created the WeeBit corpus by combining educational articles from the Weekly Reader website and BBC Bitesize website which provide texts for different age groups 95 In total there are 3125 articles that are divided into five readability levels from age 7 to 16 Weebit corpus has been used in several AI based readability assessment research 96 Newsela Edit Wei Xu University of Pennsylvania Chris Callison Burch University of Pennsylvania and Courtney Napoles Johns Hopkins University introduced the Newsela corpus to the academic field in 2015 97 The corpus is a collection of thousands of news articles professionally leveled to different reading complexities by professional editors at Newsela The corpus was originally introduced for text simplification research but was also used for text readability assessment 98 Linguistic features Edit Lexico Semantic Edit The type token ratio is one of the features that are often used to captures the lexical richness which is a measure of vocabulary range and diversity To measure the lexical difficulty of a word the relative frequency of the word in a representative corpus like the Corpus of Contemporary American English COCA is often used Below includes some examples for lexico semantic features in readability assessment 96 Average number of syllables per word Out of vocabulary rate in comparison to the full corpus Type token ratio the ratio of unique terms to total terms observed Ratio of function words in comparison to the full corpus Ratio of pronouns in comparison to the full corpus Language model perplexity comparing the text to generic or genre specific models In addition Lijun Feng pioneered the cognitively motivated features mostly lexical in 2009 This was during her doctorate study at the City University of New York CUNY 99 The cognitively motivated features were originally designed for adults with intellectual disability but was proved to improve readability assessment accuracy in general Cognitively motivated features in combination with a logistic regression model can correct the average error of Flesch Kincaid grade level by more than 70 The newly discovered features by Feng include Number of lexical chains in document Average number of unique entities per sentence Average number of entity mentions per sentence Total number of unique entities in document Total number of entity mentions in document Average lexical chain length Average lexical chain spanSyntactic Edit Syntactic complexity is correlated with longer processing times in text comprehension 100 It is common to use a rich set of these syntactic features to predict the readability of a text The more advanced variants of syntactic readability features are frequently computed from parse tree Emily Pitler University of Pennsylvania and Ani Nenkova University of Pennsylvania are considered pioneers in evaluating the parse tree syntactic features and making it widely used in readability assessment 101 96 Some examples include Average sentence length Average parse tree height Average number of noun phrases per sentence Average number of verb phrases per sentenceUsing the readability formulas EditThe accuracy of readability formulas increases when finding the average readability of a large number of works The tests generate a score based on characteristics such as statistical average word length which is used as an unreliable proxy for semantic difficulty sometimes word frequency is taken into account and sentence length as an unreliable proxy for syntactic complexity of the work Most experts agree that simple readability formulas like Flesch Kincaid grade level can be highly misleading Even though the traditional features like the average sentence length have high correlation with reading difficulty the measure of readability is much more complex The artificial intelligence data driven approach see above was studied to tackle this shortcoming Writing experts have warned that an attempt to simplify the text only by changing the length of the words and sentences may result in text that is more difficult to read All the variables are tightly related If one is changed the others must also be adjusted including approach voice person tone typography design and organization Writing for a class of readers other than one s own is very difficult It takes training method and practice Among those who are good at this are writers of novels and children s books The writing experts all advise that besides using a formula observe all the norms of good writing which are essential for writing readable texts Writers should study the texts used by their audience and their reading habits This means that for a 5th grade audience the writer should study and learn good quality 5th grade materials 20 59 70 102 103 104 105 See also EditAsemic writing Plain language Verbosity Accessible publishing George R Klare William S Gray Miles Tinker Bourbaki dangerous bend symbolReferences Edit Typographic Readability and Legibility Web Design Envato Tuts Retrieved 2020 08 17 Tinker Miles A 1963 Legibility of Print Iowa Iowa State University Press pp 5 7 ISBN 0 8138 2450 8 Feng Lijun Elhadad Noemie Huenerfauth Matt March 2009 Cognitively Motivated Features for Readability Assessment Proceedings of the 12th Conference of the European Chapter of the ACL 229 237 a b Xia Menglin Kochmar Ekaterina Briscoe Ted June 2016 Text Readability Assessment for Second Language Learners Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications 12 22 arXiv 1906 07580 doi 10 18653 v1 W16 0502 Lee Bruce W Jang Yoo Sung Lee Jason Hyung Jong Nov 2021 Pushing on Text Readability Assessment A Transformer Meets Handcrafted Linguistic Features Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 10669 10686 arXiv 2109 12258 doi 10 18653 v1 2021 emnlp main 834 S2CID 237940206 How to get readability in word amp improve content readability 18 April 2021 Harris Theodore L and Richard E Hodges eds 1995 The Literacy Dictionary The Vocabulary of Reading and Writing Newark DE International Reading Assn Dale Edgar and Jeanne S Chall 1949 The concept of readability Elementary English 26 23 a b McLaughlin G H 1969 SMOG grading a new readability formula Journal of reading 22 639 646 a b c d e f g DuBay W H 2006 Smart language Readers Readability and the Grading of Text Costa Mesa Impact Information a b Fry Edward B 2006 Readability Reading Hall of Fame Book Newark DE International Reading Assn Kimble Joe 1996 97 Writing for dollars Writing to please Scribes journal of legal writing 6 Available online at http www plainlanguagenetwork org kimble dollars htm Fry E B 1986 Varied uses of readability measurement Paper presented at the 31st Annual Meeting of the International Reading Association Philadelphia PA Rabin A T 1988 Determining difficulty levels of text written in languages other than English In Readability Its past present and future eds B L Zakaluk and S J Samuels Newark DE International Reading Association Murphy D 1947 How plain talk increases readership 45 to 60 Printer s ink 220 35 37 Schramm W 1947 Measuring another dimension of newspaper readership Journalism quarterly 24 293 306 Lostutter M 1947 Some critical factors in newspaper readability Journalism quarterly 24 307 314 Swanson C E 1948 Readability and readership A controlled experiment Journalism quarterly 25 339 343 Feld B 1948 Empirical test proves clarity adds readers Editor and publisher 81 38 a b c d Klare G R and B Buck 1954 Know Your Reader The scientific approach to readability New York Heritage House Klare G R J E Mabry and L M Gustafson 1955 The relationship of style difficulty to immediate retention and to acceptability of technical material Journal of educational psychology 46 287 295 Klare G R E H Shuford and W H Nichols 1957 The relationship of style difficulty practice and efficiency of reading and retention Journal of Applied Psychology 41 222 26 a b Klare G R 1976 A second look at the validity of the readability formulas Journal of reading behavior 8 129 52 a b Klare G R 1985 Matching reading materials to readers The role of readability estimates in conjunction with other information about comprehensibility In Reading thinking and concept development eds T L Harris and E J Cooper New York College Entrance Examination Board Sherman Lucius Adelno 1893 Analytics of literature A manual for the objective study of English prose and poetry Boston Ginn and Co a b Choldin M T 1979 Rubakin Nikolai Aleksandrovic in Kent Allen Lancour Harold Nasri William Z Daily Jay Elwood eds Encyclopedia of library and information science vol 26 illustrated ed CRC Press pp 178 79 ISBN 9780824720261 Lorge I 1944 Word lists as background for communication Teachers College Record 45 543 552 Kitson Harry D 1921 The Mind of the Buyer New York Macmillan Clay M 1991 Becoming literate The construction of inner control Portsmouth NH Heinneman Fry E B 2002 Text readability versus leveling Reading Teacher 56 no 23 286 292 Chall J S J L Bissex S S Conard and S H Sharples 1996 Qualitative assessment of text difficulty A practical guide for teachers and writers Cambridge MA Brookline Books Thorndike E L 1921 The teacher s word book 1932 A teacher s word book of the twenty thousand words found most frequently and widely in general reading for children and young people 1944 with J E Lorge The teacher s word book of 30 000 words Dale E and J O Rourke 1981 The living word vocabulary A national vocabulary inventory World Book Childcraft International Lively Bertha A and S L Pressey 1923 A method for measuring the vocabulary burden of textbooks Educational administration and supervision 9 389 398 1 DuBay William H 2004 The Principles of Readability p 2 The Classic Readability Studies William H DuBay Editor chapter on Washburne C i M Vogel 1928 Washburne C and M Vogel 1928 An objective method of determining grade placement of children s reading material Elementary school journal 28 373 81 Lewerenz A S 1929 Measurement of the difficulty of reading materials Los Angeles educational research bulletin 8 11 16 Lewerenz A S 1929 Objective measurement of diverse types of reading material Los Angeles educational research bulletin 9 8 11 Lewerenz A S 1930 Vocabulary grade placement of typical newspaper content Los Angeles educational research bulletin 10 4 6 Lewerenz A S 1935 A vocabulary grade placement formula Journal of experimental education 3 236 Lewerenz A S 1939 Selection of reading materials by pupil ability and interest Elementary English review 16 151 156 Thorndike E 1934 Improving the ability to read Teachers college record 36 1 19 123 44 229 41 October November December Patty W W and W I Painter 1931 A technique for measuring the vocabulary burden of textbooks Journal of educational research 24 127 134 Waples D and R Tyler 1931 What adults want to read about Chicago University of Chicago Press Ojemann R H 1934 The reading ability of parents and factors associated with reading difficulty of parent education materials University of Iowa studies in child welfare 8 11 32 Dale E and R Tyler 1934 A study of the factors influencing the difficulty of reading materials for adults of limited reading ability Library quarterly 4 384 412 a b c Gray W S and B Leary 1935 What makes a book readable Chicago Chicago University Press Lorge I 1939 Predicting reading difficulty of selections for children Elementary English Review 16 229 233 Lorge I 1944 Predicting readability Teachers college record 45 404 419 Flesch R Marks of a readable style Columbia University contributions to education no 187 New York Bureau of Publications Teachers College Columbia University Flesch R 1948 A new readability yardstick Journal of Applied Psychology 32 221 33 Klare G R 1963 The measurement of readability Ames Iowa University of Iowa Press a b Chall J S 1958 Readability An appraisal of research and application Columbus OH Bureau of Educational Research Ohio State University Farr J N J J Jenkins and D G Paterson 1951 Simplification of the Flesch Reading Ease Formula Journal of Applied Psychology 35 no 5 333 357 Kincaid J P R P Fishburne R L Rogers and B S Chissom 1975 Derivation of new readability formulas Automated Readability Index Fog Count and Flesch Reading Ease Formula for Navy enlisted personnel CNTECHTRA Research Branch Report 8 75 Dale E and J S Chall 1948 A formula for predicting readability Educational research bulletin Jan 21 and Feb 17 27 1 20 37 54 Chall J S and E Dale 1995 Readability revisited The new Dale Chall readability formula Cambridge MA Brookline Books a b c Gunning R 1952 The Technique of Clear Writing New York McGraw Hill Fry E B 1963 Teaching faster reading London Cambridge University Press Fry E B 1968 A readability formula that saves time Journal of reading 11 513 516 Doak C C L G Doak and J H Root 1996 Teaching patients with low literacy skills Philadelphia J P Lippincott Company Caylor J S T G Stitch L C Fox and J P Ford 1973 Methodologies for determining reading requirements of military occupational specialties Technical report No 73 5 Alexander VA Human Resources Research Organization Kintsch W and J R Miller 1981 Readability A view from cognitive psychology In Teaching Research reviews Newark DE International Reading Assn Kemper S 1983 Measuring the inference load of a text Journal of educational psychology 75 no 3 391 401 Meyer B J 1982 Reading research and the teacher The importance of plans College composition and communication 33 no 1 37 49 Armbruster B B 1984 The problem of inconsiderate text In Comprehension instruction ed G Duffy New York Longmann p 202 217 Calfee R C and R Curley 1984 Structures of prose in content areas In Understanding reading comprehension ed J Flood Newark DE International Reading Assn pp 414 430 Dolch E W 1939 Fact burden and reading difficulty Elementary English review 16 135 138 a b Flesch R 1949 The Art of Readable Writing New York Harper OCLC 318542 Coleman E B and P J Blumenfeld 1963 Cloze scores of nominalization and their grammatical transformations using active verbs Psychology reports 13 651 654 Gough P B 1965 Grammatical transformations and the speed of understanding Journal of verbal learning and verbal behavior 4 107 111 a b Coleman E B 1966 Learning of prose written in four grammatical transformations Journal of Applied Psychology 49 332 341 Clark H H and S E Haviland 1977 Comprehension and the given new contract In Discourse production and comprehension ed R O Freedle Norwood NJ Ablex Press pp 1 40 Hornby P A 1974 Surface structure and presupposition Journal of verbal learning and verbal behavior 13 530 538 Spyridakis J H 1989 Signaling effects A review of the research Part 1 Journal of technical writing and communication 19 no 3 227 240 Spyridakis J H 1989 Signaling effects Increased content retention and new answers Part 2 Journal of technical writing and communication 19 no 4 395 415 Halbert M G 1944 The teaching value of illustrated books American school board journal 108 no 5 43 44 Vernon M D 1946 Learning from graphic material British journal of psychology 36 145 158 Felker D B F Pickering V R Charrow V M Holland and J C Redish 1981 Guidelines for document designers Washington D C American Institutes for Research Klare G R J E Mabry and L M Gustafson 1955 The relationship of patterning underlining to immediate retention and to acceptability of technical material Journal of Applied Psychology 39 no 1 40 42 Klare G R 1957 The relationship of typographic arrangement to the learning of technical material Journal of Applied Psychology 41 no 1 41 45 Jatowt A and K Tanaka 2012 Longitudinal analysis of historical texts readability Proceedings of Joint Conference on Digital Libraries 2012 353 354 Vygotsky L 1978 Mind in society Cambridge MA Harvard University Press Chall J S and S S Conard 1991 Should textbooks challenge students The case for easier or harder textbooks New York Teachers College Press Bormuth J R 1966 Readability A new approach Reading research quarterly 1 79 132 Bormuth J R 1969 Development of readability analysis Final Report Project no 7 0052 Contract No OEC 3 7 0070052 0326 Washington D C U S Office of Education Bureau of Research U S Department of Health Education and Welfare Bormuth J R 1971 Development of standards of readability Towards a rational criterion of passage performance Washington D C U S Office of Education Bureau of Research U S Department of Health Education and Welfare Stenner A J I Horabin D R Smith and R Smith 1988 The Lexile Framework Durham NC Metametrics School Renaissance Institute 2000 The ATOS readability formula for books and how it compares to other formulas Madison WI School Renaissance Institute Inc Paul T 2003 Guided independent reading Madison WI School Renaissance Institute Inc http www renlearn com GIRP2008 pdf a b Graesser A C McNamara D S Louwerse M M 2003 Sweet A P Snow C E eds What do readers need to learn in order to process coherence relations in narrative and expository text Rethinking reading comprehension New York Guilford Publications pp 82 98 a b Lee Bruce W Lee Jason Dec 2020 LXPER Index 2 0 Improving Text Readability Assessment Model for L2 English Students in Korea Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications 20 24 arXiv 2010 13374 Feng Lijun Jansche Martin Huernerfauth Matt Elhadad Noemie August 2010 A Comparison of Features for Automatic Readability Assessment Coling 2010 Posters 276 284 a b Vajjala Sowmya Meurers Detmar June 2012 On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition Proceedings of the Seventh Workshop on Building Educational Applications Using NLP 163 173 a b c Collins Thompson Kevyn 2015 Computational assessment of text readability A survey of current and future research International Journal of Applied Linguistics 165 2 97 135 doi 10 1075 itl 165 2 01col Xu Wei Callison Burch Chris Napoles Courtney 2015 Problems in Current Text Simplification Research New Data Can Help Transactions of the Association for Computational Linguistics 3 283 297 doi 10 1162 tacl a 00139 S2CID 17817489 Deutsch Tovly Jasbi Masoud Shieber Stuart July 2020 Linguistic Features for Readability Assessment Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications 1 17 arXiv 2006 00377 doi 10 18653 v1 2020 bea 1 1 Feng Lijun Elhadad Noemie Huenerfauth Matt March 2009 Cognitively motivated features for readability assessment EACL 09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics Eacl 09 229 237 doi 10 3115 1609067 1609092 S2CID 13888774 Gibson Edward 1998 Linguistic complexity locality of syntactic dependencies Cognition 68 1 1 76 doi 10 1016 S0010 0277 98 00034 1 PMID 9775516 S2CID 377292 Pitler Emily Nenkova Ani October 2008 Revisiting Readability A Unified Framework for Predicting Text Quality Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing 186 195 Flesch R 1946 The art of plain talk New York Harper Flesch R 1979 How to write in plain English A book for lawyers and consumers New York Harpers Klare G R 1980 How to write readable English London Hutchinson Fry E B 1988 Writeability the principles of writing for increased comprehension In Readability Its past present and future eds B I Zakaluk and S J Samuels Newark DE International Reading Assn Further reading EditHarris A J and E Sipay 1985 How to increase reading ability 8th Ed New York amp London Longman Ruddell R B 1999 Teaching children to read and write Boston Allyn and Bacon Manzo A V and U C Manzo 1995 Teaching children to be literate Fort Worth Harcourt Brace Vacca J A R Vacca and M K Gove 1995 Reading and learning to read New York HarperCollins External links Edit Look up readability in Wiktionary the free dictionary Wikiversity has learning resources about Wikiversity Readability Readability Scoring Tool Scores against many readability formulas at once Readable io Readability Tests Joe s Web Tools Text Content Analysis Tool UsingEnglish com free membership required Retrieved from https en wikipedia org w index php title Readability amp oldid 1138708376, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.