project gutenberg by vividness
richard hasty
This project is a whimsical index to more than four thousand English language texts in Project Gutenberg (www.gutenberg.org). The texts are arranged by the degree to which they contain vivid words, and the titles are displayed in colors according to the color values of the words they contain. The index keys are words which have also been ordered by vividness.
The keys can be used to quickly move to a level of vividness. For example, selecting 'verdant' and then 'medal' brings up a list of the eleven most vivid texts (ten cookbooks, one poetry book). Lists of slightly more or less vivid texts are linked from each page of texts.
The texts are arranged by the average value of the vividness of the words from which they have been composed. The degree to which any word is considered vivid, and the color associated with that word are determined through a statistical analysis of all of the Project Gutenberg texts. A word's vividness is determined by its proximity to words which denote color. A word which always occurred within ten words of a color word would have a vividness of one. Verdant, the most vivid frequently occurring word, has a vividness of approximately seven tenths.
The title colors are based on the average color values of the words in the text. The color value associated with a word is the weighted average of the red, green and blue values for color words that appear in proximity. The weight is the inverse of the number of intervening words. In the sentence, "The red balloon floats over the green grass.", the word 'grass' would get six times as much a contribution towards green as it would to red. The color values are multiplied by the vividness for display, making the least vivid words appear in black.
