Guide to all Text Analysis posts
Try and guess the text I am going to use as an illustrative example. At a certain stage of text-processing, it has somewhat fewer than 2,000 word-tokens and a total length of a bit less than 27,000 words. I rank these 2,000 word tokens by order of frequency of appearance in the text. Here are some of the salient tokens and their ranks in increasing order: scan the tokens one at a time, and stop when you guess the text they are from. Give your points corresponding to the rank of the word you stopped at and report back to me via a comment.
Here is more information about the words:
The word 'head' appears only 4/5 of the times that the word 'queen' does. Assuming that 'head' appears half the time in neutral circumstances, the Queen is only yelling “Off with his head!” 2/5 of the times she makes an appearance.
The full text can be found here.