5.7. Babble File

Babble files are a form of Markov specification to generate text on the fly that preserves some statistical properties of the original source material. There are 2 types of lines in this file. The file begins with a list of tokens/words and their relative frequency in the source. These frequencies are used to decide the likelihood of choosing one of those tokens when there is no “next” state from the current. These lines look like:

connections 24
recommended 4
book 3

which gives the tokens connections, recommended, and book with the frequency of each token in the source.

The second type of line is a pair of tokens. The first representing the current state, or previously selected token. The second token is the next token to select followed by a relative frequency. The first token may appear on many lines. This sequence of lines could be combined into a CDF Line. For example

own use. 0.5
own server 1

both apply when the current state is own and show that there is an equal chance of moving to either use. or server. This would be a CDF line like 0.5 use., 1.0 server.