Frumkina's Law

is a probabilistic model of the occurrence of linguistic units in text passages.

The Russian linguist Reveka M. Frumkina was the first to systematically investigate the distribution of words in text blocks of fexed length. Later, also the occurrence of syntactic structures and syntactic functions was analysed.

Data are ontained by counting the number of occurrences of the unit under study in each of the passages of a text. The lengths of the passages should be determined according to the overall probability of the unit, e.g. 100 words for the analysis of frequent words. The number of passages with x occurrences of the given unit is considered as a random variable. The probability of the unit is denoted by p, the probability of occurrence of any other unit is 1-p = q. The probability p is also a random variable since the application of a word is not independent of its co-text. Under the assumption that p is distributed according to the Beta distribution, the formula

is obtained. This model has been applied to

Determination of the class of the unit (e.g., part of speech of a word)
Identification of text passages with respect to terminological or semantic criteria
Determination of keywords
Measurement of stylistic parameters
Diagnosis of psychic diseases
Construction of learning automata

References

Altmann, G. (1988) Wiederholungen in Texten. Bochum: Brockmeyer.

Best, K.-H. (2005). Sprachliche Einheiten in Textblöcken. Glottometrics 9, 1-12.

Köhler, R. (2001). The distribution of some syntactic construction types in text blocks. In Uhlířova, L., Wimmer, G., Altmann, G., Köhler, R. (Eds.), Text as a linguistic paradigm: levels, constituents, constructs. Festschrift in honour of Ludek Hřebíček: 136-148. Trier: WVT.

Piotrowski, R.G. (1984). Text, Computer, Mensch. Bochum: Brockmeyer.

Paškovskij, V.E., Srebrjanskaja, I.I. (1971). Statističeskie ocenki pis´mennoj reči bol´nych šizofreniej. In: Inženernaja lingvistika. Leningrad: Nauka.

Frumkina's Law

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Interaction

Tools