In this paper we describe the criteria we adopted for the selection of a corpus composed of 3,000,000 words from Italian contemporary written texts. The corpus will give rise to a frequency dictionary, which should have two main characteristics: i) representativeness of the Italian texts which are actually read, rather than of all possible written texts, ii) usefulness for psycholinguistic research.
Publication type:
Contributo in atti di convegno
Publisher:
CISU, Roma, ITA
Source:
JADT 1995 III Giornate Internazionali di ANALISI STATISTICA dei DATI TESTUALI, pp. 103–109, CNR - Roma, 11 - 13 Dicembre 1995
Date:
1995
Resource Identifier:
http://www.cnr.it/prodotto/i/265175
urn:isbn:8879751603
Language:
Ita