CompL-it: a Computational Lexicon of Italian

Autori

  • Flavia Sciolette CNR-Istituto di Linguistica Computazionale (ILC) “A. Zampolli”, Pisa, Italy
  • Andrea Bellandi
  • Emiliano Giovannetti CNR-Istituto di Linguistica Computazionale (ILC) “A. Zampolli”, Pisa, Italy
  • Simone Marchi CNR-Istituto di Linguistica Computazionale (ILC) “A. Zampolli”, Pisa, Italy

DOI:

https://doi.org/10.57574/596545646

Parole chiave:

Computational Lexicon, Linguistic Resources, Linguistic Linked Open Data, OntoLex- Lemon, Information Retrieval

Abstract

This paper describes CompL-it, a new open computational lexicon for contemporary Italian. The resource was constructed from three sources: an already available Italian lexicon, a lemmatized list of inflected forms obtained from a morphological analyser, and a set of treebanks. Integrating these resources required a standardisation process in accordance with the standards of the Linguistic Linked Open Data community, which was necessary for the subsequent conversion into the OntoLex-Lemon model. The resulting computational lexicon comprises approximately 100,000 lexical entries, 790,000 forms, 57,000 senses, and 86,000 semantic relations. The lexicon, thanks to its rich and articulated linguistic structure, can be used, as shown, to enhance information retrieval in the context of full-text search tasks.

##submission.downloads##

Pubblicato

30-12-2024