LinguaStream

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Rjwilmsi (talk | contribs) at 20:13, 24 April 2008 (Typo fixing , typos fixed: tionnal → tional using AWB). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Jump to navigation Jump to search

LinguaStream is a generic platform for Natural Language Processing (NLP), based on incremental enrichment of electronic documents. It allows complex processing streams to be designed and evaluated, assembling analysis components of various types and levels: part-of-speech, syntax, semantics, discourse or statistical. Each stage of the processing stream discovers and produces new information, on which the subsequent steps can rely. At the end of the stream, several tools allow analysed documents and their annotations to be conveniently visualised.

LinguaStream is above all a virtual laboratory targeted to researchers in NLP. It allows for complex experiments on corpora to be realised conveniently, using various types of declarative formalisms, and reducing considerably the development costs. Its uses range from corpora exploration to the development of fully functional automatic analysers. An integrated environment is provided with the platform, where all the steps of the realisation of an experiment can be achieved.

It is also a platform providing an extensive Java API. For example, it can be integrated with Java EE servers to develop web applications based on processing streams. It is also used for teaching, and provides specific modules dedicated to students.

LinguaStream is developed at the GREYC computer science research group (Université de Caen) since 2001. It is available for free for private use and research purposes.

References

  • "LinguaStream: An Integrated Environment for Computational Linguistics Experimentation", F. Bilhaut and A. Widlöcher (2006). In Proceedings of the 11th Conference of the European Chapter of the Association of Computational Linguistics (EACL) (Companion Volume), Trento, Italy.
  • "Une plate-forme logicielle et une démarche pour la validation de ressources linguistiques sur corpus : application à l'évaluation de la détection automatique de cadres temporels", S. Ferrari, F. Bilhaut, A. Widlöcher, M. Laignelet (2005). In Actes des 4èmes Journées de Linguistique de Corpus, Lorient, France.
  • "La plate-forme LinguaStream : un outil d'exploration linguistique sur corpus", A. Widlöcher and F. Bilhaut (2005). In Actes de la 12e Conférence Traitement Automatique du Langage Naturel (TALN), Dourdan.
  • "La plate-forme LinguaStream", F. Bilhaut and A. Widlöcher (2005). Journée ATALA "Articuler les traitements sur corpus", Paris, France.
  • "The LinguaStream Platform", F. Bilhaut (2003). In Proceedings of the 19th Spanish Society for Natural Language Processing Conference (SEPLN), Alcalá de Henares, Spain, 339-340.