Welcome to the website of the Cameleon Project ! Collaborative and Automatic Methods for the Multilingualisation of Lexica and Ontologies.
This is a 4-year international cooperation project that aims at creating, reinforcing and continuing academic exchanges between French and Brazilian researchers in the domain of multilingual lexica and ontologies. It is co-financed by CAPES (Brazil) and by COFECUB (France), as project CAPES-COFECUB number 707/11. The project started in 2011 and is expected to be concluded by the end of 2014.
The goal of this project is to investigate, propose, experiment, apply and validate automatic and collaborative techniques for the development of lexical and ontological resources that can be useful in the context of multilingual applications, particularly for French, Portuguese and English.
Therefore, it aims at the investigation of methods for acquiring linguistic information for the construction of lexical resources, integrating multilingual lexica and ontologies, focusing on collaborative and automatic techniques. In the former, volunteer contributors can use a platform to edit dictionary entries and to create links, online via a Web browser. Analogously, multilingual applications can access and contribute automatically to the lexical resources stored on the platform through an API. The latter, i.e. automatic construction of resources, is based on the extraction of lexical information from textual corpora, using empirical/statistical evidence and machine learning techniques.
The integration of automatic and collaborative methods has several advantages because they are somehow complementary. On the one hand, collaborative methods could use automatically generated data as a starting point, thus saving time and effort when creating a new instance (for a new language/domain/language pair). On the other hand, data-driven methods produce noisy results that should be later filtered by human experts. The use of collaborative platforms seems the most natural environment for post-editing automatically extracted lexical and ontological resources.