Violeta Seretan
|
|
|
|
|
|
Maître-Assistante
|
|
Address: |
LATL - Language Technology
Laboratory, room L706
Department of Linguistics, University of Geneva
2, rue de Candolle
CH-1211 Geneva, Switzerland |
|
|
Fly
to LATL! |
[Google
Earth link] |
|
Phone:
|
+41
22 379 73 62 |
|
Fax: |
+41 22
379 79 31 |
|
E-mail: |
violeta.seretan@unige.ch |
I am
a Maître-Assistante (Lecturer) at the Department of Linguistics, University of Geneva, and a member of the Language
Technology Laboratory headed by Prof. Eric
Wehrli. My work is related to two main Computational Linguistics topics, namely syntactic parsing and machine translation, and is focused on the corpus-based acquisition of monolingual and multilingual lexical resources for these applications. I am specifically interested in collocations, a subtype of multi-word expressions, and the interrelation between collocations and parsing/translation. My Ph.D. thesis (defended June, 2008; supervisor Eric Wehrli) explored the use of syntactic information for improving collocation extraction.
Before joining LATL, I studied Computer Science at the University
of
Iasi, Romania. Both my B.Sc. and M.Sc. theses dealt with Computational Linguistics topics (HPSG, discourse structure, anaphora) and were supervised by Prof. Dan Cristea, who led the NLP Group I was a member of. As a teaching assistant, I currently give practical NLP classes to master students at the Faculty of Arts.
Research
Interests
- collocations, multi-word expressions
- lexical acquisition
- syntactic parsing
- text alignment
- machine translation, translation aids and tools
- corpus linguistics, Web as a corpus
- textual entailment, nominalization
Projects
I am currently involved in the SNSF project "Analyse multilingue" aimed at the multilingual extension of the Fips symbolic parser (already available for French, English, German, Italian, Spanish, and Greek). I work on the development of the Romanian version, started in a previous related project, "Fips Multilingue" (2004–2006). I previously participated in the "Multra" project (2006–2009) related to the multilingual extension of the Its-2 translation system based on Fips.
I started my work at LATL with the RUIG-GIAN project "Linguistic Analysis and Collocation Extraction" (2002–2004), a joint research with the Translation Division of the World Trade Organisation Geneva, and developed a translation aid tool for collocation extraction and visualisation in parallel corpora. This project was at the root of my PhD (2004–2008).
During my PhD, I spent 3 months as a summer intern at FX-PAL/PARC, Palo
Alto, California (August–October 2005). I worked with Lorenzo Thione and Martin van der Berg on interpreting arguments of nominalizations for Question Answering [presentation; talk
abstract; Java APIs for NOMLEX].
List of publications
(Peer-reviewed, reverse chronological order, h-index = 8 as computed by "Publish or Perish")
- Seretan, Violeta (in press). Syntax-Based Collocation Extraction. Springer (Text, Speech and Language Technology, volume 44). ISBN: 978-94-007-0133-5.

[bib]
- Wehrli, Eric, Violeta Seretan, and Luka Nerima (2010). Sentence analysis and collocation identification. In Proceedings of the Workshop on Multiword Expressions: from Theory to Applications (MWE 2010), pages 27–35, Beijing, China.
[html abstract] [pdf] [bib]
- Seretan, Violeta and Eric Wehrli (2010). Tools for syntactic concordancing. In Proceedings of the International Multiconference on Computer Science and Information Technology, pages 493–500, Wisła, Poland.
[html abstract] [pdf] [bib]
- Seretan, Violeta and Eric Wehrli (2010). Extending a multilingual symbolic parser to Romanian. In Dan Tufis and Corina Forascu (eds.): Multilinguality and Interoperability in Language Processing with Emphasis on Romanian, Romanian Academy Publishing House.
[html abstract] [pdf] [bib]
- Seretan, Violeta, Eric Wehrli, Luka Nerima, and Gabriela Soare (2010). FipsRomanian: towards a Romanian version of the Fips syntactic parser. In Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta.
[html abstract] [pdf] [bib] [poster]
- Luka Nerima, Eric Wehrli, and Violeta Seretan (2010). A recursive treatment of collocations. In Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta.
[html abstract] [pdf] [bib]
- Wehrli, Eric, Luka Nerima, Violeta Seretan, and Yves Scherrer (2009). On-line and off-line translation aids for non-native readers. In Proceedings of the International Multiconference on Computer Science and Information Technology, pages 299–303, Mrągowo, Poland.
[html abstract] [pdf] [bib]
- Seretan, Violeta (2009). Extraction de collocations et leurs équivalents de traduction à partir de corpus parallèles ('Extracting collocations and translation equivalents from parallel corpora'). TAL, 50(1):305–332. In French.
[html abstract] [pdf] [bib] [data: VO, AN, NPN]
- Seretan, Violeta (2009). An integrated environment for extracting and translating collocations. In Proceedings of the Fifth Corpus Linguistics Conference, Liverpool, U.K.
[html abstract] [pdf] [bib]
- Wehrli, Eric, Violeta Seretan, Luka Nerima, and Lorenza Russo (2009). Collocations in a rule-based MT system: A case study evaluation of their translation adequacy. In Proceedings of the 13th Annual Meeting of the European Association for Machine Translation, pages 128–135, Barcelona, Spain.
[html abstract] [pdf] [bib]
- Michou, Athina and Violeta Seretan (2009). A tool for multi-word expression extraction in Modern Greek using syntactic parsing. In Proceedings of the Demonstrations Session at EACL 2009, pages 45–48, Athens, Greece.
[html
abstract] [pdf] [bib]
- Seretan, Violeta and Eric
Wehrli (forthcoming). Context-sensitive look-up in electronic dictionaries. In Rufus H. Gouws, Ulrich Heid, Wolfgang Schweickard, Herbert Ernst Wiegand (editors) Dictionaries. An international encyclopedia of lexicography. Supplementary volume: Recent developments with special focus on computational lexicography, Handbooks of Linguistics and Communications Science. Walter de Gruyter, Berlin/New York.
[html
abstract] [pdf] [bib]
- Seretan, Violeta and Eric
Wehrli (2007). Collocation translation based on sentence alignment and parsing. In Actes de la 14e conférence sur le Traitement Automatique des Langues Naturelles (TALN 2007), pages 401–410, Toulouse, France. Best Paper Award.
[html
abstract] [pdf] [bib]
- Pallotta, Vincenzo, Violeta Seretan and Marita Ailomaa (2007). User requirements analysis for Meeting Information Retrieval based on query elicitation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 1008–1015, Prague, Czech Republic.
[html
abstract][pdf] [bib]
- Pallotta, Vincenzo, Violeta Seretan, Marita Ailomaa, Hatem Ghorbel, and Martin Rajman (2007). Towards an argumentative coding scheme for annotating meeting dialogue data. In Proceedings of the 10th International Pragmatics Association Conference (IPrA), Göteborg, Sweden, 2007.
[html
abstract][pdf] [bib]
- Nerima, Luka, Violeta Seretan, and Eric Wehrli (2006). Le problème des collocations en TAL. Nouveaux cahiers de linguistique française, 27(2006):95–115.
[html
abstract] [pdf] [bib]
- Seretan, Violeta and Eric
Wehrli (2006). Accurate collocation extraction using a multilingual
parser. In Proceedings of the 21st International Conference
on Computational Linguistics and 44th Annual Meeting of the Association
for Computational Linguistics (COLING/ACL 2006), pages
953–960, Sydney,
Australia.
[html
abstract] [pdf] [bib]
- Seretan, Violeta and Eric
Wehrli (2006). Multilingual collocation extraction: Issues and
solutions. In Proceedings or COLING/ACL Workshop on Multilingual Language Resources
and Interoperability, pages 40–49, Sydney, Australia.
[html
abstract] [pdf] [bib]
- Seretan, Violeta (2005).
Induction of syntactic collocation patterns from generic syntactic
relations. In
Proceedings of Nineteenth International Joint Conference on Artificial
Intelligence (IJCAI 2005), pages 1698–1699, Edinburgh,
Scotland.
[html
abstract] [pdf] [bib]
- Seretan, Violeta,
Luka Nerima, and Eric Wehrli (2004). "Multi-word collocation
extraction by syntactic composition of collocation bigrams". Recent
Advances in Natural Language Processing III: Selected Papers from RANLP
2003, Nicolas Nicolov et al. eds., 91–100. Amsterdam
& Philadelphia: John Benjamins. The original publication is available at www.benjamins.com.
[html abstract] [pdf] [bib]
- Seretan, Violeta,
Luka Nerima, and Eric Wehrli (2004). A tool for multi-word collocation
extraction and visualization in multilingual corpora. In Proceedings
of the Eleventh EURALEX International Congress (EURALEX 2004),
pages 755–766, Lorient, France.
[html abstract] [pdf] [bib]
- Seretan, Violeta,
Luka Nerima, and Eric Wehrli (2004). Using the Web as a corpus for
the syntactic-based collocation identification. In Proceedings
of International Conference on Language Resources and Evaluation (LREC
2004), pages 1871–1874, Lisbon, Portugal.
[html
abstract] [pdf] [bib]
- Seretan, Violeta,
Luka Nerima, and Eric Wehrli (2003). Extraction of multi-word collocations
using syntactic bigram composition. In Proceedings
of the Fourth International Conference on Recent Advances in NLP
(RANLP-2003), pages 424–431, Borovets, Bulgaria.
[html
abstract] [pdf] [bib]
- Nerima, Luka, Violeta Seretan, and Eric Wehrli (2003). Creating a multilingual
collocation dictionary from large text corpora. In Proceedings
of the Research Notes Session of the 10th Conference of the European
Chapter of the Association for Computational Linguistics (EACL'03),
pages 131–134, Budapest, Hungary.
[html
abstract] [pdf] [bib]
- Seretan, Violeta. (2002). Discourse analysis correction using anaphoric cues. In Proceedings
of the 2nd Workshop on RObust Methods in Analysis of Natural language
Data (ROMAND 2002), pages 79–86, Frascati, Italy.
[html
abstract] [pdf] [bib]
- Seretan, Violeta
and Dan Cristea (2002). The use of referential constrains in
structuring discourse. In Proceedings of The Third
International Conference on Language Resources and Evaluation (LREC
2002), pages 1231–1238, Las Palmas, Spain.
[html
abstract] [pdf] [bib]
Teaching
- 2002–2003 winter term
- Travaux pratiques en relation avec TALN (NLP
Laboratory), Faculty of
Arts, 3rd year
- 2003–2004 winter term
- Travaux pratiques en relation avec TALN (NLP
Laboratory), Faculty of
Arts, 3rd year
- 2004–2005 winter term
- Travaux pratiques en relation avec TALN (NLP
Laboratory), Faculty of
Arts, 3rd year
- Databases - seminar and laboratory for the course Informatique
2, Faculty
of Arts, 2nd year
- 2005–2006 winter term
- Travaux pratiques en relation avec TALN (NLP
Laboratory), Faculty of
Arts, Master and DEA
- 2009–2010 autumn term
- Travaux pratiques en relation avec TALN (NLP
Laboratory), Faculty of
Arts, Master and DEA
Reviewing
- CLA Workshop Computational Linguistics – Applications (CLA'10), Wisła, Poland, October 18-20, 2010
- "Multiword Expressions: from Theory to Applications" (MWE 2010), Workshop at COLING 2010, Beijing, China, August 28, 2010
- The 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China, August 23–27, 2010
- The 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, July 11–16, 2010
- The seventh international conference on Language Resources and Evaluation (LREC 2010), Valetta, Malta, May 19–21 2010
- CICLing 2010 satellite conference PROMISE - Processing ROmanian in Multilingual, Interoperational and Scalable Environments, Iasi, Romania, March 29–31, 2010
- Transactions on Intelligent Systems and Technology journal, 2010
- ACL/IJCNLP 2009 Workshop “Multiword Expressions: Identification, Interpretation, Disambiguation and Applications (MWE 2009)”, Singapore, August 6, 2009
- ConsILR 2009 Workshop "Romanian Linguistic Resources and Tools for Natural Language Processing", May 6–7, 2009
- Natural Language Engineering journal, 2008
- Language Resources and Evaluation journal, 2008
- ConsILR 2008 Workshop "Romanian Linguistic Resources and Tools for Natural Language Processing", Iasi, Romania, November 19–20, 2008
- LREC 2008 Workshop "Towards a Shared Task for Multiword Expressions (MWE 2008)", Marrakech, Morocco, June 1, 2008
- ConsILR 2007 Workshop "Romanian Linguistic Resources and Tools for Natural Language Processing", Iasi, Romania, December 14–15, 2007
- ACL 2007 Workshop "A Broader Perspective on Multiword Expressions", Prague, Czech Republic, June 28, 2007
- Doctoral Consortium at the 8th EUROLAN Summer School, Iasi, Romania, July 30 - August 2, 2007
- RANLP-07, Borovetz, Bulgaria, September 27–29, 2007
- RANLP-07 Workshop on Acquisition and Management of Multilingual Lexicons, Borovetz, Bulgaria, September 30, 2007
- Computational Linguistics journal, December 2007, Vol. 33, No. 4
- COLING/ACL
Workshop on Multiword Expressions, Sydney, Australia, July
23, 2006
- ROMAND 2006 4th International workshop on RObust Methods in Analysis of Natural language Data at EACL 2006, Trento, Italy, April 3, 2006
Conference organisation
Last modified:
novembre 2, 2011