I am a Permanent Researcher (tenured) at IXA Group, part of the HiTZ Centre of the University of the Basque Country UPV/EHU, where I am head of the Text Analysis unit. I got a PhD in Computer Science at City, University of London (2007), and I have since been working on Natural Language Processing at several British and Spanish institutions, including a two year stint at the industry as research project director. I have been involved as PI or collaborator in more than 40 research projects funded by the European Commission, UK research councils, Spanish Ministry of Science and Basque Goverment and published in the major journals (Artificial Intelligence, etc.) and conferences (ACL, EMNLP, EACL, IJCAI, etc.) related to Artificial Intelligence and Natural Language Processing.
Currently my research is focused on Computational Semantics and Information Extraction, with a strong focus on multilingual and cross-lingual approaches. I was the creator and main developer of IXA pipes, a set of ready to use multilingual tools for linguistic processing. I am also PMC and committer in the OpenNLP project of the Apache Software Foundation.
Latest News
New CoNLL 2024 paper on a new Argumentation task: Critical Question Generation!
New EMNLP 2024 papers on CasiMedicos-Arg and Evaluation of Counternarratives.
Check out MedExpQA: our new Multilingual Medical QA benchmark for LLMs in the Medical Domain!
New ACL 2024 paper on Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques
Area Chair at COLING 2025
Senior Programme Committee Member at ECAI 2024.
New project funded!! DeepMinor: Large Language Models for Multilingual and Multidomain Text Processing in Low Resource Scenarios. Consolidación Investigadora 2023, Ministerio de Ciencia e Innovación. Funding: €399.987,28. Ref:CNS2023-144375. Role: Principal Investigator.
Check out our ICLR 2024 accepted paper on Annotation Guidelines to Improve Zero-Shot Information-Extraction.
Demo paper presented at EACL 2024: TextBI: a multimodal dashboard to interpret multidimensional text annotations on large volumes of multilingual social media data
Three papers accepted at LREC-COLING 2024. Check the publications page for details.
Area Chair at ACL 2024
Area Chair at NAACL 2024
New paper on Neural Contextual Lemmatization published in Computational Linguistics (presented at EMNLP 2023).
New paper on Cross-Lingual Annotation Projection using text-to-text models (Findings of EMNLP 2023).
- Senior Area Chair, Information Extraction track at EACL 2024.
Organizing the Workshop on NLP applied to Misinformation at the annual conference of the Spanish Society for Natural Language Processing.
Organizing the TESTLINK@IberLEF 2023 to be held at the annual conference of the Spanish Society for Natural Language Processing.
New paper on Scaling Laws for BERT in Low-Resource Settings at ACL Findings 2023.
New paper on Lessons Learned from the Evaluation of Spanish Language Models published at Procesamiento del Lenguaje Natural, Revista nº 70, marzo de 2023, pp. 157-170.
New paper on multilingual temporal processing published at Expert Systems with Applications, Elsevier.
New project funded!! DISARGUE: Few-shot Learning and Argumentation to Detect and Fight Misinformation in Social Media, Funding Entity: Programa Transición Digital y Ecológica (TED 2021), Ministerio de Ciencia, Innovación y Universidades. Funding: €200.330. Ref: TED2021-130810B-C21. Role: Principal Investigator.
New project funded!! DeepKnowledge: Deep Language Models for Understanding and Reasoning with Multilingual Content.Funding Entity: Programa Generación del Conocimiento, Ministerio de Ciencia, Innovación y Universidades. Funding: €295.119. Ref: PID2021-127777OB-C21. Role: Principal Investigator.
Senior Action Editor at ACL Rolling Review from December 2022 to December 2023.
- New paper on Multilingual Metaphor Detection to be presented at CoNLL 2022.
2 papers accepted at EMNLP 2022. Check them out!
Senior Area Chair at Coling 2022.
Area Chair at LREC 2022
New project funded!! ANTIDOTE. ArgumeNtaTIon-Driven explainable artificial intelligence fOr digiTal mEdicine.Funding Entity: CHIST-ERA - INT-Acciones de Programación Conjunta Internacional (MINECO) 2020. Funding: €148500. Ref: PCI2020-120717-2. Role: Principal Investigator (Spanish partner).
New Dataset on Stance Detection on Vaccines. We organized a shared task at IberLEF 2021 on detection favour and against stance with respect to vaccines in Basque and Spanish languages. You can check the VaxxStance shared task overview paper here.
New paper on Social Analysis of Basque Speakers in Twitter published in the Journal of Multilingual and Multicultural Development.
I am Associate Editor at the Expert Systems with Applications journal, Elsevier.
New paper on Multilingual Stance Detection published at Expert Systems with Applications, Elsevier.
We won the SardiStance@EVALITA 2020 shared task on Stance Detection Results and details in the shared task overview paper
We won the Capitel 2020 NER shared task in Spanish: Results and details in the shared task page. You can check the paper here
Invited to the IJCAI 2020 Journal track: We have been invited to publish our paper Language Independent Sequence Labelling for Opinion Target Extraction, Artificial Intelligence Journal, 2019. preprint version available
New project funded: I am co-PI (with Simón Peña-Fernandez) of the project: “Tools for the analysis of parliamentary discourses: polarization, subjectivity and affectivity in the post-truth era”. Funded by the UPV/EHU to promote collaboration between GureIker and IXA research groups.
- Two papers accepted at LREC 2020
Final Word
“If it were not for the brute fact that the world contains more than five billion primates that are demonstrably able to produce and comprehend natural languages, mathematical linguists would long ago have been able to present convincing formal demonstrations that such production and comprehension was impossible” (Gerald Gazdar).