Third Workshop on Human Evaluation of NLP Systems (HumEval’23)

RANLP’23, Varna, Bulgaria, 7 or 8 September 2023

First Call for Papers

Website: https://humeval.github.io/

HumEval’21 at EACL 2021: https://humeval.github.io/2021/
HumEval’22 at ACL 2022: https://humeval.github.io/2022/

Workshop overview

The Third Workshop on Human Evaluation of NLP Systems (HumEval’23) invites the submission of long and short papers on substantial, original, and unpublished research on all aspects of human evaluation of NLP systems with a focus on NLP systems which produce language as output. We welcome work on any quality criteria relevant to NLP, on both intrinsic evaluation (which assesses systems and outputs directly) and extrinsic evaluation (which assesses systems and outputs indirectly in terms of its impact on an external task or system), on quantitative as well as qualitative methods, score-based (discrete or continuous scores) as well as annotation-based (marking, highlighting).

Invited speakers

tbc

Important dates

  • Workshop paper submission deadline: 10 July 2023
  • Workshop paper acceptance notification: 5 August 2023
  • Workshop paper camera-ready versions: 25 August 2023
  • Workshop camera-ready proceedings ready: 31 August 2023

All deadlines are 23:59 UTC-12.

Papers

We invite papers on topics including, but not limited to, the following:

  • Experimental design and methods for human evaluations
  • Reproducibility of human evaluations
  • Work on inter-evaluator and intra-evaluator agreement
  • Ethical considerations in human evaluation of computational systems
  • Quality assurance for human evaluation
  • Crowdsourcing for human evaluation
  • Issues in meta-evaluation of automatic metrics by correlation with human evaluations
  • Alternative forms of meta-evaluation and validation of human evaluations
  • Comparability of different human evaluations
  • Methods for assessing the quality and the reliability of human evaluations
  • Role of human evaluation in the context of Responsible and Accountable AI

We welcome work from any subfield of NLP (and ML/AI more generally), with a particular focus on evaluation of systems that produce language as output.

Submission

Long papers

Long papers must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers may consist of up to eight (8) pages of content, plus unlimited pages of references. Final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. Long papers will be presented orally or as posters as determined by the programme committee. Decisions as to which papers will be presented orally and which as posters will be based on the nature rather than the quality of the work. There will be no distinction in the proceedings between long papers presented orally and as posters.

Short papers

Short paper submissions must describe original and unpublished work. Short papers should have a point that can be made in a few pages. Examples of short papers are a focused contribution, a negative result, an opinion piece, an interesting application nugget, a small set of interesting results. Short papers may consist of up to four (4) pages of content, plus unlimited pages of references. Final versions of short papers will be given one additional page of content (up to 5 pages) so that reviewers’ comments can be taken into account. Short papers will be presented orally or as posters as determined by the programme committee. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between short papers presented orally and as posters.

Multiple submission policy

HumEval’23 allows multiple submissions. However, if a submission has already been, or is planned to be, submitted to another event, this must be clearly stated in the submission form.

Submission procedure and templates

Please submit short and long papers directly via START by the submission deadline, 10 July 2023: https://softconf.com/ranlp23/HumEval/

Please follow the submission guidelines issued by RANLP 2023: http://ranlp.org/ranlp2023/index.php/submissions/

Optional Supplementary Materials: Appendices, Software and Data

Additionally, supplementary materials can be added in an appendix. If you wish to make available software and data to accompany the paper, please indicate this in the paper, but for the submission fully anonymise all links.

Organisers

Anya Belz, ADAPT Centre, Dublin City University, Ireland
Maja Popović, ADAPT Centre, Dublin City University, Ireland
Ehud Reiter, University of Aberdeen, UK
João Sedoc, New-York University
Craig Thomson, University of Aberdeen, UK

For questions and comments regarding the workshop please contact the organisers at humeval.ws@gmail.com.

Programme committee

Albert Gatt, Utrecht University, NL
Leo Wanner, Universitat Pompeu Fabra, ES
Alberto José Bugarín Diz, University of Santiago de Compostela, ES
Jose Alonso, University of Santiago de Compostela, ES
Antonio Toral, Groningen University, NL
Malik Altakrori, McGill University, CA
Aoife Cahill, Dataminr, US
Malvina Nissim, Groningen University, NL
Dimitra Gkatzia, Edinburgh Napier University, UK
Yiru Li, Groningen University, NL
Margot Mieskes, University of Applied Sciences, Darmstadt, DE
Mark Cieliebak, Zurich University of Applied Sciences, CH
Diyi Yang, Georgia Tech, US
Mingqi Gao, Peking University, CN
Elizabeth Clark, Google Research, US
Mohammad Arvan, University of Illinois, Chicago, US
Filip Klubicka, Technical University Dublin, IE
Mohit Bansal, UNC Charlotte, US
Gavin Abercrombie, Heriot Watt University, UK
Natalie Parde, University of Illinois, Chicago, US
Huiyuan Lai, Groningen University, NL
Ondřej Dušek, Karls University, CZ
Yiru Li, Groningen University, NL
Ondřej Plátek, Karls University, NL
Ito Takumi, Utrecht University, NL
Rudali Huidrom, ADAPT/DCU, IE
Jackie Cheung, McGill University, CA
Saad Mahamood, Trivago N.V., DE
Pablo Mosteiro Romero, Utrecht University, NL
Steffen Eger, Bielefeld University, DE
Jie Ruan, Peking University, CN
Tanvi Dinkar, Heriot Watt University, UK
Joel Tetreault, Dataminr, US
Verena Rieser, Heriot Watt University, UK
John Kelleher, Technical University Dublin, IE
Xiaojun Wan, Peking University, CN
Kees van Deemter, Utrecht University, NL
Yanran Chen, Bielefeld University, DE
Lewis Watson, Edinburgh Napier University, UK
Dirk Hovy, Bocconi University, IT
Hürlimann Manuela, Zurich University of Applied Sciences, CH
Javier González Corbelle, University of Santiago de Compostela, ES\