Workshop on Human Evaluation of NLP Systems (HumEval)

ACL’22, Dublin, Ireland, 26 or 27 May 2022

First Call for Papers


Previous edition (at EACL 2021):

Workshop overview

The 2nd HumEval Workshop invites the submission of long and short papers on substantial, original, and unpublished research on all aspects of human evaluation of NLP systems including but by no means limited to NLP systems whose output is language. The aspects include intrinsic (directly on the given task) as well as extrinsic (on an external task) evaluation, quantitative as well as qualitative methods, score-based (discrete or continuous scores) as well as annotation-based (marking, highlighting) procedures, different quality criteria.

Important dates

December 20, 2021: First Call for Workshop Papers
February 6, 2022: Second Call for Workshop Papers
February 28, 2022: Submission Deadline
March 26, 2022: Notification of Acceptance
April 10, 2022: Camera-ready papers due
26 or 27 May 2022: Workshop Dates at ACL (to be defined)

All deadlines are 23:59 UTC-12


We invite papers on topics including, but not limited to, the following:

  • Experimental design and methods for human evaluations
  • Reproducibility of human evaluations
  • Work on inter-evaluator and intra-evaluator agreement
  • Ethical considerations in human evaluation of computational systems
  • Quality assurance for human evaluation
  • Crowdsourcing for human evaluation
  • Issues in meta-evaluation of automatic metrics by correlation with human evaluations
  • Alternative forms of meta-evaluation and validation of human evaluations
  • Comparability of different human evaluations
  • Methods for assessing the quality and the reliability of human evaluations
  • Role of human evaluation in the context of Responsible and Accountable AI

We welcome work from any subfield of NLP (and ML/AI more generally), with a particular focus on evaluation of systems that produce language as output.


Long papers

Long papers must describe substantial, original, completed and unpublished work. Wherever appropriate, concrete evaluation and analysis should be included. Long papers may consist of up to eight (8) pages of content, plus unlimited pages of references. Final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. Long papers will be presented orally or as posters as determined by the programme committee. Decisions as to which papers will be presented orally and which as posters will be based on the nature rather than the quality of the work. There will be no distinction in the proceedings between long papers presented orally and as posters.

Short papers

Short paper submissions must describe original and unpublished work. Short papers should have a point that can be made in a few pages. Examples of short papers are a focused contribution, a negative result, an opinion piece, an interesting application nugget, a small set of interesting results. Short papers may consist of up to four (4) pages of content, plus unlimited pages of references. Final versions of short papers will be given one additional page of content (up to 5 pages) so that reviewers’ comments can be taken into account. Short papers will be presented orally or as posters as determined by the programme committee. While short papers will be distinguished from long papers in the proceedings, there will be no distinction in the proceedings between short papers presented orally and as posters.

Multiple submission policy

HumEval22 allows multiple submissions. However, if a submission has already been, or is planned to be, submitted to another event, this must be clearly stated in the submission form.

Submission procedure and templates

Submission is electronic, the links will be available soon. Both long and short papers must be anonymised for double-blind reviewing, must follow the ACL Author Guidelines, and must use the ACL 2022 templates available on the ACL Rolling Review website.

Optional Supplementary Materials: Appendices, Software and Data

ARR encourages the submission of these supplementary materials to improve the reproducibility of results, and to enable authors to provide additional information that does not fit in the paper. Supplementary materials may include appendices, software or data. For example, pre processing decisions, model parameters, feature templates, lengthy proofs or derivations, pseudocode, sample system inputs/outputs, and other details that are necessary for the exact replication of the work described in the paper can be put into appendices. However, if the pseudo-code or derivations or model specifications are an important part of the contribution, or if they are important for the reviewers to assess the technical correctness of the work, they should be a part of the main paper, and not appear in appendices. Reviewers are not required to consider material in appendices. Appendices should come after the references in the submitted pdf, but do not count towards the page limit. Software should be submitted as a single .tgz or .zip archive, and data as a separate single .tgz or .zip archive. Supplementary materials must be fully anonymized to preserve the two-way anonymized reviewing policy.


Anya Belz, ADAPT Centre, Dublin City University, Ireland
Maja Popović, ADAPT Centre, Dublin City University, Ireland
Ehud Reiter, University of Aberdeen, UK
Anastasia Shimorina, Orange, Lannion, France

For questions and comments regarding the workshop please contact the organisers at

Programme committee

Eleftherios Avramidis, DFKI, Germany
Sheila Castilho, ADAPT, Dublin City University, Ireland
Sandipan Dandapat, Microsoft, India
Ondrej Dušek, Charles University, Czechia
Markus Freitag, Google, USA
Albert Gatt, Malta University, Malta
Behnam Hedayatnia, Amazon, USA
David Howcroft, Heriot Watt University, UK
Tom Kocmi, Microsoft, Germany
Filip Klubička, ADAPT, Technological University of Dublin, Ireland
Samuel Läubli, University of Zurich, Switzerland
Chris van der Lee, Tilburg University, Netherlands
Saad Mahamood, Trivago, Germany
Nitika Mathur, University of Melbourne, Australia
Margot Mieskes, UAS Darmstadt, Germany
Emiel van Miltenburg, Tilburg University, Netherlands
Mathias Mueller, University of Zurich, Switzerland
Sergiu Nisioi, University of Bucharest, Romania
Juri Opitz, University of Heidelberg, Germany
Maike Paetzel-Prüsmann, University Potsdam, Germany
Maxime Peyrard, EPFL, Switzerland
Tim Polzehl, TU Berlin, Germany
Martin Popel, UFAL, Czechia
Verena Rieser, Heriot Watt University, UK
Samira Shaikh, UNC, USA
Wei Zhao, TU Darmstadt, Germany