INLG Conference and the ‘Evaluating NLG Evaluations’ Workshop
By Lesia Tkacz | Web Science iPhD student
Thanks to the Web Science CDT training fund, I had the opportunity to attend the 13th International Conference on Natural Language Generation (INLG 2020) which was held online from Dublin from 15 to 18 December. As a subfield of Natural Language Processing (NLP), Natural Language Generation (NLG) generates text from data. This data can take many forms such as spreadsheets, readings from scientific or medical instruments, videos, images, or other texts. NLG is a fascinating feild; although there is still much research to be done in automatically generating high quality texts which are coherent, factual, complete, and relevant, generated texts have already percolated from research labs and into news media in the form of automated journalism. For some years now we have already been reading generated weather, sports, finance, and election reports. More recently, academic publisher Springer has released their first scientific book generated from other publications.
Attending the INLG 2020 conference was a fantastic training opportunity for me because I was able to build a better understanding of how to design my study, and a stronger grasp of the discussions surrounding evaluation in NLG. my interdisciplinary PhD research intersects with areas in the NLG feild whose goal is to generate creative and entertaining texts. While this is still a huge challenge that is far from being solved, it has certainly not prevented web users from creating and publishing text-based works which focus on the creative potential of text generation. Of these works, my research is particularly interested in studying the generated novel as an emerging form of creative text generation. Generated novels often make use of web-based data and NLP/NLG tools and methods, and they are typically published and shared on web platforms. Because I want to study potential readers’ interpretation and reception of generated novels on the web, I needed to better understand how generated texts in general are evaluated, and in particular how humans currently participate in the evaluation process.
Because I needed to better understand the current state, discussions, and methods within NLG evaluation, joining the INLG conference workshop on ‘Evaluating NLG Evaluations’ was a high quality and engaging way of doing this. The presented research as well as the open discussions and networking that the workshop and the conference’s online format fostered through random social break sessions, plus the chance to interact informally over INLG’s Discord chat, were a great way to build my understanding. The workshop topics included discussions about the currently quality of NLG evaluation, human and automated metrics, reproducibility, and how shared tasks for NLG evaluation might be created. The workshop closed with the participants sharing their opinions about what the future of NLG might hold, and this led to fruitful brainstorming about intersections and collaboration opportunities between NLG and Human-Computer-Interaction researchers, and the importance of putting in motion plans for specialized ethics training during future events. I’m grateful to my Web Science CDT for supporting me through the training fund, and I look forward to incorporating what I learned from the workshop into my own study design.