Folgen
Craig Thomson
Craig Thomson
Dublin City University / ADAPT, University of Aberdeen
Bestätigte E-Mail-Adresse bei dcu.ie
Titel
Zitiert von
Zitiert von
Jahr
A gold standard methodology for evaluating accuracy in data-to-text systems
C Thomson, E Reiter
arXiv preprint arXiv:2011.03992, 2020
602020
Missing information, unresponsive authors, experimental flaws: The impossibility of assessing the reproducibility of previous human evaluations in NLP
A Belz, C Thomson, E Reiter, G Abercrombie, JM Alonso-Moral, M Arvan, ...
arXiv preprint arXiv:2305.01633, 2023
53*2023
Underreporting of errors in NLG output, and what to do about it
E Van Miltenburg, MA Clinciu, O Dušek, D Gkatzia, S Inglis, L Leppänen, ...
arXiv preprint arXiv:2108.01182, 2021
402021
SportSett: basketball-a robust and maintainable data-set for natural language generation
C Thomson, E Reiter, S Sripada
Proceedings of the Workshop on Intelligent Information Processing and …, 2020
282020
Non-repeatable experiments and non-reproducible results: The reproducibility crisis in human evaluation in NLP
A Belz, C Thomson, E Reiter, S Mille
Findings of the Association for Computational Linguistics: ACL 2023, 3676-3687, 2023
232023
Evaluating factual accuracy in complex data-to-text
C Thomson, E Reiter, B Sundararajan
Computer Speech & Language 80, 101482, 2023
222023
Generation challenges: Results of the accuracy evaluation shared task
C Thomson, E Reiter
arXiv preprint arXiv:2108.05644, 2021
222021
Common flaws in running human evaluation experiments in NLP
C Thomson, E Reiter, A Belz
Computational Linguistics 50 (2), 795-805, 2024
192024
The 2024 repronlp shared task on reproducibility of evaluations in nlp: Overview and results
A Belz, C Thomson
Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems …, 2024
172024
Gemv2: Multilingual nlg benchmarking in a single line of code
S Gehrmann, A Bhattacharjee, A Mahendiran, A Wang, A Papangelis, ...
arXiv preprint arXiv:2206.11249, 2022
162022
The 2023 webnlg shared task on low resource languages overview and evaluation results (webnlg 2023)
L Cripwell, A Belz, C Gardent, A Gatt, C Borg, M Borg, J Judge, M Lorandi, ...
Proceedings of the Workshop on Multimodal, Multilingual Natural Language …, 2023
122023
Barriers and enabling factors for error analysis in NLG research
E Van Miltenburg, M Clinciu, O Dušek, D Gkatzia, S Inglis, L Leppänen, ...
Northern European Journal of Language Technology 9 (1), 2023
102023
Shared task on evaluating accuracy
E Reiter, CA Thomson
102020
Comprehension driven document planning in natural language generation systems
C Thomson, E Reiter, S Sripada
Proceedings of The 11th International Natural Language Generation Conference, 2018
72018
Studying the impact of filling information gaps on the output quality of neural data-to-text
CA Thomson, Z Zhao, SG Sripada
52020
QCET: An interactive taxonomy of quality criteria for comparable and repeatable evaluation of NLP systems
A Belz, S Mille, C Thomson, R Huidrom
Proceedings of the 17th International Natural Language Generation Conference …, 2024
32024
(Mostly) Automatic Experiment Execution for Human Evaluations of NLP Systems
C Thomson, A Belz
Proceedings of the 17th International Natural Language Generation Conference …, 2024
22024
HEDS 3.0: The human evaluation data sheet version 3.0
A Belz, C Thomson
arXiv preprint arXiv:2412.07940, 2024
12024
Filling Gaps in Wikipedia: Leveraging Data-to-Text Generation to Improve Encyclopedic Coverage of Underrepresented Groups
S Mille, M Pronesti, C Thomson, M Lorandi, S Fitzpatrick, R Huidrom, ...
Proceedings of the 17th International Natural Language Generation Conference …, 2024
12024
Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems
A Belz, M Popović, E Reiter, C Thomson, J Sedoc
Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems, 2023
12023
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20