Search
11 resources
-
Ehud Reiter, Anja Belz|Dec 24th, 2009|journalArticleEhud Reiter, Anja BelzDec 24th, 2009
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous work on NLG evaluation and on validation of automatic metrics in NLP, and then present the results of two studies of how well some metrics which are popular in other areas of NLP (notably BLEU and ROUGE)...
-
Cynthia Lee, Kelvin C.K. Wong, William K...|Feb 24th, 2009|journalArticleCynthia Lee, Kelvin C.K. Wong, William K...Feb 24th, 2009
-
Klaus Zechner, Derrick Higgins, Xiaoming...|Oct 1st, 2007|journalArticleKlaus Zechner, Derrick Higgins, Xiaoming...Oct 1st, 2007
-
Anat Ben-Simon, Randy Elliot Bennett|Apr 24th, 2007|journalArticleAnat Ben-Simon, Randy Elliot BennettApr 24th, 2007
This study evaluated a “substantively driven” method for scoring NAEP writing assessments automatically. The study used variations of an existing commercial program, e-rater®, to compare the performance of three approaches to automated essay scoring: a brute-empirical approach in which variables are selected and weighted solely according to statistical criteria, a hybrid approach in which a fixed set of variables more closely tied to the characteristics of good writing was used but the...
-
Joseph P. Turian, Luke Shen, I. Dan Mela...|Jan 1st, 2006|conferencePaperJoseph P. Turian, Luke Shen, I. Dan Mela...Jan 1st, 2006
Evaluation of MT evaluation measures is limited by inconsistent human judgment data. Nonetheless, machine translation can be evaluated using the well-known measures precision, recall, and their average, the F-measure. The unigram-based F-measure has significantly higher correlation with human judgments than recently proposed alternatives. More importantly, this standard measure has an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved. The...
-
Alon Lavie, Kenji Sagae, Shyamsundar Jay...|Apr 24th, 2004|bookSectionAlon Lavie, Kenji Sagae, Shyamsundar Jay...Apr 24th, 2004
-
Chin-Yew Lin, Franz Josef Och|Apr 24th, 2004|conferencePaperChin-Yew Lin, Franz Josef OchApr 24th, 2004
-
Paul Deane, Kathleen M. Sheehan|Apr 24th, 2003|conferencePaperPaul Deane, Kathleen M. SheehanApr 24th, 2003
-
George Doddington|Apr 24th, 2002|conferencePaperGeorge DoddingtonApr 24th, 2002
-
Lawrence M. Rudner, T. Liang|Apr 24th, 2002|conferencePaperLawrence M. Rudner, T. LiangApr 24th, 2002
-
Kishore Papineni, Salim Roukos, Todd War...|Apr 24th, 2001|conferencePaperKishore Papineni, Salim Roukos, Todd War...Apr 24th, 2001