Main EvallWeb Portlet

The classification task uses as input a 3 column tsv format without headers, where the first column represents the TEST CASE, the second column represents the ID of the item and the third column represents the VALUE assigned to the item. Find an example here.

Notice that, in the classification input, duplicate ids of items at TEST CASE level are not allowed. Similarly, empty values or different number of columns are not permitted. These restrictions will produce warnings when parsing the output file (the evaluation can continue but might not be reliable). These same restrictions will produce errors when parsing the goldstandard (the process will stop until errors are solved).

In the classification task, EvALL also allows as input a TWO COLUMN format without test case. Find more here.

The ranking task uses a 2 column tsv format without headers, where the first column represents the TEST CASE and the second column represents the ID of the item. The order of the rows in the output file will be interpreted as the ranking of the items. Find an example here.

As in the classification format, duplicate ids of items at test case level are not allowed. Besides, empty values or different number of columns are not permitted. Again, these restrictions will produce warnings when parsing the output file (the evaluation can continue but might not be reliable).

Unlike, the goldstandard format for ranking should contain 3 columns in a tsv format, indicating the TEST CASE, the ID and the RELEVANCE, respectively. Notice that the ranking should be represented as a positive numeric value. Higher values are ranked first. Errors in these restrictions when parsing the goldstandard will stop the evaluation until they are solved. Find an example here.

In the ranking task, EvALL also allows as input a TREC EVAL STYLE format. Find more here.

The clustering task uses as input a 3 column tsv without headers, where the first column represents the TEST CASE, the second column represents the ID of the item and the third column represents the CLUSTER NAME assigned to the item. Find an example here.

Notice that, in the clustering input, duplicate ids of items with different CLUSTER NAME at TEST CASE level are allowed. Duplicate ids with the same CLUSTER NAME at TEST CASE level are not permitted. Similarly, empty values or different number of columns are not permitted.

These restrictions will produce warnings when parsing the output file (the evaluation can continue but might not be reliable). These same restrictions will produce errors when parsing the goldstandard (the process will stop until errors are solved).

The diversification task uses a 2 column tsv format without headers, where the first column represents the TEST CASE and the second column represents the ID of the item. The order of the rows in the output file will be interpreted as the ranking of the items. Find an example here.

Duplicate ids of items at test case level are not allowed. Besides, empty values or different number of columns are not permitted. These restrictions will produce warnings when parsing the output file (the evaluation can continue but might not be reliable).

Unlike, the goldstandard format for diversification must contain 5 columns in a tsv format, indicating the TEST CASE, the ID, the RELEVANCE, the ASPECT TAG and the ASPECT WEIGHT, respectively. Notice that the RELEVANCE must be represented as a positive numeric value. Higher values are ranked first. Similarly, ASPECT WEIGHT must be a numeric positive value. Notice that, in the diversification input, duplicate ids of items with different ASPECT TAG at TEST CASE level are allowed. Finally, the ASPECT WEIGHT must be the same for all items with the same ASPECT TAG at TEST CASE level. Errors in these restrictions when parsing the goldstandard will stop the evaluation until they are solved. Find an example here.

In the diversification task, EvALL also allows as input a TREC EVAL STYLE format, but only for the output files, not for goldstandard files. Find more here.