Difference between revisions of "Problem Format"

From Problem Archive
Jump to: navigation, search
(Graders: added graders)
(Graders: Explicitly specify exit code 0 for graders.)
 
(227 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''This is a draft'''.
+
This is a draft. Sections <s>highlighted in yellow</s> have not been implemented in the reference implementation of the problem format tool chain. This might imply that that part of the specification is more in flux. Sections <s class="dep">highlighted in red</s> are deprecated.
  
 
== Overview ==
 
== Overview ==
  
This document describes the Kattis problem format, used for distributing and sharing problems for algorithmic programming contests as well as educational use.  
+
This document describes the format of a ''Kattis problem package'', used for distributing and sharing problems for algorithmic programming contests as well as educational use.  
  
All problems must have a "short name" consisting solely of lower case letters a-z and digits 0-9.  All files related to a given problem are provided in a directory named after the short name of the problem.
+
=== General Requirements ===
  
All file names must match the following regexp
+
The package consists of a single directory containing files as described below, or alternatively, a ZIP compressed archive of the same files using the file extension <tt>.kpp</tt> . The name of the directory or the base name of the archive must consisting solely of lower case letters a-z and digits 0-9. 
 +
 
 +
All file names for files included in the package must match the following regexp
  
 
  [a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]
 
  [a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]
Line 14: Line 16:
  
 
All text files for a problem must be UTF-8 encoded and not have a byte order mark.
 
All text files for a problem must be UTF-8 encoded and not have a byte order mark.
 +
 +
All floating point numbers must be given as the external character sequences defined by IEEE 754-2008 and may use up to double precision.
 +
 +
=== Programs ===
 +
 +
There are a number of different kinds of programs that may be provided in the problem package; submissions, input validators, output validators, graders and generators. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. The name of the program, for the purpose of referring to it within the package is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.
 +
 +
Validators and graders, but not submissions, in the form of a directory may include two POSIX-compliant scripts "build" and "run". Either both or none of these scripts must be included.
 +
If the scripts are present, then:
 +
* the program will be compiled by executing the build script.
 +
* the program will be run by executing the run script.
 +
 +
Programs without build and run scripts are built and run according to what language is used. Language is determined by looking at the file endings. If a single language from the table below can't be determined, building fails. In the case of Python 2 and 3 which share the same file ending, language will be determined by looking at the shebang line which must match the regular expressions in the table below.
 +
 +
For languages where there could be several entry points, the default entry point in the table below will be used.
 +
 +
{| class="wikitable"
 +
! Code !! Language !! Default entry point !! File endings !! Shebang
 +
|-
 +
| c || C || || .c ||
 +
|-
 +
| cpp || C++ || || .cc, .cpp, .cxx, .c++, .C ||
 +
|-
 +
| csharp || C# || || .cs ||
 +
|-
 +
| go || Go || || .go ||
 +
|-
 +
| haskell || Haskell || || .hs ||
 +
|-
 +
| java || Java || Main || .java ||
 +
|-
 +
| javascript || JavaScript || main.js  || .js ||
 +
|-
 +
| kotlin || Kotlin || MainKt || .kt ||
 +
|-
 +
| lisp || Common Lisp || main.{lisp,cl} || .lisp .cl ||
 +
|-
 +
| objectivec || Objective-C || || .m ||
 +
|-
 +
| ocaml || OCaml || || *.ml ||
 +
|-
 +
| pascal || Pascal || || .pas ||
 +
|-
 +
| php || PHP || main.php || .php ||
 +
|-
 +
| prolog || Prolog || || .pl ||
 +
|-
 +
| python2 || Python 2 || main.py || .py || Matches the regex "<tt>^#!.*python2 </tt>", and default if shebang does not match any other language
 +
|-
 +
| python3 || Python 3 || main.py || .py || Matches the regex "<tt>^#!.*python3 </tt>"
 +
|-
 +
| ruby || Ruby || || .rb ||
 +
|-
 +
| rust || Rust || || .rs ||
 +
|-
 +
| scala || Scala || || .scala ||
 +
|}
 +
 +
=== Problem types ===
 +
 +
There are two types of problems: <em>pass-fail</em> problems and <em>scoring</em> problems.  In pass-fail problems, submissions are basically judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc).  In scoring problems, a submission that is accepted is additionally given a score, which is a numeric value (and the goal is to either maximize or minimize this value).
  
 
== Problem Metadata ==
 
== Problem Metadata ==
  
Metadata about the problem (e.g., source, license, limits) are provided in a UTF-8 encoded YAML file named <tt><short-name>/problem.yaml</tt>.   
+
Metadata about the problem (e.g., source, license, limits) are provided in a UTF-8 encoded YAML file named <tt>problem.yaml</tt> placed in the root directory of the package.   
  
The keys are defined as below. All keys are optional. Any unknown keys should be ignored.
+
The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.
  
 
{| class="wikitable"
 
{| class="wikitable"
 
! Key !! Type !! Default !! Comments
 
! Key !! Type !! Default !! Comments
 
|-
 
|-
| author || String or sequence of strings ||  || Who should get author credits. This would typically be the people that came up with the idea, wrote the problem specification and created the test data. This is sometimes omitted when authors choose to instead only give source credit, but both may be specified.
+
| <s>name</s> || <s>String or map of strings</s> || || <s>Required. If a string this is the name of the problem in english. If a map the keys are language codes and the values are the name of the problem in that language. It is an error for a language to be missing if there exists a problem statement for that language.</s>
 +
|-
 +
| type || String || pass-fail || One of "pass-fail" and "scoring".
 +
|-
 +
| author || String ||  || Who should get author credits. This would typically be the people that came up with the idea, wrote the problem specification and created the test data. This is sometimes omitted when authors choose to instead only give source credit, but both may be specified.
 
|-
 
|-
 
| source || String ||  || Who should get source credit. This would typically be the name (and year) of the event where the problem was first used or created for.
 
| source || String ||  || Who should get source credit. This would typically be the name (and year) of the event where the problem was first used or created for.
 
|-
 
|-
| source-url || String || || Link to page for source event. Should not be given if source is not.
+
| source_url || String || || Link to page for source event. Must not be given if source is not.
 +
|-
 +
| license || String || unknown || License under which the problem may be used. Value has to be one of the ones defined below.
 +
|-
 +
| rights_owner || String || Value of author, if present, otherwise value of source. || Owner of the copyright of the problem. If not present, author is owner. If author is not present either, source is owner. Required if license is something other than "unknown" or "public domain". Forbidden if license is "public domain".
 
|-
 
|-
| license || String || unknown || License under which the problem may be used. Value have to be one of the ones defined below.
+
| limits || Map with keys as defined below || see definition below ||
 
|-
 
|-
| rights_owner || String || Value of author, if present, otherwise value of source. || Owner of the copyright of the problem. If not present, author is owner. If author is not present either, source is owner.
+
| validation || String || default || One of "default" or "custom".  If "custom", may be followed by some subset of "score" and "interactive", where "score" indicates that the validator produces a score (this is only valid for scoring problems), and "interactive" specifies that the validator is run interactively with a submission. For example, "custom interactive score".  
 
|-
 
|-
| keywords || || ||
+
| validator_flags || String || || Will be passed as command-line arguments to each of the output validators.
 
|-
 
|-
| limits || Map with keys as defined below || see definition below ||
+
| <s class="dep">grading</s> <s>scoring</s> || Map with keys as defined below || See definition below || Must only be used on scoring problems.
 
|-
 
|-
| libraries || String || || set of libraries (delimited by spaces) as defined below.
+
| keywords || <s>String or sequence of strings</s> || || Set of keywords.
 
|-
 
|-
| languages || String || all || set of languages (delimited by spaces) or all
+
| <s>uuid</s> || <s>String</s> || || <s>UUID identifying the problem.</s>
 
|-
 
|-
| grading || String || acceptance || One of "acceptance" and "score"
+
| <s>libraries</s> || <s>String or sequence of strings</s> || || <s>Set of libraries as defined below.</s>
 
|-
 
|-
| validator || String || || set of values from below (delimited by spaces)
+
| <s>languages</s> || <s>String or sequence of strings</s> || <s>all</s> || <s>Set of languages or "all".</s>
 
|}
 
|}
  
Line 76: Line 147:
  
 
{| class="wikitable"
 
{| class="wikitable"
! Key !! Comments !! Default
+
! Key !! Comments !! Default !! Typical system default
 
|-
 
|-
| time_multiplier || optional || 5
+
| time_multiplier || optional || 5 ||
 
|-
 
|-
| time_safety_margin || optional || 2
+
| time_safety_margin || optional || 2 ||
 
|-
 
|-
| memory || optional, in Mb || 2048
+
| memory || optional, in MiB || system default || 2048
 
|-
 
|-
| output || optional, in Mb || 8  
+
| output || optional, in MiB || system default || 8  
 
|-
 
|-
| compilation_time || optional, in seconds || 60
+
| code || optional, in kiB || system default || 128
 
|-
 
|-
| validation_time || optional, in seconds || 60
+
| compilation_time || optional, in seconds || system default || 60
 
|-
 
|-
| validation_memory || optional, in Mb || 2048
+
| compilation_memory || optional, in MiB || system default || 2048
 
|-
 
|-
| validation_output || optional, in Mb || 8  
+
| validation_time || optional, in seconds || system default || 60
 +
|-
 +
| validation_memory || optional, in MiB || system default || 2048
 +
|-
 +
| validation_output || optional, in MiB || system default || 8  
 
|}
 
|}
  
=== libraries ===
+
For most keys the system default will be used if nothing is specified. This can vary, but you SHOULD assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.
 +
 
 +
=== <s class="dep">grading</s> <s>scoring</s> ===
 +
 
 +
A map with the following keys:
  
A space separated list from below. A library will be available for the languages listed.
+
{| class="wikitable"
 +
! Key !! Type !! Default !! Comments
 +
|-
 +
| objective || String || max || One of "min" or "max" specifying whether it is a minimization or a maximization problem.
 +
|-
 +
| show_test_data_groups || boolean || false || Specifies whether test group results should be shown to the end user.
 +
|-
 +
|}
  
 +
=== libraries ===
 +
<s>
 +
A set from elements below. A library will be available for the languages listed.
 +
</s>
 
{| class="wikitable"
 
{| class="wikitable"
 
! Value !! Library !! Languages
 
! Value !! Library !! Languages
Line 107: Line 197:
 
|}
 
|}
  
=== validator ===
+
=== languages ===
 +
<s>
 +
A space separated list of language code from the table in the overview section or ''all''.
 +
 
 +
If a list is given, the problem may only be solved using those languages.
 +
</s>
 +
 
 +
== Problem Statements ==
  
A space separated list from the following:
+
The problem statement of the problem is provided in the directory <tt>problem_statement/</tt>.
  
{| class="wikitable"
+
This directory must contain one file per language, for at least one language, named <tt>problem.<language>.<filetype></tt>, that contains the problem text itself, including input
! Value !! Comments
+
and output specifications, but not sample input and output.  <s class='dep'>Language must be given as an ISO 639-1 alpha-2 language code.</s> <s>Language must be given as the shortest ISO 639 code. If needed a hyphen and a ISO 3166-1 alpha-2 code may be appended to ISO 639 code.</s> Optionally, the language code can be left out, the default is then English (<tt>en</tt>). Filetype can be either <tt>tex</tt> for LaTeX files<s>, <tt>md</tt> for Markdown, or <tt>pdf</tt> for PDF</s>.
|-
+
 
| case_sensitive || upper/lower case differences are significant. If this parameter is not specified, any changes in case are to be ignored.
+
Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.
|-
+
 
| space_change_sensitive || whitespace differences are significant. If this parameter is not specified, any non-zero amounts of whitespace are considered identical.
+
Auxiliary files needed by the problem statement files must all be in <tt><short_name>/problem_statement/</tt> , <tt>problem.<language>.<filetype></tt> should reference auxiliary files as if the
|-
+
working directory is <tt><short_name>/problem_statement/</tt>. Image file formats supported are <tt>.png</tt>, <tt>.jpg</tt>, <tt>.jpeg</tt>, and <tt>.pdf</tt>.
| float_relative_tolerance X || accepts token if it is a floating point number and the relative error is <= X
 
|-
 
| float_absolute_tolerance X || accepts token if it is a floating point number and the absolute error is <= X
 
|-
 
| float_tolerance X || accepts token if either "float_relative_tolerance X" or "float_absolute_tolerance X" would accept
 
|-
 
| custom || use a custom output validator. Must be the first value. All following values will be passed as command-line arguments to each of the output validators
 
|}
 
  
== Problem Statements ==
+
A LaTeX file may include the Problem name using the LaTeX command <tt>\problemname</tt> in case LaTeX formatting of the title is wanted. <s>If it's not included the problem name specified in <tt>problem.yaml</tt> will be used.</s>
  
The problem statement of the problem is provided in the directory <tt><short_name>/problem_statement/</tt>.
+
The problem statements must only contain the actual problem statement, no sample data.
  
This directory must contain one LaTeX file per language, named <tt>problem.<language>.tex</tt>, that contains the problem text itself, including input
+
== Attachments ==
and output specifications, but not sample input and output. Language must be given as an ISO 639-1 alpha-2 language code. Optionally, the language code can be left out, the default is then English.
 
  
A template will be provided that \imports this file as well as the sample input
+
Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory <tt>attachments/</tt>.
and output. The format of the problem statement is described in [[Problem Statement Template]].  Files needed by this file must all be in
 
<tt><short_name>/problem_statement/</tt> , problem.tex should reference auxiliary files as if the
 
working directory is <tt><short_name>/problem_statement/</tt>.
 
  
 
== Test data ==
 
== Test data ==
  
The test data are provided in subdirectories of <tt><short_name>/data/</tt>. The sample data in <tt><short_name>/data/sample/</tt> and the secret data in <tt><short_name>/data/secret/</tt>.
+
<s>
 +
If input generators are used the files described here might not be available in this directory. This section describes what must be the case after running the generators.
 +
</s>
 +
 
 +
The test data are provided in subdirectories of <tt>data/</tt>. The sample data in <tt>data/sample/</tt> and the secret data in <tt>data/secret/</tt>.
 +
 
 +
All input and answer files have the filename extension <tt>.in</tt> and <tt>.ans</tt> respectively.
 +
 
 +
=== Annotations ===
  
All input and answer files have the filename extension .in and .ans respectively. Optionally a text file (with filename extension .desc) describing the purpose of an input file may be present.  
+
Optionally a hint, a description and an illustration file may be provided.
  
Optionally a description or a hint file (or both) may be present. The description file is a text file with filename extension <tt>.desc</tt> describing the purpose of an input file. The hint file is a text file with filename extension<tt>.hint</tt> giving a hint for solving an input file. The description file is meant to be privileged information, whereas the hint file is meant to be given to anybody who needs it, i.e. fails to solve the problem. The hint file might not be used at all, depending on how the problem is used, e.g. when used in a programming contest.
+
The hint file is a text file with filename extension<tt>.hint</tt> giving a hint for solving an input file. The hint file is meant to be given as feedback, i.e. to somebody that fails to solve the problem.  
  
Input, answer, description and hint files are matched by the base name.
+
The description file is a text file with filename extension <tt>.desc</tt> describing the purpose of an input file. The description file is meant to be privileged information that explains the purpose of the related test file, e.g. what cases it's supposed to test.
  
Test files will be used in lexicographical order. If a specific order is needed a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
+
The Illustration is an image file with filename extension <tt>.png</tt>, <tt>.jpg</tt>, <tt>.jpeg</tt>, or <tt>.svg</tt>. The illustration is meant to be privileged information illustrating the related test file.
  
== Included Code ==
+
Input, answer, description, hint and image files are matched by the base name.
 +
 
 +
=== Test Data Groups ===
 +
 
 +
The test data for the problem can be organized into a tree-like structure.  Each node of this tree is represented by a directory and referred to as a test data group.  Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subgroups of test data (i.e., subdirectories).
 +
 
 +
At the top level, the test data is divided into exactly two groups: <tt>sample</tt> and <tt>secret</tt>, but these two groups may be further split into subgroups as desired.
 +
 
 +
The <em>result</em> of a test data group is computed by applying a <em>grader</em> to all of the sub-results (test cases and subgroups) in the group.  See [[#Graders|Graders]] for more details.
  
Code that should be included with all submissions are provided in one directory per supported language, called <tt><short_name>/include/<language>/</tt>.  
+
Test files and groups will be used in lexicographical order on file base name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
  
The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the table below. If any of the included files are supposed to be the main file (i.e. a driver), that file must have the langage dependent name as given in the table below.
+
In each test data group, a file <tt>testdata.yaml</tt> may be placed to specify how the result of the test data group should be computed. If such a file is not provided for a test data group then the settings for the parent group will be used.  The format of <tt>testdata.yaml</tt> is as follows:
  
 
{| class="wikitable"
 
{| class="wikitable"
! Code !! Language !! Default main file
+
! Key !! Type !! Default !! Comments
 +
|-
 +
| on_reject || String || break || One of "break" or "continue".  Specifies how judging should proceed when a submission gets a non-Accept judgement on an individual test file or subgroup.  If "break", judging proceeds immediately to grading.  If "continue", judging continues judging the rest of the test files and subgroups within the group.
 +
|-
 +
| grading || String || default || One of "default" and "custom".
 +
|-
 +
| grader_flags || String  || empty string|| arguments passed to the grader for this test data group.
 
|-
 
|-
| c || C ||
+
| input_validator<s class="dep">_flags</s> || String <s>or map with the keys "name" and "flags"</s> || empty string || <s class="dep">arguments passed to the input validator for this test data group.</s> <s>If a string this is the name of the input validator that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the input validator.</s>
 
|-
 
|-
| cpp || C++ ||
+
| output_validator<s class="dep">_flags</s> || String <s>or map with the keys "name" and "flags"</s> || empty string || <s class="dep">arguments passed to the output validator for this test data group.</s> <s>If a string this is the name of the output validator that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the output validator.</s>
 
|-
 
|-
| java || Java || Main.java
+
| accept_score || String || 1 || Default score for accepted input files.  May only be specified for scoring problems.
 
|-
 
|-
| python2 || Python 2 || main.py
+
| reject_score || String || 0 || Default score for rejected input files.  May only be specified for scoring problems.
 
|-
 
|-
| python3 || Python 3 || main.py
+
| range || String || -inf +inf || Two numbers A and B ("inf", "-inf", "+inf" are allowed for plus/minus infinity) specifying the range of possible scores.  May only be specified for scoring problems.
 
|}
 
|}
 +
 +
== Included Code ==
 +
 +
Code that should be included with all submissions are provided in one directory per supported language, called <tt>include/<language>/</tt>.
 +
 +
The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the language table in the overview section. If any of the included files are supposed to be the main file (i.e. a driver), that file must have the language dependent name as given in the table referred above.
  
 
== Example Submissions ==
 
== Example Submissions ==
  
Correct and incorrect solutions to the problem are provided in a directory <tt><short_name>/submissions/</tt>.   
+
Correct and incorrect solutions to the problem are provided in subdirectories of <tt>submissions/</tt>.  The possible subdirectories are:
 
 
The expected result is specified by including the string <tt>@EXPECTED_RESULT@:</tt> followed by the expected result somewhere in the source code, e.g. in a comment. The expected result must be one of the ones listed below. If none are specified, it defaults to <tt>AC</tt>.
 
  
 
{| class="wikitable"
 
{| class="wikitable"
! Value !! Meaning !! Requirement !! Comment
+
! Value !! Requirement !! Comment
|-
 
| AC || Accepted || Accepted as a correct solution for all test files || At least one is required. Default value.
 
 
|-
 
|-
| RE || Rejected || Not accepted as a correct solution for some test file ||
+
| accepted || Accepted as a correct solution for all test files || At least one is required.
 
|-
 
|-
| WA || Wrong Answer || Wrong answer for some test file, but is not too slow and does not crash for any test file ||
+
| partially_accepted || Overall verdict must be Accepted. Overall score must not be max of range if objective is max and min of range if objective is min. || Must not be used for pass-fail problems.
 
|-
 
|-
| TLE || Time Limit Exceeded || Too slow for some test file. May also give wrong answer but not crash for any test file||
+
| wrong_answer || Wrong answer for some test file, but is not too slow and does not crash for any test file ||
 
|-
 
|-
| RTE || Run-Time Error || Crashes for some test file ||
+
| time_limit_exceeded || Too slow for some test file. May also give wrong answer but not crash for any test file.  ||
 
|-
 
|-
| <number> || The score || Score is exactly as specified || Only allowed when grading is "score"
+
| run_time_error || Crashes for some test file ||
 
|}
 
|}
 +
 +
<s>For submissions of type <tt>accepted</tt> and scoring problems, the expected score can be specified by including the string <tt>@EXPECTED_SCORE@:</tt> followed by the expected score somewhere in the source code, e.g. in a comment.</s>
  
 
Every file or directory in these directories represents a separate solution.  Same requirements as for submissions with regards to filenames. It is mandatory to provide at least one accepted solution.  
 
Every file or directory in these directories represents a separate solution.  Same requirements as for submissions with regards to filenames. It is mandatory to provide at least one accepted solution.  
Line 197: Line 308:
 
Submissions must read input data from standard input, and write output to standard output.
 
Submissions must read input data from standard input, and write output to standard output.
  
== Validators ==
+
== Input Validators ==
  
=== Input Format Validators ===
+
Input Validators, for verifying the correctness of the input files, are provided in <s class="dep"><tt>input_format_validators/</tt></s> <s><tt>input_validators/</tt></s>.  Input validators can be specified as VIVA-files (with file ending <tt>.viva</tt>), Checktestdata-file (with file ending <tt>.ctd</tt>), or as a program.
  
Input Format Validators, for verifying the correctness of the input files, are provided in <tt><short_name>/input_format_validators/</tt>. They must adhere to the [[Input format validator]] standard.
+
All input validators provided will be run on every input file. Validation fails if any validator fails.
  
=== Output Validators ===
+
=== Invocation ===
  
Output Validators are used if the problem requires more complicated output validation than what is provided by the default diff variant described below.  They are provided in <tt><short_name>/output_validators/</tt>, and must adhere to the [[Output validator| Output Validator]] standard.
+
An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.  
  
=== File Conventions For Validators ===
+
All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.
  
A validator is either a file or a directory. A validator in the form of a directory may include two scripts "build" and "run".  Either both or none of these scripts must be included.  If the scripts are present, then:
+
When invoked the input validator will get the input file on stdin.
* validator must be compiled by executing the build script.
 
* the validator must be run by executing the run script.
 
Otherwise, the validator will be compiled and run as if it was a submission (except that it is given the command-line arguments specified in problem.yaml, if there are any).
 
  
=== Default Validator Capabilities ===
+
The validator should be possible to use as follows on the command line:
  
The default validator is essentially a beefed-up diff. In its default mode, it
+
  ./validator [arguments] < inputfile
 +
 
 +
=== Output ===
 +
 
 +
The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.
 +
 
 +
=== Exit codes ===
 +
 
 +
The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.
 +
 
 +
==== Dependencies ====
 +
 
 +
The validator MUST NOT read any files outside those defined in
 +
the Invocation section. Its result MUST depend only on these
 +
files and the arguments.
 +
 
 +
== Output Validators ==
 +
 
 +
Output Validators are used if the problem requires more complicated output validation than what is provided by the default diff variant described below.  They are provided in <tt>output_validators/</tt>, and must adhere to the [[Output validator]] specification.
 +
 
 +
All output validators provided will be run on the output for every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.
 +
 
 +
=== Default Output Validator Specification ===
 +
 
 +
The default output validator is essentially a beefed-up diff. In its default mode, it
 
tokenizes the files to compare and compares them token by token. It supports the
 
tokenizes the files to compare and compares them token by token. It supports the
 
following command-line arguments to control how tokens are compared.
 
following command-line arguments to control how tokens are compared.
Line 240: Line 372:
 
== Graders ==
 
== Graders ==
  
Graders are used if the problem requires more complicated result aggregation than what is provided by the default grader described below. They are provided in <tt><short_name>/graders/</tt>, and must adhere to the [[Grader]] specification.
+
Graders are programs that are given the sub-results of a test data group and aggregate a result for the group. They are provided in <tt>graders/</tt> .
  
== Determination of Time Limit ==
+
For pass-fail problems, this grader will typically just set the verdict to accepted if all sub-results in the group were accepted and otherwise select the "worst" error in the group (see below for definition of "worst"), though it is possible to write a custom grader which e.g. accepts if at least half the sub-results are accepted.  For scoring problems, one common grader behaviour would be to always set the verdict to Accepted, with the score being the sum of scores of the items in the test group.
  
The execution time limit for the problem is determined as follows. First, all the provided <tt>accepted/</tt> solutions are run. Let ''t<sub>max</sub>'' be the maximum running time of these solutions on any of the input files.  The time limit is then set to be ''t<sub>lim</sub>'' = &lceil;''t<sub>max</sub> &middot; M''&rceil; where ''M'' is the value of the time multiplier parameter from the [[#limits|limits configuration of problem.yaml]].
+
=== Invocation ===
  
Furthermore, it is required that all of the provided time limit exceeded submissions run for at least ''t<sub>lim</sub> &middot; S'' seconds, where S is the value of the time safety margin parameter from the [[#limits|limits configuration]].
+
A grader program must be an application (executable or interpreted) capable of being invoked with a command line call.  
  
== Verification ==
+
When invoked the grader will get the judgement for test files or groups on stdin and is expected to produce an aggregate result on stdout.
  
Solutions or validators in languages that are not supported by the CCS
+
The grader should be possible to use as follows on the command line:
should be ignored and a warning to that effect shown.
 
  
Verification Checks (in order)
+
./grader [arguments] < judgeresults
# Check files (all files present as required + check problem.yaml)
 
# Check compile (check that all programs compile)
 
# Check input (run input validators)
 
# Check solutions (run all solutions check that they get the expected verdicts)
 
  
Warn if:
+
On success, the grader must exit with exit code 0.
      there are no *.in in data/sample/
 
  
Error if:
+
=== Input ===
      there is no problem.yaml
 
      there is no problem statement (i.e., a problem*.tex file)
 
      any value in problem.yaml is invalid
 
      there are no *.in in data/secret/
 
      there are .in files without corresponding .ans files in data/*/
 
      there are .ans files without corresponding .in files in data/*/
 
      there are no solutions in submissions/accepted/
 
      there are no validators in input_format_validators/
 
      validator begins with "custom" and there are no validators in output_validators/
 
      there are validators in output_validators/ and validator does not begin with "custom"
 
      any validator (input format or output) does not compile
 
  
  For each *.in in in data/*/:
+
A grader simply takes a list of results on standard input, and produces a single result on standard output. The input file will have the one line per test file containing the result of judging the testfile, using the code from the table below, followed by whitespace, followed by the score. <s>Format to be extended.</s>
      For each validator in input_format_validators/:
 
              If the validator does not accept the input file: Error!
 
  
For each solution in test_submissions/accepted/:
+
{| class="wikitable"
      For each *.in in data/*/:
+
! Code !! Meaning
              Run the solution on the input
+
|-
              For the built-in validator if corrector is "diff" or each validator in output_validators/:
+
| AC || Accepted
                      If the validator does not accept the output of the solution: Error!
+
|-
  Let t be the longest time any of the solutions ran on any of the inputs.
+
| WA || Wrong Answer
 +
|-
 +
| RTE || Run-Time Error
 +
|-
 +
| TLE || Time-Limit Exceeded
 +
|}
 +
 
 +
The score is taken from the <tt>score.txt</tt> files produced by the ouput validator. If no <tt>score.txt</tt> exists the score will be as defined by the grading accept_score and reject_score setting from problem.yaml.
 +
 
 +
=== Output ===
 +
 
 +
The grader must output the aggregate result on stdout in the same format as its input. Any other output, including no output, will result in a Judging Error.
 +
 
 +
For pass-fail problems, or for non-Accepted results on scoring problems, the score provided by the grader will always be ignored.
 +
 
 +
The grader may output debug information on stderr. This information may be displayed to the user upon invocation of the grader.
 +
 
 +
=== Default Grader Specification ===
 +
 
 +
The default grader has three different modes for aggregating the verdict -- ''worst_error'', ''first_error'' and ''always_accept'' -- four different modes for aggregating the score -- ''sum'', ''avg'', ''min'', ''max'' -- and two flags -- ''ignore_sample'' and ''accept_if_any_accepted''.  These modes can be set by providing their names as command line arguments (through the "grader_flags" option in [[#Test Data Groups|testdata.yaml]]).  If multiple conflicting modes are given, the last one is used.  Their meaning are as follows.
 +
 
 +
{| class="wikitable"
 +
! Argument !! Type !! Description
 +
|-
 +
| <tt>worst_error</tt> <s class="dep"><tt>no_errors</tt></s> || verdict mode || Default.  Verdict is accepted if all subresults are accepted, otherwise it is the first of JE, IF, RTE, MLE, TLE, OLE, WA that is the subresult of some item in the test case group.  Note that in combination with the on_reject:break policy in testdata.yaml, the result will be the first error encountered.
 +
|-
 +
| <tt>first_error</tt> || verdict mode || Verdict is accepted if all subresults are accepted, otherwise it is the verdict of the first subresult with a non-accepted verdict. Please note <tt>worst_error</tt> and <tt>first_error</tt> always give the same result if <tt>on_reject</tt> is set to <tt>break</tt>, and as such it is recommended to use the default.
 +
|-
 +
| <tt>always_accept</tt> || verdict mode || Verdict is always accepted.
 +
|-
 +
| <tt>sum</tt> || scoring mode || Default. Score is sum of input scores.
 +
|-
 +
| <tt>avg</tt> || scoring mode || score is average of input scores.
 +
|-
 +
| <tt>min</tt> || scoring mode || score is minimum of input scores.
 +
|-
 +
| <tt>max</tt> || scoring mode || score is maximum of input scores.
 +
|-
 +
| <tt>ignore_sample</tt> || flag || Must only be used on the root level. The first subresult (sample) will be ignored, the second subresult (secret) will be used, both verdict and score.
 +
|-
 +
| <tt>accept_if_any_accepted</tt> || flag || Verdict is accepted if any subresult is accepted, otherwise as specified by the verdict aggregation mode.
 +
|}
  
For each solution in submissions/time_limit_exceeded/:
+
== <s>Generators</s> ==
      For each *.in in data/*/:
+
<s>
              Run the solution on the input for at least t * time_limit_safety_margin seconds.
+
Input generators are programs that generates input. They are provided in <tt>generators/</tt>.
Let t_slow be the shortest time any of the solutions ran on any of the inputs.
+
</s>
 +
=== <s>Invocation</s> ===
 +
<s>
 +
A generator program must be an application (executable or interpreted) capable of being invoked with a command line call.
  
If t_slow is less than t * time_limit_safety_margin: Error!
+
The generators will be run with the test data directory (<tt>data/</tt>) as the working directory. The generator may read any existing files in that directory and should create any kind of test data file as defined in the test data section. The generator may not read or write anything outside the test data directory. The generators will be run in lexicographical order on name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
  
For each solution in submissions/wrong_answer/:
+
The generators must be deterministic, i.e. always produce the same input file when give the same arguments.
      For each *.in in data/*/:
 
              Run the solution on the input
 
              For the built-in validator if corrector = "diff" or each validator in output_validators/:
 
                      If the validator accepts the output of the solution: Error!
 
  
For each solution in submissions/run_time_error/:
+
The generators must be idempotent, i.e. running them multiple times should result in the same contents of the test data directory as running them once.
      For each *.in in data/*/:
+
</s>
              Run the solution on the input
 
              If the solution is not judged Run-Time Error: Error!
 
  
 
== See also ==
 
== See also ==
  
 
* [[Output validator]]
 
* [[Output validator]]
* [[Input format validator]]
 
* [[Grader]]
 
 
* [[Sample problem.yaml]]
 
* [[Sample problem.yaml]]
 
* [[Problem format directory structure]]
 
* [[Problem format directory structure]]
 +
* [[Problem Format Verification]]

Latest revision as of 21:41, 17 October 2019

This is a draft. Sections highlighted in yellow have not been implemented in the reference implementation of the problem format tool chain. This might imply that that part of the specification is more in flux. Sections highlighted in red are deprecated.

Overview

This document describes the format of a Kattis problem package, used for distributing and sharing problems for algorithmic programming contests as well as educational use.

General Requirements

The package consists of a single directory containing files as described below, or alternatively, a ZIP compressed archive of the same files using the file extension .kpp . The name of the directory or the base name of the archive must consisting solely of lower case letters a-z and digits 0-9.

All file names for files included in the package must match the following regexp

[a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]

I.e., it must be of length at least 2, consist solely of lower or upper case letters a-z, A-Z, digits 0-9, period, dash or underscore, but must not begin or end with period, dash or underscore.

All text files for a problem must be UTF-8 encoded and not have a byte order mark.

All floating point numbers must be given as the external character sequences defined by IEEE 754-2008 and may use up to double precision.

Programs

There are a number of different kinds of programs that may be provided in the problem package; submissions, input validators, output validators, graders and generators. All programs are always represented by a single file or directory. In other words, if a program consists of several files, these must be provided in a single directory. The name of the program, for the purpose of referring to it within the package is the base name of the file or the name of the directory. There can't be two programs of the same kind with the same name.

Validators and graders, but not submissions, in the form of a directory may include two POSIX-compliant scripts "build" and "run". Either both or none of these scripts must be included. If the scripts are present, then:

  • the program will be compiled by executing the build script.
  • the program will be run by executing the run script.

Programs without build and run scripts are built and run according to what language is used. Language is determined by looking at the file endings. If a single language from the table below can't be determined, building fails. In the case of Python 2 and 3 which share the same file ending, language will be determined by looking at the shebang line which must match the regular expressions in the table below.

For languages where there could be several entry points, the default entry point in the table below will be used.

Code Language Default entry point File endings Shebang
c C .c
cpp C++ .cc, .cpp, .cxx, .c++, .C
csharp C# .cs
go Go .go
haskell Haskell .hs
java Java Main .java
javascript JavaScript main.js .js
kotlin Kotlin MainKt .kt
lisp Common Lisp main.{lisp,cl} .lisp .cl
objectivec Objective-C .m
ocaml OCaml *.ml
pascal Pascal .pas
php PHP main.php .php
prolog Prolog .pl
python2 Python 2 main.py .py Matches the regex "^#!.*python2 ", and default if shebang does not match any other language
python3 Python 3 main.py .py Matches the regex "^#!.*python3 "
ruby Ruby .rb
rust Rust .rs
scala Scala .scala

Problem types

There are two types of problems: pass-fail problems and scoring problems. In pass-fail problems, submissions are basically judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc). In scoring problems, a submission that is accepted is additionally given a score, which is a numeric value (and the goal is to either maximize or minimize this value).

Problem Metadata

Metadata about the problem (e.g., source, license, limits) are provided in a UTF-8 encoded YAML file named problem.yaml placed in the root directory of the package.

The keys are defined as below. Keys are optional unless explicitly stated. Any unknown keys should be treated as an error.

Key Type Default Comments
name String or map of strings Required. If a string this is the name of the problem in english. If a map the keys are language codes and the values are the name of the problem in that language. It is an error for a language to be missing if there exists a problem statement for that language.
type String pass-fail One of "pass-fail" and "scoring".
author String Who should get author credits. This would typically be the people that came up with the idea, wrote the problem specification and created the test data. This is sometimes omitted when authors choose to instead only give source credit, but both may be specified.
source String Who should get source credit. This would typically be the name (and year) of the event where the problem was first used or created for.
source_url String Link to page for source event. Must not be given if source is not.
license String unknown License under which the problem may be used. Value has to be one of the ones defined below.
rights_owner String Value of author, if present, otherwise value of source. Owner of the copyright of the problem. If not present, author is owner. If author is not present either, source is owner. Required if license is something other than "unknown" or "public domain". Forbidden if license is "public domain".
limits Map with keys as defined below see definition below
validation String default One of "default" or "custom". If "custom", may be followed by some subset of "score" and "interactive", where "score" indicates that the validator produces a score (this is only valid for scoring problems), and "interactive" specifies that the validator is run interactively with a submission. For example, "custom interactive score".
validator_flags String Will be passed as command-line arguments to each of the output validators.
grading scoring Map with keys as defined below See definition below Must only be used on scoring problems.
keywords String or sequence of strings Set of keywords.
uuid String UUID identifying the problem.
libraries String or sequence of strings Set of libraries as defined below.
languages String or sequence of strings all Set of languages or "all".

license

Allowed values for license.

Values other than unknown or public domain requires rights_owner to have a value.

Value Comments Link
unknown The default value. In practice means that the problem can not be used.
public domain There are no known copyrights on the problem, anywhere in the world. http://creativecommons.org/about/pdm
cc0 CC0, "no rights reserved" http://creativecommons.org/about/cc0
cc by CC attribution http://creativecommons.org/licenses/by/3.0/
cc by-sa CC attribution, share alike http://creativecommons.org/licenses/by-sa/3.0/
educational May be freely used for educational purposes
permission Used with permission. The author must be contacted for every additional use.

limits

A map with the following keys:

Key Comments Default Typical system default
time_multiplier optional 5
time_safety_margin optional 2
memory optional, in MiB system default 2048
output optional, in MiB system default 8
code optional, in kiB system default 128
compilation_time optional, in seconds system default 60
compilation_memory optional, in MiB system default 2048
validation_time optional, in seconds system default 60
validation_memory optional, in MiB system default 2048
validation_output optional, in MiB system default 8

For most keys the system default will be used if nothing is specified. This can vary, but you SHOULD assume that it's reasonable. Only specify limits when the problem needs a specific limit, but do specify limits even if the "typical system default" is what is needed.

grading scoring

A map with the following keys:

Key Type Default Comments
objective String max One of "min" or "max" specifying whether it is a minimization or a maximization problem.
show_test_data_groups boolean false Specifies whether test group results should be shown to the end user.

libraries

A set from elements below. A library will be available for the languages listed.

Value Library Languages
gmp GMP - The GNU Multiple Precision Arithmetic Library C, C++
boost Boost C++

languages

A space separated list of language code from the table in the overview section or all.

If a list is given, the problem may only be solved using those languages.

Problem Statements

The problem statement of the problem is provided in the directory problem_statement/.

This directory must contain one file per language, for at least one language, named problem.<language>.<filetype>, that contains the problem text itself, including input and output specifications, but not sample input and output. Language must be given as an ISO 639-1 alpha-2 language code. Language must be given as the shortest ISO 639 code. If needed a hyphen and a ISO 3166-1 alpha-2 code may be appended to ISO 639 code. Optionally, the language code can be left out, the default is then English (en). Filetype can be either tex for LaTeX files, md for Markdown, or pdf for PDF.

Please note that many kinds of transformations on the problem statements, such as conversion to HTML or styling to fit in a single document containing many problems will not be possible for PDF problem statements, so using this format should be avoided if at all possible.

Auxiliary files needed by the problem statement files must all be in <short_name>/problem_statement/ , problem.<language>.<filetype> should reference auxiliary files as if the working directory is <short_name>/problem_statement/. Image file formats supported are .png, .jpg, .jpeg, and .pdf.

A LaTeX file may include the Problem name using the LaTeX command \problemname in case LaTeX formatting of the title is wanted. If it's not included the problem name specified in problem.yaml will be used.

The problem statements must only contain the actual problem statement, no sample data.

Attachments

Public, i.e. non-secret, files to be made available in addition to the problem statement and sample test data are provided in the directory attachments/.

Test data

If input generators are used the files described here might not be available in this directory. This section describes what must be the case after running the generators.

The test data are provided in subdirectories of data/. The sample data in data/sample/ and the secret data in data/secret/.

All input and answer files have the filename extension .in and .ans respectively.

Annotations

Optionally a hint, a description and an illustration file may be provided.

The hint file is a text file with filename extension.hint giving a hint for solving an input file. The hint file is meant to be given as feedback, i.e. to somebody that fails to solve the problem.

The description file is a text file with filename extension .desc describing the purpose of an input file. The description file is meant to be privileged information that explains the purpose of the related test file, e.g. what cases it's supposed to test.

The Illustration is an image file with filename extension .png, .jpg, .jpeg, or .svg. The illustration is meant to be privileged information illustrating the related test file.

Input, answer, description, hint and image files are matched by the base name.

Test Data Groups

The test data for the problem can be organized into a tree-like structure. Each node of this tree is represented by a directory and referred to as a test data group. Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subgroups of test data (i.e., subdirectories).

At the top level, the test data is divided into exactly two groups: sample and secret, but these two groups may be further split into subgroups as desired.

The result of a test data group is computed by applying a grader to all of the sub-results (test cases and subgroups) in the group. See Graders for more details.

Test files and groups will be used in lexicographical order on file base name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.

In each test data group, a file testdata.yaml may be placed to specify how the result of the test data group should be computed. If such a file is not provided for a test data group then the settings for the parent group will be used. The format of testdata.yaml is as follows:

Key Type Default Comments
on_reject String break One of "break" or "continue". Specifies how judging should proceed when a submission gets a non-Accept judgement on an individual test file or subgroup. If "break", judging proceeds immediately to grading. If "continue", judging continues judging the rest of the test files and subgroups within the group.
grading String default One of "default" and "custom".
grader_flags String empty string arguments passed to the grader for this test data group.
input_validator_flags String or map with the keys "name" and "flags" empty string arguments passed to the input validator for this test data group. If a string this is the name of the input validator that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the input validator.
output_validator_flags String or map with the keys "name" and "flags" empty string arguments passed to the output validator for this test data group. If a string this is the name of the output validator that will be used for this test data group. If a map then this is the name as well as the flags that will be passed to the output validator.
accept_score String 1 Default score for accepted input files. May only be specified for scoring problems.
reject_score String 0 Default score for rejected input files. May only be specified for scoring problems.
range String -inf +inf Two numbers A and B ("inf", "-inf", "+inf" are allowed for plus/minus infinity) specifying the range of possible scores. May only be specified for scoring problems.

Included Code

Code that should be included with all submissions are provided in one directory per supported language, called include/<language>/.

The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the language table in the overview section. If any of the included files are supposed to be the main file (i.e. a driver), that file must have the language dependent name as given in the table referred above.

Example Submissions

Correct and incorrect solutions to the problem are provided in subdirectories of submissions/. The possible subdirectories are:

Value Requirement Comment
accepted Accepted as a correct solution for all test files At least one is required.
partially_accepted Overall verdict must be Accepted. Overall score must not be max of range if objective is max and min of range if objective is min. Must not be used for pass-fail problems.
wrong_answer Wrong answer for some test file, but is not too slow and does not crash for any test file
time_limit_exceeded Too slow for some test file. May also give wrong answer but not crash for any test file.
run_time_error Crashes for some test file

For submissions of type accepted and scoring problems, the expected score can be specified by including the string @EXPECTED_SCORE@: followed by the expected score somewhere in the source code, e.g. in a comment.

Every file or directory in these directories represents a separate solution. Same requirements as for submissions with regards to filenames. It is mandatory to provide at least one accepted solution.

Submissions must read input data from standard input, and write output to standard output.

Input Validators

Input Validators, for verifying the correctness of the input files, are provided in input_format_validators/ input_validators/. Input validators can be specified as VIVA-files (with file ending .viva), Checktestdata-file (with file ending .ctd), or as a program.

All input validators provided will be run on every input file. Validation fails if any validator fails.

Invocation

An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.

All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.

When invoked the input validator will get the input file on stdin.

The validator should be possible to use as follows on the command line:

 ./validator [arguments] < inputfile

Output

The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.

Exit codes

The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.

Dependencies

The validator MUST NOT read any files outside those defined in the Invocation section. Its result MUST depend only on these files and the arguments.

Output Validators

Output Validators are used if the problem requires more complicated output validation than what is provided by the default diff variant described below. They are provided in output_validators/, and must adhere to the Output validator specification.

All output validators provided will be run on the output for every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.

Default Output Validator Specification

The default output validator is essentially a beefed-up diff. In its default mode, it tokenizes the files to compare and compares them token by token. It supports the following command-line arguments to control how tokens are compared.

Arguments Description
case_sensitive indicates that comparisons should be case-sensitive.
space_change_sensitive indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more

whitespace characters are equivalent).

float_relative_tolerance ε indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details).
float_absolute_tolerance ε indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details).
float_tolerance ε short-hand for applying ε as both relative and absolute tolerance.

When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances. When a floating-point tolerance has been set, any valid formatting of floating point numbers is accepted for floating point tokens. So for instance if a token in the answer file says 0.0314, a token of 3.14000000e-2 in the output file would be accepted. If no floating point tolerance has been set, floating point tokens are treated just like any other token and has to match exactly.

Graders

Graders are programs that are given the sub-results of a test data group and aggregate a result for the group. They are provided in graders/ .

For pass-fail problems, this grader will typically just set the verdict to accepted if all sub-results in the group were accepted and otherwise select the "worst" error in the group (see below for definition of "worst"), though it is possible to write a custom grader which e.g. accepts if at least half the sub-results are accepted. For scoring problems, one common grader behaviour would be to always set the verdict to Accepted, with the score being the sum of scores of the items in the test group.

Invocation

A grader program must be an application (executable or interpreted) capable of being invoked with a command line call.

When invoked the grader will get the judgement for test files or groups on stdin and is expected to produce an aggregate result on stdout.

The grader should be possible to use as follows on the command line:

./grader [arguments] < judgeresults

On success, the grader must exit with exit code 0.

Input

A grader simply takes a list of results on standard input, and produces a single result on standard output. The input file will have the one line per test file containing the result of judging the testfile, using the code from the table below, followed by whitespace, followed by the score. Format to be extended.

Code Meaning
AC Accepted
WA Wrong Answer
RTE Run-Time Error
TLE Time-Limit Exceeded

The score is taken from the score.txt files produced by the ouput validator. If no score.txt exists the score will be as defined by the grading accept_score and reject_score setting from problem.yaml.

Output

The grader must output the aggregate result on stdout in the same format as its input. Any other output, including no output, will result in a Judging Error.

For pass-fail problems, or for non-Accepted results on scoring problems, the score provided by the grader will always be ignored.

The grader may output debug information on stderr. This information may be displayed to the user upon invocation of the grader.

Default Grader Specification

The default grader has three different modes for aggregating the verdict -- worst_error, first_error and always_accept -- four different modes for aggregating the score -- sum, avg, min, max -- and two flags -- ignore_sample and accept_if_any_accepted. These modes can be set by providing their names as command line arguments (through the "grader_flags" option in testdata.yaml). If multiple conflicting modes are given, the last one is used. Their meaning are as follows.

Argument Type Description
worst_error no_errors verdict mode Default. Verdict is accepted if all subresults are accepted, otherwise it is the first of JE, IF, RTE, MLE, TLE, OLE, WA that is the subresult of some item in the test case group. Note that in combination with the on_reject:break policy in testdata.yaml, the result will be the first error encountered.
first_error verdict mode Verdict is accepted if all subresults are accepted, otherwise it is the verdict of the first subresult with a non-accepted verdict. Please note worst_error and first_error always give the same result if on_reject is set to break, and as such it is recommended to use the default.
always_accept verdict mode Verdict is always accepted.
sum scoring mode Default. Score is sum of input scores.
avg scoring mode score is average of input scores.
min scoring mode score is minimum of input scores.
max scoring mode score is maximum of input scores.
ignore_sample flag Must only be used on the root level. The first subresult (sample) will be ignored, the second subresult (secret) will be used, both verdict and score.
accept_if_any_accepted flag Verdict is accepted if any subresult is accepted, otherwise as specified by the verdict aggregation mode.

Generators

Input generators are programs that generates input. They are provided in generators/.

Invocation

A generator program must be an application (executable or interpreted) capable of being invoked with a command line call.

The generators will be run with the test data directory (data/) as the working directory. The generator may read any existing files in that directory and should create any kind of test data file as defined in the test data section. The generator may not read or write anything outside the test data directory. The generators will be run in lexicographical order on name. If a specific order is desired a numbered prefix such as 00, 01, 02, 03, and so on, can be used.

The generators must be deterministic, i.e. always produce the same input file when give the same arguments.

The generators must be idempotent, i.e. running them multiple times should result in the same contents of the test data directory as running them once.

See also