Problem Format

From Problem Archive
Revision as of 13:56, 4 September 2013 by Austrin (talk | contribs) (Problem Metadata)
Jump to: navigation, search

This is a draft.

Overview

This document describes the format of a Kattis problem package, used for distributing and sharing problems for algorithmic programming contests as well as educational use.

General Requirements

The package consists of a single directory containing files as described below, or alternatively, a ZIP compressed archive of the same files using the file extension .kpp . The name of the directory or the basename of the archive is called the short name of the problem. It must consisting solely of lower case letters a-z and digits 0-9.

All file names for files included in the package must match the following regexp

[a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]

I.e., it must be of length at least 2, consist solely of lower or upper case letters a-z, A-Z, digits 0-9, period, dash or underscore, but must not begin or end with period, dash or underscore.

All text files for a problem must be UTF-8 encoded and not have a byte order mark.

All floating point numbers must be given as the external character sequences defined by IEEE 754-2008 and may use up to double precision.

All programs (validators, grader or submissions) are represented by a single file or a directory. Validators and graders, but not submissions, in the form of a directory may include two POSIX-compliant scripts "build" and "run". Either both or none of these scripts must be included. If the scripts are present, then:

  • the program must be compiled by executing the build script.
  • the program must be run by executing the run script.

Programs without these scripts will be compiled and run using the default system build and run script for the language in question.

Problem types

There are two types of problems: pass-fail problems and scoring problems. In pass-fail problems, submissions are basically judged as either accepted or rejected (though the "rejected" judgement is more fine-grained and divided into results such as "Wrong Answer", "Time Limit Exceeded", etc). In scoring problems, a submission that is accepted is additionally given a score, which is a numeric value (and the goal is to either maximize or minimize this value).

Problem Metadata

Metadata about the problem (e.g., source, license, limits) are provided in a UTF-8 encoded YAML file named <short-name>/problem.yaml.

The keys are defined as below. All keys are optional. Any unknown keys should be ignored.

Key Type Default Comments
uuid String UUID identifying the problem.
type String pass-fail One of "pass-fail" and "scoring".
author String or sequence of strings Who should get author credits. This would typically be the people that came up with the idea, wrote the problem specification and created the test data. This is sometimes omitted when authors choose to instead only give source credit, but both may be specified.
source String Who should get source credit. This would typically be the name (and year) of the event where the problem was first used or created for.
source-url String Link to page for source event. Should not be given if source is not.
license String unknown License under which the problem may be used. Value have to be one of the ones defined below.
rights_owner String Value of author, if present, otherwise value of source. Owner of the copyright of the problem. If not present, author is owner. If author is not present either, source is owner.
keywords
limits Map with keys as defined below see definition below
validation String standard One of "default" or "custom". If "custom", may be followed by some subset of "score" and "interactive", where "score" indicates that the validator produces a score (this is only valid for scoring problems), and "interactive" specifies that the validator is run interactively with a submission. For example, "custom interactive score". ["Interactive is not implemented yet."]
validator_flags String will be passed as command-line arguments to each of the output validators
grading Map with keys as defined below see definition below
libraries [not available yet] String set of libraries (delimited by spaces) as defined below.
languages [not available yet] String all set of languages (delimited by spaces) or all

license

Allowed values for license.

Values other than unknown or public domain requires rights_owner to have a value.

Value Comments Link
unknown The default value. In practice means that the problem can not be used.
public domain There are no known copyrights on the problem, anywhere in the world. http://creativecommons.org/about/pdm
cc0 CC0, "no rights reserved" http://creativecommons.org/about/cc0
cc by CC attribution http://creativecommons.org/licenses/by/3.0/
cc by-sa CC attribution, share alike http://creativecommons.org/licenses/by-sa/3.0/
educational May be freely used for educational purposes
permission Used with permission. The author must be contacted for every additional use.

limits

A map with the following keys:

Key Comments Default
time_multiplier optional 5
time_safety_margin optional 2
memory optional, in Mb 2048
output optional, in Mb 8
compilation_time optional, in seconds 60
validation_time optional, in seconds 60
validation_memory optional, in Mb 2048
validation_output optional, in Mb 8

grading

A map with the following keys:

Key Type Default Languages
on_reject String first_error One of "first_error", "worst_error" and "grade". Specifies how judging should proceed when a submission gets a non-Accept judgement on an individual test file.
accept_score String 1 Default score for accepted input files. Must only be used for scoring problems.
reject_score String 0 Default score for rejected input files. Must only be used for scoring problems.
objective String max One of "min" or "max" specifying whether it is a minimization or a maximization problem. Must only be used for scoring problems.
range String -inf +inf Two numbers A and B ("inf", "-inf", "+inf" are allowed for plus/minus infinity) specifying the range of possible scores. Must only be used for scoring problems.
first_error
The first non-accepted judgement will be the result for the entire submission. The order of test files is defined in the section on test data.
worst_error
The first error in RTE, TLE, WA that is the result of some test file will be the result for the entire submission.
grade
The result of all test files will be sent to the grader that will determine the score.

libraries

A space separated list from below. A library will be available for the languages listed.

Value Library Languages
gmp GMP - The GNU Multiple Precision Arithmetic Library C, C++
boost Boost C++

languages

A space separated list of language code from the table below or all.

If a list is given, the problem may only be solved using those languages.

Code Language Default main file
c C
cpp C++
csharp C#
go Go
java Java Main.java
objectivec Objective-C
pascal Pascal
python2 Python 2 main.py
python3 Python 3 main.py

Problem Statements

The problem statement of the problem is provided in the directory <short_name>/problem_statement/.

This directory must contain one LaTeX file per language, named problem.<language>.tex, that contains the problem text itself, including input and output specifications, but not sample input and output. Language must be given as an ISO 639-1 alpha-2 language code. Optionally, the language code can be left out, the default is then English.

Files needed by this file must all be in <short_name>/problem_statement/ , problem.<language>.tex should reference auxiliary files as if the working directory is <short_name>/problem_statement/.

The LaTeX file must include the Problem name using the LaTeX command \problemname. In case the problem name consists of special LaTeX code (for instance, math, or escaped characters), a "plain" version of the problem name must be specified by a comment of the form

%% plainproblemname: Problem Name

somewhere in the file.

The problem statements must only contain the actual problem statement, no sample data (the .tex file is included in a template that adds these). A typical problem.<language>.tex looks as follows:

\problemname{Problem Name}
%% plainproblemname: Problem Name

<Problem statement here>

\section*{Input}

<Input description here>

\section*{Output}

<Output description here>

Test data

The test data are provided in subdirectories of <short_name>/data/. The sample data in <short_name>/data/sample/ and the secret data in <short_name>/data/secret/.

All input and answer files have the filename extension .in and .ans respectively.

Test files will be used in lexicographical order on file name. If a specific order is needed a numbered prefix such as 00, 01, 02, 03, and so on, can be used.

Annotations

Optionally a description or a hint file (or both) may be present. The description file is a text file with filename extension .desc describing the purpose of an input file. The hint file is a text file with filename extension.hint giving a hint for solving an input file. The description file is meant to be privileged information, whereas the hint file is meant to be given to anybody who needs it, i.e. fails to solve the problem. The hint file might not be used at all, depending on how the problem is used, e.g. when used in a programming contest.

Input, answer, description and hint files are matched by the base name.

Test Data Groups

For scoring problems, the set of all the test data for the problem can be organized into a tree-like structure. Each node of this tree is represented by a directory and referred to as a test data group. Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subroups of test data (i.e., subdirectories).

At the top level, the test data is divided into exactly two groups: sample and secret, but these two groups may be further split into subgroups as desired.

The result of a test data group is computed by applying a grader to all of the items (test cases and subgroups) in the group. For instance, if a group specifies that the "sum" grader should be used, the result of that group will always be accepted with a score that is the sum of the scores of the items in that group.

In each test data group, a file testdata.yaml may be placed to specify how the result of the test data group should be computed. If such a file is not provided for a test data group then the settings for the parent group will be used. The format of testdata.yaml is as follows:

Key Type Default Comments
grading String default One of "default" and "custom".
grader_flags String empty string arguments passed to the grader for this test data group.
input_validator_flags String empty string arguments passed to the input validator for this test data group.
output_validator_flags String empty string arguments passed to the output validator for this test data group.

Included Code

Code that should be included with all submissions are provided in one directory per supported language, called <short_name>/include/<language>/.

The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the language table in the metadata section. If any of the included files are supposed to be the main file (i.e. a driver), that file must have the language dependent name as given in the table referred above.

Example Submissions

Correct and incorrect solutions to the problem are provided in subdirectories of <short_name>/submissions/. The possible subdirectories are:

Value Requirement Comment
accepted Accepted as a correct solution for all test files At least one is required.
wrong_answer Wrong answer for some test file, but is not too slow and does not crash for any test file
time_limit_exceeded Too slow for some test file. May also give wrong answer but not crash for any test file.
run_time_error Crashes for some test file

For submissions of type accepted and scoring problems, the expected score can be specified by including the string @EXPECTED_SCORE@: followed by the expected score somewhere in the source code, e.g. in a comment.

Every file or directory in these directories represents a separate solution. Same requirements as for submissions with regards to filenames. It is mandatory to provide at least one accepted solution.

Submissions must read input data from standard input, and write output to standard output.

Input Validators

Input Validators, for verifying the correctness of the input files, are provided in <short_name>/input_format_validators/.

Invocation

An input validator program must be an application (executable or interpreted) capable of being invoked with a command line call.

All input validators provided will be run on every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.

When invoked the input validator will get the input file on stdin.

The validator should be possible to use as follows on the command line:

 ./validator [arguments] < inputfile

Output

The input validator may output debug information on stdout and stderr. This information may be displayed to the user upon invocation of the validator.

Exit codes

The input validator must exit with code 42 on successful validation. Any other exit code means that the input file could not be confirmed as valid.

Output Validators

Output Validators are used if the problem requires more complicated output validation than what is provided by the default diff variant described below. They are provided in <short_name>/output_validators/, and must adhere to the Output validator specification.

All output validators provided will be run on the output for every test data file using the arguments specified for the test data group they are part of. Validation fails if any validator fails.

Default Output Validator Specification

The default output validator is essentially a beefed-up diff. In its default mode, it tokenizes the files to compare and compares them token by token. It supports the following command-line arguments to control how tokens are compared.

Arguments Description
case_sensitive indicates that comparisons should be case-sensitive.
space_change_sensitive indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more

whitespace characters are equivalent).

float_relative_tolerance ε indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details).
float_absolute_tolerance ε indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details).
float_tolerance ε short-hand for applying ε as both relative and absolute tolerance.

When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances. When a floating-point tolerance has been set, any valid formatting of floating point numbers is accepted for floating point tokens. So for instance if a token in the answer file says 0.0314, a token of 3.14000000e-2 in the output file would be accepted. If no floating point tolerance has been set, floating point tokens are treated just like any other token and has to match exactly.

Graders

Graders are programs that is given the results of judging individual test files and calculates an aggregate result. They are provided in <short_name>/graders/ .

Invocation

A grader program must be an application (executable or interpreted) capable of being invoked with a command line call.

When invoked the grader will get the judgement for test files or groups on stdin and is expected to produce an aggregate result on stdout.

The grader should be possible to use as follows on the command line:

./grader [arguments] < judgeresults

Input

A grader simply takes a list of results on standard input, and produces a single result on standard output. The input file will have the one line per test file containing the result of judging the testfile, using the code from the table below, followed by whitespace, followed by the score.

Code Meaning
AC Accepted
WA Wrong Answer
RTE Run-Time Error
TLE Time-Limit Exceeded

The score is taken from the score.txt files produces by the ouput validator. If no score.txt exists the score will be as defined by the grading accept_score and reject_score setting from problem.yaml.

Output

The grader must output the aggregate result on stdout. The output must be one of the codes listed in the table above followed by a number. Any other output, including no output, will result in a Judging Error.

Note that in principle a grader can get as input (and produce as output) a non-Accepted result with a non-zero score. However, for the final result of a submission, the score will be ignored for non-Accepted results.

The grader may output debug information on stderr. This information may be displayed to the user upon invocation of the grader.

Default Grader Specification

The default grader always gives verdict AC, regardless of input results. In its default mode, the score is the sum of the input scores, but can be changed by providing one of "sum", "avg", "max", "min" as a command-line argument (through the "grader_flags" option in testdata.yaml)

Argument Description
sum score is sum of input scores (default).
avg score is average of input scores.
min score is minimum of input scores.
max score is maximum of input scores.

Generators

See also