Difference between revisions of "Problem Format"

Revision as of 15:39, 18 July 2013

This is a draft.

Overview

This document describes the Kattis problem format, used for distributing and sharing problems for algorithmic programming contests as well as educational use.

All problems must have a "short name" consisting solely of lower case letters a-z and digits 0-9. All files related to a given problem are provided in a directory named after the short name of the problem.

All file names must match the following regexp

[a-zA-Z0-9][a-zA-Z0-9_.-]*[a-zA-Z0-9]

I.e., it must be of length at least 2, consist solely of lower or upper case letters a-z, A-Z, digits 0-9, period, dash or underscore, but must not begin or end with period, dash or underscore.

All text files for a problem must be UTF-8 encoded and not have a byte order mark.

Problem Metadata

Metadata about the problem (e.g., source, license, limits) are provided in a UTF-8 encoded YAML file named <short-name>/problem.yaml.

The keys are defined as below. All keys are optional. Any unknown keys should be ignored.

Key	Type	Default	Comments
author	String or sequence of strings		Who should get author credits. This would typically be the people that came up with the idea, wrote the problem specification and created the test data. This is sometimes omitted when authors choose to instead only give source credit, but both may be specified.
source	String		Who should get source credit. This would typically be the name (and year) of the event where the problem was first used or created for.
source-url	String		Link to page for source event. Should not be given if source is not.
license	String	unknown	License under which the problem may be used. Value have to be one of the ones defined below.
rights_owner	String	Value of author, if present, otherwise value of source.	Owner of the copyright of the problem. If not present, author is owner. If author is not present either, source is owner.
keywords
limits	Map with keys as defined below	see definition below
validator	String		set of values from below (delimited by spaces)
rejecthandling	String	first_error	One of "first_error", "worst_error" and "no_score" for some value X.
scoringtype	String		One of "min" or "max" optionally followed by two numbers A and B giving the range of the scores
libraries [not available yet]	String		set of libraries (delimited by spaces) as defined below.
languages [not available yet]	String	all	set of languages (delimited by spaces) or all

license

Allowed values for license.

Values other than unknown or public domain requires rights_owner to have a value.

Value	Comments	Link
unknown	The default value. In practice means that the problem can not be used.
public domain	There are no known copyrights on the problem, anywhere in the world.	http://creativecommons.org/about/pdm
cc0	CC0, "no rights reserved"	http://creativecommons.org/about/cc0
cc by	CC attribution	http://creativecommons.org/licenses/by/3.0/
cc by-sa	CC attribution, share alike	http://creativecommons.org/licenses/by-sa/3.0/
educational	May be freely used for educational purposes
permission	Used with permission. The author must be contacted for every additional use.

limits

A map with the following keys:

Key	Comments	Default
time_multiplier	optional	5
time_safety_margin	optional	2
memory	optional, in Mb	2048
output	optional, in Mb	8
compilation_time	optional, in seconds	60
validation_time	optional, in seconds	60
validation_memory	optional, in Mb	2048
validation_output	optional, in Mb	8

libraries

A space separated list from below. A library will be available for the languages listed.

Value	Library	Languages
gmp	GMP - The GNU Multiple Precision Arithmetic Library	C, C++
boost	Boost	C++

validator

A space separated list from the following:

Value	Comments
case_sensitive	upper/lower case differences are significant. If this parameter is not specified, any changes in case are to be ignored.
space_change_sensitive	whitespace differences are significant. If this parameter is not specified, any non-zero amounts of whitespace are considered identical.
float_relative_tolerance X	accepts token if it is a floating point number and the relative error is <= X
float_absolute_tolerance X	accepts token if it is a floating point number and the absolute error is <= X
float_tolerance X	accepts token if either "float_relative_tolerance X" or "float_absolute_tolerance X" would accept
custom	use a custom output validator. Must be the first value. All following values will be passed as command-line arguments to each of the output validators

Problem Statements

The problem statement of the problem is provided in the directory <short_name>/problem_statement/.

This directory must contain one LaTeX file per language, named problem.<language>.tex, that contains the problem text itself, including input and output specifications, but not sample input and output. Language must be given as an ISO 639-1 alpha-2 language code. Optionally, the language code can be left out, the default is then English.

A template will be provided that \imports this file as well as the sample input and output. The format of the problem statement is described in Problem Statement Template. Files needed by this file must all be in <short_name>/problem_statement/ , problem.tex should reference auxiliary files as if the working directory is <short_name>/problem_statement/.

Test data

Basic usage

The test data are provided in subdirectories of <short_name>/data/. The sample data in <short_name>/data/sample/ and the secret data in <short_name>/data/secret/.

All input and answer files have the filename extension .in and .ans respectively.

Test files will be used in lexicographical order on file name. If a specific order is needed a numbered prefix such as 00, 01, 02, 03, and so on, can be used.

Advanced usage: test data groups

The set of all the test data for the problem is organized into a tree-like structure. Each node of this tree is represented by a directory and referred to as a test data group. Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subroups of test data (i.e., subdirectories).

At the top level, the test data is divided into exactly two groups: sample and secret, but these two groups may be further split into subgroups as desired.

In each test data group, a file grading.yaml may be placed to specify how the result of the test data group should be computed. If such a file is not provided for a test data group then the grading for the parent group will be used.

Example

Todo

Annotating the test data

Optionally a description or a hint file (or both) may be present. The description file is a text file with filename extension .desc describing the purpose of an input file. The hint file is a text file with filename extension.hint giving a hint for solving an input file. The description file is meant to be privileged information, whereas the hint file is meant to be given to anybody who needs it, i.e. fails to solve the problem. The hint file might not be used at all, depending on how the problem is used, e.g. when used in a programming contest.

Input, answer, description and hint files are matched by the base name.

Included Code

Code that should be included with all submissions are provided in one directory per supported language, called <short_name>/include/<language>/.

The files should be copied from a language directory based on the language of the submission, to the submission files before compiling, overwriting files from the submission in the case of name collision. Language must be given as one of the language codes in the table below. If any of the included files are supposed to be the main file (i.e. a driver), that file must have the langage dependent name as given in the table below.

Code	Language	Default main file
c	C
cpp	C++
csharp	C#
go	Go
java	Java	Main.java
objectivec	Objective-C
pascal	Pascal
python2	Python 2	main.py
python3	Python 3	main.py

Example Submissions

Correct and incorrect solutions to the problem are provided in subdirectories of <short_name>/submissions/. The possible subdirectories are:

Value	Requirement	Comment
accepted	Accepted as a correct solution for all test files	At least one is required.
wrong_answer	Wrong answer for some test file, but is not too slow and does not crash for any test file
time_limit_exceeded	Too slow for some test file. May also give wrong answer but not crash for any test file.
run_time_error	Crashes for some test file

For submissions of type accepted and scoring problems, the expected score can be specified by including the string @EXPECTED_SCORE@: followed by the expected score somewhere in the source code, e.g. in a comment.

Every file or directory in these directories represents a separate solution. Same requirements as for submissions with regards to filenames. It is mandatory to provide at least one accepted solution.

Submissions must read input data from standard input, and write output to standard output.

Validators

Input Format Validators

Input Format Validators, for verifying the correctness of the input files, are provided in <short_name>/input_format_validators/. They must adhere to the Input format validator specification.

Output Validators

Output Validators are used if the problem requires more complicated output validation than what is provided by the default diff variant described below. They are provided in <short_name>/output_validators/, and must adhere to the Output validator specification.

File Conventions For Validators

A validator is either a file or a directory. A validator in the form of a directory may include two scripts "build" and "run". Either both or none of these scripts must be included. If the scripts are present, then:

validator must be compiled by executing the build script.
the validator must be run by executing the run script.

Otherwise, the validator will be compiled and run as if it was a submission (except that it is given the command-line arguments specified in problem.yaml, if there are any).

Default Validator Capabilities

The default validator is essentially a beefed-up diff. In its default mode, it tokenizes the files to compare and compares them token by token. It supports the following command-line arguments to control how tokens are compared.

Arguments	Description
`case_sensitive`	indicates that comparisons should be case-sensitive.
`space_change_sensitive`	indicates that changes in the amount of whitespace should be rejected (the default is that any sequence of 1 or more whitespace characters are equivalent).
`float_relative_tolerance ε`	indicates that floating-point tokens should be accepted if they are within relative error ≤ ε (see below for details).
`float_absolute_tolerance ε`	indicates that floating-point tokens should be accepted if they are within absolute error ≤ ε (see below for details).
`float_tolerance ε`	short-hand for applying ε as both relative and absolute tolerance.

When supplying both a relative and an absolute tolerance, the semantics are that a token is accepted if it is within either of the two tolerances. When a floating-point tolerance has been set, any valid formatting of floating point numbers is accepted for floating point tokens. So for instance if a token in the answer file says 0.0314, a token of 3.14000000e-2 in the output file would be accepted. If no floating point tolerance has been set, floating point tokens are treated just like any other token and has to match exactly.

Graders

Graders are used if the problem requires more complicated result aggregation than what is provided by the default grader described below. They are provided in <short_name>/graders/, and must adhere to the Grader specification.

Default Grader Capabilities

The default grader supports the most common forms of aggregation of results.

Value	Meaning	Comment
first_error	Will return AC if every test file is AC otherwise it will return the first non AC answer.	Any `.score` files are ignored.
sum	Will return the sum of the scores in the `.score` files.	Assumes that `.score` files consist of a single line with a single number.

@@ Line 142: / Line 142: @@
 == Test data ==
+=== Basic usage ===
 The test data are provided in subdirectories of <tt><short_name>/data/</tt>. The sample data in <tt><short_name>/data/sample/</tt> and the secret data in <tt><short_name>/data/secret/</tt>.
 All input and answer files have the filename extension .in and .ans respectively.
+Test files will be used in lexicographical order on file name. If a specific order is needed a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
+=== Advanced usage: test data groups ===
+The set of all the test data for the problem is organized into a tree-like structure.  Each node of this tree is represented by a directory and referred to as a test data group.  Each test data group may consist of zero or more test cases (i.e., input-answer files) and zero or more subroups of test data (i.e., subdirectories).
+At the top level, the test data is divided into exactly two groups: <tt>sample</tt> and <tt>secret</tt>, but these two groups may be further split into subgroups as desired.
+In each test data group, a file <tt>grading.yaml</tt> may be placed to specify how the result of the test data group should be computed.  If such a file is not provided for a test data group then the grading for the parent group will be used.
+==== Example ====
+Todo
+=== Annotating the test data ===
 Optionally a description or a hint file (or both) may be present. The description file is a text file with filename extension <tt>.desc</tt> describing the purpose of an input file. The hint file is a text file with filename extension<tt>.hint</tt> giving a hint for solving an input file. The description file is meant to be privileged information, whereas the hint file is meant to be given to anybody who needs it, i.e. fails to solve the problem. The hint file might not be used at all, depending on how the problem is used, e.g. when used in a programming contest.
 Input, answer, description and hint files are matched by the base name.
-Test files will be used in lexicographical order. If a specific order is needed a numbered prefix such as 00, 01, 02, 03, and so on, can be used.
 == Included Code ==

Difference between revisions of "Problem Format"

Revision as of 15:39, 18 July 2013

Contents

Overview

Problem Metadata

license

limits

libraries

validator

Problem Statements

Test data

Basic usage

Advanced usage: test data groups

Example

Annotating the test data

Included Code

Example Submissions

Validators

Input Format Validators

Output Validators

File Conventions For Validators

Default Validator Capabilities

Graders

Default Grader Capabilities

Generators

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools