# $Revision: 12143 $ # $Author: saulius $ # $Date: 2025-02-10 19:59:03 +0000 (Mon, 10 Feb 2025) $ Generate an atomic model of an object ===================================== Saulius Gražulis Vilnius, 2025 CONVENTIONS =========== The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1]. PROGRAM ======= Write a Perl or Ada program that places specified atoms on a given surface, line or in a given volume (i.e. on a given manifold). The program MUST be able to work in all currently supported Debian or compatible and Ubuntu or compatible OSes. The program should read in one or several parameter files in CSV [2] or TSV [3] formats. Each row of these files, with the exception of the possible header row, should contain a full set of parameters for generation of the atom set. Multiple parameter files as well as multiple parameter sets in each file are permitted. If no inputs files are given, the program MUST read its parameters in the same format from STDIN. For each parameter row the program MUST generate the corresponding atom set and write it to STDOUT in XYZ [4] format. Atom types can be given either as atomic numbers or as atom names in the Periodic element table. If column names are given in the parameter file, the parameters MUST be identified by the column names. If the header is not given, the parameters must be identified by a predefined parameter order. Both available parameter names and the default parameter order MUST be specified in the program documentation, in the comments and in the help printout. The program input (i.e. file names specified in the command line) can simultaneously contain files both with and without column names. The parameter files MAY contain incomplete parameter lists. For parameter files with column names, arbitrary columns can be missing. For parameter files without column names, the missing columns shall be the trailing columns (i.e. if a program has 10 parameters and the input file has 5 columns, parameters 1 to 5 (indexed from 1) shall be assumed to be present, and parameters 6 to 10 shall be assumed to be missing). The missing parameters SHOULD assume default values; if no reasonable default value is available for a missing parameter, the programs should issue a clear error message about this. Program name ------------ The program name MUST be constructed according to the following pattern: gen--atoms or gen__atoms where characters MUST be changed to the name of the manifold (i.e. 3D figure) assigned to you. For example, if you are generating atoms on a sphere, your programs should be called gen-sphere-atoms or gen_sphere_atoms The program name MUST NOT contain an extension. Program invocation ------------------ It must be possible to invoke the program in all following ways: gen-sphere-atoms parameter_set_1.tsv parameter_set_2.csv gen-sphere-atoms < parameters.tsv gen-sphere-atoms *.tsv /dev/stdin *.csv IMPLEMENTATION REQUIREMENTS =========================== In this assignment input, computations and output must be implemented using the standard facilities of the language, without the use of external (non-standard) libraries. External libraries are permitted for math, random number generator and basic input-output (string reading) functions. INPUTS ====== Program inputs must be in CSV [2] or TSV [3] format, with columns specified as described above. To avoid the use of external dependencies and to simplify code it is permissible to restrict the input CSV format by requiring that one CSV record is always placed on a single physical line, and the values may not contain commas. The final parameter SHOULD be a random seed for the random number generator that would permit to recompute the values form the output. The first parameter MUST have the name "Nr", which specifies a number of the parameter set in the file. This number SHOULD be unique in the file. Example ------- For a programs that generate atoms on a sphere, the following parameter file content can be used (CSV format): Nr,AtomType,AtomCount,CenterX,CenterY,CenterZ,Radius,RndSeed 1,C,100,10.5,9.8,12.6,15.0,34345345 2,0,120,0,0,0,80.0,34546456 Here atom type "0" may be used to indicate that atoms of random type should be generated for this atom set. OUTPUTS ======= Output must be in the XYZ [4] format. Example ------- Two hydrogen atoms placed on a 10.0 Å radius sphere might look like this: 2 gen_sphere_atoms <<< 1,H,2,0,0,0,10.0,38973458 H 0.0 0.0 10.0 H 0.0 0.0 -10.0 or, if atomic numbers are used: 2 gen-sphere-atoms <<< 1,1,2,0,0,0,10.0,38973458 1 0.0 0.0 10.0 1 0.0 0.0 -10.0 ERROR DIAGNOSTICS ================= The program MAY use the native Ada or Perl diagnostics. The Perl warn() and die() subroutines may be used where appropriate. Error messages SHOULD contain at least: -- the name of the program that diagnosed the error; -- the name of the file that was being processed when the error happened (if appropriate); use the "-" string (with quotation marks) or the name "STDIN" (without the quotes) if STDIN was being processed when the error happened; -- the line of the file and the character position within the line (the column) where the error was detected (if appropriate); -- a short (20–40 characters) citation of the context where the error happened (if appropriate); -- a short but informative message indicating the cause of the error and possible actions to rectify situation. Exclamatory marks SHOULD NOT be used in the messages. The following classes of errors MUST be diagnosed, as a minimum: -- error opening parameter file; -- error reading parameter file; -- wrong parameter format or type; -- unknown parameter name in the file header; -- error writing output results; -- mathematical computation error (e.g. division by zero). Example of an error message --------------------------- gen_sphere_atoms: cannot open parameter file "param.tsv" -- permission denied. EXIT STATUS =========== The program MUST return at least the following exit codes to its calling environment: 0 – successful termination of the program; 1 – error in one or several parameter sets; 2 – file read or write error; 3 – mathematical computation error. Other status codes with higher values are allowed. All possible status codes MUST be documented in the program description and comments. References ========== 1. S. Bradner "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, URI: https://tools.ietf.org/html/rfc2119 2. Library of Congress. CSV, Comma Separated Values (RFC 4180). https://www.loc.gov/preservation/digital/formats/fdd/fdd000323.shtml [accessed: 2022-04-05T11:50+03:00] 3. Library of Congress. TSV, Tab-Separated Values. https://www.loc.gov/preservation/digital/formats/fdd/fdd000533.shtml [accessed: 2022-04-05T11:51+03:00] 4. Wikipedia (2025) XYZ file format. URL: https://en.wikipedia.org/wiki/XYZ_file_format [accessed: 2025-02-09T17:01+02:00, permalink: https://en.wikipedia.org/w/index.php?title=XYZ_file_format&oldid=1270820839]. Colophon ======== $Id: atom-generation-task.txt 12143 2025-02-10 19:59:03Z saulius $ $Header: file:///home/saulius/svn-repositories/paskaitos/VU/bioinformatika-III/u%C5%BEduotys-praktikai/atom%C5%B3-generavimo-u%C5%BEduotis/tasks/en/atom-generation-task.txt 12143 2025-02-10 19:59:03Z saulius $ $URL: file:///home/saulius/svn-repositories/paskaitos/VU/bioinformatika-III/u%C5%BEduotys-praktikai/atom%C5%B3-generavimo-u%C5%BEduotis/tasks/en/atom-generation-task.txt $