SourceForge.net Logo

One syntax to rule them all

Last modified on 2008/02/08




There are primitive data types (numbers, strings) and 3 compound constructs: lists, records and tuples. Each data element can have a tag, which is an equivalent of a class name in programming.

Record is basic object description construct. In the following example, there is a record with 3 fields - a string, an int and an identifier:


name = "John"

age = 21

sex = male


Sometimes, especially if there is a lot of records of the same type, it is not feasible to write field names. You can use tuples (records with anonymous fields) instead:

Person( "John" 21 male )

Tuples are very different to lists. Tuples (round brackets) describe single entities, whereas lists (curly brackets) are used for collections:

Team {

Person( "John" 21 male )

Person( "Bill" 33 male )

Person( "Alice" 28 female )


Unituples (single-element tuples) are considered equivalent to its elements. This rule is compatible with mathematical definition of tuples and makes it possible to use brackets also for grouping.

Elements of tuples and lists can be separated with commas or ended with semicolons, but in most cases this is not neccessary.


Every data element can have a tag. In the above examples tags were Person and Team. They are not always needed, but they add higher-level meaning to the data, enhancing self-describeness. Here is an example of a tagged record containing a not tagged fields (string, integer and list):


title = "Peopleware"

year = 1999

authors = { "Tom Demarco" "Timothy Lister" }


Tags can also be added in the postfix style:


width = 14_mm

height = 2_cm


This can be used, for example, to repeat the tag in the XML-like way or to add units to values.

Note: Earlier, postfix tags were using '/' character, but since there are identifier operators, the slash can be freed (at the moment (v 5.5.84), you can still use slash for postfix tags, but this will change, so that you can use slash for operators).

Primitive data

There are 4 common primitive data types, which correspond to literals that can be used in harpoon expressions:

  • Int – signed, 32 bit number,

  • Real – double precision floating point number,

  • String – finite sequence of characters (UTF-8) (e.g. "hello"),

  • Ident – identifier, (e.g. true).

Strings in Harpoon have an escape character \ (backslash) which makes the lexer ignoring subsequent white-space and makes it possible to express new line (\n), tab (\t), quote (\"), backslash (\\) and some other special characters. Strings can be multiline.

The numeric types are now machine-dependent, which is not nice of course. This is going to change in future with two machine-independent general types:

  • Integer – signed, arbitrarily large integer numbers (e.g. 1928382738271),

  • Rational – arbitrary precision rational numbers (e.g. 3.141592),

The extension will be backward compatible, as all numbers will become Integers or Rationals, which will be automatically convertible to Ints and Reals.

Special values

There are two special values in Harpoon:

  • Null expressed with an empty tuple "()",

  • Unknown expressed with an underscore character "_".

Their meanings are very different - the null value means that there is no data (similar as in the division by zero), whereas the unknown value means that there should be something, but we do not know or care what it is.


There are two kinds of comments: // makes a comment till end of line and pair (* ... *) makes a comment between. This is similar to C/C++, but much easier to type.