I’ve been active on my “Ocean” programming language design project again and have created another point release which I am calling “Cataract Creek”. It contains a number of changes but the most significant are functions and references (aka pointers) so that is what I’ll discuss here.
I found as I was working on some of the design that these two really need to come together, or at least I needed some understanding of references before I could do anything useful with functions. This is because functions need some form of “by-reference” parameter to be really useful, and that means there must be some concept of references.
Possibly the best place to start is the dereference operation. If you have a reference, you need some way to get at the thing being referred to. In some languages (e.g. Python) there is no such operation – dereferencing is completely transparent and the “thing” being referred to doesn’t have an existence without some reference to access it. In Ocean, as in C, this is not the case. Though in many cases Ocean does allow transparent dereferencing, that isn’t where I want to start.
In C there are two ways to dereference a pointer – you can prefix with an asterisk “
*foo” or you can postfix with a zero index “
foo“. While the former is the idiomatic form, the later has some real appeal, particularly when entering expression incrementally in gdb. Somehow it seems to make sense for the dereferencing operator to come after the thing being dereferenced. That is what Pascal uses of course, spell with a caret: “
foo^“. So Ocean will use a postfix operator too, the at sign to be specific – “@”. If “
foo” is a reference then “
foo@” is the referenced datum. This fits well with the approach I have taken declaring the type of a variable. A name (variable or field or parameter) is followed by a colon and then the type and if the type is more than just a name, then moving the colon over any element of the type sort-of makes sense. So
is an array of 12 strings, and
look a bit like saying that the type of a member of months is “string” – which it is. Similarly with references
declares the name “parent” to be a reference to a node, and
parent@ : node
could suggest that the type of “parent@” is “node” – which, again, it is.
Another aspect of type reference usage which gdb gave me experience with is the use of a simple dot to access a field in a structure given a reference. In C you would use “
foo->parent” to access the parent field if “foo” were a pointer, but in gdb you can just use “
foo.parent“. There is nothing to dislike about about this syntax, so it is what Ocean will use. You can use “
foo@.parent” if you want to explicitly dereference “foo” but there is no need, the “.” will dereference a pointer if needed. It will actually be able to do more than that eventually, but you’ll have to wait until I implement transparent fields in structures to understand the full implications (as will I).
The inverse of the dereference operation is the “address-of” operation, the prefix “&” operator in in C. There is no such explicit operator in Ocean – the operation exists but it is always implicit. If a reference is needed in some context, but a non-reference is given, then the address of that non-reference is automatically taken. There aren’t very many situations where this can actually happen, primarily assignment where passing arguments to a function and returning a value are seen as special cases of assignment.
n:number = 42 np:@number = n np@ = 43
In this little code fragment, the reference “np” will refer to the variable “n”. Normally you don’t need to give a type when declaring a variable with an initial value as the type is deduced from the value. When we want to get a reference we cannot omit the type, though a future edition might allow the referent type “number” to be omitted so “
np:@=n” might work.
The third line above is one place where the dereference operator cannot be omitted. Assigning to a reference is very different to assigning to the referenced variable and the latter needs the explicit “@” operator.
This transparent taking of an address means that we need a clear distinction between references and things that are referred to (referents), so that it is clear when the address should be taken. This raises questions about the wisdom of allowing a reference to a reference. If a reference to a number is given where a reference to a reference to a number is expected, do we simply take the address or do we raise an type-mismatch error because the reference is to the wrong type? For now my decision is to disallow this possibility so that references and referents are clearly distinct. I may change my mind later but I’d rather be cautious at this stage.
It is still, of course, possible to have a reference to a reference by placing the second reference in a struct and having a reference to that struct. That might turn out to be clumsy, or it might actually improve clarity – I’ll have to wait and see. For now, a reference can only be to a named type, and that excludes arrays as well as references. We will allow references to arrays in the future, but they will include a size and be called “slices”.
Now that we know what references look like, we can see how to pass them to functions. Hopefully it will be completely unsurprising. A function parameter declared as a reference acts as a by-reference parameter (a “var” parameter in Pascal or a “type&” parameter in C++). If a reference is passed, it is accepted as-is. If an L-value is passed, the address of it is taken. Non-reference parameters are handled much as you would expect.
The syntax for function declarations is a little different from what you might expect from C or similar languages. The parameter list is still a list of variable declarations, but it has long bothered me that in C you can list multiple variables with the one type in a single declaration, but you cannot do that in a function declaration as the comma has a slightly different meaning. So in Ocean, the semicolon is used to separate parameters.
func insert(l:@list; v:string)
Now this argument is actually a bit odd because Ocean doesn’t allow multiple names with the one type at all, but maybe it will one day, and I still think that semicolon is the better separator here. If you have a longer list of parameters, or just like spacing things out, you can use an indented list of lines for the parameters. The correspondence between the semicolon and the end-of-line then matches the similar correspondence as statement separators.
func insert l:@list v:string
The return type can be omitted as above, or can either be a simple type or an implicit struct with a list of return values. Each of these can appear on a single line, or with line structure
func add(a:number; b:number) : number use a + b func subtract a: number; b: number return number do use a - b func to_polar(x:number; y:number) : (rho:number; theta:number) theta = arcsine(y / x) rho = sqrt(x*x + y*y) func to_cartesian rho: number theta: number return x: number y: number do x = rho * sine(theta) y = rho * cosine(theta)
Note that “use” here, which says what value to use, will likely change to “return” in a future edition.
If the expected return value is a reference and the value given to “use” is a referent, then this is currently an error, but that will soon be change to take the address of the value – providing that makes sense.
My plans for function include default values for parameters so they may be omitted, and “varargs” support by treating the last parameter as a bit special if it is an array slice. This is where it will really be important that reference (both normal and slices) are distinct from referents.
If the last parameter to a function is e.g. “args:@printable” where “printable” is an interface (which is still a long way from being implemented), then (in the future) this can be given an actual parameter of an array slice, or zero or more references or referents which support “printable”.
But this is mostly “still to come”. For now, functions work, can receive by-reference parameters, and have a pleasing “open” syntax if wanted.