Riptide Language Reference

1. Introduction

This manual is the primary reference for the Riptide programming language and defines its core syntax and semantics. The intended audience is for developers working on the language itself and for curious users who wish to dig deeper into the language. This document is not a complete formal specification at this time.

If you are just getting started with Riptide, we recommend checking out the Guide first.

2. Syntax

The Riptide syntax describes how to read the source code of a valid Riptide program into valid structures.

Riptide programs are always written as a sequence of UTF-8 characters.

2.1. Lines and whitespace

Horizontal whitespace has no meaning, except when used as a separator. When whitespace is used to separate syntactic elements, any one or more combination of horizontal whitespace counts as one separator.

Line separators are treated just like horizontal whitespace, except inside blocks. For greater cross-platform support, a newline can be represented in any of three ways: line feed (\n), carriage return (\r), or carriage return followed by a line feed (\r\n).

2.2. Comments

Single line comments begin with a hash character (#) and continue until the end of the line. Multiline comments begin with ### and end with ###. Nesting multiline comments is allowed, but the comment markers must be balanced.

Comments are ignored by the parser and are otherwise treated as whitespace.

2.3. Literals

Number literals

String literals

A string literal is a sequence of any Unicode characters enclosed within two single-quote characters.

println 'I am a string literal.'

List literals

[1 2 3 'a' 'b' 'c']

Table literals

A table literal is an expression used to construct a table with entries defined in code.

[
    foo: 'bar'
    say-hello: {
        println 'hello'
    }
]

2.4. Blocks

A block is a special section of code whose execution is deferred until at a later point in a program. A block is also a means of executing multiple statements in sequential order.

A block is defined using curly braces ({ and }) and includes all code between the opening and closing braces.

{
    println "I am in a block."
}

Inside a block is a list of statements, which are each pipelines to be executed. Statements may be separated by newlines or optionally by the statement terminator, a single semicolon (;). Both separators are equivalent.

{
    println "Statement one."
    println "Statement two."; println "Statement three."
}

2.5. Formal grammar

Below is the full specification of the Riptide grammar. This is the actual specification used to generate the language parser from.

// The Riptide language grammar.
//
// This file is used both to generate the parser, and serves as the canonical
// specification for the language syntax.

// A program string. Like a block without surrounding curly braces or params.
// "SOI" and "EOI" refer to the start and end of the file, respectively.
program = { SOI ~ statement_list ~ EOI }

// Blocks are surrounded by curly braces, with an optional square bracket
// delimited parameter list proceeding it.
block = { block_params? ~ "{" ~ statement_list ~ "}" }
block_params = { "<" ~ block_params_list ~ ">" }
block_params_list = _{
    vararg_param_decl
    | param_decl ~ ("," ~ block_params_list)?
}

param_decl = { symbol }
vararg_param_decl = { "..." ~ symbol }

// A subroutine is just a block with an explicit name.
subroutine = { "sub" ~ symbol ~ block }

// Blocks and programs are lists of statements.
statement_list = { statement_separator* ~ (statement ~ (statement_separator+ ~ statement)*)? ~ statement_separator* }
statement_separator = _{ NEWLINE | ";" }

// A statement can be either an assignment or a pipeline.
statement = _{ import_statement | assignment_statement | return_statement | pipeline_statement }

import_statement = { KEYWORD_IMPORT ~ string_literal ~ "for" ~ import_clause }
import_clause = { import_wildcard | import_items }
import_items = { string_literal+ }
import_wildcard = { "*" }

assignment_statement = { assignment_target ~ "=" ~ expr }

// An expression used as the target of an assignment statement.
assignment_target = {
    member_access_expr
    | &"$" ~ variable_substitution
}

return_statement = { KEYWORD_RETURN ~ expr? }

pipeline_statement = { pipeline }

// Expression is the main syntax building block.
expr = {
    member_access_expr
    | unary_expr
}
unary_expr = _{
    block
    | subroutine
    | "(" ~ pipeline ~ ")"
    | cvar_scope
    | cvar
    | regex_literal
    | substitution
    | table_literal
    | list_literal
    | number_literal
    | interpolated_string
    | string_literal
}

regex_literal = ${ "`" ~ ("\\\\" | "\\/" | !"`" ~ ANY)* ~ "`" }

member_access_expr = { unary_expr ~ (member_operator ~ string_literal)+ }

// Pipelines are function calls chained together with the pipe "|" operator.
//
// The "!" prefix forces insignificant whitespace back on, which allows
// whitespace in a pipeline inside substitutions.
pipeline = !{ call ~ ("|" ~ call)* }

// A function call is a reference to a function followed by a series of argument
// expressions.
call = { named_call | unnamed_call }
named_call = { string_literal ~ call_args }
unnamed_call = { expr ~ call_args }

call_args = _{ call_arg* }
call_arg = { splat_arg | expr }
splat_arg = { "..." ~ expr }

// Reference a context variable.
cvar = ${ "@" ~ string_literal }

// Binds a context variable to a value for the duration of a scope.
cvar_scope = { KEYWORD_LET ~ cvar ~ "=" ~ expr ~ block }

// Dollar sign indicates the start of some form of substitution.
substitution = ${ &"$" ~ (
    format_substitution
    | pipeline_substitution
    | variable_substitution
) }
format_substitution = ${ "${" ~ string_literal ~ (format_flags_separator ~ format_substitution_flags)? ~ "}" }
format_substitution_flags = ${ (ASCII_ALPHANUMERIC | "_" | ".")+ }
pipeline_substitution = ${ "$(" ~ pipeline ~ ")" }
variable_substitution = ${ "$" ~ string_literal }

// A table literal expression is used to create tables declaratively.
table_literal = { "[" ~ NEWLINE* ~ ((table_literal_entry ~ NEWLINE*)+ | ":") ~ NEWLINE* ~ "]" }
table_literal_entry = { expr ~ ":" ~ expr }

// A list literal creates a list declaratively from a sequence of expressions.
list_literal = { "[" ~ (NEWLINE* ~ expr)* ~ NEWLINE* ~ "]" }

// An interpolated string is surrounded by double quotes, and is made up of a
// sequence of parts that, when stringified and concatenated in order, form the
// desired string value.
//
// Escapes are handled later in the parser pipeline.
interpolated_string = ${ "\"" ~ interpolated_string_part* ~ "\"" }
interpolated_string_part = ${ substitution | interpolated_string_literal_part }
interpolated_string_literal_part = ${ ("\\\"" | "\\$" | !"\"" ~ !"$" ~ ANY)+ }

// A literal string. String literals are static and have no runtime
// interpolation.
// Escapes are handled later in the parser pipeline.
string_literal = ${ "'" ~ single_quote_inner ~ "'" | symbol }
single_quote_inner = ${ ("\\'" | !"'" ~ ANY)* }

// Numbers are floating point.
number_literal = ${ "-"? ~ ("." ~ ASCII_DIGIT+ | ASCII_DIGIT+ ~ ("." ~ ASCII_DIGIT+)?) }

// A symbol is an unquoted string, usually used for identifying variable names.
symbol_char = _{ ASCII_ALPHANUMERIC | "_" | "-" | "?" | "!" | "." | "/" | "*" | "=" }
symbol = ${ !reserved_words ~ symbol_char ~ (!member_operator ~ symbol_char)* }

// A list of keywords that are not allowed as bare identifiers because they have
// special meaning.
reserved_words = _{ KEYWORD_IMPORT | KEYWORD_LET | KEYWORD_RETURN }

// Operator to access namespaces and table members.
member_operator = _{ "->" }

// Separator for specifying format parameters.
format_flags_separator = _{ ":" }

// Inline comments are similar to UNIX shells, where "#" starts a comment and
// includes all following characters until end of line.
COMMENT = _{ "#" ~ (!NEWLINE ~ ANY)* }

// Only horizontal whitespace is insignificant; vertical whitespace is used to
// separate staements in blocks.
WHITESPACE = _{ " " | "\t" | "\\" ~ NEWLINE }

// All reserved keywords.
KEYWORD_IMPORT = _{ "import" }
KEYWORD_LET = _{ "let" }
KEYWORD_RETURN = _{ "return" }

The grammar is written in the Pest syntax, an excellent modern parser generator. Reading through the Pest book to get a thorough understanding of how the Riptide grammar works.

3. Data types

Riptide has a simple data model and only offers a few basic data types.

3.1. Strings

The string is the most fundamental data type in Riptide. A string is a fixed-length array of bytes. Usually it contains text encoded using the UTF-8 encoding, but non-text strings are also valid.

String are immutable values; functions that combine or modify text return new strings rather than modify existing strings.

Since strings are immutable, it is implementation-defined as to whether strings are passed by value or by reference, as it makes no difference in program behavior.

Strings can be created without quotes, single quotes ('), or double quotes ("), each with a slightly different meaning.

3.2. Numbers

Only one number type os offered. All numbers are double-precision floating-point numbers.

Numbers are immutable values.

3.3. Lists

Lists are the first compound data type, or a container type that can hold multiple other data items. A list is a fixed-length array containing zero or more values in sequence. The contained values can be of any data type.

Lists are immutable values, and cannot be modified once they are created.

It is implementation-defined as to whether lists are passed by value or by reference, as it makes no difference in program behavior.

3.4. Tables

A table (or associative array) is a collection of key-value pairs, where each key appears at most once in the collection.

Unlike other data types, tables are mutable and can be modified in place.

Tables are passed by reference instead of by value.

The storage representation of a table is implementation-defined.

3.5. Closures

4. Expressions

Riptide is an expression based language, where nearly every construct is an expression, and is the most important building block of Riptide.

Every expression has a resulting value when it is executed.

4.1. Literal expressions

A literal expression consists of a single literal value. The resulting value for a literal expression is always the the literal value written. See Literals for details.

4.2. Pipeline expressions

4.3. Block expressions

A block expression defines a new block.

5. Lexical scope and variables

Variables must be explicitly declared before they are used. By default, a variable is confined to the lexical scope in which it is defined.

5.1. Function calls

5.2. Pipelines

6. Execution model

6.1. Local variables

Local variables are lexically scoped bindings of names to values, and only exist inside the function they are defined in.

Local variables are mutable in the sense that they can be redefined at any time.

Local variables can be referenced by name using the $ sigil.

Variables can be defined or reassigned using the set builtin function:

# Bind the string "Hello world!" to the variable $foo.
set foo "Hello world!"

6.2. Context variables

In contrast with local variables, which are lexically scoped, context variables are a form of global variables that offers dynamic scoping.

Context variables can be referenced by name using the @ sigil.

let @cvar = foo {
    println @cvar # foo

    let @cvar = bar {
        println @cvar # bar
    }

    println @cvar # foo
}

6.3. Binding resolution

6.4. Exceptions

As is common in many languages, exceptions offer a means of breaking out of regular control flow when runtime errors are encountered or other exceptional situations arise.

When the Riptide runtime encounters a recoverable error, it raises an exception that describes the error that occurred.

Note	Not all errors in the runtime get turned into exceptions. If an error occurs that the runtime cannot safely recover from, such as running out of memory or data corruption, the program will be aborted instead.

Riptide programs are also free to raise their own exceptions at any time during program execution using the throw builtin function.

Regardless of the origin of the exception, when an exception is raised, the current function call is aborted recursively in a process called stack unwinding, until the exception is caught. A raised exception may be caught by the first try block encountered that wraps the offending code.

If a raised exception is not caught during stack unwinding before the top of the stack is reached, then the runtime will attempt to print a stack trace of the exception if possible, then abort the program.

7. Modules

Caution

The module system has yet to be designed!

8. External commands

External commands can be executed in the same way as functions are, and use the same function call mechanism.

Native data types passed to a command as arguments are coalesced into strings and then passed in as program arguments. The function call waits for the command to finish, then returns the exit code of the command as a number.

9. Platform interaction

9.1. Environment variables

Process environment variables are exposed to a Riptide program via a environment context variable. This variable is populated with a map of all of the current process environment variables when the runtime is initialized.

The environment map is not linked to the process environment map after initialization; modifying the contents of the map at runtime does not update the current process’s environment. Whenever a subprocess is spawned, the subprocess’s environment is created by exporting the current value of environment. This mimics normal environment variable support without the normal overhead required, and offers the benefits of being a regular context variable.

Example:

let @environment->FOO "bar" {
    printenv
}

9.2. Working directory

The current "working directory" of the current process is exposed as a special cwd context variable. This variable is populated when the process starts from the working directory reported by the OS.

Changes to cwd are not required to be reflected in the process working directory, but cwd must be respected for all relative path resolution, and newly spawned processes must inherit the current value of cwd.

9.3. Processes

As process parallelism and external commands are essential features of Riptide, defining how Riptide manages external and child processes is paramount.

The runtime acts as a form of process supervisor, and keeps track of all child processes owned by the current process. This removes much of the burden of managing processes from the programmer.

New child processes can be created in one of two ways:

The spawn builtin, which creates a new child process and executes a user-supplied block inside it in parallel with the current process.
Calling external commands, which executes the command in a child process.

In both of these cases, newly created processes have their process IDs recorded in the global process table, which maintains a list of all child processes the runtime is aware of.

On Unix-like systems, when the process

9.4. Input and output

Pipes

10. Standard library

This section of the reference describes all of the built-in functions that must be provided by the Riptide runtime for any program.

10.1. Logical and control flow functions

`=`

Test equivalence.

`and`

Logical AND.

`or`

Logical OR.

`not`

Negate a boolean.

`if`

Conditional branching.

`cond`

Multiple conditional branching.

`foreach`

Iterate over a list.

10.2. Core functions

`def`

Define a new variable. Throws an exception if the variable is already defined.

def myvar "Hello, World!"

`let`

Introduces a scoped local variable binding.

def foo "bar"

let foo "baz" {
    println $foo # prints "baz"
}

println $foo # prints "bar"

`set`

Assigns a new value to an existing variable. Throws an exception if the variable is not defined.

`builtin`

Call the builtin function with the given name and arguments.

`command`

Execute an external command as a function.

`help`

Print out user help for using Riptide.

`clone`

Perform a deep clone of the given value and return it.

`call`

Invoke a block with the given arguments.

`list`

Create a list.

`nth`

Return nth item in list.

`source`

Evaluate a script file.

`random`

Produces an output stream of random bytes.

10.3. Environment

`env`

Get, set, or list environment variables.

`pwd`

Get the current working directory.

`cd`

Set the current working directory.

10.4. Input and output

`print`

Writes each argument given to standard output.

`println`

Writes each argument given to standard output, with a trailing newline separator.

`echo`

An alias for println.

`eprint`

Writes each argument given to standard error.

`eprintln`

Writes each argument given to standard error, with a trailing newline separator.

`read`

Read from input.

`lines`

Split standard input into lines and executes a block for each line.

# Filter out lines starting with "//"
cat 'file.txt' | lines {
    if not (str->starts-with '//' $1) {
        println $1
    }
}

# Transform every line to upper case
cat 'file.txt' | lines {
    println (str->upper $1)
}

10.5. Working with strings

`str?`

Check if the given values are strings.

`str→format`

`str→match`

Applies a regular expression to a string and emits matches and captures.

`str→replace`

Applies a regular expression to a string and replaces matches with the received values.

`split`

Splits a string into a list by a separator.

10.6. Tables

`table-get`

`table-set`

10.7. Stream functions

`send`

Sends one or more values to the current output channel.

`recv`

Receives a value from the input channel.

10.8. Process management

`pid`

Returns the PID of the current process.

`exit`

Terminate the current process, with an optional status code.

Note	By default, all child processes will also be terminated in as safe a manner as possible before the current process exits. Child processes that do not respond will be terminated forcefully. To bypass this behavior, pass the `--orphan` flag.

`spawn`

Spawn a new process and execute a given block within it. Returns the PID of the new process.

Calling spawn will never interrupt the current fiber; the spawned fiber will not be started until at least the current fiber yields.

`kill`

Send an interrupt or signal to a running process.

`sleep`

Suspend the current process for a given amount of time.

`exec`

Execute a command, replacing the current process with the executed process.

Note	Like `exit`, `exec` will do its best to clean up the current process as safely as possible before replacing the current process.

Warning

This replaces the current process, which includes all fibers in the current process.

10.9. Exceptions

`throw`

Throw an exception.

`try`

Execute a block, and if the block throws an exception, invoke a continuation with the error as its first argument.

Appendix A: Design goals

The language should be simple to parse and evaluate so the interpreter can be simple, fast, and maintainable.
Only a few orthogonal language semantics so that the core language is easy to learn.
Support traditional command line syntax as the core of the language syntax (command args), so users can get started right away using Riptide as a shell, and then learn the language gradually afterward.
Provide a built-in module system. Let users create their own package managers that work together automatically.
Low-level functionality can be scripted through C extension modules.
Provide built-in support for concurrency through forking processes.
Scripts should fail fast using exceptions with clear messages, rather than continue lumbering along, leaving the user unclear of the state of the world.
Provide data structures needed to create complex programs.
Extend the UNIX philosophy of many small programs that work together. Instead of creating functions that run inside your shell, encourage users to create their scripts as standalone files that can be run from within any shell.

Appendix B: Influences

Riptide draws inspiration from several other languages:

Fish: A shell scripting language that trades POSIX compatibility for friendlier syntax.
Lisp: Functional composition.
Ruby: Block design.
Tcl: Everything is a command, including control structures!