1. Introduction
This manual is the primary reference for the Riptide programming language and defines its core syntax and semantics. The intended audience is for developers working on the language itself and for curious users who wish to dig deeper into the language. This document is not a complete formal specification at this time.
If you are just getting started with Riptide, we recommend checking out the Guide first.
2. Syntax
The Riptide syntax describes how to read the source code of a valid Riptide program into valid structures.
Riptide programs are always written as a sequence of UTF-8 characters.
2.1. Lines and whitespace
Horizontal whitespace has no meaning, except when used as a separator. When whitespace is used to separate syntactic elements, any one or more combination of horizontal whitespace counts as one separator.
Line separators are treated just like horizontal whitespace, except inside blocks. For greater cross-platform support, a newline can be represented in any of three ways: line feed (\n
), carriage return (\r
), or carriage return followed by a line feed (\r\n
).
2.2. Comments
Single line comments begin with a hash character (#
) and continue until the end of the line. Multiline comments begin with ###
and end with ###
. Nesting multiline comments is allowed, but the comment markers must be balanced.
Comments are ignored by the parser and are otherwise treated as whitespace.
2.3. Literals
Number literals
String literals
A string literal is a sequence of any Unicode characters enclosed within two single-quote characters.
println 'I am a string literal.'
List literals
[1 2 3 'a' 'b' 'c']
Table literals
A table literal is an expression used to construct a table with entries defined in code.
[
foo: 'bar'
say-hello: {
println 'hello'
}
]
2.4. Blocks
A block is a special section of code whose execution is deferred until at a later point in a program. A block is also a means of executing multiple statements in sequential order.
A block is defined using curly braces ({
and }
) and includes all code between the opening and closing braces.
{
println "I am in a block."
}
Inside a block is a list of statements, which are each pipelines to be executed. Statements may be separated by newlines or optionally by the statement terminator, a single semicolon (;
). Both separators are equivalent.
{
println "Statement one."
println "Statement two."; println "Statement three."
}
2.5. Formal grammar
Below is the full specification of the Riptide grammar. This is the actual specification used to generate the language parser from.
// The Riptide language grammar.
//
// This file is used both to generate the parser, and serves as the canonical
// specification for the language syntax.
// A program string. Like a block without surrounding curly braces or params.
// "SOI" and "EOI" refer to the start and end of the file, respectively.
program = { SOI ~ statement_list ~ EOI }
// Blocks are surrounded by curly braces, with an optional square bracket
// delimited parameter list proceeding it.
block = { block_params? ~ "{" ~ statement_list ~ "}" }
block_params = { "<" ~ block_params_list ~ ">" }
block_params_list = _{
vararg_param_decl
| param_decl ~ ("," ~ block_params_list)?
}
param_decl = { symbol }
vararg_param_decl = { "..." ~ symbol }
// A subroutine is just a block with an explicit name.
subroutine = { "sub" ~ symbol ~ block }
// Blocks and programs are lists of statements.
statement_list = { statement_separator* ~ (statement ~ (statement_separator+ ~ statement)*)? ~ statement_separator* }
statement_separator = _{ NEWLINE | ";" }
// A statement can be either an assignment or a pipeline.
statement = _{ import_statement | assignment_statement | return_statement | pipeline_statement }
import_statement = { KEYWORD_IMPORT ~ string_literal ~ "for" ~ import_clause }
import_clause = { import_wildcard | import_items }
import_items = { string_literal+ }
import_wildcard = { "*" }
assignment_statement = { assignment_target ~ "=" ~ expr }
// An expression used as the target of an assignment statement.
assignment_target = {
member_access_expr
| &"$" ~ variable_substitution
}
return_statement = { KEYWORD_RETURN ~ expr? }
pipeline_statement = { pipeline }
// Expression is the main syntax building block.
expr = {
member_access_expr
| unary_expr
}
unary_expr = _{
block
| subroutine
| "(" ~ pipeline ~ ")"
| cvar_scope
| cvar
| regex_literal
| substitution
| table_literal
| list_literal
| number_literal
| interpolated_string
| string_literal
}
regex_literal = ${ "`" ~ ("\\\\" | "\\/" | !"`" ~ ANY)* ~ "`" }
member_access_expr = { unary_expr ~ (member_operator ~ string_literal)+ }
// Pipelines are function calls chained together with the pipe "|" operator.
//
// The "!" prefix forces insignificant whitespace back on, which allows
// whitespace in a pipeline inside substitutions.
pipeline = !{ call ~ ("|" ~ call)* }
// A function call is a reference to a function followed by a series of argument
// expressions.
call = { named_call | unnamed_call }
named_call = { string_literal ~ call_args }
unnamed_call = { expr ~ call_args }
call_args = _{ call_arg* }
call_arg = { splat_arg | expr }
splat_arg = { "..." ~ expr }
// Reference a context variable.
cvar = ${ "@" ~ string_literal }
// Binds a context variable to a value for the duration of a scope.
cvar_scope = { KEYWORD_LET ~ cvar ~ "=" ~ expr ~ block }
// Dollar sign indicates the start of some form of substitution.
substitution = ${ &"$" ~ (
format_substitution
| pipeline_substitution
| variable_substitution
) }
format_substitution = ${ "${" ~ string_literal ~ (format_flags_separator ~ format_substitution_flags)? ~ "}" }
format_substitution_flags = ${ (ASCII_ALPHANUMERIC | "_" | ".")+ }
pipeline_substitution = ${ "$(" ~ pipeline ~ ")" }
variable_substitution = ${ "$" ~ string_literal }
// A table literal expression is used to create tables declaratively.
table_literal = { "[" ~ NEWLINE* ~ ((table_literal_entry ~ NEWLINE*)+ | ":") ~ NEWLINE* ~ "]" }
table_literal_entry = { expr ~ ":" ~ expr }
// A list literal creates a list declaratively from a sequence of expressions.
list_literal = { "[" ~ (NEWLINE* ~ expr)* ~ NEWLINE* ~ "]" }
// An interpolated string is surrounded by double quotes, and is made up of a
// sequence of parts that, when stringified and concatenated in order, form the
// desired string value.
//
// Escapes are handled later in the parser pipeline.
interpolated_string = ${ "\"" ~ interpolated_string_part* ~ "\"" }
interpolated_string_part = ${ substitution | interpolated_string_literal_part }
interpolated_string_literal_part = ${ ("\\\"" | "\\$" | !"\"" ~ !"$" ~ ANY)+ }
// A literal string. String literals are static and have no runtime
// interpolation.
// Escapes are handled later in the parser pipeline.
string_literal = ${ "'" ~ single_quote_inner ~ "'" | symbol }
single_quote_inner = ${ ("\\'" | !"'" ~ ANY)* }
// Numbers are floating point.
number_literal = ${ "-"? ~ ("." ~ ASCII_DIGIT+ | ASCII_DIGIT+ ~ ("." ~ ASCII_DIGIT+)?) }
// A symbol is an unquoted string, usually used for identifying variable names.
symbol_char = _{ ASCII_ALPHANUMERIC | "_" | "-" | "?" | "!" | "." | "/" | "*" | "=" }
symbol = ${ !reserved_words ~ symbol_char ~ (!member_operator ~ symbol_char)* }
// A list of keywords that are not allowed as bare identifiers because they have
// special meaning.
reserved_words = _{ KEYWORD_IMPORT | KEYWORD_LET | KEYWORD_RETURN }
// Operator to access namespaces and table members.
member_operator = _{ "->" }
// Separator for specifying format parameters.
format_flags_separator = _{ ":" }
// Inline comments are similar to UNIX shells, where "#" starts a comment and
// includes all following characters until end of line.
COMMENT = _{ "#" ~ (!NEWLINE ~ ANY)* }
// Only horizontal whitespace is insignificant; vertical whitespace is used to
// separate staements in blocks.
WHITESPACE = _{ " " | "\t" | "\\" ~ NEWLINE }
// All reserved keywords.
KEYWORD_IMPORT = _{ "import" }
KEYWORD_LET = _{ "let" }
KEYWORD_RETURN = _{ "return" }
The grammar is written in the Pest syntax, an excellent modern parser generator. Reading through the Pest book to get a thorough understanding of how the Riptide grammar works.
3. Data types
Riptide has a simple data model and only offers a few basic data types.
3.1. Strings
The string is the most fundamental data type in Riptide. A string is a fixed-length array of bytes. Usually it contains text encoded using the UTF-8 encoding, but non-text strings are also valid.
String are immutable values; functions that combine or modify text return new strings rather than modify existing strings.
Since strings are immutable, it is implementation-defined as to whether strings are passed by value or by reference, as it makes no difference in program behavior.
Strings can be created without quotes, single quotes ('
), or double quotes ("
), each with a slightly different meaning.
3.2. Numbers
Only one number type os offered. All numbers are double-precision floating-point numbers.
Numbers are immutable values.
3.3. Lists
Lists are the first compound data type, or a container type that can hold multiple other data items. A list is a fixed-length array containing zero or more values in sequence. The contained values can be of any data type.
Lists are immutable values, and cannot be modified once they are created.
It is implementation-defined as to whether lists are passed by value or by reference, as it makes no difference in program behavior.
3.4. Tables
A table (or associative array) is a collection of key-value pairs, where each key appears at most once in the collection.
Unlike other data types, tables are mutable and can be modified in place.
Tables are passed by reference instead of by value.
The storage representation of a table is implementation-defined.
3.5. Closures
4. Expressions
Riptide is an expression based language, where nearly every construct is an expression, and is the most important building block of Riptide.
Every expression has a resulting value when it is executed.
4.1. Literal expressions
A literal expression consists of a single literal value. The resulting value for a literal expression is always the the literal value written. See Literals for details.
4.2. Pipeline expressions
4.3. Block expressions
A block expression defines a new block.
5. Lexical scope and variables
Variables must be explicitly declared before they are used. By default, a variable is confined to the lexical scope in which it is defined.
5.1. Function calls
5.2. Pipelines
6. Execution model
6.1. Local variables
Local variables are lexically scoped bindings of names to values, and only exist inside the function they are defined in.
Local variables are mutable in the sense that they can be redefined at any time.
Local variables can be referenced by name using the $
sigil.
Variables can be defined or reassigned using the set
builtin function:
# Bind the string "Hello world!" to the variable $foo.
set foo "Hello world!"
6.2. Context variables
In contrast with local variables, which are lexically scoped, context variables are a form of global variables that offers dynamic scoping.
Context variables can be referenced by name using the @
sigil.
let @cvar = foo {
println @cvar # foo
let @cvar = bar {
println @cvar # bar
}
println @cvar # foo
}
6.3. Binding resolution
6.4. Exceptions
As is common in many languages, exceptions offer a means of breaking out of regular control flow when runtime errors are encountered or other exceptional situations arise.
When the Riptide runtime encounters a recoverable error, it raises an exception that describes the error that occurred.
Note
|
Not all errors in the runtime get turned into exceptions. If an error occurs that the runtime cannot safely recover from, such as running out of memory or data corruption, the program will be aborted instead. |
Riptide programs are also free to raise their own exceptions at any time during program execution using the throw
builtin function.
Regardless of the origin of the exception, when an exception is raised, the current function call is aborted recursively in a process called stack unwinding, until the exception is caught. A raised exception may be caught by the first try
block encountered that wraps the offending code.
If a raised exception is not caught during stack unwinding before the top of the stack is reached, then the runtime will attempt to print a stack trace of the exception if possible, then abort the program.
7. Modules
Caution
|
The module system has yet to be designed! |
8. External commands
External commands can be executed in the same way as functions are, and use the same function call mechanism.
Native data types passed to a command as arguments are coalesced into strings and then passed in as program arguments. The function call waits for the command to finish, then returns the exit code of the command as a number.
9. Platform interaction
9.1. Environment variables
Process environment variables are exposed to a Riptide program via a environment
context variable. This variable is populated with a map of all of the current process environment variables when the runtime is initialized.
The environment
map is not linked to the process environment map after initialization; modifying the contents of the map at runtime does not update the current process’s environment. Whenever a subprocess is spawned, the subprocess’s environment is created by exporting the current value of environment
. This mimics normal environment variable support without the normal overhead required, and offers the benefits of being a regular context variable.
Example:
let @environment->FOO "bar" {
printenv
}
9.2. Working directory
The current "working directory" of the current process is exposed as a special cwd
context variable. This variable is populated when the process starts from the working directory reported by the OS.
Changes to cwd
are not required to be reflected in the process working directory, but cwd
must be respected for all relative path resolution, and newly spawned processes must inherit the current value of cwd
.
9.3. Processes
As process parallelism and external commands are essential features of Riptide, defining how Riptide manages external and child processes is paramount.
The runtime acts as a form of process supervisor, and keeps track of all child processes owned by the current process. This removes much of the burden of managing processes from the programmer.
New child processes can be created in one of two ways:
-
The
spawn
builtin, which creates a new child process and executes a user-supplied block inside it in parallel with the current process. -
Calling external commands, which executes the command in a child process.
In both of these cases, newly created processes have their process IDs recorded in the global process table, which maintains a list of all child processes the runtime is aware of.
On Unix-like systems, when the process
9.4. Input and output
Pipes
10. Standard library
This section of the reference describes all of the built-in functions that must be provided by the Riptide runtime for any program.
10.1. Logical and control flow functions
=
Test equivalence.
and
Logical AND.
or
Logical OR.
not
Negate a boolean.
if
Conditional branching.
cond
Multiple conditional branching.
foreach
Iterate over a list.
10.2. Core functions
def
Define a new variable. Throws an exception if the variable is already defined.
def myvar "Hello, World!"
let
Introduces a scoped local variable binding.
def foo "bar"
let foo "baz" {
println $foo # prints "baz"
}
println $foo # prints "bar"
set
Assigns a new value to an existing variable. Throws an exception if the variable is not defined.
builtin
Call the builtin function with the given name and arguments.
command
Execute an external command as a function.
help
Print out user help for using Riptide.
clone
Perform a deep clone of the given value and return it.
call
Invoke a block with the given arguments.
list
Create a list.
nth
Return nth item in list.
source
Evaluate a script file.
random
Produces an output stream of random bytes.
10.3. Environment
env
Get, set, or list environment variables.
pwd
Get the current working directory.
cd
Set the current working directory.
10.4. Input and output
print
Writes each argument given to standard output.
println
Writes each argument given to standard output, with a trailing newline separator.
echo
An alias for println
.
eprint
Writes each argument given to standard error.
eprintln
Writes each argument given to standard error, with a trailing newline separator.
read
Read from input.
lines
Split standard input into lines and executes a block for each line.
# Filter out lines starting with "//"
cat 'file.txt' | lines {
if not (str->starts-with '//' $1) {
println $1
}
}
# Transform every line to upper case
cat 'file.txt' | lines {
println (str->upper $1)
}
10.5. Working with strings
str?
Check if the given values are strings.
str→format
str→match
Applies a regular expression to a string and emits matches and captures.
str→replace
Applies a regular expression to a string and replaces matches with the received values.
split
Splits a string into a list by a separator.
10.6. Tables
table-get
table-set
10.7. Stream functions
send
Sends one or more values to the current output channel.
recv
Receives a value from the input channel.
10.8. Process management
pid
Returns the PID of the current process.
exit
Terminate the current process, with an optional status code.
Note
|
By default, all child processes will also be terminated in as safe a manner as possible before the current process exits. Child processes that do not respond will be terminated forcefully. To bypass this behavior, pass the --orphan flag.
|
spawn
Spawn a new process and execute a given block within it. Returns the PID of the new process.
Calling spawn
will never interrupt the current fiber; the spawned fiber will not be started until at least the current fiber yields.
kill
Send an interrupt or signal to a running process.
sleep
Suspend the current process for a given amount of time.
exec
Execute a command, replacing the current process with the executed process.
Note
|
Like exit , exec will do its best to clean up the current process as safely as possible before replacing the current process.
|
Warning
|
This replaces the current process, which includes all fibers in the current process. |
10.9. Exceptions
throw
Throw an exception.
try
Execute a block, and if the block throws an exception, invoke a continuation with the error as its first argument.
Appendix A: Design goals
-
The language should be simple to parse and evaluate so the interpreter can be simple, fast, and maintainable.
-
Only a few orthogonal language semantics so that the core language is easy to learn.
-
Support traditional command line syntax as the core of the language syntax (
command args
), so users can get started right away using Riptide as a shell, and then learn the language gradually afterward. -
Provide a built-in module system. Let users create their own package managers that work together automatically.
-
Low-level functionality can be scripted through C extension modules.
-
Provide built-in support for concurrency through forking processes.
-
Scripts should fail fast using exceptions with clear messages, rather than continue lumbering along, leaving the user unclear of the state of the world.
-
Provide data structures needed to create complex programs.
-
Extend the UNIX philosophy of many small programs that work together. Instead of creating functions that run inside your shell, encourage users to create their scripts as standalone files that can be run from within any shell.