Top Qs
Timeline
Chat
Perspective

Rust syntax

Set of rules defining correctly structured programs for the Rust programming language From Wikipedia, the free encyclopedia

Rust syntax
Remove ads

The syntax of Rust is the set of rules defining how a Rust program is written and compiled.

Thumb
A snippet of Rust code

Rust's syntax is similar to that of C and C++,[1][2] although many of its features were influenced by functional programming languages such as OCaml.[3]

Basics

Summarize
Perspective

Although Rust syntax is heavily influenced by the syntaxes of C and C++, the syntax of Rust is far more distinct from C++ syntax than Java or C#, as those languages have more C-style declarations, primitive names, and keywords.

Below is a "Hello, World!" program in Rust. The fn keyword denotes a function, and the println! macro (see § Macros) prints the message to standard output.[4] Statements in Rust are separated by semicolons.

fn main() {
    println!("Hello, World!");
}

Reserved words

Keywords

The following words are reserved, and may not be used as identifiers:

  • as
  • async
  • await
  • break
  • const
  • continue
  • crate
  • dyn
  • else
  • enum
  • extern
  • false
  • fn
  • for
  • if
  • impl
  • in
  • let
  • loop
  • match
  • mod
  • move
  • mut
  • pub
  • ref
  • return
  • Self
  • self
  • static
  • struct
  • super
  • trait
  • true
  • type
  • union
  • unsafe
  • use
  • where
  • while

Unused words

The following words are reserved as keywords, but currently have no use or purpose.

  • abstract
  • become
  • box
  • do
  • final
  • gen
  • macro
  • override
  • priv
  • try
  • typeof
  • unsized
  • virtual
  • yield

Variables

Variables in Rust are defined through the let keyword.[5] The example below assigns a value to the variable with name foo and outputs its value.

fn main() {
    let foo = 10;
    println!("The value of foo is {foo}");
}

Variables are immutable by default, but adding the mut keyword allows the variable to be mutated.[6] The following example uses //, which denotes the start of a comment.[7]

fn main() {
    // This code would not compile without adding "mut".
    let mut foo = 10; 
    println!("The value of foo is {foo}");
    foo = 20;
    println!("The value of foo is {foo}");
}

Multiple let expressions can define multiple variables with the same name, known as variable shadowing. Variable shadowing allows transforming variables without having to name the variables differently.[8] The example below declares a new variable with the same name that is double the original value:

fn main() {
    let foo = 10;
    // This will output "The value of foo is 10"
    println!("The value of foo is {foo}");
    let foo = foo * 2;
    // This will output "The value of foo is 20"
    println!("The value of foo is {foo}");
}

Variable shadowing is also possible for values of different types. For example, going from a string to its length:

fn main() {
    let letters = "abc";
    let letters = letters.len();
}

Block expressions and control flow

A block expression is delimited by curly brackets. When the last expression inside a block does not end with a semicolon, the block evaluates to the value of that trailing expression:[9]

fn main() {
    let x = {
        println!("this is inside the block");
        1 + 2
    };
    println!("1 + 2 = {x}");
}

Trailing expressions of function bodies are used as the return value:[10]

fn add_two(x: i32) -> i32 {
    x + 2
}

if expressions

An if conditional expression executes code based on whether the given value is true. else can be used for when the value evaluates to false, and else if can be used for combining multiple expressions.[11]

fn main() {
    let x = 10;
    if x > 5 {
        println!("value is greater than five");
    }

    if x % 7 == 0 {
        println!("value is divisible by 7");
    } else if x % 5 == 0 {
        println!("value is divisible by 5");
    } else {
        println!("value is not divisible by 7 or 5");
    }
}

if and else blocks can evaluate to a value, which can then be assigned to a variable:[11]

fn main() {
    let x = 10;
    let new_x = if x % 2 == 0 { x / 2 } else { 3 * x + 1 };
    println!("{new_x}");
}

while loops

while can be used to repeat a block of code while a condition is met.[12]

fn main() {
    // Iterate over all integers from 4 to 10
    let mut value = 4;
    while value <= 10 {
         println!("value = {value}");
         value += 1;
    }
}

for loops and iterators

For loops in Rust loop over elements of a collection.[13] for expressions work over any iterator type.

fn main() {
    // Using `for` with range syntax for the same functionality as above
    // The syntax 4..=10 means the range from 4 to 10, up to and including 10.
    for value in 4..=10 {
        println!("value = {value}");
    }
}

In the above code, 4..=10 is a value of type Range which implements the Iterator trait. The code within the curly braces is applied to each element returned by the iterator.

Iterators can be combined with functions over iterators like map, filter, and sum. For example, the following adds up all numbers between 1 and 100 that are multiples of 3:

(1..=100).filter(|&x| x % 3 == 0).sum()

loop and break statements

More generally, the loop keyword allows repeating a portion of code until a break occurs. break may optionally exit the loop with a value. In the case of nested loops, labels denoted by 'label_name can be used to break an outer loop rather than the innermost loop.[14]

fn main() {
    let value = 456;
    let mut x = 1;
    let y = loop {
        x *= 10;
        if x > value {
            break x / 10;
        }
    };
    println!("largest power of ten that is smaller than or equal to value: {y}");

    let mut up = 1;
    'outer: loop {
        let mut down = 120;
        loop {
            if up > 100 {
                break 'outer;
            }

            if down < 4 {
                break;
            }

            down /= 2;
            up += 1;
            println!("up: {up}, down: {down}");
        }
        up *= 2;
    }
}

Pattern matching

The match and if let expressions can be used for pattern matching. For example, match can be used to double an optional integer value if present, and return zero otherwise:[15]

fn double(x: Option<u64>) -> u64 {
    match x {
        Some(y) => y * 2,
        None => 0,
    }
}

Equivalently, this can be written with if let and else:

fn double(x: Option<u64>) -> u64 {
    if let Some(y) = x {
        y * 2
    } else {
        0
    }
}
Remove ads

Types

Summarize
Perspective

Rust is strongly typed and statically typed, meaning that the types of all variables must be known at compilation time. Assigning a value of a particular type to a differently typed variable causes a compilation error. Type inference is used to determine the type of variables if unspecified.[16]

The default integer type is i32, and the default floating point type is f64. If the type of a literal number is not explicitly provided, it is either inferred from the context or the default type is used.[17]

Primitive types

Integer types in Rust are named based on the signedness and the number of bits the type takes. For example, i32 is a signed integer that takes 32 bits of storage, whereas u8 is unsigned and only takes 8 bits of storage. isize and usize take storage depending on the architecture of the computer that runs the code, for example, on computers with 32-bit architectures, both types will take up 32 bits of space.

By default, integer literals are in base-10, but different radices are supported with prefixes, for example, 0b11 for binary numbers, 0o567 for octals, and 0xDB for hexadecimals. By default, integer literals default to i32 as its type. Suffixes such as 4u32 can be used to explicitly set the type of a literal.[18] Byte literals such as b'X' are available to represent the ASCII value (as a u8) of a specific character.[19]

The Boolean type is referred to as bool which can take a value of either true or false. A char takes up 32 bits of space and represents a Unicode scalar value: a Unicode codepoint that is not a surrogate.[20] IEEE 754 floating point numbers are supported with f32 for single precision floats and f64 for double precision floats.[21]

Compound types

Compound types can contain multiple values. Tuples are fixed-size lists that can contain values whose types can be different. Arrays are fixed-size lists whose values are of the same type. Expressions of the tuple and array types can be written through listing the values, and can be accessed with .index or [index]:[22]

let tuple: (u32, i64) = (3, -3);
let array: [i8; 5] = [1, 2, 3, 4, 5];
let tuple: (bool, bool) = (true, true);
let value = tuple.1; // -3
let value = array[2]; // 3

Arrays can also be constructed through copying a single value a number of times:[23]

let array2: [char; 10] = [' '; 10];
Remove ads

Ownership and references

Summarize
Perspective

Rust's ownership system consists of rules that ensure memory safety without using a garbage collector. At compile time, each value must be attached to a variable called the owner of that value, and every value must have exactly one owner.[24] Values are moved between different owners through assignment or passing a value as a function parameter. Values can also be borrowed, meaning they are temporarily passed to a different function before being returned to the owner.[25] With these rules, Rust can prevent the creation and use of dangling pointers:[25][26]

fn print_string(s: String) {
    println!("{}", s);
}

fn main() {
    let s = String::from("Hello, World");
    print_string(s); // s consumed by print_string
    // s has been moved, so cannot be used any more
    // another print_string(s); would result in a compile error
}

The function print_string takes ownership over the String value passed in; Alternatively, & can be used to indicate a reference type (in &String) and to create a reference (in &s):[27]

fn print_string(s: &String) {
    println!("{}", s);
}

fn main() {
    let s = String::from("Hello, World");
    print_string(&s); // s borrowed by print_string
    print_string(&s); // s has not been consumed; we can call the function many times
}


Because of these ownership rules, Rust types are known as linear or affine types, meaning each value can be used exactly once. This enforces a form of software fault isolation as the owner of a value is solely responsible for its correctness and deallocation.[28]

When a value goes out of scope, it is dropped by running its destructor. The destructor may be programmatically defined through implementing the Drop trait. This helps manage resources such as file handles, network sockets, and locks, since when objects are dropped, the resources associated with them are closed or released automatically.[29]

Lifetimes

Object lifetime refers to the period of time during which a reference is valid; that is, the time between the object creation and destruction.[30] These lifetimes are implicitly associated with all Rust reference types. While often inferred, they can also be indicated explicitly with named lifetime parameters (often denoted 'a, 'b, and so on).[31]

Lifetimes in Rust can be thought of as lexically scoped, meaning that the duration of an object lifetime is inferred from the set of locations in the source code (i.e., function, line, and column numbers) for which a variable is valid.[32] For example, a reference to a local variable has a lifetime corresponding to the block it is defined in:[32]

fn main() {
    let x = 5;                // ------------------+- Lifetime 'a
                              //                   |
    let r = &x;               // -+-- Lifetime 'b  |
                              //  |                |
    println!("r: {}", r);     //  |                |
                              //  |                |
                              // -+                |
}                             // ------------------+

The borrow checker in the Rust compiler then enforces that references are only used in the locations of the source code where the associated lifetime is valid.[33][34] In the example above, storing a reference to variable x in r is valid, as variable x has a longer lifetime ('a) than variable r ('b). However, when x has a shorter lifetime, the borrow checker would reject the program:

fn main() {
    let r;                    // ------------------+- Lifetime 'a
                              //                   |
    {                         //                   |
        let x = 5;            // -+-- Lifetime 'b  |
        r = &x; // ERROR: x does  |                |
    }           // not live long -|                |
                // enough                          |
    println!("r: {}", r);     //                   |
}                             // ------------------+

Since the lifetime of the referenced variable ('b) is shorter than the lifetime of the variable holding the reference ('a), the borrow checker errors, preventing x from being used from outside its scope.[35]

Lifetimes can be indicated using explicit lifetime parameters on function arguments. For example, the following code specifies that the reference returned by the function has the same lifetime as original (and not necessarily the same lifetime as prefix):[36]

fn remove_prefix<'a>(mut original: &'a str, prefix: &str) -> &'a str {
    if original.starts_with(prefix) {
        original = original[prefix.len()..];
    }
    original
}

In the compiler, ownership and lifetimes work together to prevent memory safety issues such as dangling pointers.[37][38]

Remove ads

User-defined types

Summarize
Perspective

User-defined types are created with the struct or enum keywords. The struct keyword is used to denote a record type that groups multiple related values.[39] enums can take on different variants at runtime, with its capabilities similar to algebraic data types found in functional programming languages.[40] Both records and enum variants can contain fields with different types.[41] Alternative names, or aliases, for the same type can be defined with the type keyword.[42]

The impl keyword can define methods for a user-defined type. Data and functions are defined separately. Implementations fulfill a role similar to that of classes within other languages.[43]

Standard library

More information Type, Description ...

Option values are handled using syntactic sugar, such as the if let construction, to access the inner value (in this case, a string):[58]

fn main() {
    let name1: Option<&str> = None;
    // In this case, nothing will be printed out
    if let Some(name) = name1 {
        println!("{name}");
    }

    let name2: Option<&str> = Some("Matthew");
    // In this case, the word "Matthew" will be printed out
    if let Some(name) = name2 {
        println!("{name}");
    }
}
Remove ads

Pointers

Summarize
Perspective
More information Type, Description ...

To prevent the use of null pointers and their dereferencing, the basic & and &mut references are guaranteed to not be null. Rust instead uses Option for this purpose: Some(T) indicates that a value is present, and None is analogous to the null pointer.[59] Option implements a "null pointer optimization", avoiding any spatial overhead for types that cannot have a null value (references or the NonZero types, for example).[60] Though null pointers are idiomatically avoided, the null pointer constant in Rust is represented by std::ptr::null().

Rust also supports raw pointer types *const and *mut, which may be null; however, it is impossible to dereference them unless the code is explicitly declared unsafe through the use of an unsafe block. Unlike dereferencing, the creation of raw pointers is allowed inside of safe Rust code.[61]

Remove ads

Type conversion

Rust provides no implicit type conversion (coercion) between most primitive types. But, explicit type conversion (casting) can be performed using the as keyword.[62]

let x = 1000;
println!("1000 as a u16 is: {}", x as u16);
A presentation on Rust by Emily Dunham from Mozilla's Rust team (linux.conf.au conference, Hobart, 2017)

Polymorphism

Summarize
Perspective

Generics

Rust's more advanced features include the use of generic functions. A generic function is given generic parameters, which allow the same function to be applied to different variable types. This capability reduces duplicate code[63] and is known as parametric polymorphism.

The following program calculates the sum of two things, for which addition is implemented using a generic function:

use std::ops::Add;

// sum is a generic function with one type parameter, T
fn sum<T>(num1: T, num2: T) -> T
where  
    T: Add<Output = T>,  // T must implement the Add trait where addition returns another T
{
    num1 + num2  // num1 + num2 is syntactic sugar for num1.add(num2) provided by the Add trait
}

fn main() {
    let result1 = sum(10, 20);
    println!("Sum is: {}", result1); // Sum is: 30

    let result2 = sum(10.23, 20.45);
    println!("Sum is: {}", result2); // Sum is: 30.68
}

At compile time, polymorphic functions like sum are instantiated with the specific types the code requires; in this case, sum of integers and sum of floats.

Generics can be used in functions to allow implementing a behavior for different types without repeating the same code. Generic functions can be written in relation to other generics, without knowing the actual type.[64]

Traits

Thumb
Excerpt from std::io

Rust's type system supports a mechanism called traits, inspired by type classes in the Haskell language,[65] to define shared behavior between different types. For example, the Add trait can be implemented for floats and integers, which can be added; and the Display or Debug traits can be implemented for any type that can be converted to a string. Traits can be used to provide a set of common behavior for different types without knowing the actual type. This facility is known as ad hoc polymorphism.

Generic functions can constrain the generic type to implement a particular trait or traits; for example, an add_one function might require the type to implement Add. This means that a generic function can be type-checked as soon as it is defined. The implementation of generics is similar to the typical implementation of C++ templates: a separate copy of the code is generated for each instantiation. This is called monomorphization and contrasts with the type erasure scheme typically used in Java and Haskell. Type erasure is also available via the keyword dyn (short for dynamic).[66] Because monomorphization duplicates the code for each type used, it can result in more optimized code for specific-use cases, but compile time and size of the output binary are also increased.[67]

In addition to defining methods for a user-defined type, the impl keyword can be used to implement a trait for a type.[43] Traits can provide additional derived methods when implemented.[68] For example, the trait Iterator requires that the next method be defined for the type. Once the next method is defined, the trait can provide common functional helper methods over the iterator, such as map or filter.[69]

Trait objects

Rust traits are implemented using static dispatch, meaning that the type of all values is known at compile time; however, Rust also uses a feature known as trait objects to accomplish dynamic dispatch, a type of polymorphism where the implementation of a polymorphic operation is chosen at runtime. This allows for behavior similar to duck typing, where all data types that implement a given trait can be treated as functionally equivalent.[70] Trait objects are declared using the syntax dyn Tr where Tr is a trait. Trait objects are dynamically sized, therefore they must be put behind a pointer, such as Box.[71] The following example creates a list of objects where each object can be printed out using the Display trait:

use std::fmt::Display;

let v: Vec<Box<dyn Display>> = vec![
    Box::new(3),
    Box::new(5.0),
    Box::new("hi"),
];

for x in v {
    println!("{x}");
}

If an element in the list does not implement the Display trait, it will cause a compile-time error.[72]

Remove ads

Memory safety

Rust is designed to be memory safe. It does not permit null pointers, dangling pointers, or data races.[73][74][75][76] Data values can be initialized only through a fixed set of forms, all of which require their inputs to be already initialized.[77]

Unsafe code can subvert some of these restrictions, using the unsafe keyword.[61] Unsafe code may also be used for low-level functionality, such as volatile memory access, architecture-specific intrinsics, type punning, and inline assembly.[78]

Remove ads

Memory management

Rust does not use garbage collection. Memory and other resources are instead managed through the "resource acquisition is initialization" convention,[79] with optional reference counting. Rust provides deterministic management of resources, with very low overhead.[80] Values are allocated on the stack by default, and all dynamic allocations must be explicit.[81]

The built-in reference types using the & symbol do not involve run-time reference counting. The safety and validity of the underlying pointers is verified at compile time, preventing dangling pointers and other forms of undefined behavior.[82] Rust's type system separates shared, immutable references of the form &T from unique, mutable references of the form &mut T. A mutable reference can be coerced to an immutable reference, but not vice versa.[83]

Macros

Macros allow generation and transformation of Rust code to reduce repetition. Macros come in two forms, with declarative macros defined through macro_rules!, and procedural macros, which are defined in separate crates.[84][85]

Declarative macros

A declarative macro (also called a "macro by example") is a macro, defined using the macro_rules! keyword, that uses pattern matching to determine its expansion.[86][87] Below is an example that sums over all its arguments:

macro_rules! sum {
    ( $initial:expr $(, $expr:expr )* $(,)? ) => {
        $initial $(+ $expr)*
    }
}

fn main() {
    let x = sum!(1, 2, 3);
    println!("{x}"); // prints 6
}

Procedural macros

Procedural macros are Rust functions that run and modify the compiler's input token stream, before any other components are compiled. They are generally more flexible than declarative macros, but are more difficult to maintain due to their complexity.[88][89]

Procedural macros come in three flavors:

  • Function-like macros custom!(...)
  • Derive macros #[derive(CustomDerive)]
  • Attribute macros #[custom_attribute]
Remove ads

Interface with C and C++

Rust has a foreign function interface (FFI) that can be used both to call code written in languages such as C from Rust and to call Rust code from those languages. As of 2024, an external library called CXX exists for calling to or from C++.[90] Rust and C differ in how they lay out structs in memory, so Rust structs may be given a #[repr(C)] attribute, forcing the same layout as the equivalent C struct.[91]

Remove ads

See also

Notes

  1. On Unix systems, this is often UTF-8 strings without an internal 0 byte. On Windows, this is UTF-16 strings without an internal 0 byte. Unlike these, str and String are always valid UTF-8 and can contain internal zeros.

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads