By Tim McNamara

This article introduces you to the fundamentals of Rust programming, and we’ll be working through some of the language’s syntax.

This article is intended for programmers with experience in another programming language.

Save 37% off Rust in Action with code fccmcnamara at manning.com.

Language Foundations

We’re going to build grep-lite, a stripped-down version of the ubiquitous grep utility. grep-lite looks for patterns within text, and it prints lines that match the pattern. The conceptual simplicity of this model allows us to focus on the unique features of Rust.

Tip: Rust’s community strives to be welcoming and responsive to newcomers. At times, you may strike a mental pothole when you encounter terms such as “lifetime elision,” “hygienic macros,” and “algebraic types” without context. Don’t be afraid to ask for help. The community is much more welcoming than these helpful, yet opaque, documents may suggest.

A Glance at Rust’s Syntax

Rust tries to be boring and predictable where it can be. It has variables, numbers, functions and other familiar things that you’ve seen in other languages. Its blocks are delimited by curly brackets ({ and }), it uses a single equals sign as its assignment operator (=), and it’s whitespace agnostic.

A whole program with main()

Below is a short, yet complete Rust program. It prints a + b = 30 to the console after guiding you through defining functions and using variables.

Listing 1 Adding integers and using variables

  
 fn main() {
   let a = 10;      
   let b: i32 = 20; 
  
   let c = add(a, b);
   println!("a + b = {}", c);
 }
  
 fn add(i: i32, j: i32) -> i32 { 
   i + j 
 }
  

❶  Types can be inferred by the compile…

❷  …or declared by the programmer when creating variables

❸  Types are required when defining functions

❹  Functions return the last expression’s result, meaning return isn’t required (but be careful, adding a semi-colon to this line changes the semantics to return () rather than i32)

Although only eleven lines of code, there’s quite a lot packed into them. Here are some notes that should provide the gist of what’s going on.

Line 1 (fn main() {):

  • The fn keyword begins a function definition.

  • main() is the entrypoint to all Rust programs. It takes no arguments and returns no value.

  • Code blocks, also known as lexical scopes, are defined with braces ({ & }).

Line 2 (let a = 10;):

  • Use let to declare variable bindings. Variables are immutable by default, meaning that they are read-only rather than read/write.

  • Statements are delimited with semi-colons (;).

Line 3 (let b: i32 = 20;):

·   You can designate a specific data type to the compiler. At times, this is required as the compiler is unable to deduce a unique type on your behalf.

Line 6 (println!("a + b = {}", c);):

  • println!() is a macro. Macros are function-like, but return code rather than a value. In the case of printing, every data type has its own way of being converted to a string. println!() takes care of figuring out the exact methods to call on its arguments.

  • Strings use double quotes (“) rather than single quotes (‘)

  • String formatting uses {} as a placeholder, rather than the C-like %s or other variants.

Line 9 (fn add(…) → i32 {):

·   Rust’s syntax for defining functions should be legible to anyone who has worked with a typed programming before. Parameters are delimited by commas, and types follow the variable names. The “dagger” or “thin arrow” syntax indicates return type, rather than the more traditional colon.

 

Compiling Code with rustc

 The Rust compiler rustc can be invoked to create a working executables from source code. To compile a single file of Rust code called first-steps.rs into a working program:

 – Make sure that first-steps.rs includes a main() function

 – Open a shell such as cmd.exe, bash or Terminal

 – Move to the directory that includes first-steps.rs

 – Execute the command rustc first-steps.rs

 A file, first-steps (or first-steps.exe) has been created. To execute it, enter first-steps.exe on Windows or ./first-steps on other operating systems. Projects larger than a single file tend to be compiled with a higher-level tool called cargo. cargo understands whole crates and executes rustc on your behalf.

 

Starting out With Numbers

Computers have been associated with numbers for longer than you’ve been able to say formula translator. Numeric literals for integers and floating point numbers are relatively straightforward.

The code below prints a single line to the console:

  
 20; 21; 21; 1000000
  

Listing 2 Numeric Literals and Basic Operations in Rust

  
 fn main() {
   let twenty = 20; 
   let twenty_one: i32 = twenty + 1; 
   let floats_okay = 21.0; 
   let one_million = 1_000_000; 
  
   println!("{}; {}; {}; {}", twenty, twenty_one, floats_okay, one_million);
 }
  

❶  Rust infers a type on your behalf if you don’t supply one…

❷  …which is done by adding type annotations (i32)

❸  Floating point literals require no special syntax

❹  Underscores can be used to increase readability and are ignored by the compiler

Rust also has built-in support for numeric literals that allow you to define literals in base 2 (binary), base 8 (octal) and base 16 (hexadecimal). This notation is also available within the formatting macros, such as println!. To demonstrate, the following output is produced by the code that follows.

 3 30 300
 11 11110 100101100
 3 36 454
 3 1e 12c

Listing 3 Using Base 2, Base 8 and Base 16 Numeric Literals

  
 fn main() {
   let three = 0b11;          
   let thirty = 0o36;         
   let three_hundred = 0x12C; 
  
   println!("{} {} {}", three, thirty, three_hundred);
   println!("{:b} {:b} {:b}", three, thirty, three_hundred);
   println!("{:o} {:o} {:o}", three, thirty, three_hundred);
   println!("{:x} {:x} {:x}", three, thirty, three_hundred);
 }

❶  0b → binary (base 2)

❷  0o → octal (base 8)

❸  0x → hexidecimal (base 16)

Rust contains a full complement of numeric types:

  • i8, i16, i32, i64 – signed integers ranging from 8-bit to 64-bit

  •  u8, u16, u32, u64 – unsigned integers ranging from 8-bit to 64-bit

  • f32, f64 – floating point numbers in 32-bit and 64-bit variants

  • isize, usize – integers that assume CPU’s “native” width (e.g. in 64-bit CPUs, usize and isize is 64 bits wide)

The number families are:

  • signed integers (i) can represent negative as well as positive integers

  • unsigned integer (u) can only represent positive integers but can count twice as high compared with their signed counterparts

  • floating point is (f) able to represent real numbers and has special values for infinity, negative infinity and “not a number”

The widths are the number of bits that the type uses in RAM and in the CPU. Types that take up more space, such as u32 vs i8, can represent more numbers at the expense of needing to store extra zeros for smaller numbers.

Number

Type

Bit Pattern in Memory

20

u32

00000000000000000000000000010100

20

i8

0010100

Although we’ve only touched on numbers, we have nearly enough exposure to Rust to create a prototype of our pattern matching program.

The listing below (which isn’t yet runnable) prints 42 to the console when compiled. The syntax on line 6, e.g. *item, may be unfamiliar. item is a reference to some number within haystack. *item == needle asks Rust to compare the value referred to by item against needle. This is known as de-referencing.

Listing 4 Searching for an integer in an array of integers

  
 fn main() {
   let needle = 42;
   let haystack = [1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862]; 
  
   for reference in haystack.iter() { 
     let item = *reference;           
     if item == needle {
       println!("{}", item);
     }
  
     // if reference == &needle {     
     // println!("{}", reference);
     // }
   }
 }

❶  Array literal for a list of integers

❷  The haystack.iter() method returns an iterator over haystack that provides references to enabling access to individual elements

❸  item is the value referred to by reference

❹  This block provides an alternative form of the previous code. reference == &needle converts needle to a reference and compares against that.

This general pattern comes in handy with more complex examples. Before looking at more complex types, let’s take a moment to discuss one of Rust’s more novel features – the match keyword.

Type-aware control flow with match

Let’s say that we wanted to match against multiple patterns. Although it’s possible to use if/else blocks, these can become cumbersome and brittle. Rust’s match keyword, analogous to other languages’ switch keyword, can provide an alternative which is easier to read and more robust. Rust ensures that you’re testing against all possible values, meaning that corner cases don’t occur. match returns when the first match is found.

The code below prints these two lines to the screen:

  
 42: hit!
 132: hit!
  

Listing 5 Using the match keyword to match on multiple values

  
 fn main() {
   // let needle = 42;         
   let haystack = [1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862];
  
   for reference in haystack.iter() {
     let item = *reference;
  
     let result = match item { 
       42 | 132 => "hit!",     
       _ => "miss",            
     };
  
     if result == "hit!" {
       println!("{}: {}", item, result);
     }
   }
 }
  

❶  The variable needle is now redundant

❷  match is an expression that returns a value that can be bound to a variable

❸  42 | 132 matches both 42 and 132

❹  _ is a wildcard pattern that matches everything

match keyword plays an important role within the Rust language. Many control structures, such as looping, are defined in terms of match under the hood. They shine when combined with the Option type.

Now that we’ve taken a good look at defining numbers and working with some of Rust’s control flow mechanisms, let’s move on to adding structure to programs with functions.

Getting Stuff Done With Functions

Since the introduction of structured programming, one of the fundamental abstractions provided by programming languages has been the subroutine. In Rust, stand-alone subroutines are created as functions. Methods, which are functions tied directly to a specific object, also exist and will be discussed with impl (implementation) blocks.

Looking back to where the article begins, the snippet in listing 1 contained a small function, add(). Repeated below, add takes two i32 values and returns another:

Listing 6 Extract of lines 9-11 from listing 2.1.

  
 fn add(i: i32, j: i32) -> i32 { 
   i + j
 }
  

❶  add takes two integer parameters and returns an integer. The two arguments are bound to local variables i and j.

For the moment, let’s concentrate on the syntax of each of the elements. Rust’s function signatures can grow quite complex and therefore it pays to understand what’s happening with simple ones. A visual picture of each of the pieces is provided below. Anyone who’s programmed in a strongly typed programming language should be able to squint their way through.

Rust’s functions require that you specify your parameters’ types and the function’s return type.


Figure 1. Rust’s Function Definition Syntax


That’s all for this article. If you want to learn more about Rust, download the free first chapter of Rust in Action and see this Slideshare presentation.