7 - Managing Growing Projects with Packages, Crates, and Modules

So far all of our examples have lived in a single file, but almost any non-trivial program would be too large to fit in a single file. Rust provides a number of tools to help us organize a project:

Modules group together related code, and let you control what is a private implementation detail and what should be public interface.
Crates are a tree of modules that produce a library or an executable. Back in chapter 2 we installed the rand crate to help us generate random numbers.
Paths are used to reference a function, struct, enum, etc... from some other module or crate.
Packages are what you actually check into git (or your source control of choice) - a package contains a library crate, one or more binary crates, or both.
Workspaces let you group together a set of related packages, similar to a monorepo. We'll wait until chapter 14 to talk about workspaces

7.1 Packages and Crates

In many compiled languages - C or Java for example - the "unit of compilation" is a single file. In a C project, every .c source file compiles to a single object file, and then all the object files get linked together into a single executable (or into a dynamically linked library). If you change a .c file in a big project, you only have to recompile that single .c file and then relink all the object files.

In Rust, the unit of compilation is the crate. Crates come in two forms - library crates and binary crates, but most of the crates you're going to deal with are libraries, so the terms "crate", "library", and "library crate" are all used interchangeably when talking about Rust libraries.

A package is purely a cargo concept (in other words, rustc doesn't know anything about packages). A package is what you get when you run cargo new - a Cargo.toml file, and a src folder (possibly with subfolders) containing one or more source files.

The crate root is the file that rustc starts working from when it compiles a given crate. If a package contains a src/main.rs file with a main function, then the package contains a binary crate with the same name as the package. If a package contains a src/lib.rs, then it contains a library crate (again with the same name as the package). If it has both, then the package contains both a library crate and a binary crate. Having both is a very common pattern. For example, if you were writing a library to convert JPG images to PNG format, you might include both a library crate that other developers can use and a binary crate implementing a command line tool that uses the library.

If you want to include more than one binary crate in a package, you can add files in src/bin. Each file placed there will be compiled as a separate binary crate.

7.2 Defining Modules to Control Scope and Privacy

A module is quite similar to a package in Go, and is somewhat similar to a package in Java. If you're a JavaScript developer, then a module is close to an ESM module, except you can split a module up over multiple files.

rustc always starts at the crate root (usually src/main.rs or src/lib.rs) - it compiles this file, and any time it finds a mod statement, it adds the associated module to the crate and compiles it too. Suppose in main.rs we have the statement mod garden - Rust will look for the code for this module in three places:

Inline (e.g. mod garden { /* code goes here */ })
In src/garden.rs.
In src/garden/mod.rs (older style)

Similarly modules can defined submodules. src/garden.rs can have a mod vegetables that might be defined in src/garden/vegetables.rs (note that garden's submodules go in a folder named "garden", not in the same folder).

Note that we've marked the src/garden/mod.rs version as "older style". This is still supported (and as we'll see in chapter 11 it's very handy for writing integration tests) but the src/garden.rs is the one you should use by default. If you try to use the mix the [name].rs and [name]/mod.rs styles in the same module, you'll get a compiler error.

Every identifier at the top level of a Rust module has a path associated with it. If we created a struct called Asparagus in the vegetables module, then the absolute path for that struct would be crate::garden::vegetables::Asparagus. (This is a little bit like a file system path, but with ::s instead of slashes.)

Each identifier declared in a source file is also private by default, meaning it can only be accessed by functions inside that module (or it's submodules - submodules always have visibility into the private details of their ancestors). To make any identifier public you use the pub keyword, like pub fn do_the_thing() {...} or pub mod garden.

The use keyword is used to bring paths into scope. If you use crate::garden:vegetables::Asparagus in a file, then you can write Asparagus to refer to this struct instead of using the full path.

In the restaurant industry, the dining room and other places of the restaurant where customers go is called the "front of house" and the kitchen and offices and parts where customers are rarely seen are called the "back of house". Let's create library for managing a restaurant. We'll run cargo new restaurant --lib to create a library crate, and in src/lib.rs we'll put:

src/lib.rs
mod front_of_house {
    mod hosting {
        fn add_to_waitlist() {}

        fn seat_at_table() {}
    }

    mod serving {
        fn take_order() {}

        fn serve_order() {}

        fn take_payment() {}
    }
}

We'll just defined the modules inline here, because this is convenient for the purposes of an example, but usually we'd split this modules up into multiple files.

7.3 Paths for Referring to an Item in the Module Tree

To refer to an item in the module tree, we use a path. Paths come in two forms:

An absolute path starts from the crate root. For external library creates we're using, this starts with the name of the crate (e.g. rand) and for code within the current crate it starts with crate.
A relative path starts from the current module. It starts with an identifier in the current module or with self or super.

Using our restaurant example, let's say we want to call the add_to_waitlist function. From the top level of src/lib.rs we could do this in two ways:

src/lib.rs
mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}

pub fn eat_at_restaurant() {
    // Absolute path
    crate::front_of_house::hosting::add_to_waitlist();

    // Relative path, since this function is defined
    // the same modules as `front_of_house`.
    front_of_house::hosting::add_to_waitlist();
}

Relative paths have the clear advantage that they are shorter. Absolute paths have the advantage that, if you move a function from one module to another, all the absolute paths in that function won't have to change (although obviously all the paths pointing to the moved function will).

We have to mark hosting and add_to_waitlist as pub in order for eat_at_restaurant to compile. This is because eat_at_restaurant is defined at the root of the crate, and hosting and add_to_waitlist are in child modules. A parent cannot access the private contents of a another module, unless that other module is an ancestor - add_to_waitlist could access private members of front_of_house or the root of the crate.

Starting Relative Paths with super

The super keyword is used in a module path in exactly the same way as .. is used in a file path - it goes up one level in the module tree:

src/lib.rs
fn deliver_order() {}

mod back_of_house {
    fn fix_incorrect_order() {
        cook_order();
        // Call into `deliver_order` in the parent module.
        super::deliver_order();
    }

    fn cook_order() {}
}

Making Structs and Enums Public

pub is used to make an identifier visible outside of the module, but there are a few special considerations for structs and enums. When you make a struct public, by default all of it's fields are private and can only be accessed inside the module. You need to mark individual fields as pub if you want them to be visible to code outside the module:

src/lib.rs
mod back_of_house {
    pub struct Breakfast {
        pub toast: String,
        seasonal_fruit: String,
    }

    impl Breakfast {
        // Constructor for a summer Breakfast.
        pub fn summer(toast: &str) -> Breakfast {
            Breakfast {
                toast: String::from(toast),
                seasonal_fruit: String::from("peaches"),
            }
        }
    }
}

pub fn eat_at_restaurant() {
    // Order a breakfast in the summer with Rye toast
    let mut meal = back_of_house::Breakfast::summer("Rye");

    // Change our mind about what bread we'd like.
    // We can write directly to `toast` because it is `pub`:
    meal.toast = String::from("Wheat");
    println!("I'd like {} toast please", meal.toast);

    // The next line won't compile if we uncomment it; we're not allowed
    // to see or modify the seasonal fruit that comes with the meal,
    // since we're outside of the `back_of_house` module.
    // meal.seasonal_fruit = String::from("blueberries");
}

The pub toast field can be read and written outside of the back_of_house module, but the private seasonal_fruit cannot. Note that the existence of this private field implies that other modules won't be able to create a new instance of Breakfast, since they won't be able to set this private field. Here we've created a public associated function called summer to act as a sort of constructor.

Enums behave exactly the opposite to structs. When we make an enum pub, all of it's variants and all fields defined on all variants are automatically pub as well. There's no way to have an enum with some fields that are private.

7.4 - Bringing Paths into Scope with the `use` Keyword

We've already seen many examples of using the use keyword to bring something into scope. We can use it with modules within our crate to bring members from a child module into scope, too:

src/lib.rs
use crate::front_of_house::hosting;

mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}

pub fn eat_at_restaurant() {
    // Don't need to write `front_of_house::hosting::add_to_waitlist()`
    // here because we brought `hosting` into scope with the `use`
    // above.use crate::front_of_house::hosting;

    hosting::add_to_waitlist();
}

You can think about use a bit like a symbolic link in a file system, or a bit like JavaScript's import { add_to_waitlist } from './hosting.js'. It adds a symbol to the scope of the use statement.

One thing to note about modules is that, whether they're split into another file or used inline, the mod keyword always creates a new scope that doesn't inherit anything from the parent scope. When we create a scope using braces, in most cases we assume all symbols from outside those braces will be available in the child scope. For example:

fn say_hello() {
    let name = "Jason",

    {
        println!("Hello {}!", name);
    }
}

Here name is visible inside the scope created by the inner braces. The scope created by mod however doesn't bring in anything from the parent scope:

src/lib.rs
mod front_of_house {
    mod serving {
        fn serve_order() {}
    }

    pub mod hosting {
        fn seat_at_table() {
            // This won't compile!  `serving` is undefined here,
            // even though it was defined one scope up.
            serving::server_order();
        }
    }
}

This also means a use statement at the top level of a file will only bring a symbol into scope for the top-level module, and not for any inline modules. If you want to use a symbol inside an inline mod, you'll need to put the use inside that module:

src/lib.rs
use crate::front_of_house::hosting;


mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}


mod customer {
    // We need this `use` here, even though we're already
    // `use`ing hosting at the top level, because
    // `mod customer` creates a new scope that doesn't
    // inherit any symbols.
    use crate::front_of_house::hosting;

    pub fn eat_at_restaurant() {
        hosting::add_to_waitlist();
    }
}

This seems a bit strange at first, but it makes sense when you realize that modules are generally intended to be split across multiple files. If you move the contents of an inline mod out into a file, you won't have to deal with the fact that there are a bunch of symbols you don't have access to anymore.

Creating Idiomatic use Paths

These two listings do the same thing:

// Version 1
use crate::front_of_house::hosting;
fn fn1() {
    hosting::add_to_waitlist();
}

// Version 2
use crate::front_of_house::hosting::add_to_waitlist;
fn fn2() {
    add_to_waitlist();
}

But the first one is considered idiomatic and the second is not. Generally we use a module, and don't use individual functions within a module. This makes for more typing, because we need to type the name of the module, but it also makes it clear that the function comes from some other module and isn't defined locally.

On the other hand, when bringing structs and enums into scope, we generally use the individual struct or enum instead of the parent module. For example, we use std::collections::HashMap;, and then just type HashMap.

If you want to use two data types from different modules that have the same name, you can either refer to them by their namespace:

use std::fmt;
use std::io;

fn fn1() -> fmt::Result {...}
fn fn2() -> io::Result {...}

Or you can use the as keyword to rename a symbol:

use std::fmt::Result as FmtResult;
use std::io::Result as IoResult;

fn fn1() -> FmtResult {...}
fn fn2() -> IoResult {...}

Re-exporting Names with `pub use`

You can "re-export" a symbol from some other module in your module:

mod colors {
    pub struct Color {
        red: u8,
        green: u8,
        blue: u8,
    }
}

mod ansi {
    pub fn color_string(message: &str, color: crate::colors::Color) -> String {
        // --snip--
    }

    // Re-export `Color` from colors.
    pub use crate::colors::Color;
}

fn log_error(message: &str) {
    let red = ansi::Color{red: 255, green: 0, blue: 0};
    println!("{}", ansi::color_string(message, red));
}

Here callers of ansi can use ansi::Color, even though Color us actually defined in the colors module. This is very handy when there's a type you're using from some other crate that's central to your module. It's also handy when the internal organization of your library might be different than the public API you want to share. We'll talk about this idea more in chapter 14.

Using External Packages

Many useful crates are published on crates.io, and we can use these by adding them to the "dependencies" section of Cargo.toml. We did this back in chapter 2 when we built our guessing game. There we added the rand crate:

Cargo.toml

[dependencies]
rand = "0.8.5"

And then we used the Rng trait from rand:

use rand::Rng;

fn main() {
    let secret_number = rand::thread_rng().gen_range(1..=100);
}

Using Nested Paths to Clean Up Large use Lists

These two sets of use statements are equivalent:

// This:
use std::cmp::Ordering;
use std::io;


// Can be shortened to this:
use std::{cmp::Ordering, io};

There's no difference between these, the second is just shorter. If we want to use a module and some members of that module, we can use the self keyword:

// This:
use std::io;
use std::io::Write;


// Can be shortened to this:
use std::io::{self, Write};

The Glob Operator

This brings all public symbols in the io::prelude module into scope:

use io::prelude::*;

This is generally something you'd only do in two cases. The first is this example, where a crate has defined a custom prelude that brings a lot commonly used symbols into scope. The second involves unit tests; we generally write tests for a given module in a child module called "tests", so we very frequently use super::*; to bring everything in the module we're testing down into the tests module. We'll talk more about testing in chapter 11.

Separating Modules into Different Files

We've used the inline style for modules in this chapter because we've been working with short examples, but in real life any non-trivial program is going to be split across multiple files:

In src/lib.rs:

src/lib.rs
mod front_of_house;

pub use crate::front_of_house;

pub fn eat_at_restaurant() {
    front_of_house::add_to_waitlist();
}

And then in src/front_of_house.rs:

src/front_of_house.rs
pub fn add_to_waitlist() {}

You only need to load a file with mod once in your entire module tree, not in every place it is used. mod is not like include or import from other programming languages. There just has to be one mod somewhere to let rustc know it should include the file in the crate.

Note that it's perfectly acceptable to have a small module be defined inline. You can always move it into its own file later if it grows. Since the path of a symbol doesn't change based on whether a module is inline or in a separate file, moving inline code into files doesn't require any refactoring work. Tests are generally defined in an inline module.

Continue to chapter 8.

7 - Managing Growing Projects with Packages, Crates, and Modules

7.1 Packages and Crates​

7.2 Defining Modules to Control Scope and Privacy​

Grouping Related Code in Modules​

7.3 Paths for Referring to an Item in the Module Tree​

Starting Relative Paths with super​

Making Structs and Enums Public​

7.4 - Bringing Paths into Scope with the use Keyword​

Creating Idiomatic use Paths​

Re-exporting Names with pub use​

Using External Packages​

Using Nested Paths to Clean Up Large use Lists​

The Glob Operator​

Separating Modules into Different Files​