File systems

What are file systems (persistent storage)?

File systems, also known as persistent storage, allow you to store your large datasets and the state of your instance, for example:

  • Packages installed system-wide using apt-get.

  • Python packages installed using pip.

  • conda and Python venv virtual environments.

Lambda GPU Cloud file systems have a capacity of 8 exabytes, or 8,000,000 terabytes, and you can have a total of 24 file systems, except for file systems created in the Texas, USA (us-south-1) region. The capacity of file systems created in the Texas, USA (us-south-1) region is 10 terabytes.

How are file systems billed?

Persistent storage is billed per GB used per month, in increments of 1 hour.

For example, based on the price of $0.20 per GB used per month:

  • If you use 1,000 GB of your file system capacity for an entire month (30 days, or 720 hours), you’ll be billed $200.00.

  • If you use 1,000 GB of your file system capacity for a single day (24 hours), you’ll be billed $6.67.

The actual price of persistent storage will be displayed when you create your file system.

Can file systems be accessed without an instance?

Persistent storage file systems can't be accessed unless attached to an instance at the time the instance is launched.

For this reason, it's recommended that you keep a local copy of the files you have saved in your persistent storage file systems. This can be done using rsync.

File systems can't be attached to running instances and can't be mounted remotely, for example, using NFS.

Moreover, file systems can only be attached to instances in the same region. For example, a file system created in the us-west-1 (California, USA) region can only be attached to instances in the us-west-1 region.

File systems can't be transferred from one region to another. However, you can copy data between file systems using tools such as rsync.

Lambda GPU Cloud currently doesn't offer block or object storage.

Can I set a limit (quota) on my file system usage?

Currently, you can't set a limit (quota) on your persistent storage file system usage.

You can see the usage of a persistent storage file system from within an instance by running df -h -BG. This command will produce output similar to:

Filesystem           1G-blocks  Used   Available Use% Mounted on
udev                       99G    0G         99G   0% /dev
tmpfs                      20G    1G         20G   1% /run
/dev/vda1                1357G   23G       1335G   2% /
tmpfs                      99G    0G         99G   0% /dev/shm
tmpfs                       1G    0G          1G   0% /run/lock
tmpfs                      99G    0G         99G   0% /sys/fs/cgroup
persistent-storage 8589934592G    0G 8589934592G   0% /home/ubuntu/persistent-storage
/dev/vda15                  1G    1G          1G   6% /boot/efi
/dev/loop0                  1G    1G          0G 100% /snap/core20/1822
/dev/loop1                  1G    1G          0G 100% /snap/lxd/24061
/dev/loop2                  1G    1G          0G 100% /snap/snapd/18357
tmpfs                      20G    0G         20G   0% /run/user/1000

In the example output, above:

  • The name of the file system is persistent-storage.

  • The size of the file system is 8589934592G (8 exabytes).

  • The available capacity of the file system is 8589934592G.

  • The used percentage of the file system is 0%.

  • The file system is mounted on /home/ubuntu/persistent-storage.

You can also use the Cloud API's /file-systems endpoint to find out your file system usage.

How do I use persistent storage to save datasets and system state?

You can use the Lambda Cloud Storage feature to save:

  • Large datasets that you don’t want to re-upload every time you start an instance

  • The state of your system, including software packages and configurations

You can have up to 24 persistent storage file systems.

Preserving the state of your system

For saving the state of your system, including:

  • Packages installed system-wide using apt-get

  • Python packages installed using pip

  • conda environments

We recommend creating containers using Docker or other software for creating containers.

You can also create a script that runs the commands needed to re-create your system state. For example:

sudo apt install PACKAGE_0 PACKAGE_1 PACKAGE_2 && \
pip install PACKAGE_3 PACKAGE_4 PACKAGE_5

Run the script each time you start an instance.

If you only need to preserve Python packages and not packages installed system-wide, you can create a Python virtual environment.

You can also create a conda environment.

For the highest performance when training, we recommend copying your dataset, containers, and virtual environments from persistent storage to your home directory. This can take some time but greatly increases the speed of training.

Last updated