7 - Can my data be recovered once I've terminated my instance?
Warning
We cannot recover your data once you’ve terminated your instance! Before
terminating an instance, make sure to back up all data that you want to keep.
If you want to save data even after you terminate your instance, create a
persistent filesystem.
Note
The persistent filesystem must be attached to your instance before you start
your instance. The persistent filesystem cannot be attached to your instance
after you start your instance.
When you create a persistent filesystem, a directory with the name of your
persistent filesystem is created in your home directory. For example, if the
name of your persistent filesystem is PERSISTENT-FILESYSTEM, the directory
is created at /home/ubuntu/PERSISTENT-FILESYSTEM. Data not stored in this
directory is erased once you terminate your instance and cannot be
recovered.
8 - Can you provide an estimate of how much a job will cost?
We can’t estimate how much your job will cost or how long it’ll take to
complete on one of our instances. This is because we don’t know the details of
your job, such as how your program works.
However, the performance of our instances is close to what you’d expect from
bare metal machines with the same GPUs.
In order to estimate how much your job will cost or how long it’ll take to
complete, we suggest you create an instance and benchmark your program.
Tip
Check out our GPU benchmarks to form
a general idea of the performance provided by our instances. Keep in mind that
real-world performance doesn’t always match the performance provided by
benchmarks.
We currently don’t support Kubernetes, also known as K8s.
10 - How are on-demand instances invoiced?
On-demand instances are billed in
one-minute increments from the moment you spin up (start) the instance up to
the moment you terminate (stop) the instance.
Warning
Be sure to terminate any instances that you’re not using!
You will be billed for all minutes that an instance is running, even if the
instance isn’t actively being used.
Invoices are sent weekly for the previous week’s usage.
Note
On-demand instances require us to maintain excess capacity at all times so we
can meet the changing workloads of our customers. For this reason, on-demand
instances are priced higher than reserved instances.
Conversely, we offer
reserved GPU Cloud instances
at a significant savings over on-demand instances, since they allow us to more
accurately determine our capacity needs ahead of time.
11 - How do I change my password?
To reset your Lambda Cloud password, visit the
Reset Password page.
12 - How do I get started using the dashboard?
The dashboard makes it easy to get
started using Lambda GPU Cloud.
Review the license agreements and terms of service. If you agree to them,
click I agree to the above to launch your instance.
In the dashboard, you should now see your instance listed. Once your instance
has finished booting, you’ll be provided with the details needed to begin
using your instance.
It currently isn’t possible to host a demo on an existing instance.
Note
The new instance hosting your demo can be used like any other Lambda GPU Cloud
on-demand instance. For example, you can SSH into the instance and
open Jupyter Notebook on the
instance.
The Demos feature can be hosted on multi-GPU instance types. However, Demos
uses only one of the GPUs.
Also, demos currently can’t be hosted on H100 instances.
Add a demo to your Lambda GPU Cloud account
In the left sidebar of the
dashboard, click Demos. Then,
click the Add demo button at the top-right of the dashboard.
The Add a demo dialog will appear.
Under Demo Source URL, enter the URL of the Git repository containing
your demo’s source code.
Note
The Demos feature looks in your Git repository for a file named
README.md. If the file doesn’t exist, or if the file doesn’t contain the
required properties, you’ll receive a Demo misconfigured error.
The README.mdmust have at the top a YAML block containing the
following:
Replace GRADIO-VERSION with the version of Gradio your demo is built
with, for example, 3.24.1.
Replace PATH-TO-APP-FILE with the path to your Gradio application file
(the file containing the Gradio
interface code),
relative to the root of your Git repository. For example, if your Gradio
application file is named app.py and is located in the root directory of
your Git repository, replace PATH-TO-APP-FILE with app.py.
Properties other than sdk, sdk_version, and app_file are ignored by
the Demos feature.
Unlisted if you want your demo accessible only by those who know your
demo’s URL.
Under Name, give your demo a name. If you choose to make your demo
public, the name of your demo will appear in the Lambda library of public
models. The name of your demo will also appear in your demo’s URL.
(Optional) Under Description, enter a description for your demo.
The description shows under the name of your demo in your library of demos.
If your demo is public, the description also shows under the name of your
demo in the Lambda library of public models.
Note
You can’t change the name or description of your demo once you add it.
However, you can delete your demo then add it again.
Click Add demo, then follow the prompts to launch a new instance to
host your demo.
Tip
To host a demo that’s already added to your account, in the
Demos dashboard, find the row
containing the demo you want to host, then click Host.
The link to your demo might temporarily appear in the Instances dashboard,
then disappear. This is expected behavior and doesn’t mean your instance or
demo is broken.
The models used by demos are often several gigabytes in size, and can take 5
to 15 minutes to download and load.
Once your instance is launched and your demo is accessible, a link with
your demo’s name will appear under the Demo column. Click the link to
access your demo.
Tip
To see a gallery of all of your demos, at the top-right of the Demos
dashboard, click the See your demos button.
Troubleshooting demos
If you experience trouble accessing your demo, the Demos logs can be helpful
for troubleshooting.
To view the Demos log files, SSH into your instance or open a terminal in
Jupyter Notebook, then run:
sudo bash -c 'for f in /root/virt-sysprep-firstboot.log ~demo/bootstrap.log; do printf "### BEGIN $f\n\n"; cat $f; printf "\n### END $f\n\n"; done > demos_debug_logs.txt; printf "### BEGIN journalctl -u lambda-demos.service\n\n$(journalctl -u lambda-demos.service)\n\n### END journalctl -u lambda-demos.service" >> demos_debug_logs.txt'
This command will produce a file named demos_debug_logs.txt containing the
logs for the Demos feature. You can review the logs from within your instance
by running less demos_debug_logs.txt. Alternatively, you can download the
file locally to review or share.
Note
The Lambda Support team provides only basic
support for the Demos feature. However, assistance might be available in the
community forum.
Here are some examples of how problems present in logs:
Misconfigured README.md file
### BEGIN /home/demo/bootstrap.log
Cloning into '/home/demo/source'...
Traceback (most recent call last):
File "<stdin>", line 17, in <module>
File "<stdin>", line 15, in load
File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 3 validation errors for Metadata
sdk
field required (type=value_error.missing)
sdk_version
field required (type=value_error.missing)
app_file
field required (type=value_error.missing)
Created symlink /etc/systemd/system/multi-user.target.wants/lambda-demos-error-server.service → /etc/systemd/system/lambda-demos-error-server.service.
Bootstrap failed: misconfigured
### END /home/demo/bootstrap.log
Not a Gradio app
### BEGIN /home/demo/bootstrap.log
Cloning into '/home/demo/source'...
Traceback (most recent call last):
File "<stdin>", line 17, in <module>
File "<stdin>", line 15, in load
File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 2 validation errors for Metadata
sdk
unexpected value; permitted: 'gradio' (type=value_error.const; given=docker; permitted=('gradio',))
sdk_version
field required (type=value_error.missing)
Created symlink /etc/systemd/system/multi-user.target.wants/lambda-demos-error-server.service → /etc/systemd/system/lambda-demos-error-server.service.
Bootstrap failed: misconfigured
### END /home/demo/bootstrap.log
14 - How do I get started using the Firewall feature?
The Firewall feature allows you to
configure firewall rules to restrict incoming traffic to your instances.
Note
Firewall rules configured using the Firewall feature apply to all of your
instances outside of the Texas, USA (us-south-1) region.
To use the Firewall feature:
Click Firewall in the left sidebar of the dashboard to open your
firewall settings.
Under General Settings, use the toggle next to Allow ICMP traffic
(ping) to allow or restrict incoming ICMP traffic to your instances.
Note
For network diagnostic tools such as ping and mtr to be able to reach
your instances, you need to allow incoming ICMP traffic.
Next to Inbound Rules, click Edit to configure incoming TCP and UDP
traffic rules.
In the drop-down menu under Type, select:
Custom TCP to manually configure a rule to allow incoming TCP traffic.
Custom UDP to manually configure a rule to allow incoming UDP traffic.
HTTPS to automatically configure a rule to allow incoming HTTPS traffic.
SSH to automatically configure a rule to allow incoming SSH traffic.
All TCP to automatically configure a rule to allow all incoming TCP traffic.
All UDP to automatically configure a rule to allow all incoming UDP traffic.
Warning
If you don’t have a rule to allow incoming traffic to port TCP/22, you
won’t be able to access your instances using SSH.
In the Source field, either:
Click the 🔎 to automatically enter your current IP address.
Enter a single IP address, for example, 203.0.113.1.
Enter an IP address range in CIDR notation, for example,
203.0.113.0/24.
To allow incoming traffic from any source, enter 0.0.0.0/0.
If you choose Custom TCP or Custom UDP, enter a Port range.
Port range can be:
A single port, for example, 8080.
A range of ports, for example, 8080-8081.
(Optional) Enter a Description for the rule.
(Optional) Click Add rule to add additional rules.
(Optional) Click the x next
to any rule you want to delete.
Click Update to apply your changes.
15 - How do I get started using the Team feature?
Create a team
In the dashboard, click Team at the bottom-left of the dashboard. Then,
click Invite at the top-right of the Team dashboard.
Enter the email address of the person you want to invite to your team.
Select their role in the team, either an Admin or a Member. Then,
click Send invitation.
Warning
Be sure to invite only trusted persons to your team!
Currently, the only differences between the Admin and Member roles are
that an Admin can:
Invite others to the team.
Remove others from the team.
Modify payment information.
Change the team name.
This means that a person with a Member role can, for example:
Launch instances that will incur charges.
Terminate instances that should continue to run.
Note
You can’t send an invitation to an email address already associated with a
Lambda Cloud account. If you try to, you’ll be presented with a message
that says there is already a Lambda Cloud account associated with the email
address you’re trying to send an invitation to.
The person you’re inviting to your team must first close their existing
Lambda Cloud account before they can be invited to your team.
The person you invited to your team will receive an email letting them know
that they’ve been invited to a team on Lambda Cloud.
In that email, they should click Join the Team.
Note
Until the person you invited to your team accepts their invitation, they
will be listed in the Team dashboard as Invitation pending.
You can delete the invitation while it’s pending by clicking ⋮ where
the person is listed in your Team dashboard, then choosing Delete
invitation.
Note
If the person you invited to your team doesn’t receive their invitation,
you have to delete their invitation then invite them again.
In the Team dashboard of the person you invited to your team, the person will
see that they are on your team. In your Team dashboard, you’ll see the person
you invited listed.
Change a teammate’s role
To change the role of a person on your team from Member to Admin, click
⋮ where the person is listed in your Team dashboard, then choose Change
to Admin.
Conversely, to change the role of a person on your team from Admin to
Member, click ⋮ where the person is listed in your Team dashboard, then
choose Change to Member.
Close a teammate’s account
To close a teammate’s account, click the ⋮ where your teammate is listed
in your Team dashboard. Then, choose Deactivate user.
Warning
Carefully review the information in the dialog box that pops up.
Change team name
To change the name of your team, click Settings at the bottom-left of the
dashboard, then click Edit team name. Enter a new name for your team, then
click Update team name.
16 - How do I learn my instance's private IP address and other info?
To learn your instance’s private IP address, SSH into your instance and run:
ip -4 -br addr show | grep '10.'
The above command will output, for example:
enp5s0 UP 10.19.60.24/20
In the above example, the instance’s private IP address is 10.19.60.24.
Tip
If you want your instance’s private IP address and only that address,
run the following command instead:
ip -4 -br addr show | grep -Eo '10\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
The above command will output, for example:
10.19.60.24
Learn what ports on your instance are publicly accessible
You can use Nmap to learn what ports on your instance are publicly
accessible, that is, reachable over the Internet.
Note
The instructions, below, assume you’re running Ubuntu on your computer.
First, install Nmap on your computer (not on your instance) by running:
sudo apt install -y nmap
Next, run:
nmap -Pn INSTANCE-IP-ADDRESS
Replace INSTANCE-IP-ADDRESS with your instance’s IP address, which you can
get from the Cloud dashboard.
The command will output, for example:
Starting Nmap 7.80 ( https://nmap.org ) at 2023-01-11 13:22 PST
Nmap scan report for 129.159.46.35
Host is up (0.041s latency).
Not shown: 999 filtered ports
PORT STATE SERVICE
22/tcp open ssh
Nmap done: 1 IP address (1 host up) scanned in 6.42 seconds
In the above example, TCP port 22 (SSH) is publicly accessible.
Note
If nmap doesn’t show TCP/22 (SSH) or any other ports open, your:
To allow incoming connections to ports other than TCP port 22, the instance’s
default firewall rules need to be modified.
iptables can be used to modify the firewall rules or disable the firewall
entirely.
Warning
The instructions below completely disable the firewall on the instance and
don’t follow best security practices.
For security, firewalls should be configured to allow incoming connections
only to the ports needed for specific services. For example:
TCP/80 and TCP/443 for http and https, respectively
TCP/22 for ssh
TCP/25 for smtp
The instructions below provide a quick, simple, but insecure way to ensure
network traffic between instances isn’t being blocked by a firewall.
We highly recommend that you:
Read the iptablesmanual page to
learn more advanced usage of iptables, including how to fine-tune firewall
rules
Configure your firewall rules to allow incoming connections only to the
ports needed for the services that you’re running
To entirely disable the firewall on the instance, run the following commands:
sudo iptables -P INPUT ACCEPT # accept all incoming traffic by defaultsudo iptables -P OUTPUT ACCEPT # accept all outgoing traffic by defaultsudo iptables -P FORWARD ACCEPT # accept all forwarded traffic by defaultsudo iptables -F # delete all rules in the filter tablesudo iptables -t nat -F # delete all rules in the nat tablesudo iptables -t mangle -F # delete all rules in the mangle tablesudo iptables -X # delete all non-built-in chains
To keep the firewall disabled after rebooting, run:
sudo iptables-save | sudo tee /etc/iptables/rules.v4 > /dev/null
Note
The firewall that restricts traffic from the Internet to your Lambda GPU Cloud
reserved instances is managed by Lambda.
For the highest performance when training, we recommend copying your dataset,
containers, and virtual environments from persistent storage to your home
directory. This can take some time but greatly increases the speed of
training.
24 - How long does it take for instances to launch?
Single-GPU instances usually take 3-5 minutes to launch.
Multi-GPU instances usually take 10-15 minutes to launch.
Note
Jupyter Notebook and
Demos can take a few minutes after an
instance launches to become accessible.
Note
Billing starts the moment an instance begins booting.
25 - Is it possible to open ports other than for SSH?
By default, all ports are open to TCP and UDP traffic. ICMP traffic is also
allowed by default.
It’s possible to allow more than one SSH key to access your instance. To do
so, you need to add public keys to ~/.ssh/authorized_keys. You can do
this with the echo command.
Note
This FAQ assumes that you’ve already generated another SSH key pair, that is,
a private key and a public key.
Your account will be permanently banned from Lambda GPU Cloud. Your account
will be referred for collection. Legal action may be taken against you.
29 - What network bandwidth does Lambda GPU Cloud provide?
Utah, USA region (us-west-3)
The bandwidth between instances in our Utah, USA region (us-west-3) can be up
to 200 Gbps.
Bandwidth to the Internet can be up to 20 Gbps.
Texas, USA region (us-south-1)
The bandwidth between instances in our Texas, USA region (us-south-1) can be
up to 200 Gbps.
Bandwidth to the Internet can be up to 20 Gbps.
Arizona, USA region (us-west-2)
The bandwidth between instances in our Arizona, USA region (us-west-2) can be
up to 3.7 Gbps when using the instances’ private IP addresses.
When using the instances’ public IP addresses, the bandwidth can be up to
3.5 Gbps.
Bandwidth to the Internet can be up to 10 Gbps.
Virginia, USA region (us-east-1)
The bandwidth between instances in our Virginia, USA region (us-east-1) can be
up to 7.2 Gbps when using the instances’ private IP addresses.
Bandwidth to the Internet can be up to 10 Gbps.
Arizona, USA (us-west-2) ↔ Virginia, USA (us-east-1)
The bandwidth between instances in our Arizona, USA (us-west-2) and Virginia,
USA (us-east-1) regions can be up to 100 Mbps.
Note
The bandwidth tests for connections between instances were performed using
iPerf3.
The bandwidth tests for connections to the Internet were performed using
Speedtest CLI.
Note
Real-world network bandwidth depends on a variety of factors, including the
total number of connections opened by your applications.
Note
We’re in the process of testing the network bandwidth in our other regions.
30 - What should I do about timeout waiting for RPC from GSP errors?
If you’re seeing in your instance’s logs error messages about Timeout waiting for RPC from GSP!, the system software installed on your instance needs to be
upgraded.
Note
nvidia-smi might also produce output similar to the following:
31 - Why am I seeing an error about NMI received for unknown reason?
You can safely disregard the error message: “Uhhuh. NMI received for unknown
reason […] .”
This error message might show up in, for example:
The log file /var/log/syslog.
The output of the command dmesg.
The output of the command journalctl.
The error message results from a bug in AMD’s newer processors, including
processors used in our servers. The bug has no impact other than causing the
“NMI received for unknown reason” error message to appear in system logs.
Tip
To learn more about the “NMI received for unknown reason” error message, see:
32 - Why are some instance types grayed out when I try to launch an instance?
If you try to launch an instance from the dashboard and see that the instance
type you want is grayed out, then we’re currently at capacity for that
instance type.
33 - Why can't my program find the NVIDIA cuDNN library?
Unfortunately, the
NVIDIA cuDNN license
limits how cuDNN can be used on our instances.
On our instances, cuDNN can only be used by the PyTorch® framework and
TensorFlow library installed as part of
Lambda Stack.
Other software, including PyTorch and TensorFlow installed outside of Lambda
Stack, won’t be able to find and use the cuDNN library installed on our
instances.
Tip
Software outside of Lambda Stack usually looks for the cuDNN library files in
/usr/lib/x86_64-linux-gnu. However, on our instances, the cuDNN library
files are in /usr/lib/python3/dist-packages/tensorflow.
Creating symbolic links, or “symlinks,” for the cuDNN library files might
allow your program to find the cuDNN library on our instances.
Run the following command to create symlinks for the cuDNN library files:
for cudnn_so in /usr/lib/python3/dist-packages/tensorflow/libcudnn*;do sudo ln -s "$cudnn_so" /usr/lib/x86_64-linux-gnu/
done
34 - Why is my credit or debit card being declined?
Common reasons why credit and debit card transactions are declined include:
The card is a prepaid card
We don’t accept prepaid cards. We only accept major credit and debit cards.
The purchase is being made from a country we don’t support
We currently only support customers in the following regions:
United States
Canada
Chile
Iceland
United Arab Emirates
Saudi Arabia
South Africa
Israel
Taiwan
South Korea
Japan
Singapore
Australia
New Zealand
United Kingdom
Switzerland
European Union
The purchase is being made while you’re connected to a VPN
Purchases made while using a VPN are flagged as suspicious.
The card issuer is denying our pre-authorization charge
We make a $10 pre-authorization charge to a card before accepting it for
payment, similar to how gas stations and hotels do. If the card issuer denies
the pre-authorization charge, then we can’t accept the card for payment.
Wrong CVV or ZIP Code is being entered
Card purchases won’t go through if the CVV (security code) is entered
incorrectly. Also, card purchases will be denied if the ZIP Code doesn’t match
with the card billing address.