ParaTools Pro for E4S™ Getting Started with Google Cloud Platform (GCP)¶
General Background Information¶
In the following tutorial, we roughly follow the same steps as the "quickstart tutorial" from the Google HPC-Toolkit project. For the purposes of this tutorial, we make the following assumptions:
- You have created a Google Cloud account.
- You have created a Google Cloud project appropriate for this tutorial and it is selected.
- You have setup billing for your Google Cloud Project.
- You have enabled the Compute Engine API.
- You have enabled the Filestore API.
- You have enabled the Cloud Storage API
- You have enabled the Sevice Usage API.
- You have enabled the Secret Manager API.
- You are aware of the costs for running instances on GCP Compute Engine and of the costs of using the ParaTools Pro for E4S™ GCP marketplace VM image.
- You are comfortable using the GCP Cloud Shell, or are running locally (which will match this tutorial) and are familiar with SSH, a terminal and have installed and initialized the gcloud CLI
Tutorial¶
Getting Set Up¶
First, let's grab your PROJECT_ID
and PROJECT_NUMBER
.
Navigate to the GCP project selector and select the project that you'll be using for this tutorial.
Take note of the PROJECT_ID
and PROJECT_NUMBER
Open your local shell or the GCP Cloud Shell, and run the following commands:
Set a default project you will be using for this tutorial. If you have multiple projects you can switch back to a different one when you are finished.
Next, ensure that the default Compute Engine service account is enabled:
gcloud iam service-accounts enable \
--project="${PROJECT_ID}" \
${PROJECT_NUMBER}-compute@developer.gserviceaccount.com
roles/editor
IAM role to the service account:
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
--member=serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com \
--role=roles/editor
Install the Google Cloud HPC-Toolkit¶
First install the dependencies of ghpc
. Instructions to do this are included below.
If you encounter trouble please check the latest instructions from Google,
available here. If you are running the google cloud shell you do not need to install the dependencies and can skip to cloning the hpctoolkit.
Install the Google Cloud HPC-Toolkit Prerequisites
Please download and install any missing software packages from the following list:
- Terraform version 1.2.0 or later
- Packer version 1.7.9 or later
- Go version 1.188 or later. Ensure that the
GOPATH
is setup andgo
is on yourPATH
. You may need to add the following to.profile
or.bashrc
startup "dot" file: - Git
make
(see below for instructions specific to your OS)
Note
Most of the packages above may be installable through your OSes package manager.
For example, if you have Homebrew on macOS you should be able to brew install <package_name>
for most of these items, where <package_name>
is, e.g., go
.
Once all the software listed above has been verified and/or installed, clone the Google Cloud HPC-Toolkit and change directories to the cloned repository:
Next build the HPC-Toolkit and verify the version and that it built correctly. If you would like to install the compiled binary to a location on your$PATH
,
run
to install the ghpc
binary into /usr/local/bin
, of if you do not have root
priviledges or do not want to install the binary into a system wide location, run
to install ghpc
into ${HOME}/bin
and then ensure this is on your path:
Grant ADC access to Terraform and Enable OS Login¶
Generate cloud credentials associated with your Google Cloud account and grant Terraform access to the Aplication Default Credential (ADC).
Note
If you are using the Cloud Shell you can skip this step.
To be able to connect to VMs in the cluster OS Login must be enabled. Unless OS Login is already enabled at the organization level, enable it at the project level. To do this, run:
Deploy the Cluster¶
Copy the ParaTools-Pro-slurm-cluster-blueprint-example from the
ParaTools Pro for E4S™ documentation to your clipboard, then paste it into a file named
ParaTools-Pro-Slurm-Cluster-Blueprint.yaml
. After copying the text, in your terminal
do the following:
cat > ParaTools-Pro-Slurm-Cluster-Blueprint.yaml
# paste the copied text # (1)
# press Ctrl-d to add an end-of-file character
cat ParaTools-Pro-Slurm-Cluster-Blueprint.yaml # Check the file copied correctly #(2)
-
Note
UsuallyCtrl-v
, orCommand-v
on macOS -
Note
This is optional, but usually a good idea
Using your favorite editor, select appropriate instance types for the compute partitions, and remove the h3 partition if you do not have access to h3 instances yet. See the expandable annotations and pay extra attention to the highlighted lines on the ParaTools-Pro-slurm-cluster-blueprint-example example.
Pay Attention
In particular:
- Determine if you want to pass the
${PROJECT_ID}
on the command line or in the blueprint - Verify that the
image_family
key matches the image for ParaTools Pro for E4S™ from the GCP marketplace - Adjust the region and zone used, if desired
- Limit the IP
ranges
to those you will be connecting from via SSH in thessh-login
firewall_rules
rule, if in a production setting. If you plan to connect only from the cloud shell thessh-login
firewall_rules
rule may be completely removed. - Set an appropriate
machine_type
anddynamic_node_count_max
for yourcompute_node_group
.
Once the blue print is configured to be consistent with your GCP usage quotas and your preferences, set deployment variables and create the deployment folder.
Create deployment folder
./ghpc create e4s-23.11-cluster-slurm-gcp-5-9-hpc-rocky-linux-8.yaml \
--vars project_id=${PROJECT_ID} # (1)!
-
Note
If you uncommented and updated thevars.project_id:
you do not need to pass--vars project_id=...
on the command line. If you're bringing a cluster back online that was previously deleted, but the blueprint has been modified and the deployment folder is still present, the-w
flag will let you overwrite the deployment folder contents with the latest changes.
Note
It may take a few minutes to finish provisioning your cluster.
Now the cluster can be deployed. Run the following command to deploy your ParaTools Pro for E4S™ cluster:
At this point you will be prompted to review or accept the proposed changes.
You may review them if you like, but you should press a
for accept once satisfied.
Connect to the Cluster¶
Once the cluster is deployed, ssh to the login node.
-
Go to the Compute Engine > VM Instances page.
-
Click on
ssh
for the login node of the cluster. You may need to approve Google authentication before the session can connect.
Deletion of the Cluster¶
It is very important that when you are done using the cluster you must use ghcp to destroy it. If your instances were deleted in a different manner, see here. To delete your cluster correctly do
At this point you will be prompted to review or accept the proposed changes. You may review them if you like, but you should pressa
for accept once satisfied and the deletion will proceed.