ParaTools Pro for E4S™ Getting Started with Google Cloud Platform (GCP)¶

General Background Information¶

In the following tutorial, we roughly follow the same steps as the "quickstart tutorial" from the Google HPC-Toolkit project. For the purposes of this tutorial, we make the following assumptions:

You have created a Google Cloud account.
You have created a Google Cloud project appropriate for this tutorial and it is selected.
You have setup billing for your Google Cloud Project.
You have enabled the Compute Engine API.
You have enabled the Filestore API.
You have enabled the Cloud Storage API
You have enabled the Sevice Usage API.
You have enabled the Secret Manager API.
You are aware of the costs for running instances on GCP Compute Engine and of the costs of using the ParaTools Pro for E4S™ GCP marketplace VM image.
You are comfortable using the GCP Cloud Shell, or are running locally (which will match this tutorial) and are familiar with SSH, a terminal and have installed and initialized the gcloud CLI

Tutorial¶

Getting Set Up¶

First, let's grab your PROJECT_ID and PROJECT_NUMBER. Navigate to the GCP project selector and select the project that you'll be using for this tutorial. Take note of the PROJECT_ID and PROJECT_NUMBER Open your local shell or the GCP Cloud Shell, and run the following commands:

export PROJECT_ID=<enter your project ID here>
export PROJECT_NUMBER=<enter your project number here>

Set a default project you will be using for this tutorial. If you have multiple projects you can switch back to a different one when you are finished.

gcloud config set project "${PROJECT_ID}"

Next, ensure that the default Compute Engine service account is enabled:

gcloud iam service-accounts enable \
     --project="${PROJECT_ID}" \
     ${PROJECT_NUMBER}-compute@developer.gserviceaccount.com

and add the roles/editor IAM role to the service account:

gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
    --member=serviceAccount:${PROJECT_NUMBER}-compute@developer.gserviceaccount.com \
    --role=roles/editor

Install the Google Cloud HPC-Toolkit ¶

First install the dependencies of ghpc. Instructions to do this are included below. If you encounter trouble please check the latest instructions from Google, available here. If you are running the google cloud shell you do not need to install the dependencies and can skip to cloning the hpctoolkit.

Install the Google Cloud HPC-Toolkit Prerequisites

Please download and install any missing software packages from the following list:

Terraform version 1.2.0 or later
Packer version 1.7.9 or later
Go version 1.188 or later. Ensure that the GOPATH is setup and go is on your PATH. You may need to add the following to .profile or .bashrc startup "dot" file:
```
export PATH=$PATH:$(go env GOPATH)/bin
```
Git
make (see below for instructions specific to your OS)

macOSUbuntu/DebianCentOS/RHEL

make is packaged with the Xcode command line developer tools on macOS. To install, run:

xcode-select --install

Install make with the OS' package manager:

apt-get -y install make

Install make with the OS' package manager:

yum install -y make

Note

Most of the packages above may be installable through your OSes package manager. For example, if you have Homebrew on macOS you should be able to brew install <package_name> for most of these items, where <package_name> is, e.g., go.

Once all the software listed above has been verified and/or installed, clone the Google Cloud HPC-Toolkit and change directories to the cloned repository:

git clone https://github.com/GoogleCloudPlatform/hpc-toolkit.git
cd hpc-toolkit/

Next build the HPC-Toolkit and verify the version and that it built correctly.

make
./ghpc --version

If you would like to install the compiled binary to a location on your $PATH, run

sudo make install

to install the ghpc binary into /usr/local/bin, of if you do not have root priviledges or do not want to install the binary into a system wide location, run

make install-user

to install ghpc into ${HOME}/bin and then ensure this is on your path:

export PATH="${PATH}:${HOME}/bin"

Generate cloud credentials associated with your Google Cloud account and grant Terraform access to the Aplication Default Credential (ADC).

Note

If you are using the Cloud Shell you can skip this step.

gcloud auth application-default login

To be able to connect to VMs in the cluster OS Login must be enabled. Unless OS Login is already enabled at the organization level, enable it at the project level. To do this, run:

gcloud compute project-info add-metadata \
     --metadata enable-oslogin=TRUE

Deploy the Cluster¶

Copy the ParaTools-Pro-slurm-cluster-blueprint-example from the ParaTools Pro for E4S™ documentation to your clipboard, then paste it into a file named ParaTools-Pro-Slurm-Cluster-Blueprint.yaml. After copying the text, in your terminal do the following:

cat > ParaTools-Pro-Slurm-Cluster-Blueprint.yaml
# paste the copied text # (1)
# press Ctrl-d to add an end-of-file character
cat ParaTools-Pro-Slurm-Cluster-Blueprint.yaml # Check the file copied correctly #(2)

Note
Usually Ctrl-v, or Command-v on macOS
Note
This is optional, but usually a good idea

Using your favorite editor, select appropriate instance types for the compute partitions, and remove the h3 partition if you do not have access to h3 instances yet. See the expandable annotations and pay extra attention to the highlighted lines on the ParaTools-Pro-slurm-cluster-blueprint-example example.

Pay Attention

In particular:

Determine if you want to pass the ${PROJECT_ID} on the command line or in the blueprint
Verify that the image_family key matches the image for ParaTools Pro for E4S™ from the GCP marketplace
Adjust the region and zone used, if desired
Limit the IP ranges to those you will be connecting from via SSH in the ssh-login firewall_rules rule, if in a production setting. If you plan to connect only from the cloud shell the ssh-login firewall_rules rule may be completely removed.
Set an appropriate machine_type and dynamic_node_count_max for your compute_node_group.

Once the blue print is configured to be consistent with your GCP usage quotas and your preferences, set deployment variables and create the deployment folder.

Create deployment folder

./ghpc create e4s-23.11-cluster-slurm-gcp-5-9-hpc-rocky-linux-8.yaml \
  --vars project_id=${PROJECT_ID} # (1)!

Note
If you uncommented and updated the vars.project_id: you do not need to pass --vars project_id=... on the command line. If you're bringing a cluster back online that was previously deleted, but the blueprint has been modified and the deployment folder is still present, the -w flag will let you overwrite the deployment folder contents with the latest changes.

Note

It may take a few minutes to finish provisioning your cluster.

Now the cluster can be deployed. Run the following command to deploy your ParaTools Pro for E4S™ cluster:

Perform the deployment

./ghpc deploy ppro-4-e4s-23-11-cluster-slurm-rocky8

At this point you will be prompted to review or accept the proposed changes. You may review them if you like, but you should press a for accept once satisfied.

Connect to the Cluster¶

Once the cluster is deployed, ssh to the login node.

Go to the Compute Engine > VM Instances page.

GCP VM Instances
Click on ssh for the login node of the cluster. You may need to approve Google authentication before the session can connect.

Deletion of the Cluster¶

It is very important that when you are done using the cluster you must use ghcp to destroy it. If your instances were deleted in a different manner, see here. To delete your cluster correctly do

./ghpc destroy ppro-4-e4s-23-11-cluster-slurm-rocky8

At this point you will be prompted to review or accept the proposed changes. You may review them if you like, but you should press a for accept once satisfied and the deletion will proceed.

ParaTools Pro for E4S™ Getting Started with Google Cloud Platform (GCP)¶

General Background Information¶

Tutorial¶

Getting Set Up¶

Install the Google Cloud HPC-Toolkit¶

Grant ADC access to Terraform and Enable OS Login¶

Deploy the Cluster¶

Connect to the Cluster¶

Deletion of the Cluster¶

Install the Google Cloud HPC-Toolkit ¶