Monday, 16 January 2017

The Sun in the Clouds: Using Amazon EC2 for High Performance Computing

Over the last few years HPC on demand has become possible, there is a strong market lead from Microsoft with it's Azure service and by Amazon with EC2.

Traditional HPC systems are normally shared systems using scheduling software  to optimise utilisation. Although these systems satisfy the needs of specific communities with large computational demands (e.g. the computational solar physics community)  running such a system for a large diverse community  is a challenge. The main issues are the wait time for s system to come online, the necessity for scheduling jobs and the operating system and software stack is dictated by the user community.

On demand research computing infrastructures such as those delivered by Amazon-EC2 and Azure can reduce the time taken to get a system on line with the software stack required b y the researcher. Such an environment is ideal for researchers working interactively developing research applications, analysing and visualising data. Setting up such systems is much easier with open source deployment software such as Alces flight.

In this post  we use Alces flight and Amazon EC2 to set up a compute node with solar physics software developed at The University of Sheffield e.g. SMAUG, SAC and pysac. We use amazon ec2 nodes with GPUs to run smaug and compare the costs  and performance for running codes on local HPC infrastructure and on Amazon EC2. We describe how to set up the software stack on Amazon EC2 using Alces flight.

To run the benchmarks the following steps were followed
  1. Set up and configure account with Amazon EC2
  2. Set up machine using Alces flight
  3. Configure flight and install the required packages
  4. Configure smaug and run the benchmarks
  5. Performance results

Set up and configure account with Amazon EC2

Setting Up and Configuring Account with EC2 is straightforward registration by visiting the AWS page at

After logging in with you account you are presented with the console which has a large range of services. We will use the EC2 service to get started we need to set up a keypair which will be used to allow secure access to the machine. To configure this from the EC2 console select key pairs under the network and security group. This shows a list of active keypairs that you own, there is also a button to enable creation and deletion of the key pairs. We also import the keypairs from here to our local/client machine so that we can access our machine.

The amazon documentation is very good here is a link to the information on setting up a keypair.
to access my amazon instances it was necessary to import the Amazon key (pem file) and copy into the .ssh folder under my home directory, note that permissions for this key must be set correctly e.g. using the command
chmod 400 my-key-pair.pem
For the putty and winscp clients from windows it is necessary to convert the pem key to a ppk file using puttygen. Details are given on the Amazon docs

Set up machine using Alces flight

Using alces flight has been made much easier with some excellent articles and documentation. The Amazon news article provides a good overview. 
New in AWS Marketplace: Alces Flight – Effortless HPC on Demand
However, to get started the online documentation is once again very good.
Flight Appliance Documentation

When we are configuring the machine it's important to use the Amazon spot market and initially to use instances of type t2.micro this reduces set up costs to a minimum. As soon as the machine has been configured we set the instance type to either g2.2xlarge or p2.8xlarge for our GPU tests

Setting up the instance with Alces flight

  1. Login to you amazon account select the EC2 dashboard click on instances to see all your machines
  2. Visit the alces flight documentation at select the launching aws instance option e.g. step by step instructions
  3. Goto amzon aws market place in the search box type “alces flight” and select community edition select region for pricing e.g. EU (Ireland), Select delivery type e.g. AMI or HPC instance
  4. From the next dialog check versio, region and EC2 instance type also check the key pair used (this should correspond to the name of one of the keys in your AWS managment session check under network&security -> keypairs, there is an option to create a key pair here
  5. Clicking one click launch will fire up the following messsage - check your AWS console under instances - get coffee while it initialises
  6. Login using ssh -Y -i “yourkey.pem” alces@XX.XXX.XX.X (the pem downloaded from console and should be visible from where you login)

Usage instructions and connection information for the instance  can be obtained by ticking the activated instance in the EC2 control panel and clicking the usage instructions tab or by right clicking the activated instance in control panel and selecting Connect from the drop down menu.

Configure flight and install the required packages

 When you have logged into your instance, the node should start with the prompt unconfigured and the warning message that it is not yet operational. Use the command alces configure node. For running single GPU tests with smaug we just require a master node without any slave nodes.

The following commands return the configuration information

alces about node
alces about identity
To start a gnome session use the command:
alces session start gnome
The session will respond with connection information which can be used with client applications such as tiger VNC.
Typical commands used from Alces flight are:
Commands to use from alces flight

'module avail'            - show available application environments
'module add <modulename>' - add a module to your current environment

'alces gridware'          - manage software for your environment
'alces howto'             - guides on how to use your research environment
'alces session'           - start and manage interactive sessions
'alces storage'           - configure and address storage facilities
'alces template'          - tailored job script templates

'qstat'                   - show summary of running jobs
'qsub'                    - submit a job script
'qdesktop'                - submit an interactive session request

'aws help'                - show help for AWS CLI

's3cmd --help'            - show help for S3cmd
's3cmd ls [<bucket>]'     - list objects or buckets
's3cmd put <file> <s3>'   - put file into bucket
's3cmd get <s3> <file>'   - get file from bucket

To install the required packages we use the alces gridware install command. Alces gridware is a powerful and extremely useful utility. To see the diverse range of packages which may be installed see the repository at
The following lines can be placed in a script file which may be run and can be used for testing and experimenting with the applications that we will use 


#set up packages using alces gridware
alces gridware install compilers/gcc
alces gridware install ffmpeg
alces gridware install imagemagick
alces gridware install graphicsmagick
alces gridware install grace
alces gridware install paraview
alces gridware install nvidia-cuda
alces gridware install anaconda3

Further information for working with the ALCES cluster is at
To install the required sac and smaug applications for running our tests the following commands were run from a script file


#script to upload an install projects in an alces flight installation

cd ~
mkdir proj

svn checkout --username mikeg64
svn checkout --username mikeg64

git clone

Configure smaug and run the benchmarks

To install the version of smaug on github (version 788e630) used for the performance tests the following git commands were used

cd smaug
git checkout 788e6303aa2ac670f8490f6e193587cbedf7383a

cd smaug
git reset --hard 788e630
We will be using the Orszag-Tang model test to set this change to the src folder of the smaug distribution and type
make ot

For compiling using the version of CUDA on the amazon instance load the cuda compiler module installed with alces gridware. To find the path to the nvcc compiler type which nvcc. The path should be used in the make_inputs file it will be necessary to change the CUDA and CUDACCFLAGS as illustrated below.

Compilation using
CUDA = /opt/gridware/depots/77cbfdae/el7/pkg/libs/nvidia-cuda/7.5.18/bin/toolkit
CUDACCFLAGS = --ptxas-options=-v -arch sm_20  -maxrregcount=32 -DUSE_SAC

It is necessary to set the architecture flag correctly guidelines for setting the architecture flag for GPUs are documented by NVIDIA, the GPU feature list is described at

In the CUDA naming scheme, GPUs are named sm_xy, where x denotes the GPU generation number, and y the version in that generation. Additionally, to facilitate comparing GPU capabilities, CUDA attempts to choose its GPU names such that if x1y1 <= x2y2 then all non-ISA related capabilities of sm_x1y1 are included in those of sm_x2y2. From this it indeed follows that sm_30 is the base Kepler model, and it also explains why higher entries in the tables are always functional extensions to the lower entries. This is denoted by the plus sign in the table. Moreover, if we abstract from the instruction encoding, it implies that sm_30's functionality will continue to be included in all later GPU generations. As we will see next, this property will be the foundation for application compatibility support by nvcc.

sm_20 (deprecated)
Basic features
+ Fermi support
sm_30 and sm_32
+ Kepler support
+ Unified memory programming
+ Dynamic parallelism support
sm_50, sm_52, and sm_53
+ Maxwell support

After configuring the machine we stop the instance state and change the instance type to g2.2xlarge ready to run the compiled version of smaug on the GPU.

The GPU infor for our Amazon image is as follows
Device Grid K520 on Amazon g2.2xlarge
4132864 GDR memory
CUDA ver 3.0
8 multiprocessors
0.797GHz clock

The configuration files for the Orszag-tang test are contained in a public compressed archive at the following link
The shortened link is 


To use these we change the model size and config filename in the iosmaugparams.h file in the include directory.

Performance results

The table below shows the timings obtained for a range of different NVIDIA GPUs these are the timings in seconds for 100 iterations

The speed up factor compared to a single core of the Intel Xeon X5650 Westmere CPU is shown in the plot below

Although our results demonstrate the K520  is slower than our results for the M2070  and the K20 it still demonstrates the usefulness of the K520 for running smaller models for MHD.  We will report on the results for the larger Amazon GPU instances as tests are continued. The ease with which we have been able to access and set up GPUs is highly encouraging.

Further details about SMAUG and benchtesting are given in the publication below.

Useful References

Guide for researchers by Amazon
AWS Global Data Egress Waiver removing the worry of estimating network traffic charges and set up invoice billing.

New P2 Instance Type for Amazon EC2 – Up to 16 GPUs

AWS Marketplace - alces flight instances

A Fast MHD Code for Gravitationally Stratified Media using Graphical Processing Units: SMAUG

Magnetohydrodynamic code for gravitationally-stratified media




No comments:

Post a Comment