Cluster

From RedwoodCenter
Revision as of 23:42, 16 April 2013 by Urs (talk | contribs) (Added details about current and future hardware.)
Jump to navigationJump to search

Video Tutorial

Taught by our cluster aficionado and computer master, Mayur Mudigonda. Please watch and or read below information it before asking him questions: [1]

General Information

Contrary to popularly held belief, our cluster is not a magical computational powerhorse that will take your code and make it run hundreds of times faster. Read on to find out what it is, and how you might utilize it.

We have about a dozen somewhat heterogeneous machines, many of which can be matched or exceeded in performance today by purchasing a $300-$400 desktop, or a laptop costing twice as much. There are exceptions to this. For example, there are a couple of machines which have graphics cards (GPUs) which cost about the same, but to take advantage of them, your code needs to be written specifically for the GPU using CUDA / OpenCL or it needs to call into libraries and packages which do that dirty work for you, such as PyCUDA / PyOpenCL or Jacket for Matlab. A few machines have a bit of extra memory (12-16G). The network connectivity is comparable to what we have in the lab (i.e. it is not some exotic ultra fast network interface utilizing a fancy topology).

Given the above, the typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs (see qsub further down on this page for the details) to the queue which may not start right away, but which will get run once their turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.

Hardware Overview

We are in the process of upgrading the oder machines in the cluster. We currently have:

 2x  n0000-n0001  Intel Xeon E5410 @ 2.33GHz 4-core	(2007) 		with Tesla GPUs
 10x n0002-n0011  AMD Opteron 248  @ 2.2GHz  2-core	(2003/4)	no GPU
 2x  n0012-n0013  Intel Xeon X5650 @ 2.66GHz 6-core	(2010) 	 	with Fermi GPUs

The AMD nodes will be replaced in April 2013 by

 10x n0002-n0011 Intel Xeon E3-1220 @ 3.1GHz 4-core (2012)

In addition to the compute nodes we own a file server

 NetOp 4TB

which is mounted as scratch space.


Getting an account and crypto card

If reading the above does not deter you, in order to get an account on the cluster, please send an email to Mayur Mudigonda (lastname AT berk...edu) with the following information:

   Full Name <emailaddress> desiredusername

You can also include a note about which PI you are working with. Note: the desireusername must be 3-8 characters long, so it would have been truncated to desireus in this case.

It takes the SCS folks about one to two weeks to make the accounts and ship a new crypto card token to you. You'll need to sign and fax back to them a form that arrives with the crypto card, and then leave the hardcopy of it in my box. -pi

Directory setup

home directory quota

There is a 10GB quota limit enforced on $HOME directory (/global/home/users/username) usage. Please keep your usage below this limit. There will NETAPP snapshots in place in this file system so we suggest you store only your source code and scripts in this area and store all your data under /clusterfs/cortex (see below).

In order to see your current quota and usage, use the following command:

 quota -s

data

For large amounts of data, please create a directory

 /clusterfs/cortex/scratch/username

and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.

Connect

get a password

  • Press the "PASSWORD" button to power on the CryptoCard. You will see "PIN?" request prompt
  • Enter your PIN, and press the "ENT" key.
  • You should see 7 digits presented like a phone number; this is your one-time password

ssh to the gateway computer (hadley)

note: please don't use the gateway for computations (e.g. matlab)!

 ssh -Y neuro-calhpc.berkeley.edu (or hadley.berkeley.edu) 

and use your crypto password

Setup environment

  • put all your customizations into your .bashrc
  • for login shells, .bash_profile is used, which in turn loads .bashrc

Using a Windows machine

Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:

  • Install a Unix environment emulator to interface directly with the cluster. Cygwin [2] seems to work well. Useful tools include the text-editor "vim" as well as the X-11 forwarding package for displaying graphics on your local machine. Login via:
ssh -Y [username]@hadley.berkeley.edu
  • Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [3] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.

Useful commands

Start interactive session on compute node

  • start interactive session: (use compression, -C, to improve X-11 forwarding speed)
 qsub -X -I
  • start interactive session on particular node (nodes n0000.cortex and n0001.cortex have GPUs):
 qsub -X -I -l nodes=n0001.cortex

Perceus commands

The perceus manual is here

  • listing available cluster nodes:
 wwstats
  • list cluster usage
 wwtop
  • to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc
 export NODES='*cortex'
  • module list
  • module avail
  • module help


Resource Manager PBS

  • Job Scheduler MOAB
  • List running jobs:
 qstat -a
  • List jobs of a given node:
 qstat -n 98
  • sample script
 #!/bin/bash
 
 #PBS -q cortex
 #PBS -l nodes=1:ppn=2:cortex
 #PBS -l walltime=01:00:00
 #PBS -o path-to-output
 #PBS -e path-to-error
 cd /global/home/users/kilian/sample_executables
 cat $PBS_NODEFILE
 mpirun -np 8 /bin/hostname
 sleep 60
  • submit script
 qsub scriptname
  • interactive session
 qsub -I -q cortex -l nodes=1:ppn=2:cortex -l walltime=00:15:00
  • flush STDOUT and STDERR to files in your home directory so you can tail the output of the job while it's running
 qsub -k oe scriptname
  • remove a queued/running job (you can get the job_id from qstat)
 qdel job_id
  • list nodes that your job is running on
 cat $PBS_NODEFILE
  • run the program on several cores
 mpirun -np 4 -mca btl ^openib sample_executables/mpi_hello

Finding out the list of occupants on each cluster node

  • One can find out the list of users using a particular node by ssh into the node, e.g.
 ssh n0000.cortex
  • After logging into the node, type
 top
  • This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.

Starting multiple jobs with one command

  • To use the cluster to iterate over a one-dimensional parameter space then you can simply use this information from the -t option of qsub from the man page :
  -t array_request
 Specifies the task ids of a job array.
 Single task arrays are allowed.  The
 array_request argument is an integer id or a
 range of  integers.  Multiple  ids  or  id
 ranges can be combined in a comma delimted  list.
 Examples : -t 1-100 or -t 1,10,50-100
 An  optional  slot  limit can be specified to  
 limit the amount of jobs that can run concur-
 rently in the job array. The default value is
 unlimited. The slot limit must  be  the  last
 thing specified in the array_request and is
 delimited from the array by a percent sign  (%).
  
 qsub script.sh -t 0-299%5
 
 This sets the slot limit to 5. Only 5 jobs
 from this array can run at the same time.
 Note: You can use  qalter  to  modify  slot
 limits  on  an  array.  The  server
 parameter max_slot_limit can be used to set a
 global slot limit policy.

When using -t to start an array of jobs, each job should use the environment variable called PBS_ARRAYID to figure out which parameter value to grab. In the case above, qsub script.sh -t 0-299%5 would launch 300 jobs, but only run 5 of those jobs at a time, and each job should figure out its job number by using the PBS_ARRAYID variable. Try it with a small set of jobs, where script.sh just does this: echo $PBS_ARRAYID to see how it would work.

The above option works for specific range of numbers to be employed. If you want to pass specific variables including strings, you can use the -v option to pass variables between shell files.

Software

Matlab

note: remember to start an interactive session before starting matlab!

In order to use matlab, you have to load the matlab environment:

 module load matlab

Once the matlab environment is loaded, you can start a matlab session by running

 matlab -nodesktop

An example PBS script for running matlab code is

 #!/bin/bash
 #PBS -q cortex
 # request 1 nodes with 2 CPUs 
 #PBS -l nodes=1:ppn=2
 # reserve time on the selected cores
 #PBS -l walltime=01:00:00
 module load matlab
 matlab -nodisplay -nojvm << EOF
 test # here you should have whatever you would normally type in the Matlab prompt
 exit
 EOF

If you would like to see who is using matlab licenses, enter

 lmstat

Python

We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).

Enthought Python Distribution (EPD)

We have the Enthought Python Distribution 7.2.0 installed [EPD]. In order to use it, you have to follow the following steps:

  • login to the gateway server using "ssh -Y" (see above)
  • start an interactive session using "qsub -I -X" (see above)
  • load the python environment module:
 module load python/epd
  • start ipython:
 ipython -pylab
  • run the following commands inside ipython to test the setup:
 from enthought.mayavi import mlab
 mlab.test_contour3d()


CUDA

CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: GPGPU. We have installed the CUDA 3.0 driver and toolkit.

In order to use CUDA, you have to load the CUDA environment:

 module load cuda

Obtain GPU lock in python

If you would like to use one of the GPU cards on node n0000 or n0001, please optain a GPU lock to make sure the card is not in use and that no one else will be using the card.

If you are using Python, you can obtain a GPU lock by running

 import gpu_lock
 gpu_lock.obtain_lock_id()

The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.

Obtain GPU lock for Jacket in Matlab

If you are using Matlab, you can obtain a GPU lock by running

 addpath('/clusterfs/cortex/software/gpu_lock');
 addpath('/clusterfs/cortex/software/jacket/engine');
 gpu_id = obtain_gpu_lock_id();
 gselect(gpu_id);

By default, obtain_gpu_lock() throws an error when all gpu cards are taken. There is another option: obtain_gpu_lock_id(true) will return -1 in case there is no card available and you can then write your own code to deal with that fact.

ginfo tells you which gpu card you are using.

The following lines should also be in your .bashrc

 ## jacket stuff!
 module load cuda
 export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH

CUDA SDK (Outdated since version change to 3.0)

You can install the CUDA SDK by running

 bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run

You can compile all the code examples by running

 module load X11
 module load Mesa/7.4.4
 cd ~/NVIDIA_GPU_Computing_SDK/C
 make

The compiled examples can be found in the directory

 ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release

note: The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like virtualgl.


Usage Tips

Here are some tips on how to effectively use the cluster.

Embarrassingly Parallel Submissions

Here is an alternate script to do embarrassingly parallel submissions on the cluster.

iterate.sh

 #!/bin/sh
 #Leap Size
 param1=11
 param2=1.2
 param3=.75
 #LeapSize
 for i in 14 15 16
 do
 #Epsilon
  for j in $(seq .8 .1 $param2);
      do
      #Beta
      for k in $(seq .65 .01 $param3);
            do
                echo $i,$j,$k
                qsub param_test.sh  -v "LeapSize=$i,Epsilon=$j,Beta=$k"
            done
      done
  done

param_test.sh

 #!/bin/bash
 #PBS -q cortex
 #PBS -l nodes=1:ppn=2:gpu
 #PBS -l walltime=10:35:00
 #PBS -o /global/home/users/mayur/Logs
 #PBS -e /global/home/users/mayur/Errors
 cd /global/home/users/mayur/HMC_reducedflip/
 module load matlab
 echo "Epsilon = ",$Epsilon
 echo "Leap Size = ",$LeapSize
 echo "Beta = ",$Beta
 matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"
  Now run ./iterate.sh

Mounting Cluster File System

Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.

On linux distributions you can mount your cluster home directory locally using sshfs [4]

 sshfs hadley.berkeley.edu: <mount-dir>

On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [5]

Support Requests

  • If you have a problem that is not covered on this page, you can send an email to our user list:
 redwood_cluster@lists.berkeley.edu
  • If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well.
 hpcshelp@lbl.gov
  • In urgent cases, you can also email Krishna Muriki (LBL User Services) directly.