https://rctn.org/w/api.php?action=feedcontributions&user=Jesselivezey&feedformat=atomRedwoodCenter - User contributions [en]2024-03-28T19:23:25ZUser contributionsMediaWiki 1.39.4https://rctn.org/w/index.php?title=Cluster&diff=8672Cluster2017-01-11T00:45:43Z<p>Jesselivezey: /* Connect */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we ave a 17TB file server at<br />
/clusterfs/cortex/users<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/authentication/linotp-usage] to set up the Google Authenticator application, which gives you a one-time password for logging into the cluster.<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/users/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 17 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
==== Google Authenticator App (get a password) ====<br />
<br />
* Open the google Authenticator App<br />
* Enter your personal pin<br />
* Enter the one-time pin<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8671Cluster2017-01-11T00:44:53Z<p>Jesselivezey: /* Pledge App (get a password) */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we ave a 17TB file server at<br />
/clusterfs/cortex/users<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/authentication/linotp-usage] to set up the Google Authenticator application, which gives you a one-time password for logging into the cluster.<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/users/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 17 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Google Authenticator App (get a password) ====<br />
<br />
* Open the google Authenticator App<br />
* Enter your personal pin<br />
* Enter the one-time pin<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8670Cluster2017-01-11T00:43:42Z<p>Jesselivezey: /* Getting an account and one-time password service */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we ave a 17TB file server at<br />
/clusterfs/cortex/users<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/authentication/linotp-usage] to set up the Google Authenticator application, which gives you a one-time password for logging into the cluster.<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/users/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 17 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8635Cluster2016-10-12T18:12:24Z<p>Jesselivezey: /* Data */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we ave a 17TB file server at<br />
/clusterfs/cortex/users<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/users/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 17 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8634Cluster2016-10-12T18:10:13Z<p>Jesselivezey: /* Data */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we ave a 17TB file server at<br />
/clusterfs/cortex/users<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 17 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8633Cluster2016-10-12T18:09:39Z<p>Jesselivezey: /* Hardware Overview */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we ave a 17TB file server at<br />
/clusterfs/cortex/users<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8632Cluster2016-10-12T18:01:43Z<p>Jesselivezey: /* Job Management */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8631Cluster2016-10-12T18:01:23Z<p>Jesselivezey: /* Data */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood Center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster-Software&diff=8630Cluster-Software2016-10-12T18:00:00Z<p>Jesselivezey: /* Using Theano */</p>
<hr />
<div>= Software =<br />
<br />
== Matlab ==<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2<br />
or<br />
module load python/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing.<br />
The <br />
<br />
#SBATCH--constraint={cortex_k40, cortex_fermi} <br />
#SBATCH --gres=gpu:1<br />
<br />
options must be used in order to schedule a node with a GPU.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
By default, Theano expects the default compiler to be gcc, so you'll need to unload the intel compiler.<br />
<br />
module unload intel<br />
<br />
Theano caches certain compiled functions and these will sometimes cause errors when Theano or CUDA gets updated. If you are experiencing problems with Theano, you can try clearing the cache with<br />
theano-cache clear<br />
and if you still have problems you can delete the .theano folder from your home directory.<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano use the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster-Software&diff=8629Cluster-Software2016-10-12T17:59:00Z<p>Jesselivezey: /* CUDA */</p>
<hr />
<div>= Software =<br />
<br />
== Matlab ==<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2<br />
or<br />
module load python/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing.<br />
The <br />
<br />
#SBATCH--constraint={cortex_k40, cortex_fermi} <br />
#SBATCH --gres=gpu:1<br />
<br />
options must be used in order to schedule a node with a GPU.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
By default, Theano expects the default compiler to be gcc, so you'll need to unload the intel compiler.<br />
<br />
module unload intel<br />
<br />
Theano caches certain compiled functions and these will sometimes cause errors when Theano or CUDAgets updated. If you are experiencing problems with Theano, you can try clearing the cache with<br />
theano-cache clear<br />
and if you still have problems you can delete the .theano folder from your home directory.<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster-Software&diff=8628Cluster-Software2016-10-12T17:56:05Z<p>Jesselivezey: /* CUDA */</p>
<hr />
<div>= Software =<br />
<br />
== Matlab ==<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2<br />
or<br />
module load python/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing.<br />
The <br />
<br />
#SBATCH--constraint={cortex_k40, cortex_fermi} <br />
#SBATCH --gres=gpu:1<br />
<br />
options must be used in order to schedule a node with a GPU. We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
By default, Theano expects the default compiler to be gcc, so you'll need to unload the intel compiler.<br />
<br />
module unload intel<br />
<br />
Theano caches certain compiled libraries and these will sometimes cause errors when Theano gets updated. If you are experiencing problems with Theano, you can try clearing the cache with<br />
theano-cache clear<br />
and if you still have problems you can delete the .theano folder from your home directory.<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Seminars&diff=8613Seminars2016-09-27T21:32:15Z<p>Jesselivezey: /* Tentative / Confirmed Speakers */</p>
<hr />
<div>== Instructions ==<br />
<br />
# Check the internal calendar (here) for a free seminar slot. Seminars are usually Wednesdays at noon, but it is flexible in case there is a day that works better for the speaker. However, it is usually best to avoid booking multiple speakers in the same week - it leads to "seminar burnout" and reduced attendance. But use your own judgement here - if its a good opportunity and that's the only time that works then go ahead with it.<br />
# Once you have proposed a date to a speaker, fill in the speaker information under the appropriate date (or change if necessary). Use the status field to indicate whether the date is tentative or confirmed. Please also include your name as ''host'' in case somebody wants to contact you.<br />
# Once the invitation is confirmed with the speaker, change the status field to 'confirmed'. Also notify the webmaster (Bruno) [mailto:baolshausen@berkeley.edu] that we have a confirmed speaker so that he/she can update the public web page. Please include a title and abstract.<br />
# Natalie (HWNI) checks our web page regularly and will send out an announcement a week before and also include with the weekly neuro announcements, but if you don't get it confirmed until the last minute then make sure to email Natalie [mailto:nrterranova@berkeley.edu] as well to give her a heads up so she knows to send out an announcement in time.<br />
# If the speaker needs accommodations you should contact Natalie [mailto:nrterranova@berkeley.edu] to reserve a room at the faculty club. Tell her its for a Redwood speaker so she knows how to bill it.<br />
# During the visit you will need to look after the visitor, schedule visits with other labs, make plans for lunch, dinner, etc., and introduce at the seminar (don't ask Bruno to do this at the last moment). Save receipts for any meals you paid for.<br />
# After the seminar and before the speaker leaves, make sure to give them Natalie's contact info and have them email her their receipts, explaining its for reimbursement for a Redwood seminar. Natalie will then process the reimbursement. She can also help you with getting reimbursed for any expenses you incurred for meals and entertainment.<br />
<br />
== Tentative / Confirmed Speakers ==<br />
<br />
'''Sept. 27, 2016'''<br />
* Speaker: Yoshua Bengio<br />
* Time: 11:00<br />
* Affiliation: Univ Montreal<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''Oct/Nov. TBD, 2016'''<br />
* Speaker: Alexander Stubbs<br />
* Time: 12:00<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno/Michael Levy<br />
* Status: tentative<br />
* Title: Could chromatic aberration allow for an alternative evolutionary pathway towards color vision?<br />
* Abstract: We present a mechanism by which organisms with only a single photoreceptor, which have a monochromatic view of the world, can achieve color discrimination. An off-axis pupil and the principle of chromatic aberration (where different wavelengths come to focus at different distances behind a lens) can combine to provide “color-blind” animals with a way to distinguish colors. As a specific example, we constructed a computer model of the visual system of cephalopods (octopus, squid, and cuttlefish) that have a single unfiltered photoreceptor type. We compute a quantitative image quality budget for this visual system and show how chromatic blurring dominates the visual acuity in these animals in shallow water. This proposed mechanism is consistent with the extensive suite of visual/behavioral and physiological data that has been obtained from cephalopod studies and offers a possible solution to the apparent paradox of vivid chromatic behaviors in color blind animals. Moreover, this proposed mechanism has potential applicability in organisms with limited photoreceptor complements, such as spiders and dolphins.<br />
<br />
'''Oct. 05, 2016'''<br />
* Speaker: Paul Rhodes<br />
* Time: 12:00<br />
* Affiliation: Specific Technologies<br />
* Host: Dylan/Bruno<br />
* Status: unconfirmed<br />
* Title: A novel and important problem in spatiotemporal pattern classification<br />
* Abstract: Specific Technologies uses a sensor response that consists of a vector time series, a spatiotemporal fingerprint, to classify bacteria at the strain level during their growth. The identification of resistant strains of bacteria has become one of the world's great problems (here is a link to a $20M prize that the US govt has issued: https://www.nih.gov/news-events/news-releases/federal-prize-competition-seeks-innovative-ideas-combat-antimicrobial-resistance). We are using deep convolutional nets to do this classification, but they are instantaneous, and so do not capture the temporal patterns that are often at the core of what differentiates strains. So using the full temporal character of the sensor response time series is a cutting edge neural ML problem, and important to society too.<br />
<br />
'''October 19, 2016'''<br />
* Speaker: Nihat Ay<br />
* Time: 12:00<br />
* Affiliation: Max Planck Leipzig<br />
* Host: Bruno/Max<br />
* Status: tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''October 26, 2016'''<br />
* Speaker: Eric Jonas<br />
* Time: 12:00<br />
* Affiliation: UC Berkeley<br />
* Host: Charles Frye<br />
* Status: confirmed<br />
* Title: Could a neuroscientist understand a microprocessor?<br />
* Abstract:<br />
<br />
== Previous Seminars ==<br />
<br />
=== 2016/17 academic year ===<br />
<br />
'''Sept. 7, 2016'''<br />
* Speaker: Dan Stowell<br />
* Time: 12:00<br />
* Affiliation: Queen Mary, University of London<br />
* Host: Frederic Theunissen<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''Sept. 8, 2016'''<br />
* Speaker: Barb Finlay<br />
* Time: 12:00<br />
* Affiliation: Cornell Univ<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
=== 2015/16 academic year ===<br />
<br />
'''July 21, 2015'''<br />
* Speaker: Felix Effenberger<br />
* Affiliation: <br />
* Host: Chris H.<br />
* Status: confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 22, 2015'''<br />
* Speaker: Lav Varshney<br />
* Affiliation: Urbana-Champaign<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 23, 2015'''<br />
* Speaker: Xuemin Wei<br />
* Affiliation: Univ Penn<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 29, 2015'''<br />
* Speaker: Gonzalo Otazu<br />
* Affiliation: Cold Spring Harbor Laboratory, Long Island, NY<br />
* Host: Mike D<br />
* Status: Confirmed<br />
* Title: The Role of Cortical Feedback in Olfactory Processing<br />
* Abstract: The olfactory bulb receives rich glutamatergic projections from the piriform cortex. However, the dynamics and importance of these feedback signals remain unknown. In the first part of this talk, I will present data from multiphoton calcium imaging of cortical feedback in the olfactory bulb of awake mice. Responses of feedback boutons were sparse, odor specific, and often outlasted stimuli by several seconds. Odor presentation either enhanced or suppressed the activity of boutons. However, any given bouton responded with stereotypic polarity across multiple odors, preferring either enhancement or suppression. Inactivation of piriform cortex increased odor responsiveness and pairwise similarity of mitral cells but had little impact on tufted cells. We propose that cortical feedback differentially impacts these two output channels of the bulb by specifically decorrelating mitral cell responses to enable odor separation. In the second part of the talk I will introduce a computational model of odor identification in natural scenes that uses cortical feedback and how the model predictions match our experimental data.<br />
<br />
'''Aug 19, 2015'''<br />
* Speaker: Wujie Zhang<br />
* Affiliation: Columbia<br />
* Host: Bruno/Michael Yartsev<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''Sept 2, 2015'''<br />
* Speaker: Jeremy Maitin-Shepard<br />
* Affiliation: Computer Science, UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Combinatorial Energy Learning for Image Segmentation<br />
* Abstract: Recent advances in volume electron microscopy make it possible to image neuronal tissue volumes containining hundreds of thousands of neurons at sufficient resolution to discern even the finest neuronal processes. Accurate 3-D segmentation of these processes densely packed in these petavoxel-scale volumes is the key bottleneck in reconstructing large-scale neural circuits.<br />
<br />
'''Sept 8, 2015'''<br />
* Speaker: Jennifer Hasler<br />
* Affiliation: Georgia Tech<br />
* Host: Bruno/Mika<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''October 29, 2015'''<br />
* Speaker: Garrett Kenyon<br />
* Affiliation: Los Alamos National Laboratory<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title: A Deconvolutional Competitive Algorithm (DCA)<br />
* Abstract: The Locally Competitive Algorithm (LCA) is a neurally-plausible sparse solver based on lateral inhibition between leaky integrator neurons. LCA accounts for many linear and nonlinear response properties of V1 simple cells, including end-stopping and contrast-invariant orientation tuning. Here, we describe a convolutional implementation of LCA in which a column of feature vectors is replicated with a stride that is much smaller than the diameter of the corresponding kernels, allowing the construction of dictionaries that are many times more overcomplete than without replication. Using a local Hebbian rule that minimizes sparse reconstruction error, we are able to learn representations from unlabeled imagery, including monocular and stereo video streams, that in some cases support near state-of-the-art performance on object detection, action classification and depth estimation tasks, with a simple linear classifier. We further describe a scalable approach to building a hierarchy of convolutional LCA layers, which we call a Deconvolutional Competitive Algorithm (DCA). All layers in a DCA are trained simultaneously and all layers contribute to a single image reconstruction, with each layer deconvolving its representation through all lower layers back to the image plane. We show that a 3-layer DCA trained on short video clips obtained from hand-held cameras exhibits a clear segregation of image content, with features in the top layer reconstructing large-scale structures while features in the middle and bottom layers reconstruct progressively finer details. Lastly, we describe PetaVision, an open source, cloud-friendly, high-performance neural simulation toolbox that was used to perform the numerical studies presented here.<br />
<br />
'''Nov 18, 2015'''<br />
* Speaker: Hillel Adesnik<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Nov 17, 2015'''<br />
* Speaker: Manuel Lopez<br />
* Affiliation: <br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''Dec 2, 2015'''<br />
* Speaker: Steven Brumby<br />
* Affiliation: [http://www.descarteslabs.com/ Descartes Labs]<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title: Seeing the Earth in the Cloud<br />
* Abstract: The proliferation of transistors has increased the performance of computing systems by over a factor of a million in the past 30 years, and is also dramatically increasing the amount of data in existence, driving improvements in sensor, communication and storage technology. Multi-decadal Earth and planetary remote sensing global datasets at the petabyte scale (8×10^15 bits) are now available in commercial clouds, and new satellite constellations are planning to generate petabytes of images per year, providing daily global coverage at a few meters per pixel. Cloud storage with adjacent high-bandwidth compute, combined with recent advances in neuroscience-inspired machine learning for computer vision, is enabling understanding of the world at a scale and at a level of granularity never before feasible. We report here on a computation processing over a petabyte of compressed raw data from 2.8 quadrillion pixels (2.8 petapixels) acquired by the US Landsat and MODIS programs over the past 40 years. Using commodity cloud computing resources, we convert the imagery to a calibrated, georeferenced, multiresolution tiled format suited for machine-learning analysis. We believe ours is the first application to process, in less than a day, on generally available resources, over a petabyte of scientific image data. We report on work using this reprocessed dataset for experiments demonstrating country-scale food production monitoring, an indicator for famine early warning. <br />
<br />
'''Dec 14, 2015'''<br />
* Speaker: Bill Softky <br />
* Affiliation:<br />
* Host: Bruno<br />
* Status: confirmed <br />
* Title: Screen addition - informal Redwood group seminar<br />
<br />
'''Dec 16, 2015'''<br />
* Speaker: Mike Landy<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Feb 3, 2016'''<br />
* Speaker: Ping-Chen Huang<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Feb 17, 2016'''<br />
* Speaker: Andrew Saxe<br />
* Affiliation: Harvard<br />
* Host: Jesse<br />
* Status: confirmed<br />
* Title: Hallmarks of Deep Learning in the Brain<br />
<br />
'''Feb 24, 2016'''<br />
* Speaker: Miguel Perpinan<br />
* Affiliation: UC Merced<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
<br />
'''Mar 1, 2016'''<br />
* Speaker: Leon Gatys<br />
* Affiliation: Univ Tubingen<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Mar 7-9, 2016'''<br />
* NICE workshop<br />
<br />
'''Mar 9, 2016'''<br />
* Tatiana Engel - HWNI job talk at 12:00<br />
<br />
'''Mar 16, 2016'''<br />
* Talia Lerner - HWNI job talk at 12:00<br />
<br />
'''Mar 23, 2016'''<br />
* Speaker: Kwabena Boahen<br />
* Affiliation: Stanford<br />
* Host: Max Kanwal/Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''April 11, 2016'''<br />
* Speaker: Hao Su<br />
* Time: at 12:00<br />
* Affiliation: Geometric Computing Lab and Artificial Intelligence Lab, Stanford University<br />
* Host: Yubei<br />
* Status: confirmed<br />
* Title: [Tentative] Joint Analysis for 2D Images and 3D shapes<br />
* Abstract: Coming<br />
<br />
'''May 04, 2016'''<br />
* Speaker: Zhengya Zhang<br />
* Time: 12:00<br />
* Affiliation: Electrical Engineering and Computer Science, University of Michigan<br />
* Host: Dylan, Bruno<br />
* Status: Confirmed<br />
* Title: Sparse Coding ASIC Chips for Feature Extraction and Classification<br />
* Abstract: Hardware-based computer vision accelerators will be an essential part of future mobile and autonomous devices to meet the low power and real-time processing requirement. To realize a high energy efficiency and high throughput, the accelerator architecture can be massively parallelized and tailored to the underlying algorithms, which is an advantage over software-based solutions and general-purpose hardware. In this talk, I will present three application-specific integrated circuit (ASIC) chips that implement the sparse and independent local network (SAILnet) algorithm and the locally competitive algorithm (LCA) for feature extraction and classification. Two of the chips were designed using an array of leaky integrate-and-fire neurons. Sparse activations of the neurons make possible an efficient grid-ring architecture to deliver an image processing throughput of 1 G pixel/s using only 200 mW. The third chip was designed using a convolution approach. Sparsity is again an important factor that enabled the use of sparse convolvers to achieve an effective performance of 900 G operations/s using less than 150 mW.<br />
<br />
'''May 18, 2016'''<br />
* Speaker: Melanie Mitchell<br />
* Affiliation: Portland State University and Santa Fe Institute<br />
* Host: Dylan<br />
* Time: 12:00<br />
* Status: confirmed<br />
* Title: Using Analogy to Recognize Visual Situations<br />
* Abstract: Enabling computers to recognize abstract visual situations remains a hard open problems in artificial intelligence. No machine vision system comes close to matching human ability at identifying the contents of images or visual scenes, or at recognizing abstract similarity between different scenes, even though such abilities pervade human cognition. In this talk I will describe my research on getting computers to flexibly recognize visual situations by integrating low-level vision algorithms with an agent-based model of higher-level concepts and analogy-making. <br />
* Bio: Melanie Mitchell is Professor of Computer Science at Portland State University, and External Professor and Member of the Science Board at the Santa Fe Institute. She received a Ph.D. in Computer Science from the University of Michigan. Her dissertation, in collaboration with her advisor Douglas Hofstadter, was the development of Copycat, a computer program that makes analogies. She is the author or editor of five books and over 70 scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her most recent book, Complexity: A Guided Tour (Oxford, 2009), won the 2010 Phi Beta Kappa Science Book Award. It was also named by Amazon.com as one of the ten best science books of 2009, and was longlisted for the Royal Society's 2010 book prize. Melanie directs the Santa Fe Institute's Complexity Explorer project, which offers online courses and other educational resources related to the field of complex systems.<br />
<br />
'''June 8, 2016'''<br />
* Speaker: Kris Bouchard<br />
* Time: 12:00<br />
* Affiliation: LBNL<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: The union of intersections method<br />
* Abstract:<br />
<br />
'''June 15, 2016'''<br />
* Speaker: James Blackmon<br />
* Time: 12:00<br />
* Affiliation: San Francisco State University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
=== 2014/15 academic year ===<br />
<br />
'''2 July 2014'''<br />
* Speaker: Kelly Clancy<br />
* Affiliation: Feldman lab<br />
* Host: Guy<br />
* Status: confirmed<br />
* Title: Volitional control of neural assemblies in L2/3 of motor and somotosensory cortices<br />
* Abstract: I'll be talking about a joint effort between the Feldman, Carmena and Costa labs to study abstract task learning by small neuronal assemblies in intact networks. Brain-machine interfaces are a unique tool for studying learning, thanks to the direct mapping between neural activity and reward. We trained mice to operantly control an auditory cursor using spike-related calcium signals recorded with two-photon imaging in motor and somatosensory cortex, allowing us to assess the effects of learning with great spatial detail. Mice rapidly learned to modulate activity in layer 2/3 neurons, evident both across and within sessions. Interestingly, even neurons that exhibited very low or no spontaneous spiking--so-called 'silent' cells that are invisible to electrode-based techniques--could be behaviorally up-modulated for task performance. Learning was accompanied by modifications of firing correlations in spatially localized networks at fine scales.<br />
<br />
'''23 July 2014'''<br />
* Speaker: Gautam Agarwal<br />
* Affiliation: UC Berkeley/Champalimaud<br />
* Host: Friedrich Sommer<br />
* Status: confirmed<br />
* Title: Unsolved Mysteries of Hippocampal Dynamics<br />
* Abstract: Two radically different forms of electrical activity can be observed in the rat hippocampus: spikes and local field potentials (LFPs). Hippocampal pyramidal neurons are mostly silent, yet spike vigorously as the subject encounters particular locations in its environment. In contrast, LFPs appear to lack place-selectivity, persisting regardless of the rat's location. Recently, we found that in fact one can recover from LFPs the spatial information present in the underlying neuronal population, showing how these two signals are two sides of the same coin. Nonetheless, there are many aspects of the LFP that remain mysterious. I will review several observations and explanatory gaps which await further study. These include: the relationship of LFP patterns to anatomy; the elusive structure of gamma waves; complex forms of cross-frequency coupling; variations in LFP patterns seen when the rat explores its world more freely; reconciling the memory and navigation roles of the hippocampus.<br />
<br />
'''6 Aug 2014'''<br />
* Speaker: Georg Martius<br />
* Affiliation: Max Planck Institute, Leipzig<br />
* Host: Fritz Sommer<br />
* Status: confirmed<br />
* Title: Information driven self-organization of robotic behavior<br />
* Abstract: Autonomy is a puzzling phenomenon in nature and a major challenge in the world of artifacts. A key feature of autonomy in both natural and<br />
artificial systems is seen in the ability for independent<br />
exploration. In animals and humans, the ability to modify its own<br />
pattern of activity is not only an indispensable trait for adaptation<br />
and survival in new situations, it also provides a learning system<br />
with novel information for improving its cognitive capabilities, and<br />
it is essential for development. Efficient exploration in<br />
high-dimensional spaces is a major challenge in building learning<br />
systems. We propose to implement the exploration as a deterministic<br />
law derived from maximizing an information quantity. More<br />
specifically we use the predictive information of the sensor process<br />
(of a robot) to obtain an update rule (exploration dynamics) of the<br />
controller parameters. To be adequate in robotics application the<br />
non-stationary nature of the underlying time-series have to be taken<br />
into account, which we do by proposing the time-local predictive<br />
information (TiPI). Importantly the exploration dynamics is derived<br />
analytically and by this we link information theory and dynamical<br />
systems. Without a random component the change in the parameters is<br />
deterministically given as a function of the states in a certain time<br />
window. For an embodied system this means in particular that<br />
constraints, responses and current knowledge of the dynamical<br />
interaction with the environment can directly be used to advance<br />
further exploration. Randomness is replaced with spontaneity which we<br />
demonstrate to restrict the search space automatically to the<br />
physically relevant dimensions. Its effectiveness will be<br />
presented with various experiments on high-dimensional robotic system<br />
and we argue that this is a promising way to avoid the curse of<br />
dimensionality. This talk describes joint work with Ralf Der and Nihat Ay.<br />
<br />
'''15 Aug 2014'''<br />
* Speaker: Juergen Schmidhuber<br />
* Affiliation: IDSIA, Switzerland<br />
* Host: James/Shariq<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''2 Sept 2014'''<br />
* Speaker: Oriol Vinyals <br />
* Affliciation: Google<br />
* Host: Guy<br />
* Status: confirmed<br />
* Title: Machine Translation with Long-Short Term Memory Models<br />
* Abstract: Supervised large deep neural networks achieved good results on speech recognition and computer vision. Although very successful, deep neural networks can only be applied to problems whose inputs and outputs can be conveniently encoded with vectors of fixed dimensionality - but cannot easily be applied to problems whose inputs and outputs are sequences. In this work, we show how to use a large deep Long Short-Term Memory (LSTM) model to solve domain-agnostic supervised sequence to sequence problems with minimal manual engineering. Our model uses one LSTM to map the input sequence to a vector of a fixed dimensionality and another LSTM to map the vector to the output sequence. We applied our model to a machine translation task and achieved encouraging results. On the WMT'14 translation task from English to French, a model combination of 6 large LSTMs achieves a BLEU score of 32.3 (where a larger score is better). For comparison, a strong standard statistical MT baseline achieves a BLEU score of 33.3. When we use our LSTM to rescore the n-best lists produced by the SMT baseline, we achieve a BLEU score of 36.3, which is a new state of the art. This is joint work with Ilya Sutskever and Quoc Le.<br />
<br />
'''19 Sept 2014'''<br />
* Speaker: Gary Marcus<br />
* Affiliation: NYU<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''24 Sept 2014'''<br />
* Speaker: Alyosha Efros<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''30 Sep 2014'''<br />
* Speaker: Alejandro Bujan<br />
* Affiliation:<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Propagation and variability of evoked responses: the role of correlated inputs and oscillations<br />
* Abstract: <br />
<br />
'''8 Oct 2014'''<br />
* Speaker: Siyu Zhang<br />
* Affiliation: UC Berkeley<br />
* Host: Karl<br />
* Status: confirmed<br />
* Title: Long-range and local circuits for top-down modulation of visual cortical processing<br />
* Abstract:<br />
<br />
'''15 Oct 2014'''<br />
* Speaker: Tamara Broderick<br />
* Affiliation: UC Berkeley<br />
* Host: Yvonne/James<br />
* Status: confirmed<br />
* Title: Feature allocations, probability functions, and paintboxes<br />
* Abstract: Clustering involves placing entities into mutually exclusive categories. We wish to relax the requirement of mutual exclusivity, allowing objects to belong simultaneously to multiple classes, a formulation that we refer to as "feature allocation." The first step is a theoretical one. In the case of clustering the class of probability distributions over exchangeable partitions of a dataset has been characterized (via exchangeable partition probability functions and the Kingman paintbox). These characterizations support an elegant nonparametric Bayesian framework for clustering in which the number of clusters is not assumed to be known a priori. We establish an analogous characterization for feature allocation; we define notions of "exchangeable feature probability functions" and "feature paintboxes" that lead to a Bayesian framework that does not require the number of features to be fixed a priori. The second step is a computational one. Rather than appealing to Markov chain Monte Carlo for Bayesian inference, we develop a method to transform Bayesian methods for feature allocation (and other latent structure problems) into optimization problems with objective functions analogous to K-means in the clustering setting. These yield approximations to Bayesian inference that are scalable to large inference problems.<br />
<br />
'''29 Oct 2014'''<br />
* Speaker: Ken Nakayama<br />
* Affiliation: Harvard<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Topics in higher level visuo-motor control<br />
* Abstract: TBA<br />
<br />
'''5 Nov 2014''' - **BVLC retreat**<br />
<br />
'''20 Nov 2014'''<br />
* Speaker: Haruo Hasoya<br />
* Affiliation: ATR Institute, Japan<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''9 Dec 2014'''<br />
* Speaker: Dirk DeRidder<br />
* Affiliation: Dundedin School of Medicine, University of Otago, New Zealand<br />
* Host: Bruno/Walter Freeman<br />
* Status: confirmed<br />
* Title: The Bayesian brain, phantom percepts and brain implants<br />
* Abstract: TBA<br />
<br />
'''January 14, 2015'''<br />
* Speaker: Kevin O'regan<br />
* Affiliation: CNRS - Université Paris Descartes<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''January 21, 2015'''<br />
* Speaker: Adrienne Fairhall<br />
* Affiliation: University of Washington<br />
* Host: Mike Schachter<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''January 26, 2015'''<br />
* Speaker: Abraham Peled<br />
* Affiliation: Mental Health Center, 'Technion' Israel Institute of Technology<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Clinical Brain Profiling: A Neuro-Computational psychiatry<br />
* Abstract: TBA<br />
<br />
'''January 28, 2015'''<br />
* Speaker: Rich Ivry<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Embodied Decision Making: System interactions in sensorimotor adaptation and reinforcement learning<br />
* Abstract:<br />
<br />
'''February 11, 2015'''<br />
* Speaker: Mark Lescroart<br />
* Affiliation: UC Berkeley<br />
* Host: Karl<br />
* Status: tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''February 25, 2015'''<br />
* Speaker: Steve Chase<br />
* Affiliation: CMU<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Joint Redwood/CNEP seminar<br />
* Abstract:<br />
<br />
'''March 3, 2015'''<br />
* Speaker: Andreas Herz<br />
* Affiliation: Bernstein Center, Munich<br />
* Host: Bruno/Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''March 3, 2015 - 4:00'''<br />
* Speaker: James Cooke<br />
* Affiliation: Oxford<br />
* Host: Mike Deweese<br />
* Status: confirmed<br />
* Title: Neural Circuitry Underlying Contrast Gain Control in Primary Auditory Cortex<br />
* Abstract:<br />
<br />
'''March 4, 2015'''<br />
* Speaker: Bill Sprague<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: V1 disparity tuning and the statistics of disparity in natural viewing<br />
* Abstract:<br />
<br />
'''March 11, 2015'''<br />
* Speaker: Jozsef Fiser<br />
* Affiliation: Central European University<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''April 1, 2015'''<br />
* Speaker: Saeed Saremi<br />
* Affiliation: Salk Inst<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''April 15, 2015'''<br />
* Speaker: Zahra M. Aghajan<br />
* Affiliation: UCLA<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Hippocampal Activity in Real and Virtual Environments<br />
* Abstract:<br />
<br />
'''May 7, 2015'''<br />
* Speaker: Santani Teng<br />
* Affiliation: MIT<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''May 13, 2015'''<br />
* Speaker: Harri Valpola<br />
* Affiliation: ZenRobotics<br />
* Host: Brian<br />
* Status: Tentative<br />
* Title: TBA<br />
* Abstract<br />
<br />
'''June 24, 2015'''<br />
* Speaker: Kendrick Kay<br />
* Affiliation: Department of Psychology, Washington University in St. Louis<br />
* Host: Karl<br />
* Status: Confirmed<br />
* Title: Using functional neuroimaging to reveal the computations performed by the human visual system<br />
* Abstract<br />
Visual perception is the result of a complex set of computational transformations performed by neurons in the visual system. Functional magnetic resonance imaging (fMRI) is ideally suited for identifying these transformations, given its excellent spatial resolution and ability to monitor activity across the numerous areas of visual cortex. In this talk, I will review past research in which we used fMRI to develop increasingly accurate models of the stimulus transformations occurring in early and intermediate visual areas. I will then describe recent research in which we successfully extend this approach to high-level visual areas involved in perception of visual categories (e.g. faces) and demonstrate how top-down attention modulates bottom-up stimulus representations. Finally, I will discuss ongoing research targeting regions of ventral temporal cortex that are essential for skilled reading. Our model-based approach, combined with high-field laminar measurements, is expected to provide an integrated picture of how bottom-up stimulus transformations and top-down cognitive factors interact to support rapid and accurate word recognition. Development of quantitative models and associated experimental paradigms may help us understand and diagnose impairments in neural processing that underlie visual disorders such as dyslexia and prosopagnosia.<br />
<br />
=== 2013/14 academic year ===<br />
<br />
'''9 Oct 2013'''<br />
* Speaker: Ekaterina Brocke<br />
* Affiliation: KTH University, Stockholm, Sweden<br />
* Host: Tony<br />
* Status: confirmed<br />
* Title: Multiscale modeling in Neuroscience: first steps towards multiscale co-simulation tool development.<br />
* Abstract: Multiscale modeling/simulations attracts an increasing number of neuroscientists to study how different levels of organization (networks of neurons, cellular/subcellular levels) interact with each other across multiple scales, space and time, to mediate different brain functions. Different scales are usually described by different physical and mathematical formalisms thus making it non trivial to perform the integration. In this talk, I will discuss key phenomena in Neuroscience that can be addressed using subcellular/cellular models, possible approaches to perform multiscale simulations in particular a co-simulation method. I will also introduce several multiscale "toy" models of cellular/subcellular levels that were developed with the aim to understand numerical and technical problems which might appear during the co-simulation. And finally, the first steps made towards multiscale co-simulation tool development will be presented during the talk.<br />
<br />
'''29 Oct 2013 - note: 4:00'''<br />
* Speaker: Mitya Chkolovskii<br />
* Affiliation: HHMI/Janelia Farm<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''30 Oct 2013'''<br />
* Speaker: Ilya Nemanman<br />
* Affiliation: Emory University, Departments of Physics and Biology<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Large N in neural data -- expecting the unexpected.<br />
* Abstract: Recently it has become possible to directly measure simultaneous collective states of many biological components, such as neural activities, genetic sequences, or gene expression profiles. These data are revealing striking results, suggesting, for example, that biological systems are tuned to criticality, and that effective models of these systems based on only pairwise interactions among constitutive components provide surprisingly good fits to the data. We will explore a handful of simplified theoretical models, largely focusing on statistical mechanics of Ising spins, that suggest plausible explanations for these observations. Specifically, I will argue that, at least in certain contexts, these intriguing observations should be expected in multivariate interacting data in the thermodynamic limit of many interacting components.<br />
<br />
'''31 Oct 2013'''<br />
* Speaker: Oriol Vinyals<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno/Brian<br />
* Status: confirmed<br />
* Title: Beyond Deep Learning: Scalable Methods and Models for Learning<br />
* Abstract: In this talk I will briefly describe several techniques I explored in my thesis that improve how to efficiently model signal representations and learn useful information from them. The building block of my dissertation is based on machine learning approaches to classification, where a (typically non-linear) function is learned from labeled examples to map from signals to some useful information (e.g. an object class present an image, or a word present in an acoustic signal). One of the motivating factors of my work has been advances in neural networks in deep architectures (which has led to the terminology "deep learning"), and that has shown state-of-the-art performance in acoustic modeling and object recognition -- the main focus of this thesis. In my work, I have contributed to both the learning (or training) of such architectures through faster and robust optimization techniques, and also to the simplification of the deep architecture model to an approach that is simple to optimize. Furthermore, I derived a theoretical bound showing a fundamental limitation of shallow architectures based on sparse coding (which can be seen as a one hidden layer neural network), thus justifying the need for deeper architectures, while also empirically verifying these architectural choices on speech recognition. Many of my contributions have been used in a wide variety of applications, products and datasets as a result of many collaborations within ICSI and Berkeley, but also at Microsoft Research and Google Research.<br />
<br />
'''6 Nov 2013'''<br />
* Speaker: Garrett T. Kenyon<br />
* Affiliation: Los Alamos National Laboratory, The New Mexico Consortium<br />
* Host: Dylan Paiton<br />
* Status: Confirmed<br />
* Title: Using Locally Competitive Algorithms to Model Top-Down and Lateral Interactions<br />
* Abstract: Cortical connections consist of feedforward, feedback and lateral pathways. Infragranular layers project down the cortical hierarchy to both supra- and infragranular layers at the previous processing level, while the neurons in supragranular layers are linked by extensive long-range lateral projections that cross multiple cortical columns. However, most functional models of visual cortex only account for feedforward connections. Additionally, most models of visual cortex fail to account both for the thalamic projections to non-striate areas and the reciprocal connections from extrastriate areas back to the thalamus. In this talk, I will describe how a modified Locally Competitive Algorithm (LCA; Rozell et al, Neural Comp, 2008) can be used as a unifying framework for exploring the role of top-down and lateral cortical pathways within the context of deep, sparse, generative models. I will also describe an open source software tool called PetaVision that can be used to implement and execute hierarchical LCA-based models on multi-core, multi-node computer platforms without requiring specific knowledge of parallel-programming constructs.<br />
<br />
'''14 Nov 2013 (note: Thursday), ***12:30pm*** '''<br />
* Speaker: Geoffrey J Goodhill<br />
* Affiliation: Queensland Brain Institute and School of Mathematics and Physics, The University of Queensland, Australia<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Computational principles of neural wiring development<br />
* Abstract: Brain function depends on precise patterns of neural wiring. An axon navigating to its target must make guidance decisions based on noisy information from molecular cues in its environment. I will describe a combination of experimental and computational work showing that (1) axons may act as ideal observers when sensing chemotactic gradients, (2) the complex influence of calcium and cAMP levels on guidance decisions can be predicted mathematically, (3) the morphology of growth cones at the axonal tip can be understood in terms of just a few eigenshapes, and remarkably these shapes oscillate in time with periods ranging from minutes to hours. Together this work may shed light on how neural wiring goes wrong in some developmental brain disorders, and how best to promote appropriate regrowth of axons after injury.<br />
<br />
'''4 Dec 2013'''<br />
* Speaker: Zhenwen Dai<br />
* Affiliation: FIAS, Goethe University Frankfurt, Germany.<br />
* Host: Georgios Exarchakis<br />
* Status: Confirmed<br />
* Title: What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach <br />
* Abstract: We study optimal image encoding based on a generative approach with non-linear feature combinations and explicit position encoding. By far most approaches to unsupervised learning of visual features, such as sparse coding or ICA, account for translations by representing the same features at different positions. Some earlier models used a separate encoding of features and their positions to facilitate invariant data encoding and recognition. All probabilistic generative models with explicit position encoding have so far assumed a linear superposition of components to encode image patches. Here, we for the first time apply a model with non-linear feature superposition and explicit position encoding for patches. By avoiding linear superpositions, the studied model represents a closer match to component occlusions which are ubiquitous in natural images. In order to account for occlusions, the non-linear model encodes patches qualitatively very different from linear models by using component representations separated into mask and feature parameters. We first investigated encodings learned by the model using artificial data with mutually occluding components. We find that the model extracts the components, and that it can correctly identify the occlusive components with the hidden variables of the model. On natural image patches, the model learns component masks and features for typical image components. By using reverse correlation, we estimate the receptive fields associated with the model’s hidden units. We find many Gabor-like or globular receptive fields as well as fields sensitive to more complex structures. Our results show that probabilistic models that capture occlusions and invariances can be trained efficiently on image patches, and that the resulting encoding represents an alternative model for the neural encoding of images in the primary visual cortex. <br />
<br />
'''11 Dec 2013'''<br />
* Speaker: Kai Siedenburg<br />
* Affiliation: UC Davis, Petr Janata's Lab.<br />
* Host: Jesse Engel<br />
* Status: Confirmed<br />
* Title: Characterizing Short-Term Memory for Musical Timbre<br />
* Abstract: Short-term memory is a cognitive faculty central for the apprehension of music and speech. Only little is known, however, about memory for musical timbre despite its“sisterhood”with speech; after all, speech can be regarded as sequencing of vocal timbre. Past research has isolated many characteristic effects of verbal memory. Are these also in play for non-vocal timbre sequences? We studied this question by considering short-term memory for serial order. Using timbres and dissimilarity data from McAdams et al. (Psych. Research, 1995), we employed a same/different discrimination paradigm. Experiment 1 (N = 30 MU + 30 nonMU) revealed effects of sequence length and timbral dissimilarity of items, as well as an interaction of musical training and pitch variability: in contrast to musicians, non-musicians' performance was impaired by simultaneous changes in pitch, compared to a constant pitch baseline. Experiment 2 (N = 22) studied whether musicians' memory for timbre sequences was independent of pitch irrespective of the degree of complexity of pitch progressions. Comparing sequences with pitch changing within and across standard and comparison to a constant pitch baseline, performance was now clearly impaired for the variable pitch condition. Experiment 3 (N = 22) showed primacy and recency effects for musicians, and reproduced a positive effect of timbral heterogeneity of sequences. Our findings demonstrate the presence of hallmark effects of verbal memory such as similarity, word length, primacy/recency for the domain of non-vocal timbre, and suggest that memory for speech and non- vocal timbre sequences might to a large extent share underlying mechanisms.<br />
<br />
'''12 Dec 2013'''<br />
* Speaker: Matthias Bethge<br />
* Affiliation: University of Tubingen<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''22 Jan 2014'''<br />
* Speaker: Thomas Martinetz<br />
* Affiliation: Univ Luebeck<br />
* Host: Bruno/Fritz<br />
* Status: confirmed<br />
* Title: Orthogonal Sparse Coding and Sensing<br />
* Abstract: Sparse Coding has been a very successful concept since many natural signals have the property of being sparse in some dictionary (basis). Some natural signals are even sparse in an orthogonal basis, most prominently natural images. They are sparse in a respective wavelet transform. An encoding in an orthogonal basis has a number of advantages,.e.g., finding the optimal coding coefficients is simply a projection instead of being NP-hard.<br />
Given some data, we want to find the orthogonal basis which provides the sparsest code. This problem can be seen as a <br />
generalization of Principal Component Analysis. We present an algorithm, Orthogonal Sparse Coding (OSC), which is able to find this basis very robustly. On natural images, it compresses on the level of JPEG, but can adapt to arbitrary and special data sets and achieve significant improvements. With the property of being sparse in some orthogonal basis, we show how signals can be sensed very efficiently in an hierarchical manner with at most k log D sensing actions. This hierarchical sensing might relate to the way we sense the world, with interesting applications in active vision. <br />
<br />
'''29 Jan 2014'''<br />
* Speaker: David Klein<br />
* Affiliation: Audience<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''5 Feb 2014''' (leave open for Barth/Martinetz seminar)<br />
<br />
'''12 Feb 2014'''<br />
* Speaker: Ilya Sutskever <br />
* Affiliation: Google<br />
* Host: Zayd<br />
* Status: confirmed<br />
* Title: Continuous vector representations for machine translation<br />
* Abstract: Dictionaries and phrase tables are the basis of modern statistical machine translation systems. I will present a method that can automate the process of generating and extending dictionaries and phrase tables. Our method can translate missing word and phrase entries by learning language structures using large monolingual data, and by mapping between the languages using a small bilingual dataset. It uses distributed representations of words and learns a linear mapping between vector spaces of languages. Despite its simplicity, our method is surprisingly effective: we can achieve almost 90% precision@5 for translation of words between English and Spanish. This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs. Joint work with Tomas Mikolov and Quoc Le.<br />
<br />
'''25 Feb 2014'''<br />
* Speaker: Alexander Terekhov <br />
* Affiliation: CNRS - Université Paris Descartes<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Constructing space: how a naive agent can learn spatial relationships by observing sensorimotor contingencies<br />
* Abstract:<br />
<br />
'''12 March 2014'''<br />
* Speaker: Carlos Portera-Cailliau<br />
* Affiliation: UCLA<br />
* Host: Mike<br />
* Status: confirmed<br />
* Title: Circuit defects in the neocortex of Fmr1 knockout mice<br />
* Abstract: TBA<br />
<br />
'''19 March 2014'''<br />
* Speaker: Dean Buonomano<br />
* Affiliation: UCLA<br />
* Host: Mike<br />
* Status: confirmed<br />
* Title: State-dependent Networks: Timing and Computations Based on Neural Dynamics and Short-term Plasticity<br />
* Abstract: The brain’s ability to seamlessly assimilate and process spatial and temporal information is critical to most behaviors, from understanding speech to playing the piano. Indeed, because the brain evolved to navigate a dynamic world, timing and temporal processing represent a fundamental computation. We have proposed that timing and the processing of temporal information emerges from the interaction between incoming stimuli and the internal state of neural networks. The internal state, is defined not only by ongoing activity (the active state) but by time-varying synaptic properties, such as short-term synaptic plasticity (the hidden state). One prediction of this hypothesis is that timing is a general property of cortical circuits. We provide evidence in this direction by demonstrating that in vitro cortical networks can “learn” simple temporal patterns. Finally, previous theoretical studies have suggested that recurrent networks capable of self-perpetuating activity hold significant computational potential. However, harnessing the computational potential of these networks has been hampered by the fact that such networks are chaotic. We show that it is possible to “tame” chaos through recurrent plasticity, and create a novel and powerful general framework for how cortical circuits compute.<br />
<br />
'''26 March 2014'''<br />
* Speaker: Robert G. Smith<br />
* Affiliation: University of Pennsylvania<br />
* Host: Mike S<br />
* Status: confirmed<br />
* Title: Role of Dendritic Computation in the Direction-Selective Circuit of Retina<br />
* Abstract: The retina utilizes a variety of signal processing mechanisms to compute direction from image motion. The computation is accomplished by a circuit that includes starburst amacrine cells (SBACs), which are GABAergic neurons presynaptic to direction-selective ganglion cells (DSGCs). SBACs are symmetric neurons with several branched dendrites radiating out from the soma. When a stimulus moving back and forth along a SBAC dendrite sequentially activates synaptic inputs, larger post-synaptic potentials (PSPs) are produced in the dendritic tips when the stimulus moves outwards from the soma. The directional difference in EPSP amplitude is further amplified near the dendritic tips by voltage-gated channels to produce directional release of GABA. Reciprocal inhibition between adjacent SBACs may also amplify directional release. Directional signals in the independent SBAC branches are preserved because each dendrite makes selective contacts only with DSGCs of the appropriate preferred-direction. Directional signals are further enhanced within the dendritic arbor of the DSGC, which essentially comprises an array of distinct dendritic compartments. Each of these dendritic compartments locally sum excitatory and inhibitory inputs, amplifies them with voltage-gated channels, and generates spikes that propagate to the axon via the soma. Overall, the computation of direction in the retina is performed by several local dendritic mechanisms both presynaptic and postsynaptic, with the result that directional responses are robust over a broad range of stimuli.<br />
<br />
'''16 April 2014'''<br />
* Speaker: David Pfau<br />
* Affiliation: Columbia<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''22 April 2014 *Tuesday*'''<br />
* Speaker: Jochen Braun<br />
* Affiliation: Otto-von-Guericke University, Magdeburg<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Dynamics of visual perception and collective neural activity<br />
* Abstract:<br />
<br />
'''29 April 2014'''<br />
* Speaker: Guiseppe Vitiello<br />
* Affiliation: University of Salerno<br />
* Host: Fritz/Walter Freeman<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''30 April 2014'''<br />
* Speaker: Masataka Watanabe<br />
* Affiliation: University of Tokyo / Max Planck Institute for Biological Cybernetics<br />
* Host: Gautam Agarwal<br />
* Status: confirmed<br />
* Title: Turing Test for Machine Consciousness and the Chaotic Spatiotemporal Fluctuation Hypothesis<br />
* Abstract: I propose an experimental method to test various hypotheses on consciousness. Inspired by Sperry's observation that split-brain patients possess two independent streams of consciousness, the idea is to implement candidate neural mechanisms of visual consciousness onto an artificial cortical hemisphere and test whether subjective experience is evoked in the device's visual hemifield. In contrast to modern neurosynthetic devices, I show that mimicking interhemispheric connectivity assures that authentic and fine-grained subjective experience arises only when a stream of consciousness is generated within the device. It is valid under a widely believed assumption regarding interhemispheric connectivity and neuronal stimulus-invariance. (I will briefly explain my own evidence of human V1 not responding to changes in the contents of visual awareness [1])<br />
<br />
If consciousness is actually generated within the device, we should be able to construct a case where two objects presented in the device's visual field are distinguishable by visual experience but not by what is communicated through the brain-machine interface. As strange as it may sound, and clearly violating the law of physics, this is likely to be happening in the intact brain, where unified subjective bilateral vision and its verbal report occur without the total interhemispheric exchange of conscious visual information.<br />
<br />
Together, I present a hypothesis on the neural mechanism of consciousness, “The Chaotic Spatiotemporal Fluctuation Hypothesis” that passes the proposed test for visual qualia and also explains how physics that we know of today is violated. Here, neural activity is divided into two components, the time-averaged activity and the residual temporally fluctuating activity, where the former serves as the content of consciousness (neuronal population vector) and the latter as consciousness itself. The content is “read” into consciousness in the sense that, every local perturbation caused by change in the neuronal population vector creates a spatiotemporal wave in the fluctuation component that travels through out the system. Deterministic chaos assures that every local difference makes a difference to the whole of the dynamics, as in the butterfly effect, serving as a foundation for the holistic nature of consciousness. I will present data from simultaneous electrophysiology-fMRI recordings and human fMRI [2] that supports the existence of such large-scale causal fluctuation.<br />
<br />
Here, the chaotic fluctuation cannot be decoded to trace back the original perturbation in the neuronal population vector, because initial states of all neurons are required with infinite precision to do so. Hence what is transmitted over the two hemispheres are not "information" in the normal sense. This illustrates the violation of physics by the metaphysical assumption, "chaotic spatiotemporal fluctuation is consciousness", where unification of bilateral vision and the solving of visual tasks (e.g. perfect symmetry detection) are achieved without exchanging the otherwise required Shannon information between the two hemispheres.<br />
<br />
Finally, minimal and realistic versions of the proposed test for visual qualia can be conducted on laboratory animals to validate the hypothesis. It deals with two biological hemispheres, which we know already that it contains consciousness. We dissect interhemispheric connectivity and form instead an artificial one that is capable of filtering out the neural fluctuation component. A limited interhemispheric connectivity may be sufficient, which would drastically discount the technological challenge. If the subject is capable of conducting a bilateral stimuli matching task with the full artificial interhemispheric connectivity, but not when the fluctuation component is filtered out, it can be considered a strong supporting evidence of the hypothesis.<br />
<br />
1.Watanabe, M., Cheng, K., Ueno, K., Asamizuya, T., Tanaka, K., Logothetis, N., Attention but not awareness modulates the BOLD signal in the human V1 during binocular suppression. Science, 2011. 334(6057): p. 829-31.<br />
<br />
2.Watanabe, M., Bartels, A., Macke, J., Logothetis, N., Temporal jitter of the BOLD signal reveals a reliable initial dip and improved spatial resolution. Curr Biol, 2013. 23(21): p. 2146-50.<br />
<br />
'''11 June 2014'''<br />
* Speaker: Stuart Hammeroff<br />
* Affiliation: University of Arizona, Tucson<br />
* Host: Gautam<br />
* Status: confirmed<br />
* Title: ‘Tuning the brain’ – Treating mental states through microtubule vibrations <br />
* Abstract: Do mental states derive entirely from brain neuronal membrane activities? Neuronal interiors are organized by microtubules (‘MTs’), protein polymers proposed to encode memory, process information and support consciousness. Using nanotechnology, Bandyopadhyay’s group at MIT has shown coherent vibrations (megahertz to 10 kilohertz) from microtubule bundles inside active neurons, vibrations (electric field potentials ~40 to 50 mV) able to influence membrane potentials. This suggests EEG rhythms are ‘beat’ frequencies of megahertz vibrations in microtubules inside neurons (Hameroff and Penrose, 2014), and that consciousness and cognition involve vibrational patterns resonating across scales in the brain, more like music than computation. MT megahertz may be a useful therapeutic target for ‘tuning’ mood and mental states. Among noninvasive transcranial brain stimulation techniques (TMS, TDcS), transcranial ultrasound (TUS) is megahertz mechanical vibrations. Applied at the scalp, low intensity, sub-thermal ultrasound (TUS) safely reaches the brain. In human studies, brief (15 to 30 seconds) TUS at 0.5, 2 and 8 megahertz to frontal-temporal cortex results in 40 minutes or longer of reported mood improvement, and focused TUS enhances sensory discrimination (Legon et al, 2014). In vitro, ultrasound promotes growth of neurite outgrowth in embryonic neurons (Raman), and stabilizes microtubules against disassembly (Gupta). (In Alzheimer’s disease, MTs disassemble and release tau.) These findings suggest ‘tuning the brain’ with TUS should be a safe, effective and inexpensive treatment for Alzheimer’s, traumatic brain injury, depression, anxiety, PTSD and other disorders. <br />
<br />
References: Hameroff S, Penrose R (2014) Phys Life Rev http://www.sciencedirect.com/science/article/pii/S1571064513001188; Sahu et al (2013) Biosens Bioelectron 47:141–8; Sahu et al (2013) Appl Phys Lett 102:123701; Legon et al (2014) Nature Neuroscience 17: 322–329<br />
<br />
'''25 June 2014'''<br />
* Speaker: Peter Loxley<br />
* Affiliation: <br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: The two-dimensional Gabor function adapted to natural image statistics: An analytical model of simple-cell responses in the early visual system<br />
* Abstract: TBA<br />
<br />
=== 2012/13 academic year ===<br />
<br />
'''26 Sept 2012''' <br />
* Speaker: Jason Yeatman<br />
* Affiliation: Department of Psychology, Stanford University<br />
* Host: Bruno/Susana Chung<br />
* Status: confirmed<br />
* Title: The Development of White Matter and Reading Skills<br />
* Abstract: The development of cerebral white matter involves both myelination and pruning of axons, and the balance between these two processes may differ between individuals. Cross-sectional measures of white matter development mask the interplay between these active developmental processes and their connection to cognitive development. We followed a cohort of 39 children longitudinally for three years, and measured white matter development and reading development using diffusion tensor imaging and behavioral tests. In the left arcuate and inferior longitudinal fasciculus, children with above-average reading skills initially had low fractional anisotropy (FA) with a steady increase over the 3-year period, while children with below-average reading skills had higher initial FA that declined over time. We describe a dual-process model of white matter development that balances biological processes that have opposing effects on FA, such as axonal myelination and pruning, to explain the pattern of results.<br />
<br />
'''8 Oct 2012''' <br />
* Speaker: Sophie Deneve<br />
* Affiliation: Laboratoire de Neurosciences cognitives, ENS-INSERM<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Balanced spiking networks can implement dynamical systems with predictive coding<br />
* Abstract: Neural networks can integrate sensory information and generate continuously varying outputs, even though individual neurons communicate only with spikes---all-or-none events. Here we show how this can be done efficiently if spikes communicate "prediction errors" between neurons. We focus on the implementation of linear dynamical systems and derive a spiking network model from a single optimization principle. Our model naturally accounts for two puzzling aspects of cortex. First, it provides a rationale for the tight balance and correlations between excitation and inhibition. Second, it predicts asynchronous and irregular firing as a consequence of predictive population coding, even in the limit of vanishing noise. We show that our spiking networks have error-correcting properties that make them far more accurate and robust than comparable rate models. Our approach suggests spike times do matter when considering how the brain computes, and that the reliability of cortical representations could have been strongly under-estimated.<br />
<br />
<br />
'''19 Oct 2012'''<br />
* Speaker: Gert Van Dijck<br />
* Affiliation: Cambridge<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: A solution to identifying neurones using extracellular activity in awake animals: a probabilistic machine-learning approach<br />
* Abstract: Electrophysiological studies over the last fifty years have been hampered by the difficulty of reliably assigning signals to identified cortical neurones. Previous studies have employed a variety of measures based on spike timing or waveform characteristics to tentatively classify other neurone types (Vos et al., Eur. J. Neurosci., 1999; Prsa et al., J. Neurosci., 2009), in some cases supported by juxtacellular labelling (Simpson et al., Prog. Brain Res., 2005; Holtzman et al., J. Physiol., 2006; Barmack and Yakhnitsa, J. Neurosci., 2008; Ruigrok et al., J. Neurosci., 2011), or intracellular staining and / or assessment of membrane properties (Chadderton et al., Nature, 2004; Jorntell and Ekerot, J. Neurosci., 2006; Rancz et al., Nature, 2007). Anaesthetised animals have been widely used as they can provide a ground-truth through neuronal labelling which is much harder to achieve in awake animals where spike-derived measures tend to be relied upon (Lansink et al., Eur. J. Neurosci., 2010). Whilst spike-shapes carry potentially useful information for classifying neuronal classes, they vary with electrode type and the geometric relationship between the electrode and the spike generation zone (Van Dijck et al., Int. J. Neural Syst., 2012). Moreover, spike-shape measurement is achieved with a variety of techniques, making it difficult to compare and standardise between laboratories.In this study we build probabilistic models on the statistics derived from the spike trains of spontaneously active neurones in the cerebellum and the ventral midbrain. The mean spike frequency in combination with the log-interval-entropy (Bhumbra and Dyball, J. Physiol.-London, 2004) of the inter-spike-interval distribution yields the highest prediction accuracy. The cerebellum model consists of two sub-models: a molecular layer - Purkinje layer model and a granular layer - Purkinje layer model. The first model identifies with high accuracy (92.7 %) molecular layer interneurones and Purkinje cells, while the latter identifies with high accuracy (99.2 %) Golgi cells, granule cells, mossy fibers and Purkinje cells. Furthermore, it is shown that the model trained on anaesthetized rat and decerebrate cat data has broad applicability to other species and behavioural states: anaesthetized mice (80 %), awake rabbits (94.2 %) and awake rhesus monkeys (89 - 90 %).Recently, opto-genetics allow to obtain a ground-truth about cell classes. Using opto-genetically identified GABA-ergic and dopaminergic cells we build similar statistical models to identify these neuron types from the ventral midbrain.Hence, this illustrates that our approach will be of general use to a broad variety of laboratories.<br />
<br />
'''Tuesday, 23 Oct 2012''' <br />
* Speaker: Jaimie Sleigh<br />
* Affiliation: University of Auckland<br />
* Host: Fritz/Andrew Szeri<br />
* Status: confirmed<br />
* Title: Is General Anesthesia a failure of cortical information integration<br />
* Abstract: General anesthesia and natural sleep share some commonalities and some differences. Quite a lot is known about the chemical and neuronal effects of general anesthetic drugs. There are two main groups of anesthetic drugs, which can be distinguished by their effects on the EEG. The most commonly used drugs exert a strong GABAergic action; whereas a second group is characterized by minimal GABAergic effects, but significant NMDA blockade. It is less clear which and how these various effects result in failure of the patient to wake up when the surgeon cuts them. I will present some results from experimental brain slice work, and theoretical mean field modelling of anesthesia and sleep, that support the idea that the final common mechanism of both types of anaesthesia is fragmentation of long distance information flow in the cortex.<br />
<br />
'''31 Oct 2012''' (Halloween)<br />
* Speaker: Jonathan Landy<br />
* Affiliation: UCSB<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Mean-field replica theory: review of basics and a new approach<br />
* Abstract: Replica theory provides a general method for evaluating the mode of a distribution, and has varied applications to problems in statistical mechanics, signal processing, etc. Evaluation of the formal expressions arising in replica theory represents a formidable technical challenge, but one that physicists have apparently intuited correct methods for handling. In this talk, I will first provide a review of the historical development of replica theory, covering: 1) motivation, 2) the intuited ``Parisi-ansatz" solution, 3) continued controversies, and 4) a survey of applications (including to neural networks). Following this, I will discuss an exploratory effort of mine, aimed at developing an ansatz-free solution method. As an example, I will work out the phase diagram for a simple spin-glass model. This talk is intended primarily as a tutorial.<br />
<br />
'''7 Nov 2012''' <br />
* Speaker: Tom Griffiths<br />
* Affiliation: UC Berkeley<br />
* Host:Daniel Little<br />
* Status: Confirmed<br />
* Title: Identifying human inductive biases<br />
* Abstract: People are remarkably good at acquiring complex knowledge from limited data, as is required in learning causal relationships, categories, or aspects of language. Successfully solving inductive problems of this kind requires having good "inductive biases" - constraints that guide inductive inference. Viewed abstractly, understanding human learning requires identifying these inductive biases and exploring their origins. I will argue that probabilistic models of cognition provide a framework that can facilitate this project, giving a transparent characterization of the inductive biases of ideal learners. I will outline how probabilistic models are traditionally used to solve this problem, and then present a new approach that uses Markov chain Monte Carlo algorithms as the basis for an experimental method that magnifies the effects of inductive biases.<br />
<br />
'''19 Nov 2012''' (Monday) (Thanksgiving week)<br />
* Speaker: Bin Yu<br />
* Affiliation: Dept. of Statistics and EECS, UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Representation of Natural Images in V4<br />
* Abstract: The functional organization of area V4 in the mammalian ventral visual pathway is far from being well understood. V4 is believed to play an important role in the recognition of shapes and objects and in visual attention, but the complexity of this cortical area makes it hard to analyze. In particular, no current model of V4 has shown good predictions for neuronal responses to natural images and there is no consensus on the primary role of V4.<br />
In this talk, we present analysis of electrophysiological data on the response of V4 neurons to natural images. We propose a new computational model that achieves comparable prediction performance for V4 as for V1 neurons. Our model does not rely on any pre-defined image features but only on invariance and sparse coding principles. We interpret our model using sparse principal component analysis and discover two groups of neurons: those selective to texture versus those selective to contours. This supports the thesis that one primary role of V4 is to extract objects from background in the visual field. Moreover, our study also confirms the diversity of V4 neurons. Among those selective to contours, some of them are selective to orientation, others to acute curvature features.<br />
(This is joint work with J. Mairal, Y. Benjamini, B. Willmore, M. Oliver<br />
and J. Gallant.)<br />
<br />
'''30 Nov 2012''' <br />
* Speaker: Yan Karklin<br />
* Affiliation: NYU<br />
* Host: Tyler<br />
* Status: confirmed<br />
* Title: <br />
* Abstract: <br />
<br />
'''10 Dec 2012 (note this would be the Monday after NIPS)''' <br />
* Speaker: Marius Pachitariu<br />
* Affiliation: Gatsby / UCL<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: NIPS paper "Learning visual motion in recurrent neural networks"<br />
* Abstract: We present a dynamic nonlinear generative model for visual motion based on a<br />
latent representation of binary-gated Gaussian variables connected in a network. <br />
Trained on sequences of images by an STDP-like rule the model learns <br />
to represent different movement directions in different variables. We use an online <br />
approximate inference scheme that can be mapped to the dynamics of networks <br />
of neurons. Probed with drifting grating stimuli and moving bars of light, neurons <br />
in the model show patterns of responses analogous to those of direction-selective <br />
simple cells in primary visual cortex. We show how the computations of the model <br />
are enabled by a specific pattern of learnt asymmetric recurrent connections. <br />
I will also briefly discuss our application of recurrent neural networks as statistical <br />
models of simultaneously recorded spiking neurons. <br />
<br />
'''12 Dec 2012''' <br />
* Speaker: Ian Goodfellow<br />
* Affiliation: U Montreal<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''7 Jan 2013'''<br />
* Speaker: Stuart Hammeroff<br />
* Affiliation: University of Arizona <br />
* Host: Gautam Agarwal<br />
* Status: confirmed<br />
* Title: Quantum cognition and brain microtubules <br />
* Abstract: Cognitive decision processes are generally seen as classical Bayesian probabilities, but better suited to quantum mathematics. For example: 1) Psychological conflict, ambiguity and uncertainty can be viewed as (quantum) superposition of multiple possible judgments and beliefs. 2) Measurement (e.g. answering a question, reaching a decision) reduces possibilities to definite states (‘constructing reality’, ‘collapsing the wave function’). 3) Previous questions influence subsequent answers, so sequence affects outcomes (‘contextual non-commutativity’). 4) Judgments and choices may deviate from classical logic, suggesting random, or ‘non-computable’ quantum influences. Can quantum cognition operate in the brain? Do classical brain activities simulate quantum processes? Or have biomolecular quantum devices evolved? In this talk I will discuss how a finer scale, intra-neuronal level of quantum information processing in cytoskeletal microtubules can accumulate, operate upon and integrate quantum information and memory for self-collapse to classical states which regulate axonal firings, controlling behavior.<br />
<br />
'''Monday 14 Jan 2013, 1:00pm'''<br />
* Speaker: Dibyendu Mandal <br />
* Affiliation: Physics Dept., University of Maryland (Jarzynski group)<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: An exactly solvable model of Maxwell’s demon<br />
* Abstract: The paradox of Maxwell’s demon has stimulated numerous thought experiments, leading to discussions about the thermodynamic implications of information processing. However, the field has lacked a tangible example or model of an autonomous, mechanical system that reproduces the actions of the demon. To address this issue, we introduce an explicit model of a device that can deliver work to lift a mass against gravity by rectifying thermal fluctuations, while writing information to a memory register. We solve for the steady-state behavior of the model and construct its nonequilibrium phase diagram. In addition to the engine-like action described above, we identify a Landauer eraser region in the phase diagram where the model uses externally supplied work to remove information from the memory register. Our model offers a simple paradigm for investigating the thermodynamics of information processing by exposing a transparent mechanism of operation.<br />
<br />
'''23 Jan 2013'''<br />
* Speaker: Carlos Brody<br />
* Affiliation: Princeton<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Neural substrates of decision-making in the rat<br />
* Abstract: Gradual accumulation of evidence is thought to be a fundamental component of decision-making. Over the last 16 years, research in non-human primates has revealed neural correlates of evidence accumulation in parietal and frontal cortices, and other brain areas . However, the circuit mechanisms underlying these neural correlates remains unknown. Reasoning that a rodent model of evidence accumulation would allow a greater number of experimental subjects, and therefore experiments, as well as facilitate the use of molecular tools, we developed a rat accumulation of evidence task, the "Poisson Clicks" task. In this task, sensory evidence is delivered in pulses whose precisely-controlled timing varies widely within and across trials. The resulting data are analyzed with models of evidence accumulation that use the richly detailed information of each trial’s pulse timing to distinguish between different decision mechanisms. The method provides great statistical power, allowing us to: (1) provide compelling evidence that rats are indeed capable of gradually accumulating evidence for decision-making; (2) accurately estimate multiple parameters of the decision-making process from behavioral data; and (3) measure, for the first time, the diffusion constant of the evidence accumulator, which we show to be optimal (i.e., equal to zero). In addition, the method provides a trial-by-trial, moment-by-moment estimate of the value of the accumulator, which can then be compared in awake behaving electrophysiology experiments to trial-by-trial, moment-by-moment neural firing rate measures. Based on such a comparison, we describe data and a novel analysis approach that reveals differences between parietal and frontal cortices in the neural encoding of accumulating evidence. Finally, using semi-automated training methods to produce tens of rats trained in the Poisson Clicks accumulation of evidence task, we have also used pharmacological inactivation to ask, for the first time, whether parietal and frontal cortices are required for accumulation of evidence, and we are using optogenetic methods to rapidly and transiently inactivate brain regions so as to establish precisely when, during each decision-making trial, it is that each brain region's activity is necessary for performance of the task.<br />
<br />
'''28 Jan 2013'''<br />
* Speaker: Eugene M. Izhikevich<br />
* Affiliation: Brain Corporation<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Spikes<br />
* Abstract: Most communication in the brain is via spikes. While we understand the spike-generation mechanism of individual neurons, we fail to appreciate the spike-timing code and its role in neural computations. The speaker starts with simple models of neuronal spiking and bursting, describes small neuronal circuits that learn spike-timing code via spike-timing dependent plasticity (STDP), and finishes with biologically detailed and anatomically accurate large-scale brain models.<br />
<br />
'''29 Jan 2013'''<br />
* Speaker: Goren Gordon<br />
* Affiliation: Weizman Intitute<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Hierarchical Curiosity Loops – Model, Behavior and Robotics<br />
* Abstract: Autonomously learning about one's own body and its interaction with the environment is a formidable challenge, yet it is ubiquitous in biology: every animal’s pup and every human infant accomplish this task in their first few months of life. Furthermore, biological agents’ curiosity actively drives them to explore and experiment in order to expedite their learning progress. To bridge the gap between biological and artificial agents, a formal mathematical theory of curiosity was developed that attempts to explain observed biological behaviors and enable curiosity emergence in robots. In the talk, I will present the hierarchical curiosity loops model, its application to rodent’s exploratory behavior and its implementation in a fully autonomously learning and behaving reaching robot.<br />
<br />
'''29 Jan 2013'''<br />
* Speaker: Jenny Read<br />
* Affiliation: Institute of Neuroscience, Newcastle University<br />
* Host: Sarah<br />
* Status: confirmed<br />
* Title: Stereoscopic vision<br />
* Abstract: [To be written]<br />
<br />
'''7 Feb 2013'''<br />
* Speaker: Valero Laparra<br />
* Affiliation: University of Valencia<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Empirical statistical analysis of phases in Gabor filtered natural images<br />
* Abstract:<br />
<br />
'''20 Feb 2013'''<br />
* Speaker: Dolores Bozovic<br />
* Affiliation: UCLA<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Bifurcations and phase-locking dynamics in the auditory system<br />
* Abstract: The inner ear constitutes a remarkable biological sensor that exhibits nanometer-scale sensitivity of mechanical detection. The first step in auditory processing is performed by hair cells, which convert movement into electrical signals via opening of mechanically gated ion channels. These cells are operant in a viscous medium, but can nevertheless sustain oscillations, amplify incoming signals, and even exhibit spontaneous motility, indicating the presence of an underlying active amplification system. Theoretical models have proposed that a hair cell constitutes a nonlinear system with an internal feedback mechanism that can drive it across a bifurcation and into an unstable regime. Our experiments explore the nonlinear response as well as feedback mechanisms that enable self-tuning already at the peripheral level, as measured in vitro on sensory tissue. A simple dynamic systems framework will be discussed, that captures the main features of the experimentally observed behavior in the form of an Arnold Tongue.<br />
<br />
'''27 March 2013'''<br />
* Speaker: Dale Purves<br />
* Affiliation: Duke<br />
* Host: Sarah<br />
* Status: confirmed<br />
* Title: How Visual Evolution Determines What We See<br />
* Abstract: Information about the physical world is excluded from visual stimuli by the nature of biological vision (the inverse optics problem). Nonetheless, humans and other visual animals routinely succeed in their environments. The talk will explain how the assignment of perceptual values to visual stimuli according to the frequency of occurrence of stimulus patterns resolves the inverse problem and determines the basic visual qualities we see. This interpretation of vision implies that the best (and perhaps the only) way to understand visual system circuitry is to evolve it, an idea supported by recent work.<br />
<br />
'''9 April 2013'''<br />
* Speaker: Mounya Elhilali<br />
* Affiliation: Johns Hopkins<br />
* Host: Tyler<br />
* Status: confirmed<br />
* Title: Attention at the cocktail party: Neural bases and computational strategies for auditory scene analysis<br />
* Abstract: The perceptual organization of sounds in the environment into coherent objects is a feat constantly facing the auditory system. It manifests itself in the everyday challenge faced by humans and animals alike to parse complex acoustic information arising from multiple sound sources into separate auditory streams. While seemingly effortless, uncovering the neural mechanisms and computational principles underlying this remarkable ability remain a challenge for both the experimental and theoretical neuroscience communities. In this talk, I discuss the potential role of neuronal tuning in mammalian primary auditory cortex in mediating this process. I also examine the role of mechanisms of attention in adapting this neural representation to reflect both the sensory content and the changing behavioral context of complex acoustic scenes.<br />
<br />
'''17th of April 2013'''<br />
* Speaker: Wiktor Młynarski<br />
* Affiliation: Max Planck Institute for Mathematics in the Sciences<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: Statistical Models of Binaural Sounds<br />
* Abstract: The auditory system exploits disparities in the sounds arriving at the left and right ear to extract information about the spatial configuration of sound sources. According to the widely acknowledged Duplex Theory, sounds of low frequency are localized based on Interaural Time Differences (ITDs) and localization of high frequency sources relies on Interaural Level Differences (ILDs). Natural sounds, however, possess a rich structure and contain multiple frequency components. This leads to the question: what are the contributions of different cues to sound position identification in the natural environment and how much information do they carry about its spatial structure? In this talk, I will present my attempts to answer the above questions using statistical, generative models of naturalistic (simulated) and fully natural binaural sounds.<br />
<br />
'''15 May 2013'''<br />
* Speaker: Byron Yu<br />
* Affiliation: CMU<br />
* Host: Bruno/Jose (jointly sponsored with CNEP)<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''22 May 2013'''<br />
* Speaker: Bijan Pesaran<br />
* Affiliation: NYU<br />
* Host: Bruno/Jose (jointly sponsored with CNEP)<br />
* Status: confirmed <br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
=== 2011/12 academic year ===<br />
<br />
'''15 Sep 2011 (Thursday, at noon)'''<br />
* Speaker: Kathrin Berkner<br />
* Affiliation: Ricoh Innovations Inc.<br />
* Host: Ivana Tosic<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract: TBD<br />
<br />
'''21 Sep 2011'''<br />
* Speaker: Mike Kilgard<br />
* Affiliation: UT Dallas<br />
* Host: Michael Silver<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''27 Sep 2011'''<br />
* Speaker: Moshe Gur<br />
* Affiliation: Dept. of Biomedical Engineering, Technion, Israel Institute of Technology<br />
* Host: Bruno/Stan<br />
* Status: Confirmed<br />
* Title: On the unity of perception: How does the brain integrate activity evoked at different cortical loci?<br />
* Abstract: Any physical device we know, including computers, when comparing A to B must send the information to point C. I have done experiments in three modalities, somato-sensory, auditory, and visual, where 2 different loci at the primary cortex are stimulated and I argue that the "machine" converging hypothesis cannot explain the perceptual results. Thus we must assume a non-converging mechanism whereby the brain, at times, can compare (integrate, process) events that take place at different loci without sending the information to a common target. Once we allow for such a mechanism, many phenomena can be viewed differently. Take for example the question of how and where does multi-sensory integration take place; we perceive a synchronized talking face yet detailed visual and auditory information are represented at very different brain loci.<br />
<br />
'''5 Oct 2011'''<br />
* Speaker: Susanne Still<br />
* Affiliation: University of Hawaii at Manoa<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: Predictive power, memory and dissipation in learning systems operating far from thermodynamic equilibrium<br />
* Abstract: Understanding the physical processes that underly the functioning of biological computing machinery often requires describing processes that occur far from thermodynamic equilibrium. In recent years significant progress has been made in this area, most notably Jarzynski’s work relation and Crooks’ fluctuation theorem. In this talk I will explore how dissipation of energy is related to a system's information processing inefficiency. The focus is on driven systems that are embedded in a stochastic operating environment. If we describe the system as a state machine, then we can interpret the stochastic dynamics as performing a computation that results in an (implicit) model of the stochastic driving signal. I will show that instantaneous non-predictive information, which serves as a measure of model inefficiency, provides a lower bound on the average dissipated work. This implies that learning systems with larger predictive power can operate more energetically efficiently. We could speculate that perhaps biological systems may have evolved to reflect this kind of adaptation. One interesting insight here is that purely physical notions require what is perfectly in line with the general belief that a useful model must be predictive (at fixed model complexity). Our result thereby ties together ideas from learning theory with basic non-equilibrium thermodynamics.<br />
<br />
'''19 Oct 2011'''<br />
* Speaker: Graham Cummins<br />
* Affiliation: WSU<br />
* Host: Jeff Teeters<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''26 Oct 2011'''<br />
* Speaker: Shinji Nishimoto<br />
* Affiliation: Gallant lab, UC Berkeley<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''14 Dec 2011'''<br />
* Speaker: Austin Roorda<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: How the unstable eye sees a stable and moving world<br />
* Abstract:<br />
<br />
'''11 Jan 2012'''<br />
* Speaker: Ken Nakayama<br />
* Affiliation: Harvard University<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Subjective Contours<br />
* Abstract: The concept of the receptive field in visual science has been transformative. It fueled great discoveries of the second half of the 20th C, providing the dominant understanding of how the visual system works at its early stages. Its reign has been extended to the field of object recognition where in the form of a linear classifier, it provides a framework to understand visual object recognition (DiCarlo and Cox, 2007).<br />
Untamed, however, are areas of visual perception, now more or less ignored, dubbed variously as the 2.5 D sketch, mid-level vision, surface representations. Here, neurons with their receptive fields seem unable to bridge the gap, to supply us with even a plausible speculative framework to understand amodal completion, subjective contours and other surface phenomena. Correspondingly, these areas have become backwater, ignored, leapt over.<br />
Subjective contours, however, remain as vivid as ever, even more so.<br />
Everyday, our visual system makes countless visual inferences as to the layout of the world surfaces and objects. What’s remarkable is that subjective contours visibly reveal these inferences.<br />
<br />
'''Tuesday, 24 Jan 2012'''<br />
* Speaker: Aniruddha Das<br />
* Affiliation: Columbia University<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''22 Feb 2012'''<br />
* Speaker: Elad Schneidman <br />
* Affiliation: Department of Neurobiology, Weizmann Institute of Science<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Sparse high order interaction networks underlie learnable neural population codes<br />
* Abstract:<br />
<br />
'''29 Feb 2012 (at noon as usual)'''<br />
* Speaker: Heather Read<br />
* Affiliation: U. Connecticut<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: "Transformation of sparse temporal coding from auditory colliculus and cortex"<br />
* Abstract: TBD<br />
<br />
'''1 Mar 2012 (note: Thurs)'''<br />
* Speaker: Daniel Zoran<br />
* Affiliation: Hebrew University, Jerusalem<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''7 Mar 2012'''<br />
* Speaker: David Sivak<br />
* Affiliation: UCB<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''8 Mar 2012'''<br />
* Speaker: Ivan Schwab<br />
* Affiliation: UC Davis<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Evolution's Witness: How Eyes Evolved<br />
* Abstract:<br />
<br />
'''14 Mar 2012'''<br />
* Speaker: David Sussillo<br />
* Affiliation:<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''18 April 2012'''<br />
* Speaker: Kristofer Bouchard<br />
* Affiliation: UCSF<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Cortical Foundations of Human Speech Production<br />
* Abstract:<br />
<br />
'''23 May 2012''' (rescheduled from April 11)<br />
* Speaker: Logan Grosenick<br />
* Affiliation: Stanford, Deisseroth & Suppes Labs<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: Acquisition, creation, & analysis of 4D light fields with applications to calcium imaging & optogenetics<br />
* Abstract: In Light Field Microscopy (LFM), images can be computationally refocused after they are captured [1]. This permits acquiring focal stacks and reconstructing volumes from a single camera frame. In Light Field Illumination (LFI), the same ideas can be used to create an illumination system that can deliver focused light to any position in a volume without moving optics, and these two devices (LFM/LFI) can be used together in the same system [2]. So far, these imaging and illumination systems have largely been used independently in proof-of-concept experiments [1,2]. In this talk I will discuss applications of a combined scanless volumetric imaging and volumetric illumination system applied to 4D calcium imaging and photostimulation of neurons in vivo and in vitro. The volumes resulting from these methods are large (>500,000 voxels per time point), collected at 10-100 frames per second, and highly correlated in space and time. Analyzing such data has required the development and application of machine learning methods appropriate to large, sparse, nonnegative data, as well as the estimation of neural graphical models from calcium transients. This talk will cover the reconstruction and creation of volumes in a microscope using Light Fields [1,2], and the current state-of-the-art for analyzing these large volumes in the context of calcium imaging and optogenetics. <br />
<br />
[1] M. Levoy, R. Ng, A. Adams, M. Footer, and M. Horowitz. Light Field Microscopy. ACM Transactions on Graphics 25(3), Proceedings of SIGGRAPH 2006.<br />
[2] M. Levoy, Z. Zhang, and I. McDowall. Recording and controlling the 4D light field in a microscope. Journal of Microscopy, Volume 235, Part 2, 2009, pp. 144-162. Cover article.<br />
<br />
BIO: Logan received bachelors degrees with honors in Biology and Psychology from Stanford, and a Masters in Statistics from Stanford. He is a Ph.D. candidate in the Neurosciences Program working in the labs of Karl Deisseroth and Patrick Suppes, and a trainee at the Stanford Center for Mind, Brain, and Computation. He is interested in developing and applying novel computational imaging and machine learning techniques in order to observe, control, and understand neuronal circuit dynamics.<br />
<br />
'''7 June 2012''' (Thursday)<br />
* Speaker: Mitya Chklovskii<br />
* Affiliation: janelia<br />
* Host: Bruno<br />
* Status:<br />
* Title:<br />
* Abstract<br />
<br />
'''27 June 2012''' <br />
* Speaker: Jerry Feldman<br />
* Affiliation:<br />
* Host: Bruno<br />
* Status:<br />
* Title:<br />
* Abstract:<br />
<br />
'''30 July 2012''' <br />
* Speaker: Lucas Theis<br />
* Affiliation: Matthias Bethge lab, Werner Reichardt Centre for Integrative Neuroscience, Tübingen<br />
* Host: Jascha<br />
* Status: Confirmed<br />
* Title: Hierarchical models of natural images<br />
* Abstract: Probabilistic models of natural images have been used to solve a variety of computer vision tasks as well as a means to better understand the computations performed by the visual system in the brain. A lot of theoretical considerations and biological observations point to the fact that natural image models should be hierarchically organized, yet to date, the best known models are still based on what is better described as shallow representations. In this talk, I will present two image models. One is based on the idea of Gaussianization for greedily constructing hierarchical generative models. I will show that when combined with independent subspace analysis, it is able to compete with the state of the art for modeling image patches. The other model combines mixtures of Gaussian scale mixtures with a directed graphical model and multiscale image representations and is able to generate highly structured images of arbitrary size. Evaluating the model's likelihood and comparing it to a large number of other image models shows that it might well be the best model for natural images yet.<br />
<br />
(joint work with Reshad Hosseini and Matthias Bethge)<br />
<br />
=== 2010/11 academic year ===<br />
<br />
'''02 Sep 2010'''<br />
* Speaker: Johannes Burge<br />
* Affiliation: University of Texas at Austin<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''8 Sep 2010'''<br />
* Speaker: Tobi Szuts<br />
* Affiliation: Meister Lab/ Harvard U.<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Wireless recording of neural activity in the visual cortex of a freely moving rat.<br />
* Abstract: Conventional neural recording systems restrict behavioral experiments to a flat indoor environment compatible with the cable that tethers the subject to the recording instruments. To overcome these constraints, we developed a wireless multi-channel system for recording neural signals from a freely moving animal the size of a rat or larger. The device takes up to 64 voltage signals from implanted electrodes, samples each at 20 kHz, time-division multiplexes them onto a single output line, and transmits that output by radio frequency to a receiver and recording computer up to >60 m away. The system introduces less than 4 ?V RMS of electrode-referred noise, comparable to wired recording systems and considerably less than biological noise. The system has greater channel count or transmission distance than existing telemetry systems. The wireless system has been used to record from the visual cortex of a rat during unconstrained conditions. Outdoor recordings show V1 activity is modulated by nest-building activity. During unguided behavior indoors, neurons responded rapidly and consistently to changes in light level, suppressive effects were prominent in response to an illuminant transition, and firing rate was strongly modulated by locomotion. Neural firing in the visual cortex is relatively sparse and moderate correlations are observed over large distances, suggesting that synchrony is driven by global processes.<br />
<br />
'''29 Sep 2010'''<br />
* Speaker: Vikash Gilja<br />
* Affiliation: Stanford University<br />
* Host: Charles<br />
* Status: Confirmed<br />
* Title: Towards Clinically Viable Neural Prosthetic Systems.<br />
* Abstract:<br />
<br />
'''20 Oct 2010'''<br />
* Speaker: Alexandre Francois<br />
* Affiliation: USC<br />
* Host: <br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''3 Nov 2010'''<br />
* Speaker: Eric Jonas and Vikash Mansinghka<br />
* Affiliation: Navia Systems<br />
* Host: Jascha<br />
* Status: Confirmed<br />
* Title: Natively Probabilistic Computation: Principles, Artifacts, Architectures and Applications<br />
* Abstract: Complex probabilistic models and Bayesian inference are becoming<br />
increasingly critical across science and industry, especially in<br />
large-scale data analysis. They are also central to our best<br />
computational accounts of human cognition, perception and action.<br />
However, all these efforts struggle with the infamous curse of<br />
dimensionality. Rich probabilistic models can seem hard to write and<br />
even harder to solve, as specifying and calculating probabilities<br />
often appears to require the manipulation of exponentially (and<br />
sometimes infinitely) large tables of numbers.<br />
<br />
We argue that these difficulties reflect a basic mismatch between the<br />
needs of probabilistic reasoning and the deterministic, functional<br />
orientation of our current hardware, programming languages and CS<br />
theory. To mitigate these issues, we have been developing a stack of<br />
abstractions for natively probabilistic computation, based around<br />
stochastic simulators (or samplers) for distributions, rather than<br />
evaluators for deterministic functions. Ultimately, our aim is to<br />
produce a model of computation and the associated hardware and<br />
programming tools that are as suited for uncertain inference and<br />
decision-making as our current computers are for precise arithmetic.<br />
<br />
In this talk, we will give an overview of the entire stack of<br />
abstractions supporting natively probabilistic computation, with<br />
technical detail on several hardware and software artifacts we have<br />
implemented so far. we will also touch on some new theoretical results<br />
regarding the computational complexity of probabilistic programs.<br />
Throughout, we will motivate and connect this work to some current<br />
applications in biomedical data analysis and computer vision, as well<br />
as potential hypotheses regarding the implementation of probabilistic<br />
computation in the brain.<br />
<br />
This talk includes joint work with Keith Bonawitz, Beau Cronin,<br />
Cameron Freer, Daniel Roy and Joshua Tenenbaum.<br />
<br />
BRIEF BIOGRAPHY<br />
<br />
Vikash Mansinghka is a co-founder and the CTO of Navia Systems, a<br />
venture-funded startup company building natively probabilistic<br />
computing machines. He spent 10 years at MIT, eventually earning an<br />
SB. in Mathematics, an SB. in Computer Science, an MEng in Computer<br />
Science, and a PhD in Computation. He held graduate fellowships from<br />
the NSF and MIT's Lincoln Laboratories, and his PhD dissertation won<br />
the 2009 MIT George M. Sprowls award for best dissertation in computer<br />
science. He currently serves on DARPA's Information Science and<br />
Technology (ISAT) Study Group.<br />
<br />
Eric Jonas is a co-founder of Navia Systems, responsible for in-house<br />
accelerated inference research and development. He spent ten years at<br />
MIT, where he earned SB degrees in electrical engineering and computer<br />
science and neurobiology, an MEng in EECS, with a neurobiology PhD<br />
expected really soon. He’s passionate about biological applications<br />
of probabilistic reasoning and hopes to use Navia’s capabilities to<br />
combine data from biological science, clinical histories, and patient<br />
outcomes into seamless models.<br />
<br />
'''8 Nov 2010'''<br />
* Speaker: Patrick Ruther<br />
* Affiliation: Imtek, University of Freiburg<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract: TBD<br />
<br />
'''10 Nov 2010'''<br />
* Speaker: Aurel Lazar<br />
* Affiliation: Department of Electrical Engineering, Columbia University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Encoding Visual Stimuli with a Population of Hodgkin-Huxley Neurons<br />
* Abstract: We first present a general framework for the reconstruction of natural video<br />
scenes encoded with a population of spiking neural circuits with random thresholds.<br />
The visual encoding system consists of a bank of filters, modeling the visual<br />
receptive fields, in cascade with a population of neural circuits, modeling encoding<br />
with spikes in the early visual system.<br />
The neuron models considered include integrate-and-fire neurons and ON-OFF<br />
neuron pairs with threshold-and-fire spiking mechanisms. All thresholds are assumed<br />
to be random. We show that for both time-varying and space-time-varying stimuli neural<br />
spike encoding is akin to taking noisy measurements on the stimulus.<br />
Second, we formulate the reconstruction problem as the minimization of a<br />
suitable cost functional in a finite-dimensional vector space and provide an explicit<br />
algorithm for stimulus recovery. We also present a general solution using the theory of<br />
smoothing splines in Reproducing Kernel Hilbert Spaces. We provide examples of both<br />
synthetic video as well as for natural scenes and show that the quality of the<br />
reconstruction degrades gracefully as the threshold variability of the neurons increases.<br />
Third, we demonstrate a number of simple operations on the original visual stimulus<br />
including translations, rotations and zooming. All these operations are natively executed<br />
in the spike domain. The processed spike trains are decoded for the faithful recovery<br />
of the stimulus and its transformations.<br />
Finally, we extend the above results to neural encoding circuits built with Hodking-Huxley<br />
neurons.<br />
References:<br />
Aurel A. Lazar, Eftychios A. Pnevmatikakis and Yiyin Zhou,<br />
Encoding Natural Scenes with Neural Circuits with Random Thresholds, Vision Research, 2010,<br />
Special Issue on Mathematical Models of Visual Coding,<br />
http://dx.doi.org/10.1016/j.visres.2010.03.015<br />
Aurel A. Lazar,<br />
Population Encoding with Hodgkin-Huxley Neurons,<br />
IEEE Transactions on Information Theory, Volume 56, Number 2, pp. 821-837, February, 2010,<br />
Special Issue on Molecular Biology and Neuroscience,<br />
http://dx.doi.org/10.1109/TIT.2009.2037040<br />
<br />
'''11 Nov 2010''' (UCB holiday)<br />
* Speaker: Martha Nari Havenith<br />
* Affiliation: UCL<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: Finding spike timing in the visual cortex - Oscillations as the internal clock of vision?<br />
* Abstract:<br />
<br />
'''19 Nov 2010''' (note: on Friday because of SFN)<br />
* Speaker: Dan Butts<br />
* Affiliation: UMD<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: Common roles of inhibition in visual and auditory processing.<br />
* Abstract: The role of inhibition in sensory processing is often obscured in extracellular recordings, because the absence of a neuronal response associated with inhibition might also be explained by a simple lack of excitation. However, increasingly, evidence from intracellular recordings demonstrates important roles of inhibition in shaping the stimulus selectivity of sensory neurons in both the visual and auditory systems. We have developed a nonlinear modeling approach that can identify putative excitatory and inhibitory inputs to a neuron using standard extracellular recordings, and have applied these techniques to understand the role of inhibition in shaping sensory processing in visual and auditory areas. In pre-cortical visual areas (retina and LGN), we find that inhibition likely plays a role in generating temporally precise responses, and mediates adaptation to changing contrast. In an auditory pre-cortical area (inferior colliculus) identified inhibition has nearly identical appearance and functions in temporal processing and adaptation. Thus, we predict common roles of inhibition in these sensory areas, and more generally demonstrate general methods for characterizing the nonlinear computations that comprise sensory processing.<br />
<br />
'''24 Nov 2010'''<br />
* Speaker: Eizaburo Doi<br />
* Affiliation: NYU<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
<br />
'''29 Nov 2010 - informal talk'''<br />
* Speaker: Eero Lehtonen<br />
* Affiliation: UTU Finland<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Memristors<br />
* Abstract:<br />
<br />
'''1 Dec 2010'''<br />
* Speaker: Gadi Geiger<br />
* Affiliation: MIT<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: Visual and Auditory Perceptual Modes that Characterize Dyslexics<br />
* Abstract: I will describe how dyslexics’ visual and auditory perception is wider and more diffuse than that of typical readers. This suggests wider neural tuning in dyslexics. In addition I will describe how this processing relates to difficulties in reading. Strengthening the argument and more so helping dyslexics I will describe a regimen of practice that results in improved reading in dyslexics while narrowing perception.<br />
<br />
<br />
'''13 Dec 2010'''<br />
* Speaker: Jorg Lueke<br />
* Affiliation: FIAS<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Linear and Non-linear Approaches to Component Extraction and Their Applications to Visual Data<br />
* Abstract: In the nervous system of humans and animals, sensory data are represented as combinations of elementary data components. While for data such as sound waveforms the elementary components combine linearly, other data can better be modeled by non-linear forms of component superpositions. I motivate and discuss two models with binary latent variables: one using standard linear superpositions of basis functions and one using non-linear superpositions. Crucial for the applicability of both models are efficient learning procedures. I briefly introduce a novel training scheme (ET) and show how it can be applied to probabilistic generative models. For linear and non-linear models the scheme efficiently infers the basis functions as well as the level of sparseness and data noise. In large-scale applications to image patches, we show results on the statistics of inferred model parameters. Differences between the linear and non-linear models are discussed, and both models are compared to results of standard approaches in the literature and to experimental findings. Finally, I briefly discuss learning in a recent model that takes explicit component occlusions into account.<br />
<br />
'''15 Dec 2010'''<br />
* Speaker: Claudia Clopath<br />
* Affiliation: Universite Paris Decartes<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
<br />
'''18 Jan 2011'''<br />
* Speaker: Siwei Lyu<br />
* Affiliation: Computer Science Department, University at Albany, SUNY<br />
* Host: Bruno<br />
* Status: confirmed <br />
* Title: Divisive Normalization as an Efficient Coding Transform: Justification and Evaluation<br />
* Abstract:<br />
<br />
'''19 Jan 2011'''<br />
* Speaker: David Field (informal talk)<br />
* Affiliation: <br />
* Host: Bruno<br />
* Status: Tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''25 Jan 2011'''<br />
* Speaker: Ruth Rosenholtz<br />
* Affiliation: Dept. of Brain & Cognitive Sciences, Computer Science and AI Lab, MIT<br />
* Host: Bruno<br />
* Status: Confirmed <br />
* Title: What your visual system sees where you are not looking<br />
* Abstract:<br />
<br />
'''26 Jan 2011'''<br />
* Speaker: Ernst Niebur<br />
* Affiliation: Johns Hopkins U<br />
* Host: Fritz<br />
* Status: Confirmed <br />
* Title: <br />
* Abstract:<br />
<br />
'''16 March 2011'''<br />
* Speaker: Vladimir Itskov<br />
* Affiliation: University of Nebraska-Lincoln<br />
* Host: Chris<br />
* Status: Confirmed <br />
* Title: <br />
* Abstract:<br />
<br />
'''23 March 2011'''<br />
* Speaker: Bruce Cumming<br />
* Affiliation: National Institutes of Health<br />
* Host: Ivana<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract:<br />
<br />
'''27 April 2011'''<br />
* Speaker: Lubomir Bourdev<br />
* Affiliation: Computer Science, UC Berkeley<br />
* Host:Bruno<br />
* Status: Confirmed<br />
* Title: "Poselets and Their Applications in High-Level Computer Vision Problems"<br />
* Abstract:<br />
<br />
'''12 May 2011 (note: Thursday)'''<br />
* Speaker: Jack Culpepper<br />
* Affiliation: Redwood Center/EECS<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''26 May 2011'''<br />
* Speaker: Ian Stevenson<br />
* Affiliation: Northwestern University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Explaining tuning curves by estimating interactions between neurons<br />
* Abstract: One of the central tenets of systems neuroscience is that tuning curves are a byproduct of the interactions between neurons. Using multi-electrode recordings and recently developed inference techniques we can begin to examine this idea in detail and study how well we can explain the functional properties of neurons using the activity of other simultaneously recorded neurons. Here we examine datasets from 6 different brain areas recorded during typical sensorimotor tasks each with ~100 simultaneously recorded neurons. Using these datasets we measured the extent to which interactions between neurons can explain the tuning properties of individual neurons. We found that, in almost all areas, modeling interactions between 30-50 neurons allows more accurate spike prediction than tuning curves. This suggests that tuning can, in some sense, be explained by interactions between neurons in a variety of brain areas, even when recordings consist of relatively small numbers of neurons.<br />
<br />
'''1 June 2011'''<br />
* Speaker: Michael Oliver<br />
* Affiliation: Gallant lab<br />
* Host: Bruno<br />
* Status: Tentative <br />
* Title: <br />
* Abstract:<br />
<br />
'''8 June 2011'''<br />
* Speaker: Alyson Fletcher<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: Generalized Approximate Message Passing for Neural Receptive Field Estimation and Connectivity<br />
* Abstract: Fundamental to understanding sensory encoding and connectivity of neurons are effective tools for developing and validating complex mathematical models from experimental data. In this talk, I present a graphical models approach to the problems of neural connectivity reconstruction under multi-neuron excitation and to receptive field estimation of sensory neurons in response to stimuli. I describe a new class of Generalized Approximate Message Passing (GAMP) algorithms for a general class of inference problems on graphical models based Gaussian approximations of loopy belief propagation. The GAMP framework is extremely general, provides a systematic procedure for incorporating a rich class of nonlinearities, and is computationally tractable with large amounts of data. In addition, for both the connectivity reconstruction and parameter estimation problems, I show that GAMP-based estimation can naturally incorporate sparsity constraints in the model that arise from the fact that only a small fraction of the potential inputs have any influence on the output of a particular neuron. A simulation of reconstruction of cortical neural mapping under multi-neuron excitation shows that GAMP offers improvement over previous compressed sensing methods. The GAMP method is also validated on estimation of linear nonlinear Poisson (LNP) cascade models for neural responses of salamander retinal ganglion cells.<br />
<br />
=== 2009/10 academic year ===<br />
<br />
'''2 September 2009''' <br />
* Speaker: Keith Godfrey<br />
* Affiliation: University of Cambridge<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''7 October 2009'''<br />
* Speaker: Anita Schmid<br />
* Affiliation: Cornell University<br />
* Host: Kilian<br />
* Status: Confirmed<br />
* Title: Subpopulations of neurons in visual area V2 perform differentiation and integration operations in space and time<br />
* Abstract: The interconnected areas of the visual system work together to find object boundaries in visual scenes. Primary visual cortex (V1) mainly extracts oriented luminance boundaries, while secondary visual cortex (V2) also detects boundaries defined by differences in texture. How the outputs of V1 neurons are combined to allow for the extraction of these more complex boundaries in V2 is as of yet unclear. To address this question, we probed the processing of orientation signals in single neurons in V1 and V2, focusing on response dynamics of neurons to patches of oriented gratings and to combinations of gratings in neighboring patches and sequential time frames. We found two kinds of response dynamics in V2, both of which are different from those of V1 neurons. While V1 neurons in general prefer one orientation, one subpopulation of V2 neurons (“transient”) shows a temporally dynamic preference, resulting in a preference for changes in orientation. The second subpopulation of V2 neurons (“sustained”) responds similarly to V1 neurons, but with a delay. The dynamics of nonlinear responses to combinations of gratings reinforce these distinctions: the dynamics enhance the preference of V1 neurons for continuous orientations, and enhance the preference of V2 transient neurons for discontinuous ones. We propose that transient neurons in V2 perform a differentiation operation on the V1 input, both spatially and temporally, while the sustained neurons perform an integration operation. We show that a simple feedforward network with delayed inhibition can account for the temporal but not for the spatial differentiation operation.<br />
<br />
'''28 October 2009'''<br />
* Speaker: Andrea Benucci<br />
* Affiliation: Institute of Ophthalmology, University College London<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Stimulus dependence of the functional connectivity between neurons in primary visual cortex<br />
* Abstract: It is known that visual stimuli are encoded by the concerted activity of large populations of neurons in visual cortical areas. However, it is only recently that recording techniques have been made available to study such activations from large ensembles of neurons simultaneously, with millisecond temporal precision and tens of microns spatial resolution. I will present data from voltage-sensitive dye (VSD) imaging and multi-electrode recordings (“Utah” probes) from the primary visual cortex of the cat (V1). I will discuss the relationship between two fundamental cortical maps of the visual system: the map of retinotopy and the map of orientation. Using spatially localized and full-field oriented stimuli, we studied the functional interdependency of these maps. I will describe traveling and standing waves of cortical activity and their key role as a dynamical substrate for the spatio-temporal coding of visual information. I will further discuss the properties of the spatio-temporal code in the context of continuous visual stimulation. While recording population responses to a sequence of oriented stimuli, we asked how responses to individual stimuli summate over time. We found that such rules are mostly linear, supporting the idea that spatial and temporal codes in area V1 operate largely independently. However, these linear rules of summation fail when the visual drive is removed, suggesting that the visual cortex can readily switch between a dynamical regime where either feed-forward or intra-cortical inputs determine the response properties of the network.<br />
<br />
'''12 November 2009 (Thursday)'''<br />
* Speaker: Song-Chun Zhu<br />
* Affiliation: UCLA<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''18 November 2009'''<br />
* Speaker: Dan Graham<br />
* Affiliation: Dept. of Mathematics, Dartmouth College<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: The Packet-Switching Brain: A Hypothesis<br />
* Abstract: Despite great advances in our understanding of neural responses to natural stimuli, the basic structure of the neural code remains elusive. In this talk, I will describe a novel hypothesis regarding the fundamental structure of neural coding in mammals. In particular, I propose that an internet-like routing architecture (specifically packet-switching) underlies neocortical processing, and I propose means of testing this hypothesis via neural response sparseness measurements. I will synthesize a host of suggestive evidence that supports this notion and will, more generally, argue in favor of a large scale shift from the now dominant “computer metaphor,” to the “internet metaphor.” This shift is intended to spur new thinking with regard to neural coding, and its main contribution is to privilege communication over computation as the prime goal of neural systems.<br />
<br />
'''16 December 2009'''<br />
* Speaker: Pietro Berkes<br />
* Affiliation: Volen Center for Complex Systems, Brandeis University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Generative models of vision: from sparse coding toward structured models<br />
* Abstract: From a computational perspective, one can think of visual perception as the problem of analyzing the light patterns detected by the retina to recover their external causes. This process requires combining the incoming sensory evidence with internal prior knowledge about general properties of visual elements and the way they interact, and can be formalized in a class of models known as causal generative models. In the first part of the talk, I will discuss the first and most established generative model, namely the sparse coding model. Sparse coding has been largely successful in showing how the main characteristics of simple cells receptive fields can be accounted for based uniquely on the statistics of natural images. I will briefly review the evidence supporting this model, and contrast it with recent data from the primary visual cortex of ferrets and rats showing that the sparseness of neural activity over development and anesthesia seems to follow trends opposite to those predicted by sparse coding. In the second part, I will argue that the generative point of view calls for models of natural images that take into account more of the structure of the visual environment. I will present a model that takes a first step in this direction by incorporating the fundamental distinction between identity and attributes of visual elements. After learning, the model mirrors several aspects of the organization of V1, and results in a novel interpretation of complex and simple cells as parallel population of cells, coding for different aspects of the visual input. Further steps toward more structured generative models might thus lead to the development of a more comprehensive account of visual processing in the visual cortex.<br />
<br />
'''6 January 2010'''<br />
* Speaker: Susanne Still<br />
* Affiliation: U of Hawaii<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''20 January 2010'''<br />
* Speaker: Tom Dean<br />
* Affiliation: Google<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Accelerating Computer Vision and Machine Learning Algorithms with Graphics Processors<br />
* Abstract: Graphics processors (GPUs) and massively-multi-core architectures are becoming more powerful, less costly and more energy efficient, and the related programming language issues are beginning to sort themselves out. That said most researchers don’t want to be writing code that depends on any particular architecture or parallel programming model. Linear algebra, Fourier analysis and image processing have standard libraries that are being ported to exploit SIMD parallelism in GPUs. We can depend on the massively-multiple-core machines du jour to support these libraries and on the high-performance-computing (HPC) community to do the porting for us or with us. These libraries can significantly accelerate important applications in image processing, data analysis and information retrieval. We can develop APIs and the necessary run-time support so that code relying on these libraries will run on any machine in a cluster of computers but exploit GPUs whenever available. This strategy allows us to move toward hybrid computing models that enable a wider range of opportunities for parallelism without requiring the special training of programmers or the disadvantages of developing code that depends on specialized hardware or programming models. This talk summarizes the state of the art in massively-multi-core architectures, presents experimental results that demonstrate the potential for significant performance gains in the two general areas of image processing and machine learning, provides examples of the proposed programming interface, and some more detailed experimental results on one particular problem involving video-content analysis.<br />
<br />
'''27 January 2010'''<br />
* Speaker: David Philiponna<br />
* Affiliation: Paris<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
''''24 Feburary 2010'''<br />
* Speaker: Gordon Pipa<br />
* Affiliation: U Osnabrueck/MPI Frankfurt<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''3 March 2010'''<br />
* Speaker: Gaute Einevoll<br />
* Affiliation: UMB, Norway<br />
* Host: Amir<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
<br />
'''4 March 2010'''<br />
* Speaker: Harvey Swadlow<br />
* Affiliation: <br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''8 April 2010'''<br />
* Speaker: Alan Yuille <br />
* Affiliation: UCLA<br />
* Host: Amir<br />
* Status: Confirmed (for 1pm)<br />
* Title: <br />
* Abstract:<br />
<br />
'''28 April 2010'''<br />
* Speaker: Dharmendra Modha - cancelled<br />
* Affiliation: IBM<br />
* Host:Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''5 May 2010'''<br />
* Speaker: David Zipser<br />
* Affiliation: UCB<br />
* Host: Daniel Little<br />
* Status: Tentative<br />
* Title: Brytes 2:<br />
* Abstract:<br />
<br />
Brytes are little brains that can be assembled into larger, smarter brains. In my first talk I presented a biologically plausible, computationally tractable model of brytes and described how they can be used as subunits to build brains with interesting behaviors.<br />
<br />
In this talk I will first show how large numbers of brytes can cooperate to perform complicated actions such as arm and hand manipulations in the presence of obstacles. Then I describe a strategy for a higher level of control that informs each bryte what role it should play in accomplishing the current task. These results could have considerable significance for understanding the brain and possibly be applicable to robotics and BMI.<br />
<br />
'''12 May 2010'''<br />
* Speaker: Frank Werblin (Redwood group meeting - internal only)<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: Tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''19 May 2010'''<br />
* Speaker: Anna Judith<br />
* Affiliation: UCB<br />
* Host: Daniel Little (Redwood Lab Meeting - internal only)<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8505Cluster2016-04-06T02:13:58Z<p>Jesselivezey: /* Hardware Overview */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-non-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Seminars&diff=8474Seminars2016-02-13T22:41:36Z<p>Jesselivezey: /* Tentative / Confirmed Speakers */</p>
<hr />
<div>== Instructions ==<br />
<br />
# Check the internal calendar (here) for a free seminar slot. Seminars are usually Wednesdays at noon, but it is flexible in case there is a day that works better for the speaker. However, it is usually best to avoid booking multiple speakers in the same week - it leads to "seminar burnout" and reduced attendance. But use your own judgement here - if its a good opportunity and that's the only time that works then go ahead with it.<br />
# Once you have proposed a date to a speaker, fill in the speaker information under the appropriate date (or change if necessary). Use the status field to indicate whether the date is tentative or confirmed. Please also include your name as ''host'' in case somebody wants to contact you.<br />
# Once the invitation is confirmed with the speaker, change the status field to 'confirmed'. Also notify the webmaster (Bruno) [mailto:baolshausen@berkeley.edu] that we have a confirmed speaker so that he/she can update the public web page. Please include a title and abstract.<br />
# Natalie (HWNI) checks our web page regularly and will send out an announcement a week before and also include with the weekly neuro announcements, but if you don't get it confirmed until the last minute then make sure to email Natalie [mailto:nrterranova@berkeley.edu] as well to give her a heads up so she knows to send out an announcement in time.<br />
# If the speaker needs accommodations you should contact Natalie [mailto:nrterranova@berkeley.edu] to reserve a room at the faculty club. Tell her its for a Redwood speaker so she knows how to bill it.<br />
# During the visit you will need to look after the visitor, schedule visits with other labs, make plans for lunch, dinner, etc., and introduce at the seminar (don't ask Bruno to do this at the last moment). Save receipts for any meals you paid for.<br />
# After the seminar and before the speaker leaves, make sure to give them Natalie's contact info and have them email her their receipts, explaining its for reimbursement for a Redwood seminar. Natalie will then process the reimbursement. She can also help you with getting reimbursed for any expenses you incurred for meals and entertainment.<br />
<br />
== Tentative / Confirmed Speakers ==<br />
<br />
<br />
'''Feb 3, 2016'''<br />
* Speaker: Ping-Chen Huang<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Feb 17, 2016'''<br />
* Speaker: Andrew Saxe<br />
* Affiliation: Harvard<br />
* Host: Jesse<br />
* Status: confirmed<br />
* Title: Hallmarks of Deep Learning in the Brain<br />
<br />
'''Mar 1, 2016'''<br />
* Speaker: Leon Gatys<br />
* Affiliation: Univ Tubingen<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Mar 7-9, 2016'''<br />
* NICE workshop<br />
<br />
'''Mar 23, 2016'''<br />
* Speaker: Kwabena Boahen<br />
* Affiliation: Stanford<br />
* Host: Max Kanwal/Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Mar 30, 2016'''<br />
* Tony Zador HWNI talk at 12:00<br />
<br />
'''May 18, 2016'''<br />
* Speaker: Melanie Mitchell<br />
* Affiliation: Portland State University<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title:<br />
<br />
== Previous Seminars ==<br />
<br />
=== 2015/16 academic year ===<br />
<br />
'''July 21, 2015'''<br />
* Speaker: Felix Effenberger<br />
* Affiliation: <br />
* Host: Chris H.<br />
* Status: confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 22, 2015'''<br />
* Speaker: Lav Varshney<br />
* Affiliation: Urbana-Champaign<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 23, 2015'''<br />
* Speaker: Xuemin Wei<br />
* Affiliation: Univ Penn<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 29, 2015'''<br />
* Speaker: Gonzalo Otazu<br />
* Affiliation: Cold Spring Harbor Laboratory, Long Island, NY<br />
* Host: Mike D<br />
* Status: Confirmed<br />
* Title: The Role of Cortical Feedback in Olfactory Processing<br />
* Abstract: The olfactory bulb receives rich glutamatergic projections from the piriform cortex. However, the dynamics and importance of these feedback signals remain unknown. In the first part of this talk, I will present data from multiphoton calcium imaging of cortical feedback in the olfactory bulb of awake mice. Responses of feedback boutons were sparse, odor specific, and often outlasted stimuli by several seconds. Odor presentation either enhanced or suppressed the activity of boutons. However, any given bouton responded with stereotypic polarity across multiple odors, preferring either enhancement or suppression. Inactivation of piriform cortex increased odor responsiveness and pairwise similarity of mitral cells but had little impact on tufted cells. We propose that cortical feedback differentially impacts these two output channels of the bulb by specifically decorrelating mitral cell responses to enable odor separation. In the second part of the talk I will introduce a computational model of odor identification in natural scenes that uses cortical feedback and how the model predictions match our experimental data.<br />
<br />
'''Aug 19, 2015'''<br />
* Speaker: Wujie Zhang<br />
* Affiliation: Columbia<br />
* Host: Bruno/Michael Yartsev<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''Sept 2, 2015'''<br />
* Speaker: Jeremy Maitin-Shepard<br />
* Affiliation: Computer Science, UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Combinatorial Energy Learning for Image Segmentation<br />
* Abstract: Recent advances in volume electron microscopy make it possible to image neuronal tissue volumes containining hundreds of thousands of neurons at sufficient resolution to discern even the finest neuronal processes. Accurate 3-D segmentation of these processes densely packed in these petavoxel-scale volumes is the key bottleneck in reconstructing large-scale neural circuits.<br />
<br />
'''Sept 8, 2015'''<br />
* Speaker: Jennifer Hasler<br />
* Affiliation: Georgia Tech<br />
* Host: Bruno/Mika<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''October 29, 2015'''<br />
* Speaker: Garrett Kenyon<br />
* Affiliation: Los Alamos National Laboratory<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title: A Deconvolutional Competitive Algorithm (DCA)<br />
* Abstract: The Locally Competitive Algorithm (LCA) is a neurally-plausible sparse solver based on lateral inhibition between leaky integrator neurons. LCA accounts for many linear and nonlinear response properties of V1 simple cells, including end-stopping and contrast-invariant orientation tuning. Here, we describe a convolutional implementation of LCA in which a column of feature vectors is replicated with a stride that is much smaller than the diameter of the corresponding kernels, allowing the construction of dictionaries that are many times more overcomplete than without replication. Using a local Hebbian rule that minimizes sparse reconstruction error, we are able to learn representations from unlabeled imagery, including monocular and stereo video streams, that in some cases support near state-of-the-art performance on object detection, action classification and depth estimation tasks, with a simple linear classifier. We further describe a scalable approach to building a hierarchy of convolutional LCA layers, which we call a Deconvolutional Competitive Algorithm (DCA). All layers in a DCA are trained simultaneously and all layers contribute to a single image reconstruction, with each layer deconvolving its representation through all lower layers back to the image plane. We show that a 3-layer DCA trained on short video clips obtained from hand-held cameras exhibits a clear segregation of image content, with features in the top layer reconstructing large-scale structures while features in the middle and bottom layers reconstruct progressively finer details. Lastly, we describe PetaVision, an open source, cloud-friendly, high-performance neural simulation toolbox that was used to perform the numerical studies presented here.<br />
<br />
'''Nov 18, 2015'''<br />
* Speaker: Hillel Adesnik<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Nov 17, 2015'''<br />
* Speaker: Manuel Lopez<br />
* Affiliation: <br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''Dec 2, 2015'''<br />
* Speaker: Steven Brumby<br />
* Affiliation: [http://www.descarteslabs.com/ Descartes Labs]<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title: Seeing the Earth in the Cloud<br />
* Abstract: The proliferation of transistors has increased the performance of computing systems by over a factor of a million in the past 30 years, and is also dramatically increasing the amount of data in existence, driving improvements in sensor, communication and storage technology. Multi-decadal Earth and planetary remote sensing global datasets at the petabyte scale (8×10^15 bits) are now available in commercial clouds, and new satellite constellations are planning to generate petabytes of images per year, providing daily global coverage at a few meters per pixel. Cloud storage with adjacent high-bandwidth compute, combined with recent advances in neuroscience-inspired machine learning for computer vision, is enabling understanding of the world at a scale and at a level of granularity never before feasible. We report here on a computation processing over a petabyte of compressed raw data from 2.8 quadrillion pixels (2.8 petapixels) acquired by the US Landsat and MODIS programs over the past 40 years. Using commodity cloud computing resources, we convert the imagery to a calibrated, georeferenced, multiresolution tiled format suited for machine-learning analysis. We believe ours is the first application to process, in less than a day, on generally available resources, over a petabyte of scientific image data. We report on work using this reprocessed dataset for experiments demonstrating country-scale food production monitoring, an indicator for famine early warning. <br />
<br />
'''Dec 14, 2015'''<br />
* Speaker: Bill Softky <br />
* Affiliation:<br />
* Host: Bruno<br />
* Status: confirmed <br />
* Title: Screen addition - informal Redwood group seminar<br />
<br />
'''Dec 16, 2015'''<br />
* Speaker: Mike Landy<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
=== 2014/15 academic year ===<br />
<br />
'''2 July 2014'''<br />
* Speaker: Kelly Clancy<br />
* Affiliation: Feldman lab<br />
* Host: Guy<br />
* Status: confirmed<br />
* Title: Volitional control of neural assemblies in L2/3 of motor and somotosensory cortices<br />
* Abstract: I'll be talking about a joint effort between the Feldman, Carmena and Costa labs to study abstract task learning by small neuronal assemblies in intact networks. Brain-machine interfaces are a unique tool for studying learning, thanks to the direct mapping between neural activity and reward. We trained mice to operantly control an auditory cursor using spike-related calcium signals recorded with two-photon imaging in motor and somatosensory cortex, allowing us to assess the effects of learning with great spatial detail. Mice rapidly learned to modulate activity in layer 2/3 neurons, evident both across and within sessions. Interestingly, even neurons that exhibited very low or no spontaneous spiking--so-called 'silent' cells that are invisible to electrode-based techniques--could be behaviorally up-modulated for task performance. Learning was accompanied by modifications of firing correlations in spatially localized networks at fine scales.<br />
<br />
'''23 July 2014'''<br />
* Speaker: Gautam Agarwal<br />
* Affiliation: UC Berkeley/Champalimaud<br />
* Host: Friedrich Sommer<br />
* Status: confirmed<br />
* Title: Unsolved Mysteries of Hippocampal Dynamics<br />
* Abstract: Two radically different forms of electrical activity can be observed in the rat hippocampus: spikes and local field potentials (LFPs). Hippocampal pyramidal neurons are mostly silent, yet spike vigorously as the subject encounters particular locations in its environment. In contrast, LFPs appear to lack place-selectivity, persisting regardless of the rat's location. Recently, we found that in fact one can recover from LFPs the spatial information present in the underlying neuronal population, showing how these two signals are two sides of the same coin. Nonetheless, there are many aspects of the LFP that remain mysterious. I will review several observations and explanatory gaps which await further study. These include: the relationship of LFP patterns to anatomy; the elusive structure of gamma waves; complex forms of cross-frequency coupling; variations in LFP patterns seen when the rat explores its world more freely; reconciling the memory and navigation roles of the hippocampus.<br />
<br />
'''6 Aug 2014'''<br />
* Speaker: Georg Martius<br />
* Affiliation: Max Planck Institute, Leipzig<br />
* Host: Fritz Sommer<br />
* Status: confirmed<br />
* Title: Information driven self-organization of robotic behavior<br />
* Abstract: Autonomy is a puzzling phenomenon in nature and a major challenge in the world of artifacts. A key feature of autonomy in both natural and<br />
artificial systems is seen in the ability for independent<br />
exploration. In animals and humans, the ability to modify its own<br />
pattern of activity is not only an indispensable trait for adaptation<br />
and survival in new situations, it also provides a learning system<br />
with novel information for improving its cognitive capabilities, and<br />
it is essential for development. Efficient exploration in<br />
high-dimensional spaces is a major challenge in building learning<br />
systems. We propose to implement the exploration as a deterministic<br />
law derived from maximizing an information quantity. More<br />
specifically we use the predictive information of the sensor process<br />
(of a robot) to obtain an update rule (exploration dynamics) of the<br />
controller parameters. To be adequate in robotics application the<br />
non-stationary nature of the underlying time-series have to be taken<br />
into account, which we do by proposing the time-local predictive<br />
information (TiPI). Importantly the exploration dynamics is derived<br />
analytically and by this we link information theory and dynamical<br />
systems. Without a random component the change in the parameters is<br />
deterministically given as a function of the states in a certain time<br />
window. For an embodied system this means in particular that<br />
constraints, responses and current knowledge of the dynamical<br />
interaction with the environment can directly be used to advance<br />
further exploration. Randomness is replaced with spontaneity which we<br />
demonstrate to restrict the search space automatically to the<br />
physically relevant dimensions. Its effectiveness will be<br />
presented with various experiments on high-dimensional robotic system<br />
and we argue that this is a promising way to avoid the curse of<br />
dimensionality. This talk describes joint work with Ralf Der and Nihat Ay.<br />
<br />
'''15 Aug 2014'''<br />
* Speaker: Juergen Schmidhuber<br />
* Affiliation: IDSIA, Switzerland<br />
* Host: James/Shariq<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''2 Sept 2014'''<br />
* Speaker: Oriol Vinyals <br />
* Affliciation: Google<br />
* Host: Guy<br />
* Status: confirmed<br />
* Title: Machine Translation with Long-Short Term Memory Models<br />
* Abstract: Supervised large deep neural networks achieved good results on speech recognition and computer vision. Although very successful, deep neural networks can only be applied to problems whose inputs and outputs can be conveniently encoded with vectors of fixed dimensionality - but cannot easily be applied to problems whose inputs and outputs are sequences. In this work, we show how to use a large deep Long Short-Term Memory (LSTM) model to solve domain-agnostic supervised sequence to sequence problems with minimal manual engineering. Our model uses one LSTM to map the input sequence to a vector of a fixed dimensionality and another LSTM to map the vector to the output sequence. We applied our model to a machine translation task and achieved encouraging results. On the WMT'14 translation task from English to French, a model combination of 6 large LSTMs achieves a BLEU score of 32.3 (where a larger score is better). For comparison, a strong standard statistical MT baseline achieves a BLEU score of 33.3. When we use our LSTM to rescore the n-best lists produced by the SMT baseline, we achieve a BLEU score of 36.3, which is a new state of the art. This is joint work with Ilya Sutskever and Quoc Le.<br />
<br />
'''19 Sept 2014'''<br />
* Speaker: Gary Marcus<br />
* Affiliation: NYU<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''24 Sept 2014'''<br />
* Speaker: Alyosha Efros<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''30 Sep 2014'''<br />
* Speaker: Alejandro Bujan<br />
* Affiliation:<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Propagation and variability of evoked responses: the role of correlated inputs and oscillations<br />
* Abstract: <br />
<br />
'''8 Oct 2014'''<br />
* Speaker: Siyu Zhang<br />
* Affiliation: UC Berkeley<br />
* Host: Karl<br />
* Status: confirmed<br />
* Title: Long-range and local circuits for top-down modulation of visual cortical processing<br />
* Abstract:<br />
<br />
'''15 Oct 2014'''<br />
* Speaker: Tamara Broderick<br />
* Affiliation: UC Berkeley<br />
* Host: Yvonne/James<br />
* Status: confirmed<br />
* Title: Feature allocations, probability functions, and paintboxes<br />
* Abstract: Clustering involves placing entities into mutually exclusive categories. We wish to relax the requirement of mutual exclusivity, allowing objects to belong simultaneously to multiple classes, a formulation that we refer to as "feature allocation." The first step is a theoretical one. In the case of clustering the class of probability distributions over exchangeable partitions of a dataset has been characterized (via exchangeable partition probability functions and the Kingman paintbox). These characterizations support an elegant nonparametric Bayesian framework for clustering in which the number of clusters is not assumed to be known a priori. We establish an analogous characterization for feature allocation; we define notions of "exchangeable feature probability functions" and "feature paintboxes" that lead to a Bayesian framework that does not require the number of features to be fixed a priori. The second step is a computational one. Rather than appealing to Markov chain Monte Carlo for Bayesian inference, we develop a method to transform Bayesian methods for feature allocation (and other latent structure problems) into optimization problems with objective functions analogous to K-means in the clustering setting. These yield approximations to Bayesian inference that are scalable to large inference problems.<br />
<br />
'''29 Oct 2014'''<br />
* Speaker: Ken Nakayama<br />
* Affiliation: Harvard<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Topics in higher level visuo-motor control<br />
* Abstract: TBA<br />
<br />
'''5 Nov 2014''' - **BVLC retreat**<br />
<br />
'''20 Nov 2014'''<br />
* Speaker: Haruo Hasoya<br />
* Affiliation: ATR Institute, Japan<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''9 Dec 2014'''<br />
* Speaker: Dirk DeRidder<br />
* Affiliation: Dundedin School of Medicine, University of Otago, New Zealand<br />
* Host: Bruno/Walter Freeman<br />
* Status: confirmed<br />
* Title: The Bayesian brain, phantom percepts and brain implants<br />
* Abstract: TBA<br />
<br />
'''January 14, 2015'''<br />
* Speaker: Kevin O'regan<br />
* Affiliation: CNRS - Université Paris Descartes<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''January 21, 2015'''<br />
* Speaker: Adrienne Fairhall<br />
* Affiliation: University of Washington<br />
* Host: Mike Schachter<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''January 26, 2015'''<br />
* Speaker: Abraham Peled<br />
* Affiliation: Mental Health Center, 'Technion' Israel Institute of Technology<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Clinical Brain Profiling: A Neuro-Computational psychiatry<br />
* Abstract: TBA<br />
<br />
'''January 28, 2015'''<br />
* Speaker: Rich Ivry<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Embodied Decision Making: System interactions in sensorimotor adaptation and reinforcement learning<br />
* Abstract:<br />
<br />
'''February 11, 2015'''<br />
* Speaker: Mark Lescroart<br />
* Affiliation: UC Berkeley<br />
* Host: Karl<br />
* Status: tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''February 25, 2015'''<br />
* Speaker: Steve Chase<br />
* Affiliation: CMU<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Joint Redwood/CNEP seminar<br />
* Abstract:<br />
<br />
'''March 3, 2015'''<br />
* Speaker: Andreas Herz<br />
* Affiliation: Bernstein Center, Munich<br />
* Host: Bruno/Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''March 3, 2015 - 4:00'''<br />
* Speaker: James Cooke<br />
* Affiliation: Oxford<br />
* Host: Mike Deweese<br />
* Status: confirmed<br />
* Title: Neural Circuitry Underlying Contrast Gain Control in Primary Auditory Cortex<br />
* Abstract:<br />
<br />
'''March 4, 2015'''<br />
* Speaker: Bill Sprague<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: V1 disparity tuning and the statistics of disparity in natural viewing<br />
* Abstract:<br />
<br />
'''March 11, 2015'''<br />
* Speaker: Jozsef Fiser<br />
* Affiliation: Central European University<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''April 1, 2015'''<br />
* Speaker: Saeed Saremi<br />
* Affiliation: Salk Inst<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''April 15, 2015'''<br />
* Speaker: Zahra M. Aghajan<br />
* Affiliation: UCLA<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Hippocampal Activity in Real and Virtual Environments<br />
* Abstract:<br />
<br />
'''May 7, 2015'''<br />
* Speaker: Santani Teng<br />
* Affiliation: MIT<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''May 13, 2015'''<br />
* Speaker: Harri Valpola<br />
* Affiliation: ZenRobotics<br />
* Host: Brian<br />
* Status: Tentative<br />
* Title: TBA<br />
* Abstract<br />
<br />
'''June 24, 2015'''<br />
* Speaker: Kendrick Kay<br />
* Affiliation: Department of Psychology, Washington University in St. Louis<br />
* Host: Karl<br />
* Status: Confirmed<br />
* Title: Using functional neuroimaging to reveal the computations performed by the human visual system<br />
* Abstract<br />
Visual perception is the result of a complex set of computational transformations performed by neurons in the visual system. Functional magnetic resonance imaging (fMRI) is ideally suited for identifying these transformations, given its excellent spatial resolution and ability to monitor activity across the numerous areas of visual cortex. In this talk, I will review past research in which we used fMRI to develop increasingly accurate models of the stimulus transformations occurring in early and intermediate visual areas. I will then describe recent research in which we successfully extend this approach to high-level visual areas involved in perception of visual categories (e.g. faces) and demonstrate how top-down attention modulates bottom-up stimulus representations. Finally, I will discuss ongoing research targeting regions of ventral temporal cortex that are essential for skilled reading. Our model-based approach, combined with high-field laminar measurements, is expected to provide an integrated picture of how bottom-up stimulus transformations and top-down cognitive factors interact to support rapid and accurate word recognition. Development of quantitative models and associated experimental paradigms may help us understand and diagnose impairments in neural processing that underlie visual disorders such as dyslexia and prosopagnosia.<br />
<br />
=== 2013/14 academic year ===<br />
<br />
'''9 Oct 2013'''<br />
* Speaker: Ekaterina Brocke<br />
* Affiliation: KTH University, Stockholm, Sweden<br />
* Host: Tony<br />
* Status: confirmed<br />
* Title: Multiscale modeling in Neuroscience: first steps towards multiscale co-simulation tool development.<br />
* Abstract: Multiscale modeling/simulations attracts an increasing number of neuroscientists to study how different levels of organization (networks of neurons, cellular/subcellular levels) interact with each other across multiple scales, space and time, to mediate different brain functions. Different scales are usually described by different physical and mathematical formalisms thus making it non trivial to perform the integration. In this talk, I will discuss key phenomena in Neuroscience that can be addressed using subcellular/cellular models, possible approaches to perform multiscale simulations in particular a co-simulation method. I will also introduce several multiscale "toy" models of cellular/subcellular levels that were developed with the aim to understand numerical and technical problems which might appear during the co-simulation. And finally, the first steps made towards multiscale co-simulation tool development will be presented during the talk.<br />
<br />
'''29 Oct 2013 - note: 4:00'''<br />
* Speaker: Mitya Chkolovskii<br />
* Affiliation: HHMI/Janelia Farm<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''30 Oct 2013'''<br />
* Speaker: Ilya Nemanman<br />
* Affiliation: Emory University, Departments of Physics and Biology<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Large N in neural data -- expecting the unexpected.<br />
* Abstract: Recently it has become possible to directly measure simultaneous collective states of many biological components, such as neural activities, genetic sequences, or gene expression profiles. These data are revealing striking results, suggesting, for example, that biological systems are tuned to criticality, and that effective models of these systems based on only pairwise interactions among constitutive components provide surprisingly good fits to the data. We will explore a handful of simplified theoretical models, largely focusing on statistical mechanics of Ising spins, that suggest plausible explanations for these observations. Specifically, I will argue that, at least in certain contexts, these intriguing observations should be expected in multivariate interacting data in the thermodynamic limit of many interacting components.<br />
<br />
'''31 Oct 2013'''<br />
* Speaker: Oriol Vinyals<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno/Brian<br />
* Status: confirmed<br />
* Title: Beyond Deep Learning: Scalable Methods and Models for Learning<br />
* Abstract: In this talk I will briefly describe several techniques I explored in my thesis that improve how to efficiently model signal representations and learn useful information from them. The building block of my dissertation is based on machine learning approaches to classification, where a (typically non-linear) function is learned from labeled examples to map from signals to some useful information (e.g. an object class present an image, or a word present in an acoustic signal). One of the motivating factors of my work has been advances in neural networks in deep architectures (which has led to the terminology "deep learning"), and that has shown state-of-the-art performance in acoustic modeling and object recognition -- the main focus of this thesis. In my work, I have contributed to both the learning (or training) of such architectures through faster and robust optimization techniques, and also to the simplification of the deep architecture model to an approach that is simple to optimize. Furthermore, I derived a theoretical bound showing a fundamental limitation of shallow architectures based on sparse coding (which can be seen as a one hidden layer neural network), thus justifying the need for deeper architectures, while also empirically verifying these architectural choices on speech recognition. Many of my contributions have been used in a wide variety of applications, products and datasets as a result of many collaborations within ICSI and Berkeley, but also at Microsoft Research and Google Research.<br />
<br />
'''6 Nov 2013'''<br />
* Speaker: Garrett T. Kenyon<br />
* Affiliation: Los Alamos National Laboratory, The New Mexico Consortium<br />
* Host: Dylan Paiton<br />
* Status: Confirmed<br />
* Title: Using Locally Competitive Algorithms to Model Top-Down and Lateral Interactions<br />
* Abstract: Cortical connections consist of feedforward, feedback and lateral pathways. Infragranular layers project down the cortical hierarchy to both supra- and infragranular layers at the previous processing level, while the neurons in supragranular layers are linked by extensive long-range lateral projections that cross multiple cortical columns. However, most functional models of visual cortex only account for feedforward connections. Additionally, most models of visual cortex fail to account both for the thalamic projections to non-striate areas and the reciprocal connections from extrastriate areas back to the thalamus. In this talk, I will describe how a modified Locally Competitive Algorithm (LCA; Rozell et al, Neural Comp, 2008) can be used as a unifying framework for exploring the role of top-down and lateral cortical pathways within the context of deep, sparse, generative models. I will also describe an open source software tool called PetaVision that can be used to implement and execute hierarchical LCA-based models on multi-core, multi-node computer platforms without requiring specific knowledge of parallel-programming constructs.<br />
<br />
'''14 Nov 2013 (note: Thursday), ***12:30pm*** '''<br />
* Speaker: Geoffrey J Goodhill<br />
* Affiliation: Queensland Brain Institute and School of Mathematics and Physics, The University of Queensland, Australia<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Computational principles of neural wiring development<br />
* Abstract: Brain function depends on precise patterns of neural wiring. An axon navigating to its target must make guidance decisions based on noisy information from molecular cues in its environment. I will describe a combination of experimental and computational work showing that (1) axons may act as ideal observers when sensing chemotactic gradients, (2) the complex influence of calcium and cAMP levels on guidance decisions can be predicted mathematically, (3) the morphology of growth cones at the axonal tip can be understood in terms of just a few eigenshapes, and remarkably these shapes oscillate in time with periods ranging from minutes to hours. Together this work may shed light on how neural wiring goes wrong in some developmental brain disorders, and how best to promote appropriate regrowth of axons after injury.<br />
<br />
'''4 Dec 2013'''<br />
* Speaker: Zhenwen Dai<br />
* Affiliation: FIAS, Goethe University Frankfurt, Germany.<br />
* Host: Georgios Exarchakis<br />
* Status: Confirmed<br />
* Title: What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach <br />
* Abstract: We study optimal image encoding based on a generative approach with non-linear feature combinations and explicit position encoding. By far most approaches to unsupervised learning of visual features, such as sparse coding or ICA, account for translations by representing the same features at different positions. Some earlier models used a separate encoding of features and their positions to facilitate invariant data encoding and recognition. All probabilistic generative models with explicit position encoding have so far assumed a linear superposition of components to encode image patches. Here, we for the first time apply a model with non-linear feature superposition and explicit position encoding for patches. By avoiding linear superpositions, the studied model represents a closer match to component occlusions which are ubiquitous in natural images. In order to account for occlusions, the non-linear model encodes patches qualitatively very different from linear models by using component representations separated into mask and feature parameters. We first investigated encodings learned by the model using artificial data with mutually occluding components. We find that the model extracts the components, and that it can correctly identify the occlusive components with the hidden variables of the model. On natural image patches, the model learns component masks and features for typical image components. By using reverse correlation, we estimate the receptive fields associated with the model’s hidden units. We find many Gabor-like or globular receptive fields as well as fields sensitive to more complex structures. Our results show that probabilistic models that capture occlusions and invariances can be trained efficiently on image patches, and that the resulting encoding represents an alternative model for the neural encoding of images in the primary visual cortex. <br />
<br />
'''11 Dec 2013'''<br />
* Speaker: Kai Siedenburg<br />
* Affiliation: UC Davis, Petr Janata's Lab.<br />
* Host: Jesse Engel<br />
* Status: Confirmed<br />
* Title: Characterizing Short-Term Memory for Musical Timbre<br />
* Abstract: Short-term memory is a cognitive faculty central for the apprehension of music and speech. Only little is known, however, about memory for musical timbre despite its“sisterhood”with speech; after all, speech can be regarded as sequencing of vocal timbre. Past research has isolated many characteristic effects of verbal memory. Are these also in play for non-vocal timbre sequences? We studied this question by considering short-term memory for serial order. Using timbres and dissimilarity data from McAdams et al. (Psych. Research, 1995), we employed a same/different discrimination paradigm. Experiment 1 (N = 30 MU + 30 nonMU) revealed effects of sequence length and timbral dissimilarity of items, as well as an interaction of musical training and pitch variability: in contrast to musicians, non-musicians' performance was impaired by simultaneous changes in pitch, compared to a constant pitch baseline. Experiment 2 (N = 22) studied whether musicians' memory for timbre sequences was independent of pitch irrespective of the degree of complexity of pitch progressions. Comparing sequences with pitch changing within and across standard and comparison to a constant pitch baseline, performance was now clearly impaired for the variable pitch condition. Experiment 3 (N = 22) showed primacy and recency effects for musicians, and reproduced a positive effect of timbral heterogeneity of sequences. Our findings demonstrate the presence of hallmark effects of verbal memory such as similarity, word length, primacy/recency for the domain of non-vocal timbre, and suggest that memory for speech and non- vocal timbre sequences might to a large extent share underlying mechanisms.<br />
<br />
'''12 Dec 2013'''<br />
* Speaker: Matthias Bethge<br />
* Affiliation: University of Tubingen<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''22 Jan 2014'''<br />
* Speaker: Thomas Martinetz<br />
* Affiliation: Univ Luebeck<br />
* Host: Bruno/Fritz<br />
* Status: confirmed<br />
* Title: Orthogonal Sparse Coding and Sensing<br />
* Abstract: Sparse Coding has been a very successful concept since many natural signals have the property of being sparse in some dictionary (basis). Some natural signals are even sparse in an orthogonal basis, most prominently natural images. They are sparse in a respective wavelet transform. An encoding in an orthogonal basis has a number of advantages,.e.g., finding the optimal coding coefficients is simply a projection instead of being NP-hard.<br />
Given some data, we want to find the orthogonal basis which provides the sparsest code. This problem can be seen as a <br />
generalization of Principal Component Analysis. We present an algorithm, Orthogonal Sparse Coding (OSC), which is able to find this basis very robustly. On natural images, it compresses on the level of JPEG, but can adapt to arbitrary and special data sets and achieve significant improvements. With the property of being sparse in some orthogonal basis, we show how signals can be sensed very efficiently in an hierarchical manner with at most k log D sensing actions. This hierarchical sensing might relate to the way we sense the world, with interesting applications in active vision. <br />
<br />
'''29 Jan 2014'''<br />
* Speaker: David Klein<br />
* Affiliation: Audience<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''5 Feb 2014''' (leave open for Barth/Martinetz seminar)<br />
<br />
'''12 Feb 2014'''<br />
* Speaker: Ilya Sutskever <br />
* Affiliation: Google<br />
* Host: Zayd<br />
* Status: confirmed<br />
* Title: Continuous vector representations for machine translation<br />
* Abstract: Dictionaries and phrase tables are the basis of modern statistical machine translation systems. I will present a method that can automate the process of generating and extending dictionaries and phrase tables. Our method can translate missing word and phrase entries by learning language structures using large monolingual data, and by mapping between the languages using a small bilingual dataset. It uses distributed representations of words and learns a linear mapping between vector spaces of languages. Despite its simplicity, our method is surprisingly effective: we can achieve almost 90% precision@5 for translation of words between English and Spanish. This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs. Joint work with Tomas Mikolov and Quoc Le.<br />
<br />
'''25 Feb 2014'''<br />
* Speaker: Alexander Terekhov <br />
* Affiliation: CNRS - Université Paris Descartes<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Constructing space: how a naive agent can learn spatial relationships by observing sensorimotor contingencies<br />
* Abstract:<br />
<br />
'''12 March 2014'''<br />
* Speaker: Carlos Portera-Cailliau<br />
* Affiliation: UCLA<br />
* Host: Mike<br />
* Status: confirmed<br />
* Title: Circuit defects in the neocortex of Fmr1 knockout mice<br />
* Abstract: TBA<br />
<br />
'''19 March 2014'''<br />
* Speaker: Dean Buonomano<br />
* Affiliation: UCLA<br />
* Host: Mike<br />
* Status: confirmed<br />
* Title: State-dependent Networks: Timing and Computations Based on Neural Dynamics and Short-term Plasticity<br />
* Abstract: The brain’s ability to seamlessly assimilate and process spatial and temporal information is critical to most behaviors, from understanding speech to playing the piano. Indeed, because the brain evolved to navigate a dynamic world, timing and temporal processing represent a fundamental computation. We have proposed that timing and the processing of temporal information emerges from the interaction between incoming stimuli and the internal state of neural networks. The internal state, is defined not only by ongoing activity (the active state) but by time-varying synaptic properties, such as short-term synaptic plasticity (the hidden state). One prediction of this hypothesis is that timing is a general property of cortical circuits. We provide evidence in this direction by demonstrating that in vitro cortical networks can “learn” simple temporal patterns. Finally, previous theoretical studies have suggested that recurrent networks capable of self-perpetuating activity hold significant computational potential. However, harnessing the computational potential of these networks has been hampered by the fact that such networks are chaotic. We show that it is possible to “tame” chaos through recurrent plasticity, and create a novel and powerful general framework for how cortical circuits compute.<br />
<br />
'''26 March 2014'''<br />
* Speaker: Robert G. Smith<br />
* Affiliation: University of Pennsylvania<br />
* Host: Mike S<br />
* Status: confirmed<br />
* Title: Role of Dendritic Computation in the Direction-Selective Circuit of Retina<br />
* Abstract: The retina utilizes a variety of signal processing mechanisms to compute direction from image motion. The computation is accomplished by a circuit that includes starburst amacrine cells (SBACs), which are GABAergic neurons presynaptic to direction-selective ganglion cells (DSGCs). SBACs are symmetric neurons with several branched dendrites radiating out from the soma. When a stimulus moving back and forth along a SBAC dendrite sequentially activates synaptic inputs, larger post-synaptic potentials (PSPs) are produced in the dendritic tips when the stimulus moves outwards from the soma. The directional difference in EPSP amplitude is further amplified near the dendritic tips by voltage-gated channels to produce directional release of GABA. Reciprocal inhibition between adjacent SBACs may also amplify directional release. Directional signals in the independent SBAC branches are preserved because each dendrite makes selective contacts only with DSGCs of the appropriate preferred-direction. Directional signals are further enhanced within the dendritic arbor of the DSGC, which essentially comprises an array of distinct dendritic compartments. Each of these dendritic compartments locally sum excitatory and inhibitory inputs, amplifies them with voltage-gated channels, and generates spikes that propagate to the axon via the soma. Overall, the computation of direction in the retina is performed by several local dendritic mechanisms both presynaptic and postsynaptic, with the result that directional responses are robust over a broad range of stimuli.<br />
<br />
'''16 April 2014'''<br />
* Speaker: David Pfau<br />
* Affiliation: Columbia<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''22 April 2014 *Tuesday*'''<br />
* Speaker: Jochen Braun<br />
* Affiliation: Otto-von-Guericke University, Magdeburg<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Dynamics of visual perception and collective neural activity<br />
* Abstract:<br />
<br />
'''29 April 2014'''<br />
* Speaker: Guiseppe Vitiello<br />
* Affiliation: University of Salerno<br />
* Host: Fritz/Walter Freeman<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''30 April 2014'''<br />
* Speaker: Masataka Watanabe<br />
* Affiliation: University of Tokyo / Max Planck Institute for Biological Cybernetics<br />
* Host: Gautam Agarwal<br />
* Status: confirmed<br />
* Title: Turing Test for Machine Consciousness and the Chaotic Spatiotemporal Fluctuation Hypothesis<br />
* Abstract: I propose an experimental method to test various hypotheses on consciousness. Inspired by Sperry's observation that split-brain patients possess two independent streams of consciousness, the idea is to implement candidate neural mechanisms of visual consciousness onto an artificial cortical hemisphere and test whether subjective experience is evoked in the device's visual hemifield. In contrast to modern neurosynthetic devices, I show that mimicking interhemispheric connectivity assures that authentic and fine-grained subjective experience arises only when a stream of consciousness is generated within the device. It is valid under a widely believed assumption regarding interhemispheric connectivity and neuronal stimulus-invariance. (I will briefly explain my own evidence of human V1 not responding to changes in the contents of visual awareness [1])<br />
<br />
If consciousness is actually generated within the device, we should be able to construct a case where two objects presented in the device's visual field are distinguishable by visual experience but not by what is communicated through the brain-machine interface. As strange as it may sound, and clearly violating the law of physics, this is likely to be happening in the intact brain, where unified subjective bilateral vision and its verbal report occur without the total interhemispheric exchange of conscious visual information.<br />
<br />
Together, I present a hypothesis on the neural mechanism of consciousness, “The Chaotic Spatiotemporal Fluctuation Hypothesis” that passes the proposed test for visual qualia and also explains how physics that we know of today is violated. Here, neural activity is divided into two components, the time-averaged activity and the residual temporally fluctuating activity, where the former serves as the content of consciousness (neuronal population vector) and the latter as consciousness itself. The content is “read” into consciousness in the sense that, every local perturbation caused by change in the neuronal population vector creates a spatiotemporal wave in the fluctuation component that travels through out the system. Deterministic chaos assures that every local difference makes a difference to the whole of the dynamics, as in the butterfly effect, serving as a foundation for the holistic nature of consciousness. I will present data from simultaneous electrophysiology-fMRI recordings and human fMRI [2] that supports the existence of such large-scale causal fluctuation.<br />
<br />
Here, the chaotic fluctuation cannot be decoded to trace back the original perturbation in the neuronal population vector, because initial states of all neurons are required with infinite precision to do so. Hence what is transmitted over the two hemispheres are not "information" in the normal sense. This illustrates the violation of physics by the metaphysical assumption, "chaotic spatiotemporal fluctuation is consciousness", where unification of bilateral vision and the solving of visual tasks (e.g. perfect symmetry detection) are achieved without exchanging the otherwise required Shannon information between the two hemispheres.<br />
<br />
Finally, minimal and realistic versions of the proposed test for visual qualia can be conducted on laboratory animals to validate the hypothesis. It deals with two biological hemispheres, which we know already that it contains consciousness. We dissect interhemispheric connectivity and form instead an artificial one that is capable of filtering out the neural fluctuation component. A limited interhemispheric connectivity may be sufficient, which would drastically discount the technological challenge. If the subject is capable of conducting a bilateral stimuli matching task with the full artificial interhemispheric connectivity, but not when the fluctuation component is filtered out, it can be considered a strong supporting evidence of the hypothesis.<br />
<br />
1.Watanabe, M., Cheng, K., Ueno, K., Asamizuya, T., Tanaka, K., Logothetis, N., Attention but not awareness modulates the BOLD signal in the human V1 during binocular suppression. Science, 2011. 334(6057): p. 829-31.<br />
<br />
2.Watanabe, M., Bartels, A., Macke, J., Logothetis, N., Temporal jitter of the BOLD signal reveals a reliable initial dip and improved spatial resolution. Curr Biol, 2013. 23(21): p. 2146-50.<br />
<br />
'''11 June 2014'''<br />
* Speaker: Stuart Hammeroff<br />
* Affiliation: University of Arizona, Tucson<br />
* Host: Gautam<br />
* Status: confirmed<br />
* Title: ‘Tuning the brain’ – Treating mental states through microtubule vibrations <br />
* Abstract: Do mental states derive entirely from brain neuronal membrane activities? Neuronal interiors are organized by microtubules (‘MTs’), protein polymers proposed to encode memory, process information and support consciousness. Using nanotechnology, Bandyopadhyay’s group at MIT has shown coherent vibrations (megahertz to 10 kilohertz) from microtubule bundles inside active neurons, vibrations (electric field potentials ~40 to 50 mV) able to influence membrane potentials. This suggests EEG rhythms are ‘beat’ frequencies of megahertz vibrations in microtubules inside neurons (Hameroff and Penrose, 2014), and that consciousness and cognition involve vibrational patterns resonating across scales in the brain, more like music than computation. MT megahertz may be a useful therapeutic target for ‘tuning’ mood and mental states. Among noninvasive transcranial brain stimulation techniques (TMS, TDcS), transcranial ultrasound (TUS) is megahertz mechanical vibrations. Applied at the scalp, low intensity, sub-thermal ultrasound (TUS) safely reaches the brain. In human studies, brief (15 to 30 seconds) TUS at 0.5, 2 and 8 megahertz to frontal-temporal cortex results in 40 minutes or longer of reported mood improvement, and focused TUS enhances sensory discrimination (Legon et al, 2014). In vitro, ultrasound promotes growth of neurite outgrowth in embryonic neurons (Raman), and stabilizes microtubules against disassembly (Gupta). (In Alzheimer’s disease, MTs disassemble and release tau.) These findings suggest ‘tuning the brain’ with TUS should be a safe, effective and inexpensive treatment for Alzheimer’s, traumatic brain injury, depression, anxiety, PTSD and other disorders. <br />
<br />
References: Hameroff S, Penrose R (2014) Phys Life Rev http://www.sciencedirect.com/science/article/pii/S1571064513001188; Sahu et al (2013) Biosens Bioelectron 47:141–8; Sahu et al (2013) Appl Phys Lett 102:123701; Legon et al (2014) Nature Neuroscience 17: 322–329<br />
<br />
'''25 June 2014'''<br />
* Speaker: Peter Loxley<br />
* Affiliation: <br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: The two-dimensional Gabor function adapted to natural image statistics: An analytical model of simple-cell responses in the early visual system<br />
* Abstract: TBA<br />
<br />
=== 2012/13 academic year ===<br />
<br />
'''26 Sept 2012''' <br />
* Speaker: Jason Yeatman<br />
* Affiliation: Department of Psychology, Stanford University<br />
* Host: Bruno/Susana Chung<br />
* Status: confirmed<br />
* Title: The Development of White Matter and Reading Skills<br />
* Abstract: The development of cerebral white matter involves both myelination and pruning of axons, and the balance between these two processes may differ between individuals. Cross-sectional measures of white matter development mask the interplay between these active developmental processes and their connection to cognitive development. We followed a cohort of 39 children longitudinally for three years, and measured white matter development and reading development using diffusion tensor imaging and behavioral tests. In the left arcuate and inferior longitudinal fasciculus, children with above-average reading skills initially had low fractional anisotropy (FA) with a steady increase over the 3-year period, while children with below-average reading skills had higher initial FA that declined over time. We describe a dual-process model of white matter development that balances biological processes that have opposing effects on FA, such as axonal myelination and pruning, to explain the pattern of results.<br />
<br />
'''8 Oct 2012''' <br />
* Speaker: Sophie Deneve<br />
* Affiliation: Laboratoire de Neurosciences cognitives, ENS-INSERM<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Balanced spiking networks can implement dynamical systems with predictive coding<br />
* Abstract: Neural networks can integrate sensory information and generate continuously varying outputs, even though individual neurons communicate only with spikes---all-or-none events. Here we show how this can be done efficiently if spikes communicate "prediction errors" between neurons. We focus on the implementation of linear dynamical systems and derive a spiking network model from a single optimization principle. Our model naturally accounts for two puzzling aspects of cortex. First, it provides a rationale for the tight balance and correlations between excitation and inhibition. Second, it predicts asynchronous and irregular firing as a consequence of predictive population coding, even in the limit of vanishing noise. We show that our spiking networks have error-correcting properties that make them far more accurate and robust than comparable rate models. Our approach suggests spike times do matter when considering how the brain computes, and that the reliability of cortical representations could have been strongly under-estimated.<br />
<br />
<br />
'''19 Oct 2012'''<br />
* Speaker: Gert Van Dijck<br />
* Affiliation: Cambridge<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: A solution to identifying neurones using extracellular activity in awake animals: a probabilistic machine-learning approach<br />
* Abstract: Electrophysiological studies over the last fifty years have been hampered by the difficulty of reliably assigning signals to identified cortical neurones. Previous studies have employed a variety of measures based on spike timing or waveform characteristics to tentatively classify other neurone types (Vos et al., Eur. J. Neurosci., 1999; Prsa et al., J. Neurosci., 2009), in some cases supported by juxtacellular labelling (Simpson et al., Prog. Brain Res., 2005; Holtzman et al., J. Physiol., 2006; Barmack and Yakhnitsa, J. Neurosci., 2008; Ruigrok et al., J. Neurosci., 2011), or intracellular staining and / or assessment of membrane properties (Chadderton et al., Nature, 2004; Jorntell and Ekerot, J. Neurosci., 2006; Rancz et al., Nature, 2007). Anaesthetised animals have been widely used as they can provide a ground-truth through neuronal labelling which is much harder to achieve in awake animals where spike-derived measures tend to be relied upon (Lansink et al., Eur. J. Neurosci., 2010). Whilst spike-shapes carry potentially useful information for classifying neuronal classes, they vary with electrode type and the geometric relationship between the electrode and the spike generation zone (Van Dijck et al., Int. J. Neural Syst., 2012). Moreover, spike-shape measurement is achieved with a variety of techniques, making it difficult to compare and standardise between laboratories.In this study we build probabilistic models on the statistics derived from the spike trains of spontaneously active neurones in the cerebellum and the ventral midbrain. The mean spike frequency in combination with the log-interval-entropy (Bhumbra and Dyball, J. Physiol.-London, 2004) of the inter-spike-interval distribution yields the highest prediction accuracy. The cerebellum model consists of two sub-models: a molecular layer - Purkinje layer model and a granular layer - Purkinje layer model. The first model identifies with high accuracy (92.7 %) molecular layer interneurones and Purkinje cells, while the latter identifies with high accuracy (99.2 %) Golgi cells, granule cells, mossy fibers and Purkinje cells. Furthermore, it is shown that the model trained on anaesthetized rat and decerebrate cat data has broad applicability to other species and behavioural states: anaesthetized mice (80 %), awake rabbits (94.2 %) and awake rhesus monkeys (89 - 90 %).Recently, opto-genetics allow to obtain a ground-truth about cell classes. Using opto-genetically identified GABA-ergic and dopaminergic cells we build similar statistical models to identify these neuron types from the ventral midbrain.Hence, this illustrates that our approach will be of general use to a broad variety of laboratories.<br />
<br />
'''Tuesday, 23 Oct 2012''' <br />
* Speaker: Jaimie Sleigh<br />
* Affiliation: University of Auckland<br />
* Host: Fritz/Andrew Szeri<br />
* Status: confirmed<br />
* Title: Is General Anesthesia a failure of cortical information integration<br />
* Abstract: General anesthesia and natural sleep share some commonalities and some differences. Quite a lot is known about the chemical and neuronal effects of general anesthetic drugs. There are two main groups of anesthetic drugs, which can be distinguished by their effects on the EEG. The most commonly used drugs exert a strong GABAergic action; whereas a second group is characterized by minimal GABAergic effects, but significant NMDA blockade. It is less clear which and how these various effects result in failure of the patient to wake up when the surgeon cuts them. I will present some results from experimental brain slice work, and theoretical mean field modelling of anesthesia and sleep, that support the idea that the final common mechanism of both types of anaesthesia is fragmentation of long distance information flow in the cortex.<br />
<br />
'''31 Oct 2012''' (Halloween)<br />
* Speaker: Jonathan Landy<br />
* Affiliation: UCSB<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Mean-field replica theory: review of basics and a new approach<br />
* Abstract: Replica theory provides a general method for evaluating the mode of a distribution, and has varied applications to problems in statistical mechanics, signal processing, etc. Evaluation of the formal expressions arising in replica theory represents a formidable technical challenge, but one that physicists have apparently intuited correct methods for handling. In this talk, I will first provide a review of the historical development of replica theory, covering: 1) motivation, 2) the intuited ``Parisi-ansatz" solution, 3) continued controversies, and 4) a survey of applications (including to neural networks). Following this, I will discuss an exploratory effort of mine, aimed at developing an ansatz-free solution method. As an example, I will work out the phase diagram for a simple spin-glass model. This talk is intended primarily as a tutorial.<br />
<br />
'''7 Nov 2012''' <br />
* Speaker: Tom Griffiths<br />
* Affiliation: UC Berkeley<br />
* Host:Daniel Little<br />
* Status: Confirmed<br />
* Title: Identifying human inductive biases<br />
* Abstract: People are remarkably good at acquiring complex knowledge from limited data, as is required in learning causal relationships, categories, or aspects of language. Successfully solving inductive problems of this kind requires having good "inductive biases" - constraints that guide inductive inference. Viewed abstractly, understanding human learning requires identifying these inductive biases and exploring their origins. I will argue that probabilistic models of cognition provide a framework that can facilitate this project, giving a transparent characterization of the inductive biases of ideal learners. I will outline how probabilistic models are traditionally used to solve this problem, and then present a new approach that uses Markov chain Monte Carlo algorithms as the basis for an experimental method that magnifies the effects of inductive biases.<br />
<br />
'''19 Nov 2012''' (Monday) (Thanksgiving week)<br />
* Speaker: Bin Yu<br />
* Affiliation: Dept. of Statistics and EECS, UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Representation of Natural Images in V4<br />
* Abstract: The functional organization of area V4 in the mammalian ventral visual pathway is far from being well understood. V4 is believed to play an important role in the recognition of shapes and objects and in visual attention, but the complexity of this cortical area makes it hard to analyze. In particular, no current model of V4 has shown good predictions for neuronal responses to natural images and there is no consensus on the primary role of V4.<br />
In this talk, we present analysis of electrophysiological data on the response of V4 neurons to natural images. We propose a new computational model that achieves comparable prediction performance for V4 as for V1 neurons. Our model does not rely on any pre-defined image features but only on invariance and sparse coding principles. We interpret our model using sparse principal component analysis and discover two groups of neurons: those selective to texture versus those selective to contours. This supports the thesis that one primary role of V4 is to extract objects from background in the visual field. Moreover, our study also confirms the diversity of V4 neurons. Among those selective to contours, some of them are selective to orientation, others to acute curvature features.<br />
(This is joint work with J. Mairal, Y. Benjamini, B. Willmore, M. Oliver<br />
and J. Gallant.)<br />
<br />
'''30 Nov 2012''' <br />
* Speaker: Yan Karklin<br />
* Affiliation: NYU<br />
* Host: Tyler<br />
* Status: confirmed<br />
* Title: <br />
* Abstract: <br />
<br />
'''10 Dec 2012 (note this would be the Monday after NIPS)''' <br />
* Speaker: Marius Pachitariu<br />
* Affiliation: Gatsby / UCL<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: NIPS paper "Learning visual motion in recurrent neural networks"<br />
* Abstract: We present a dynamic nonlinear generative model for visual motion based on a<br />
latent representation of binary-gated Gaussian variables connected in a network. <br />
Trained on sequences of images by an STDP-like rule the model learns <br />
to represent different movement directions in different variables. We use an online <br />
approximate inference scheme that can be mapped to the dynamics of networks <br />
of neurons. Probed with drifting grating stimuli and moving bars of light, neurons <br />
in the model show patterns of responses analogous to those of direction-selective <br />
simple cells in primary visual cortex. We show how the computations of the model <br />
are enabled by a specific pattern of learnt asymmetric recurrent connections. <br />
I will also briefly discuss our application of recurrent neural networks as statistical <br />
models of simultaneously recorded spiking neurons. <br />
<br />
'''12 Dec 2012''' <br />
* Speaker: Ian Goodfellow<br />
* Affiliation: U Montreal<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''7 Jan 2013'''<br />
* Speaker: Stuart Hammeroff<br />
* Affiliation: University of Arizona <br />
* Host: Gautam Agarwal<br />
* Status: confirmed<br />
* Title: Quantum cognition and brain microtubules <br />
* Abstract: Cognitive decision processes are generally seen as classical Bayesian probabilities, but better suited to quantum mathematics. For example: 1) Psychological conflict, ambiguity and uncertainty can be viewed as (quantum) superposition of multiple possible judgments and beliefs. 2) Measurement (e.g. answering a question, reaching a decision) reduces possibilities to definite states (‘constructing reality’, ‘collapsing the wave function’). 3) Previous questions influence subsequent answers, so sequence affects outcomes (‘contextual non-commutativity’). 4) Judgments and choices may deviate from classical logic, suggesting random, or ‘non-computable’ quantum influences. Can quantum cognition operate in the brain? Do classical brain activities simulate quantum processes? Or have biomolecular quantum devices evolved? In this talk I will discuss how a finer scale, intra-neuronal level of quantum information processing in cytoskeletal microtubules can accumulate, operate upon and integrate quantum information and memory for self-collapse to classical states which regulate axonal firings, controlling behavior.<br />
<br />
'''Monday 14 Jan 2013, 1:00pm'''<br />
* Speaker: Dibyendu Mandal <br />
* Affiliation: Physics Dept., University of Maryland (Jarzynski group)<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: An exactly solvable model of Maxwell’s demon<br />
* Abstract: The paradox of Maxwell’s demon has stimulated numerous thought experiments, leading to discussions about the thermodynamic implications of information processing. However, the field has lacked a tangible example or model of an autonomous, mechanical system that reproduces the actions of the demon. To address this issue, we introduce an explicit model of a device that can deliver work to lift a mass against gravity by rectifying thermal fluctuations, while writing information to a memory register. We solve for the steady-state behavior of the model and construct its nonequilibrium phase diagram. In addition to the engine-like action described above, we identify a Landauer eraser region in the phase diagram where the model uses externally supplied work to remove information from the memory register. Our model offers a simple paradigm for investigating the thermodynamics of information processing by exposing a transparent mechanism of operation.<br />
<br />
'''23 Jan 2013'''<br />
* Speaker: Carlos Brody<br />
* Affiliation: Princeton<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Neural substrates of decision-making in the rat<br />
* Abstract: Gradual accumulation of evidence is thought to be a fundamental component of decision-making. Over the last 16 years, research in non-human primates has revealed neural correlates of evidence accumulation in parietal and frontal cortices, and other brain areas . However, the circuit mechanisms underlying these neural correlates remains unknown. Reasoning that a rodent model of evidence accumulation would allow a greater number of experimental subjects, and therefore experiments, as well as facilitate the use of molecular tools, we developed a rat accumulation of evidence task, the "Poisson Clicks" task. In this task, sensory evidence is delivered in pulses whose precisely-controlled timing varies widely within and across trials. The resulting data are analyzed with models of evidence accumulation that use the richly detailed information of each trial’s pulse timing to distinguish between different decision mechanisms. The method provides great statistical power, allowing us to: (1) provide compelling evidence that rats are indeed capable of gradually accumulating evidence for decision-making; (2) accurately estimate multiple parameters of the decision-making process from behavioral data; and (3) measure, for the first time, the diffusion constant of the evidence accumulator, which we show to be optimal (i.e., equal to zero). In addition, the method provides a trial-by-trial, moment-by-moment estimate of the value of the accumulator, which can then be compared in awake behaving electrophysiology experiments to trial-by-trial, moment-by-moment neural firing rate measures. Based on such a comparison, we describe data and a novel analysis approach that reveals differences between parietal and frontal cortices in the neural encoding of accumulating evidence. Finally, using semi-automated training methods to produce tens of rats trained in the Poisson Clicks accumulation of evidence task, we have also used pharmacological inactivation to ask, for the first time, whether parietal and frontal cortices are required for accumulation of evidence, and we are using optogenetic methods to rapidly and transiently inactivate brain regions so as to establish precisely when, during each decision-making trial, it is that each brain region's activity is necessary for performance of the task.<br />
<br />
'''28 Jan 2013'''<br />
* Speaker: Eugene M. Izhikevich<br />
* Affiliation: Brain Corporation<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Spikes<br />
* Abstract: Most communication in the brain is via spikes. While we understand the spike-generation mechanism of individual neurons, we fail to appreciate the spike-timing code and its role in neural computations. The speaker starts with simple models of neuronal spiking and bursting, describes small neuronal circuits that learn spike-timing code via spike-timing dependent plasticity (STDP), and finishes with biologically detailed and anatomically accurate large-scale brain models.<br />
<br />
'''29 Jan 2013'''<br />
* Speaker: Goren Gordon<br />
* Affiliation: Weizman Intitute<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Hierarchical Curiosity Loops – Model, Behavior and Robotics<br />
* Abstract: Autonomously learning about one's own body and its interaction with the environment is a formidable challenge, yet it is ubiquitous in biology: every animal’s pup and every human infant accomplish this task in their first few months of life. Furthermore, biological agents’ curiosity actively drives them to explore and experiment in order to expedite their learning progress. To bridge the gap between biological and artificial agents, a formal mathematical theory of curiosity was developed that attempts to explain observed biological behaviors and enable curiosity emergence in robots. In the talk, I will present the hierarchical curiosity loops model, its application to rodent’s exploratory behavior and its implementation in a fully autonomously learning and behaving reaching robot.<br />
<br />
'''29 Jan 2013'''<br />
* Speaker: Jenny Read<br />
* Affiliation: Institute of Neuroscience, Newcastle University<br />
* Host: Sarah<br />
* Status: confirmed<br />
* Title: Stereoscopic vision<br />
* Abstract: [To be written]<br />
<br />
'''7 Feb 2013'''<br />
* Speaker: Valero Laparra<br />
* Affiliation: University of Valencia<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Empirical statistical analysis of phases in Gabor filtered natural images<br />
* Abstract:<br />
<br />
'''20 Feb 2013'''<br />
* Speaker: Dolores Bozovic<br />
* Affiliation: UCLA<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Bifurcations and phase-locking dynamics in the auditory system<br />
* Abstract: The inner ear constitutes a remarkable biological sensor that exhibits nanometer-scale sensitivity of mechanical detection. The first step in auditory processing is performed by hair cells, which convert movement into electrical signals via opening of mechanically gated ion channels. These cells are operant in a viscous medium, but can nevertheless sustain oscillations, amplify incoming signals, and even exhibit spontaneous motility, indicating the presence of an underlying active amplification system. Theoretical models have proposed that a hair cell constitutes a nonlinear system with an internal feedback mechanism that can drive it across a bifurcation and into an unstable regime. Our experiments explore the nonlinear response as well as feedback mechanisms that enable self-tuning already at the peripheral level, as measured in vitro on sensory tissue. A simple dynamic systems framework will be discussed, that captures the main features of the experimentally observed behavior in the form of an Arnold Tongue.<br />
<br />
'''27 March 2013'''<br />
* Speaker: Dale Purves<br />
* Affiliation: Duke<br />
* Host: Sarah<br />
* Status: confirmed<br />
* Title: How Visual Evolution Determines What We See<br />
* Abstract: Information about the physical world is excluded from visual stimuli by the nature of biological vision (the inverse optics problem). Nonetheless, humans and other visual animals routinely succeed in their environments. The talk will explain how the assignment of perceptual values to visual stimuli according to the frequency of occurrence of stimulus patterns resolves the inverse problem and determines the basic visual qualities we see. This interpretation of vision implies that the best (and perhaps the only) way to understand visual system circuitry is to evolve it, an idea supported by recent work.<br />
<br />
'''9 April 2013'''<br />
* Speaker: Mounya Elhilali<br />
* Affiliation: Johns Hopkins<br />
* Host: Tyler<br />
* Status: confirmed<br />
* Title: Attention at the cocktail party: Neural bases and computational strategies for auditory scene analysis<br />
* Abstract: The perceptual organization of sounds in the environment into coherent objects is a feat constantly facing the auditory system. It manifests itself in the everyday challenge faced by humans and animals alike to parse complex acoustic information arising from multiple sound sources into separate auditory streams. While seemingly effortless, uncovering the neural mechanisms and computational principles underlying this remarkable ability remain a challenge for both the experimental and theoretical neuroscience communities. In this talk, I discuss the potential role of neuronal tuning in mammalian primary auditory cortex in mediating this process. I also examine the role of mechanisms of attention in adapting this neural representation to reflect both the sensory content and the changing behavioral context of complex acoustic scenes.<br />
<br />
'''17th of April 2013'''<br />
* Speaker: Wiktor Młynarski<br />
* Affiliation: Max Planck Institute for Mathematics in the Sciences<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: Statistical Models of Binaural Sounds<br />
* Abstract: The auditory system exploits disparities in the sounds arriving at the left and right ear to extract information about the spatial configuration of sound sources. According to the widely acknowledged Duplex Theory, sounds of low frequency are localized based on Interaural Time Differences (ITDs) and localization of high frequency sources relies on Interaural Level Differences (ILDs). Natural sounds, however, possess a rich structure and contain multiple frequency components. This leads to the question: what are the contributions of different cues to sound position identification in the natural environment and how much information do they carry about its spatial structure? In this talk, I will present my attempts to answer the above questions using statistical, generative models of naturalistic (simulated) and fully natural binaural sounds.<br />
<br />
'''15 May 2013'''<br />
* Speaker: Byron Yu<br />
* Affiliation: CMU<br />
* Host: Bruno/Jose (jointly sponsored with CNEP)<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''22 May 2013'''<br />
* Speaker: Bijan Pesaran<br />
* Affiliation: NYU<br />
* Host: Bruno/Jose (jointly sponsored with CNEP)<br />
* Status: confirmed <br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
=== 2011/12 academic year ===<br />
<br />
'''15 Sep 2011 (Thursday, at noon)'''<br />
* Speaker: Kathrin Berkner<br />
* Affiliation: Ricoh Innovations Inc.<br />
* Host: Ivana Tosic<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract: TBD<br />
<br />
'''21 Sep 2011'''<br />
* Speaker: Mike Kilgard<br />
* Affiliation: UT Dallas<br />
* Host: Michael Silver<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''27 Sep 2011'''<br />
* Speaker: Moshe Gur<br />
* Affiliation: Dept. of Biomedical Engineering, Technion, Israel Institute of Technology<br />
* Host: Bruno/Stan<br />
* Status: Confirmed<br />
* Title: On the unity of perception: How does the brain integrate activity evoked at different cortical loci?<br />
* Abstract: Any physical device we know, including computers, when comparing A to B must send the information to point C. I have done experiments in three modalities, somato-sensory, auditory, and visual, where 2 different loci at the primary cortex are stimulated and I argue that the "machine" converging hypothesis cannot explain the perceptual results. Thus we must assume a non-converging mechanism whereby the brain, at times, can compare (integrate, process) events that take place at different loci without sending the information to a common target. Once we allow for such a mechanism, many phenomena can be viewed differently. Take for example the question of how and where does multi-sensory integration take place; we perceive a synchronized talking face yet detailed visual and auditory information are represented at very different brain loci.<br />
<br />
'''5 Oct 2011'''<br />
* Speaker: Susanne Still<br />
* Affiliation: University of Hawaii at Manoa<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: Predictive power, memory and dissipation in learning systems operating far from thermodynamic equilibrium<br />
* Abstract: Understanding the physical processes that underly the functioning of biological computing machinery often requires describing processes that occur far from thermodynamic equilibrium. In recent years significant progress has been made in this area, most notably Jarzynski’s work relation and Crooks’ fluctuation theorem. In this talk I will explore how dissipation of energy is related to a system's information processing inefficiency. The focus is on driven systems that are embedded in a stochastic operating environment. If we describe the system as a state machine, then we can interpret the stochastic dynamics as performing a computation that results in an (implicit) model of the stochastic driving signal. I will show that instantaneous non-predictive information, which serves as a measure of model inefficiency, provides a lower bound on the average dissipated work. This implies that learning systems with larger predictive power can operate more energetically efficiently. We could speculate that perhaps biological systems may have evolved to reflect this kind of adaptation. One interesting insight here is that purely physical notions require what is perfectly in line with the general belief that a useful model must be predictive (at fixed model complexity). Our result thereby ties together ideas from learning theory with basic non-equilibrium thermodynamics.<br />
<br />
'''19 Oct 2011'''<br />
* Speaker: Graham Cummins<br />
* Affiliation: WSU<br />
* Host: Jeff Teeters<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''26 Oct 2011'''<br />
* Speaker: Shinji Nishimoto<br />
* Affiliation: Gallant lab, UC Berkeley<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''14 Dec 2011'''<br />
* Speaker: Austin Roorda<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: How the unstable eye sees a stable and moving world<br />
* Abstract:<br />
<br />
'''11 Jan 2012'''<br />
* Speaker: Ken Nakayama<br />
* Affiliation: Harvard University<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Subjective Contours<br />
* Abstract: The concept of the receptive field in visual science has been transformative. It fueled great discoveries of the second half of the 20th C, providing the dominant understanding of how the visual system works at its early stages. Its reign has been extended to the field of object recognition where in the form of a linear classifier, it provides a framework to understand visual object recognition (DiCarlo and Cox, 2007).<br />
Untamed, however, are areas of visual perception, now more or less ignored, dubbed variously as the 2.5 D sketch, mid-level vision, surface representations. Here, neurons with their receptive fields seem unable to bridge the gap, to supply us with even a plausible speculative framework to understand amodal completion, subjective contours and other surface phenomena. Correspondingly, these areas have become backwater, ignored, leapt over.<br />
Subjective contours, however, remain as vivid as ever, even more so.<br />
Everyday, our visual system makes countless visual inferences as to the layout of the world surfaces and objects. What’s remarkable is that subjective contours visibly reveal these inferences.<br />
<br />
'''Tuesday, 24 Jan 2012'''<br />
* Speaker: Aniruddha Das<br />
* Affiliation: Columbia University<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''22 Feb 2012'''<br />
* Speaker: Elad Schneidman <br />
* Affiliation: Department of Neurobiology, Weizmann Institute of Science<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Sparse high order interaction networks underlie learnable neural population codes<br />
* Abstract:<br />
<br />
'''29 Feb 2012 (at noon as usual)'''<br />
* Speaker: Heather Read<br />
* Affiliation: U. Connecticut<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: "Transformation of sparse temporal coding from auditory colliculus and cortex"<br />
* Abstract: TBD<br />
<br />
'''1 Mar 2012 (note: Thurs)'''<br />
* Speaker: Daniel Zoran<br />
* Affiliation: Hebrew University, Jerusalem<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''7 Mar 2012'''<br />
* Speaker: David Sivak<br />
* Affiliation: UCB<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''8 Mar 2012'''<br />
* Speaker: Ivan Schwab<br />
* Affiliation: UC Davis<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Evolution's Witness: How Eyes Evolved<br />
* Abstract:<br />
<br />
'''14 Mar 2012'''<br />
* Speaker: David Sussillo<br />
* Affiliation:<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''18 April 2012'''<br />
* Speaker: Kristofer Bouchard<br />
* Affiliation: UCSF<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Cortical Foundations of Human Speech Production<br />
* Abstract:<br />
<br />
'''23 May 2012''' (rescheduled from April 11)<br />
* Speaker: Logan Grosenick<br />
* Affiliation: Stanford, Deisseroth & Suppes Labs<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: Acquisition, creation, & analysis of 4D light fields with applications to calcium imaging & optogenetics<br />
* Abstract: In Light Field Microscopy (LFM), images can be computationally refocused after they are captured [1]. This permits acquiring focal stacks and reconstructing volumes from a single camera frame. In Light Field Illumination (LFI), the same ideas can be used to create an illumination system that can deliver focused light to any position in a volume without moving optics, and these two devices (LFM/LFI) can be used together in the same system [2]. So far, these imaging and illumination systems have largely been used independently in proof-of-concept experiments [1,2]. In this talk I will discuss applications of a combined scanless volumetric imaging and volumetric illumination system applied to 4D calcium imaging and photostimulation of neurons in vivo and in vitro. The volumes resulting from these methods are large (>500,000 voxels per time point), collected at 10-100 frames per second, and highly correlated in space and time. Analyzing such data has required the development and application of machine learning methods appropriate to large, sparse, nonnegative data, as well as the estimation of neural graphical models from calcium transients. This talk will cover the reconstruction and creation of volumes in a microscope using Light Fields [1,2], and the current state-of-the-art for analyzing these large volumes in the context of calcium imaging and optogenetics. <br />
<br />
[1] M. Levoy, R. Ng, A. Adams, M. Footer, and M. Horowitz. Light Field Microscopy. ACM Transactions on Graphics 25(3), Proceedings of SIGGRAPH 2006.<br />
[2] M. Levoy, Z. Zhang, and I. McDowall. Recording and controlling the 4D light field in a microscope. Journal of Microscopy, Volume 235, Part 2, 2009, pp. 144-162. Cover article.<br />
<br />
BIO: Logan received bachelors degrees with honors in Biology and Psychology from Stanford, and a Masters in Statistics from Stanford. He is a Ph.D. candidate in the Neurosciences Program working in the labs of Karl Deisseroth and Patrick Suppes, and a trainee at the Stanford Center for Mind, Brain, and Computation. He is interested in developing and applying novel computational imaging and machine learning techniques in order to observe, control, and understand neuronal circuit dynamics.<br />
<br />
'''7 June 2012''' (Thursday)<br />
* Speaker: Mitya Chklovskii<br />
* Affiliation: janelia<br />
* Host: Bruno<br />
* Status:<br />
* Title:<br />
* Abstract<br />
<br />
'''27 June 2012''' <br />
* Speaker: Jerry Feldman<br />
* Affiliation:<br />
* Host: Bruno<br />
* Status:<br />
* Title:<br />
* Abstract:<br />
<br />
'''30 July 2012''' <br />
* Speaker: Lucas Theis<br />
* Affiliation: Matthias Bethge lab, Werner Reichardt Centre for Integrative Neuroscience, Tübingen<br />
* Host: Jascha<br />
* Status: Confirmed<br />
* Title: Hierarchical models of natural images<br />
* Abstract: Probabilistic models of natural images have been used to solve a variety of computer vision tasks as well as a means to better understand the computations performed by the visual system in the brain. A lot of theoretical considerations and biological observations point to the fact that natural image models should be hierarchically organized, yet to date, the best known models are still based on what is better described as shallow representations. In this talk, I will present two image models. One is based on the idea of Gaussianization for greedily constructing hierarchical generative models. I will show that when combined with independent subspace analysis, it is able to compete with the state of the art for modeling image patches. The other model combines mixtures of Gaussian scale mixtures with a directed graphical model and multiscale image representations and is able to generate highly structured images of arbitrary size. Evaluating the model's likelihood and comparing it to a large number of other image models shows that it might well be the best model for natural images yet.<br />
<br />
(joint work with Reshad Hosseini and Matthias Bethge)<br />
<br />
=== 2010/11 academic year ===<br />
<br />
'''02 Sep 2010'''<br />
* Speaker: Johannes Burge<br />
* Affiliation: University of Texas at Austin<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''8 Sep 2010'''<br />
* Speaker: Tobi Szuts<br />
* Affiliation: Meister Lab/ Harvard U.<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Wireless recording of neural activity in the visual cortex of a freely moving rat.<br />
* Abstract: Conventional neural recording systems restrict behavioral experiments to a flat indoor environment compatible with the cable that tethers the subject to the recording instruments. To overcome these constraints, we developed a wireless multi-channel system for recording neural signals from a freely moving animal the size of a rat or larger. The device takes up to 64 voltage signals from implanted electrodes, samples each at 20 kHz, time-division multiplexes them onto a single output line, and transmits that output by radio frequency to a receiver and recording computer up to >60 m away. The system introduces less than 4 ?V RMS of electrode-referred noise, comparable to wired recording systems and considerably less than biological noise. The system has greater channel count or transmission distance than existing telemetry systems. The wireless system has been used to record from the visual cortex of a rat during unconstrained conditions. Outdoor recordings show V1 activity is modulated by nest-building activity. During unguided behavior indoors, neurons responded rapidly and consistently to changes in light level, suppressive effects were prominent in response to an illuminant transition, and firing rate was strongly modulated by locomotion. Neural firing in the visual cortex is relatively sparse and moderate correlations are observed over large distances, suggesting that synchrony is driven by global processes.<br />
<br />
'''29 Sep 2010'''<br />
* Speaker: Vikash Gilja<br />
* Affiliation: Stanford University<br />
* Host: Charles<br />
* Status: Confirmed<br />
* Title: Towards Clinically Viable Neural Prosthetic Systems.<br />
* Abstract:<br />
<br />
'''20 Oct 2010'''<br />
* Speaker: Alexandre Francois<br />
* Affiliation: USC<br />
* Host: <br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''3 Nov 2010'''<br />
* Speaker: Eric Jonas and Vikash Mansinghka<br />
* Affiliation: Navia Systems<br />
* Host: Jascha<br />
* Status: Confirmed<br />
* Title: Natively Probabilistic Computation: Principles, Artifacts, Architectures and Applications<br />
* Abstract: Complex probabilistic models and Bayesian inference are becoming<br />
increasingly critical across science and industry, especially in<br />
large-scale data analysis. They are also central to our best<br />
computational accounts of human cognition, perception and action.<br />
However, all these efforts struggle with the infamous curse of<br />
dimensionality. Rich probabilistic models can seem hard to write and<br />
even harder to solve, as specifying and calculating probabilities<br />
often appears to require the manipulation of exponentially (and<br />
sometimes infinitely) large tables of numbers.<br />
<br />
We argue that these difficulties reflect a basic mismatch between the<br />
needs of probabilistic reasoning and the deterministic, functional<br />
orientation of our current hardware, programming languages and CS<br />
theory. To mitigate these issues, we have been developing a stack of<br />
abstractions for natively probabilistic computation, based around<br />
stochastic simulators (or samplers) for distributions, rather than<br />
evaluators for deterministic functions. Ultimately, our aim is to<br />
produce a model of computation and the associated hardware and<br />
programming tools that are as suited for uncertain inference and<br />
decision-making as our current computers are for precise arithmetic.<br />
<br />
In this talk, we will give an overview of the entire stack of<br />
abstractions supporting natively probabilistic computation, with<br />
technical detail on several hardware and software artifacts we have<br />
implemented so far. we will also touch on some new theoretical results<br />
regarding the computational complexity of probabilistic programs.<br />
Throughout, we will motivate and connect this work to some current<br />
applications in biomedical data analysis and computer vision, as well<br />
as potential hypotheses regarding the implementation of probabilistic<br />
computation in the brain.<br />
<br />
This talk includes joint work with Keith Bonawitz, Beau Cronin,<br />
Cameron Freer, Daniel Roy and Joshua Tenenbaum.<br />
<br />
BRIEF BIOGRAPHY<br />
<br />
Vikash Mansinghka is a co-founder and the CTO of Navia Systems, a<br />
venture-funded startup company building natively probabilistic<br />
computing machines. He spent 10 years at MIT, eventually earning an<br />
SB. in Mathematics, an SB. in Computer Science, an MEng in Computer<br />
Science, and a PhD in Computation. He held graduate fellowships from<br />
the NSF and MIT's Lincoln Laboratories, and his PhD dissertation won<br />
the 2009 MIT George M. Sprowls award for best dissertation in computer<br />
science. He currently serves on DARPA's Information Science and<br />
Technology (ISAT) Study Group.<br />
<br />
Eric Jonas is a co-founder of Navia Systems, responsible for in-house<br />
accelerated inference research and development. He spent ten years at<br />
MIT, where he earned SB degrees in electrical engineering and computer<br />
science and neurobiology, an MEng in EECS, with a neurobiology PhD<br />
expected really soon. He’s passionate about biological applications<br />
of probabilistic reasoning and hopes to use Navia’s capabilities to<br />
combine data from biological science, clinical histories, and patient<br />
outcomes into seamless models.<br />
<br />
'''8 Nov 2010'''<br />
* Speaker: Patrick Ruther<br />
* Affiliation: Imtek, University of Freiburg<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract: TBD<br />
<br />
'''10 Nov 2010'''<br />
* Speaker: Aurel Lazar<br />
* Affiliation: Department of Electrical Engineering, Columbia University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Encoding Visual Stimuli with a Population of Hodgkin-Huxley Neurons<br />
* Abstract: We first present a general framework for the reconstruction of natural video<br />
scenes encoded with a population of spiking neural circuits with random thresholds.<br />
The visual encoding system consists of a bank of filters, modeling the visual<br />
receptive fields, in cascade with a population of neural circuits, modeling encoding<br />
with spikes in the early visual system.<br />
The neuron models considered include integrate-and-fire neurons and ON-OFF<br />
neuron pairs with threshold-and-fire spiking mechanisms. All thresholds are assumed<br />
to be random. We show that for both time-varying and space-time-varying stimuli neural<br />
spike encoding is akin to taking noisy measurements on the stimulus.<br />
Second, we formulate the reconstruction problem as the minimization of a<br />
suitable cost functional in a finite-dimensional vector space and provide an explicit<br />
algorithm for stimulus recovery. We also present a general solution using the theory of<br />
smoothing splines in Reproducing Kernel Hilbert Spaces. We provide examples of both<br />
synthetic video as well as for natural scenes and show that the quality of the<br />
reconstruction degrades gracefully as the threshold variability of the neurons increases.<br />
Third, we demonstrate a number of simple operations on the original visual stimulus<br />
including translations, rotations and zooming. All these operations are natively executed<br />
in the spike domain. The processed spike trains are decoded for the faithful recovery<br />
of the stimulus and its transformations.<br />
Finally, we extend the above results to neural encoding circuits built with Hodking-Huxley<br />
neurons.<br />
References:<br />
Aurel A. Lazar, Eftychios A. Pnevmatikakis and Yiyin Zhou,<br />
Encoding Natural Scenes with Neural Circuits with Random Thresholds, Vision Research, 2010,<br />
Special Issue on Mathematical Models of Visual Coding,<br />
http://dx.doi.org/10.1016/j.visres.2010.03.015<br />
Aurel A. Lazar,<br />
Population Encoding with Hodgkin-Huxley Neurons,<br />
IEEE Transactions on Information Theory, Volume 56, Number 2, pp. 821-837, February, 2010,<br />
Special Issue on Molecular Biology and Neuroscience,<br />
http://dx.doi.org/10.1109/TIT.2009.2037040<br />
<br />
'''11 Nov 2010''' (UCB holiday)<br />
* Speaker: Martha Nari Havenith<br />
* Affiliation: UCL<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: Finding spike timing in the visual cortex - Oscillations as the internal clock of vision?<br />
* Abstract:<br />
<br />
'''19 Nov 2010''' (note: on Friday because of SFN)<br />
* Speaker: Dan Butts<br />
* Affiliation: UMD<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: Common roles of inhibition in visual and auditory processing.<br />
* Abstract: The role of inhibition in sensory processing is often obscured in extracellular recordings, because the absence of a neuronal response associated with inhibition might also be explained by a simple lack of excitation. However, increasingly, evidence from intracellular recordings demonstrates important roles of inhibition in shaping the stimulus selectivity of sensory neurons in both the visual and auditory systems. We have developed a nonlinear modeling approach that can identify putative excitatory and inhibitory inputs to a neuron using standard extracellular recordings, and have applied these techniques to understand the role of inhibition in shaping sensory processing in visual and auditory areas. In pre-cortical visual areas (retina and LGN), we find that inhibition likely plays a role in generating temporally precise responses, and mediates adaptation to changing contrast. In an auditory pre-cortical area (inferior colliculus) identified inhibition has nearly identical appearance and functions in temporal processing and adaptation. Thus, we predict common roles of inhibition in these sensory areas, and more generally demonstrate general methods for characterizing the nonlinear computations that comprise sensory processing.<br />
<br />
'''24 Nov 2010'''<br />
* Speaker: Eizaburo Doi<br />
* Affiliation: NYU<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
<br />
'''29 Nov 2010 - informal talk'''<br />
* Speaker: Eero Lehtonen<br />
* Affiliation: UTU Finland<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Memristors<br />
* Abstract:<br />
<br />
'''1 Dec 2010'''<br />
* Speaker: Gadi Geiger<br />
* Affiliation: MIT<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: Visual and Auditory Perceptual Modes that Characterize Dyslexics<br />
* Abstract: I will describe how dyslexics’ visual and auditory perception is wider and more diffuse than that of typical readers. This suggests wider neural tuning in dyslexics. In addition I will describe how this processing relates to difficulties in reading. Strengthening the argument and more so helping dyslexics I will describe a regimen of practice that results in improved reading in dyslexics while narrowing perception.<br />
<br />
<br />
'''13 Dec 2010'''<br />
* Speaker: Jorg Lueke<br />
* Affiliation: FIAS<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Linear and Non-linear Approaches to Component Extraction and Their Applications to Visual Data<br />
* Abstract: In the nervous system of humans and animals, sensory data are represented as combinations of elementary data components. While for data such as sound waveforms the elementary components combine linearly, other data can better be modeled by non-linear forms of component superpositions. I motivate and discuss two models with binary latent variables: one using standard linear superpositions of basis functions and one using non-linear superpositions. Crucial for the applicability of both models are efficient learning procedures. I briefly introduce a novel training scheme (ET) and show how it can be applied to probabilistic generative models. For linear and non-linear models the scheme efficiently infers the basis functions as well as the level of sparseness and data noise. In large-scale applications to image patches, we show results on the statistics of inferred model parameters. Differences between the linear and non-linear models are discussed, and both models are compared to results of standard approaches in the literature and to experimental findings. Finally, I briefly discuss learning in a recent model that takes explicit component occlusions into account.<br />
<br />
'''15 Dec 2010'''<br />
* Speaker: Claudia Clopath<br />
* Affiliation: Universite Paris Decartes<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
<br />
'''18 Jan 2011'''<br />
* Speaker: Siwei Lyu<br />
* Affiliation: Computer Science Department, University at Albany, SUNY<br />
* Host: Bruno<br />
* Status: confirmed <br />
* Title: Divisive Normalization as an Efficient Coding Transform: Justification and Evaluation<br />
* Abstract:<br />
<br />
'''19 Jan 2011'''<br />
* Speaker: David Field (informal talk)<br />
* Affiliation: <br />
* Host: Bruno<br />
* Status: Tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''25 Jan 2011'''<br />
* Speaker: Ruth Rosenholtz<br />
* Affiliation: Dept. of Brain & Cognitive Sciences, Computer Science and AI Lab, MIT<br />
* Host: Bruno<br />
* Status: Confirmed <br />
* Title: What your visual system sees where you are not looking<br />
* Abstract:<br />
<br />
'''26 Jan 2011'''<br />
* Speaker: Ernst Niebur<br />
* Affiliation: Johns Hopkins U<br />
* Host: Fritz<br />
* Status: Confirmed <br />
* Title: <br />
* Abstract:<br />
<br />
'''16 March 2011'''<br />
* Speaker: Vladimir Itskov<br />
* Affiliation: University of Nebraska-Lincoln<br />
* Host: Chris<br />
* Status: Confirmed <br />
* Title: <br />
* Abstract:<br />
<br />
'''23 March 2011'''<br />
* Speaker: Bruce Cumming<br />
* Affiliation: National Institutes of Health<br />
* Host: Ivana<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract:<br />
<br />
'''27 April 2011'''<br />
* Speaker: Lubomir Bourdev<br />
* Affiliation: Computer Science, UC Berkeley<br />
* Host:Bruno<br />
* Status: Confirmed<br />
* Title: "Poselets and Their Applications in High-Level Computer Vision Problems"<br />
* Abstract:<br />
<br />
'''12 May 2011 (note: Thursday)'''<br />
* Speaker: Jack Culpepper<br />
* Affiliation: Redwood Center/EECS<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''26 May 2011'''<br />
* Speaker: Ian Stevenson<br />
* Affiliation: Northwestern University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Explaining tuning curves by estimating interactions between neurons<br />
* Abstract: One of the central tenets of systems neuroscience is that tuning curves are a byproduct of the interactions between neurons. Using multi-electrode recordings and recently developed inference techniques we can begin to examine this idea in detail and study how well we can explain the functional properties of neurons using the activity of other simultaneously recorded neurons. Here we examine datasets from 6 different brain areas recorded during typical sensorimotor tasks each with ~100 simultaneously recorded neurons. Using these datasets we measured the extent to which interactions between neurons can explain the tuning properties of individual neurons. We found that, in almost all areas, modeling interactions between 30-50 neurons allows more accurate spike prediction than tuning curves. This suggests that tuning can, in some sense, be explained by interactions between neurons in a variety of brain areas, even when recordings consist of relatively small numbers of neurons.<br />
<br />
'''1 June 2011'''<br />
* Speaker: Michael Oliver<br />
* Affiliation: Gallant lab<br />
* Host: Bruno<br />
* Status: Tentative <br />
* Title: <br />
* Abstract:<br />
<br />
'''8 June 2011'''<br />
* Speaker: Alyson Fletcher<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: Generalized Approximate Message Passing for Neural Receptive Field Estimation and Connectivity<br />
* Abstract: Fundamental to understanding sensory encoding and connectivity of neurons are effective tools for developing and validating complex mathematical models from experimental data. In this talk, I present a graphical models approach to the problems of neural connectivity reconstruction under multi-neuron excitation and to receptive field estimation of sensory neurons in response to stimuli. I describe a new class of Generalized Approximate Message Passing (GAMP) algorithms for a general class of inference problems on graphical models based Gaussian approximations of loopy belief propagation. The GAMP framework is extremely general, provides a systematic procedure for incorporating a rich class of nonlinearities, and is computationally tractable with large amounts of data. In addition, for both the connectivity reconstruction and parameter estimation problems, I show that GAMP-based estimation can naturally incorporate sparsity constraints in the model that arise from the fact that only a small fraction of the potential inputs have any influence on the output of a particular neuron. A simulation of reconstruction of cortical neural mapping under multi-neuron excitation shows that GAMP offers improvement over previous compressed sensing methods. The GAMP method is also validated on estimation of linear nonlinear Poisson (LNP) cascade models for neural responses of salamander retinal ganglion cells.<br />
<br />
=== 2009/10 academic year ===<br />
<br />
'''2 September 2009''' <br />
* Speaker: Keith Godfrey<br />
* Affiliation: University of Cambridge<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''7 October 2009'''<br />
* Speaker: Anita Schmid<br />
* Affiliation: Cornell University<br />
* Host: Kilian<br />
* Status: Confirmed<br />
* Title: Subpopulations of neurons in visual area V2 perform differentiation and integration operations in space and time<br />
* Abstract: The interconnected areas of the visual system work together to find object boundaries in visual scenes. Primary visual cortex (V1) mainly extracts oriented luminance boundaries, while secondary visual cortex (V2) also detects boundaries defined by differences in texture. How the outputs of V1 neurons are combined to allow for the extraction of these more complex boundaries in V2 is as of yet unclear. To address this question, we probed the processing of orientation signals in single neurons in V1 and V2, focusing on response dynamics of neurons to patches of oriented gratings and to combinations of gratings in neighboring patches and sequential time frames. We found two kinds of response dynamics in V2, both of which are different from those of V1 neurons. While V1 neurons in general prefer one orientation, one subpopulation of V2 neurons (“transient”) shows a temporally dynamic preference, resulting in a preference for changes in orientation. The second subpopulation of V2 neurons (“sustained”) responds similarly to V1 neurons, but with a delay. The dynamics of nonlinear responses to combinations of gratings reinforce these distinctions: the dynamics enhance the preference of V1 neurons for continuous orientations, and enhance the preference of V2 transient neurons for discontinuous ones. We propose that transient neurons in V2 perform a differentiation operation on the V1 input, both spatially and temporally, while the sustained neurons perform an integration operation. We show that a simple feedforward network with delayed inhibition can account for the temporal but not for the spatial differentiation operation.<br />
<br />
'''28 October 2009'''<br />
* Speaker: Andrea Benucci<br />
* Affiliation: Institute of Ophthalmology, University College London<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Stimulus dependence of the functional connectivity between neurons in primary visual cortex<br />
* Abstract: It is known that visual stimuli are encoded by the concerted activity of large populations of neurons in visual cortical areas. However, it is only recently that recording techniques have been made available to study such activations from large ensembles of neurons simultaneously, with millisecond temporal precision and tens of microns spatial resolution. I will present data from voltage-sensitive dye (VSD) imaging and multi-electrode recordings (“Utah” probes) from the primary visual cortex of the cat (V1). I will discuss the relationship between two fundamental cortical maps of the visual system: the map of retinotopy and the map of orientation. Using spatially localized and full-field oriented stimuli, we studied the functional interdependency of these maps. I will describe traveling and standing waves of cortical activity and their key role as a dynamical substrate for the spatio-temporal coding of visual information. I will further discuss the properties of the spatio-temporal code in the context of continuous visual stimulation. While recording population responses to a sequence of oriented stimuli, we asked how responses to individual stimuli summate over time. We found that such rules are mostly linear, supporting the idea that spatial and temporal codes in area V1 operate largely independently. However, these linear rules of summation fail when the visual drive is removed, suggesting that the visual cortex can readily switch between a dynamical regime where either feed-forward or intra-cortical inputs determine the response properties of the network.<br />
<br />
'''12 November 2009 (Thursday)'''<br />
* Speaker: Song-Chun Zhu<br />
* Affiliation: UCLA<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''18 November 2009'''<br />
* Speaker: Dan Graham<br />
* Affiliation: Dept. of Mathematics, Dartmouth College<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: The Packet-Switching Brain: A Hypothesis<br />
* Abstract: Despite great advances in our understanding of neural responses to natural stimuli, the basic structure of the neural code remains elusive. In this talk, I will describe a novel hypothesis regarding the fundamental structure of neural coding in mammals. In particular, I propose that an internet-like routing architecture (specifically packet-switching) underlies neocortical processing, and I propose means of testing this hypothesis via neural response sparseness measurements. I will synthesize a host of suggestive evidence that supports this notion and will, more generally, argue in favor of a large scale shift from the now dominant “computer metaphor,” to the “internet metaphor.” This shift is intended to spur new thinking with regard to neural coding, and its main contribution is to privilege communication over computation as the prime goal of neural systems.<br />
<br />
'''16 December 2009'''<br />
* Speaker: Pietro Berkes<br />
* Affiliation: Volen Center for Complex Systems, Brandeis University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Generative models of vision: from sparse coding toward structured models<br />
* Abstract: From a computational perspective, one can think of visual perception as the problem of analyzing the light patterns detected by the retina to recover their external causes. This process requires combining the incoming sensory evidence with internal prior knowledge about general properties of visual elements and the way they interact, and can be formalized in a class of models known as causal generative models. In the first part of the talk, I will discuss the first and most established generative model, namely the sparse coding model. Sparse coding has been largely successful in showing how the main characteristics of simple cells receptive fields can be accounted for based uniquely on the statistics of natural images. I will briefly review the evidence supporting this model, and contrast it with recent data from the primary visual cortex of ferrets and rats showing that the sparseness of neural activity over development and anesthesia seems to follow trends opposite to those predicted by sparse coding. In the second part, I will argue that the generative point of view calls for models of natural images that take into account more of the structure of the visual environment. I will present a model that takes a first step in this direction by incorporating the fundamental distinction between identity and attributes of visual elements. After learning, the model mirrors several aspects of the organization of V1, and results in a novel interpretation of complex and simple cells as parallel population of cells, coding for different aspects of the visual input. Further steps toward more structured generative models might thus lead to the development of a more comprehensive account of visual processing in the visual cortex.<br />
<br />
'''6 January 2010'''<br />
* Speaker: Susanne Still<br />
* Affiliation: U of Hawaii<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''20 January 2010'''<br />
* Speaker: Tom Dean<br />
* Affiliation: Google<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Accelerating Computer Vision and Machine Learning Algorithms with Graphics Processors<br />
* Abstract: Graphics processors (GPUs) and massively-multi-core architectures are becoming more powerful, less costly and more energy efficient, and the related programming language issues are beginning to sort themselves out. That said most researchers don’t want to be writing code that depends on any particular architecture or parallel programming model. Linear algebra, Fourier analysis and image processing have standard libraries that are being ported to exploit SIMD parallelism in GPUs. We can depend on the massively-multiple-core machines du jour to support these libraries and on the high-performance-computing (HPC) community to do the porting for us or with us. These libraries can significantly accelerate important applications in image processing, data analysis and information retrieval. We can develop APIs and the necessary run-time support so that code relying on these libraries will run on any machine in a cluster of computers but exploit GPUs whenever available. This strategy allows us to move toward hybrid computing models that enable a wider range of opportunities for parallelism without requiring the special training of programmers or the disadvantages of developing code that depends on specialized hardware or programming models. This talk summarizes the state of the art in massively-multi-core architectures, presents experimental results that demonstrate the potential for significant performance gains in the two general areas of image processing and machine learning, provides examples of the proposed programming interface, and some more detailed experimental results on one particular problem involving video-content analysis.<br />
<br />
'''27 January 2010'''<br />
* Speaker: David Philiponna<br />
* Affiliation: Paris<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
''''24 Feburary 2010'''<br />
* Speaker: Gordon Pipa<br />
* Affiliation: U Osnabrueck/MPI Frankfurt<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''3 March 2010'''<br />
* Speaker: Gaute Einevoll<br />
* Affiliation: UMB, Norway<br />
* Host: Amir<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
<br />
'''4 March 2010'''<br />
* Speaker: Harvey Swadlow<br />
* Affiliation: <br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''8 April 2010'''<br />
* Speaker: Alan Yuille <br />
* Affiliation: UCLA<br />
* Host: Amir<br />
* Status: Confirmed (for 1pm)<br />
* Title: <br />
* Abstract:<br />
<br />
'''28 April 2010'''<br />
* Speaker: Dharmendra Modha - cancelled<br />
* Affiliation: IBM<br />
* Host:Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''5 May 2010'''<br />
* Speaker: David Zipser<br />
* Affiliation: UCB<br />
* Host: Daniel Little<br />
* Status: Tentative<br />
* Title: Brytes 2:<br />
* Abstract:<br />
<br />
Brytes are little brains that can be assembled into larger, smarter brains. In my first talk I presented a biologically plausible, computationally tractable model of brytes and described how they can be used as subunits to build brains with interesting behaviors.<br />
<br />
In this talk I will first show how large numbers of brytes can cooperate to perform complicated actions such as arm and hand manipulations in the presence of obstacles. Then I describe a strategy for a higher level of control that informs each bryte what role it should play in accomplishing the current task. These results could have considerable significance for understanding the brain and possibly be applicable to robotics and BMI.<br />
<br />
'''12 May 2010'''<br />
* Speaker: Frank Werblin (Redwood group meeting - internal only)<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: Tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''19 May 2010'''<br />
* Speaker: Anna Judith<br />
* Affiliation: UCB<br />
* Host: Daniel Little (Redwood Lab Meeting - internal only)<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Seminars&diff=8473Seminars2016-02-13T04:53:09Z<p>Jesselivezey: /* Tentative / Confirmed Speakers */</p>
<hr />
<div>== Instructions ==<br />
<br />
# Check the internal calendar (here) for a free seminar slot. Seminars are usually Wednesdays at noon, but it is flexible in case there is a day that works better for the speaker. However, it is usually best to avoid booking multiple speakers in the same week - it leads to "seminar burnout" and reduced attendance. But use your own judgement here - if its a good opportunity and that's the only time that works then go ahead with it.<br />
# Once you have proposed a date to a speaker, fill in the speaker information under the appropriate date (or change if necessary). Use the status field to indicate whether the date is tentative or confirmed. Please also include your name as ''host'' in case somebody wants to contact you.<br />
# Once the invitation is confirmed with the speaker, change the status field to 'confirmed'. Also notify the webmaster (Bruno) [mailto:baolshausen@berkeley.edu] that we have a confirmed speaker so that he/she can update the public web page. Please include a title and abstract.<br />
# Natalie (HWNI) checks our web page regularly and will send out an announcement a week before and also include with the weekly neuro announcements, but if you don't get it confirmed until the last minute then make sure to email Natalie [mailto:nrterranova@berkeley.edu] as well to give her a heads up so she knows to send out an announcement in time.<br />
# If the speaker needs accommodations you should contact Natalie [mailto:nrterranova@berkeley.edu] to reserve a room at the faculty club. Tell her its for a Redwood speaker so she knows how to bill it.<br />
# During the visit you will need to look after the visitor, schedule visits with other labs, make plans for lunch, dinner, etc., and introduce at the seminar (don't ask Bruno to do this at the last moment). Save receipts for any meals you paid for.<br />
# After the seminar and before the speaker leaves, make sure to give them Natalie's contact info and have them email her their receipts, explaining its for reimbursement for a Redwood seminar. Natalie will then process the reimbursement. She can also help you with getting reimbursed for any expenses you incurred for meals and entertainment.<br />
<br />
== Tentative / Confirmed Speakers ==<br />
<br />
<br />
'''Feb 3, 2016'''<br />
* Speaker: Ping-Chen Huang<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Feb 17, 2016'''<br />
* Speaker: Andrew Saxe<br />
* Affiliation: Harvard<br />
* Host: Jesse<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Mar 1, 2016'''<br />
* Speaker: Leon Gatys<br />
* Affiliation: Univ Tubingen<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Mar 7-9, 2016'''<br />
* NICE workshop<br />
<br />
'''Mar 23, 2016'''<br />
* Speaker: Kwabena Boahen<br />
* Affiliation: Stanford<br />
* Host: Max Kanwal/Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Mar 30, 2016'''<br />
* Tony Zador HWNI talk at 12:00<br />
<br />
'''May 18, 2016'''<br />
* Speaker: Melanie Mitchell<br />
* Affiliation: Portland State University<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title:<br />
<br />
== Previous Seminars ==<br />
<br />
=== 2015/16 academic year ===<br />
<br />
'''July 21, 2015'''<br />
* Speaker: Felix Effenberger<br />
* Affiliation: <br />
* Host: Chris H.<br />
* Status: confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 22, 2015'''<br />
* Speaker: Lav Varshney<br />
* Affiliation: Urbana-Champaign<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 23, 2015'''<br />
* Speaker: Xuemin Wei<br />
* Affiliation: Univ Penn<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''July 29, 2015'''<br />
* Speaker: Gonzalo Otazu<br />
* Affiliation: Cold Spring Harbor Laboratory, Long Island, NY<br />
* Host: Mike D<br />
* Status: Confirmed<br />
* Title: The Role of Cortical Feedback in Olfactory Processing<br />
* Abstract: The olfactory bulb receives rich glutamatergic projections from the piriform cortex. However, the dynamics and importance of these feedback signals remain unknown. In the first part of this talk, I will present data from multiphoton calcium imaging of cortical feedback in the olfactory bulb of awake mice. Responses of feedback boutons were sparse, odor specific, and often outlasted stimuli by several seconds. Odor presentation either enhanced or suppressed the activity of boutons. However, any given bouton responded with stereotypic polarity across multiple odors, preferring either enhancement or suppression. Inactivation of piriform cortex increased odor responsiveness and pairwise similarity of mitral cells but had little impact on tufted cells. We propose that cortical feedback differentially impacts these two output channels of the bulb by specifically decorrelating mitral cell responses to enable odor separation. In the second part of the talk I will introduce a computational model of odor identification in natural scenes that uses cortical feedback and how the model predictions match our experimental data.<br />
<br />
'''Aug 19, 2015'''<br />
* Speaker: Wujie Zhang<br />
* Affiliation: Columbia<br />
* Host: Bruno/Michael Yartsev<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''Sept 2, 2015'''<br />
* Speaker: Jeremy Maitin-Shepard<br />
* Affiliation: Computer Science, UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Combinatorial Energy Learning for Image Segmentation<br />
* Abstract: Recent advances in volume electron microscopy make it possible to image neuronal tissue volumes containining hundreds of thousands of neurons at sufficient resolution to discern even the finest neuronal processes. Accurate 3-D segmentation of these processes densely packed in these petavoxel-scale volumes is the key bottleneck in reconstructing large-scale neural circuits.<br />
<br />
'''Sept 8, 2015'''<br />
* Speaker: Jennifer Hasler<br />
* Affiliation: Georgia Tech<br />
* Host: Bruno/Mika<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''October 29, 2015'''<br />
* Speaker: Garrett Kenyon<br />
* Affiliation: Los Alamos National Laboratory<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title: A Deconvolutional Competitive Algorithm (DCA)<br />
* Abstract: The Locally Competitive Algorithm (LCA) is a neurally-plausible sparse solver based on lateral inhibition between leaky integrator neurons. LCA accounts for many linear and nonlinear response properties of V1 simple cells, including end-stopping and contrast-invariant orientation tuning. Here, we describe a convolutional implementation of LCA in which a column of feature vectors is replicated with a stride that is much smaller than the diameter of the corresponding kernels, allowing the construction of dictionaries that are many times more overcomplete than without replication. Using a local Hebbian rule that minimizes sparse reconstruction error, we are able to learn representations from unlabeled imagery, including monocular and stereo video streams, that in some cases support near state-of-the-art performance on object detection, action classification and depth estimation tasks, with a simple linear classifier. We further describe a scalable approach to building a hierarchy of convolutional LCA layers, which we call a Deconvolutional Competitive Algorithm (DCA). All layers in a DCA are trained simultaneously and all layers contribute to a single image reconstruction, with each layer deconvolving its representation through all lower layers back to the image plane. We show that a 3-layer DCA trained on short video clips obtained from hand-held cameras exhibits a clear segregation of image content, with features in the top layer reconstructing large-scale structures while features in the middle and bottom layers reconstruct progressively finer details. Lastly, we describe PetaVision, an open source, cloud-friendly, high-performance neural simulation toolbox that was used to perform the numerical studies presented here.<br />
<br />
'''Nov 18, 2015'''<br />
* Speaker: Hillel Adesnik<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
'''Nov 17, 2015'''<br />
* Speaker: Manuel Lopez<br />
* Affiliation: <br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract<br />
<br />
'''Dec 2, 2015'''<br />
* Speaker: Steven Brumby<br />
* Affiliation: [http://www.descarteslabs.com/ Descartes Labs]<br />
* Host: Dylan<br />
* Status: confirmed<br />
* Title: Seeing the Earth in the Cloud<br />
* Abstract: The proliferation of transistors has increased the performance of computing systems by over a factor of a million in the past 30 years, and is also dramatically increasing the amount of data in existence, driving improvements in sensor, communication and storage technology. Multi-decadal Earth and planetary remote sensing global datasets at the petabyte scale (8×10^15 bits) are now available in commercial clouds, and new satellite constellations are planning to generate petabytes of images per year, providing daily global coverage at a few meters per pixel. Cloud storage with adjacent high-bandwidth compute, combined with recent advances in neuroscience-inspired machine learning for computer vision, is enabling understanding of the world at a scale and at a level of granularity never before feasible. We report here on a computation processing over a petabyte of compressed raw data from 2.8 quadrillion pixels (2.8 petapixels) acquired by the US Landsat and MODIS programs over the past 40 years. Using commodity cloud computing resources, we convert the imagery to a calibrated, georeferenced, multiresolution tiled format suited for machine-learning analysis. We believe ours is the first application to process, in less than a day, on generally available resources, over a petabyte of scientific image data. We report on work using this reprocessed dataset for experiments demonstrating country-scale food production monitoring, an indicator for famine early warning. <br />
<br />
'''Dec 14, 2015'''<br />
* Speaker: Bill Softky <br />
* Affiliation:<br />
* Host: Bruno<br />
* Status: confirmed <br />
* Title: Screen addition - informal Redwood group seminar<br />
<br />
'''Dec 16, 2015'''<br />
* Speaker: Mike Landy<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title:<br />
<br />
=== 2014/15 academic year ===<br />
<br />
'''2 July 2014'''<br />
* Speaker: Kelly Clancy<br />
* Affiliation: Feldman lab<br />
* Host: Guy<br />
* Status: confirmed<br />
* Title: Volitional control of neural assemblies in L2/3 of motor and somotosensory cortices<br />
* Abstract: I'll be talking about a joint effort between the Feldman, Carmena and Costa labs to study abstract task learning by small neuronal assemblies in intact networks. Brain-machine interfaces are a unique tool for studying learning, thanks to the direct mapping between neural activity and reward. We trained mice to operantly control an auditory cursor using spike-related calcium signals recorded with two-photon imaging in motor and somatosensory cortex, allowing us to assess the effects of learning with great spatial detail. Mice rapidly learned to modulate activity in layer 2/3 neurons, evident both across and within sessions. Interestingly, even neurons that exhibited very low or no spontaneous spiking--so-called 'silent' cells that are invisible to electrode-based techniques--could be behaviorally up-modulated for task performance. Learning was accompanied by modifications of firing correlations in spatially localized networks at fine scales.<br />
<br />
'''23 July 2014'''<br />
* Speaker: Gautam Agarwal<br />
* Affiliation: UC Berkeley/Champalimaud<br />
* Host: Friedrich Sommer<br />
* Status: confirmed<br />
* Title: Unsolved Mysteries of Hippocampal Dynamics<br />
* Abstract: Two radically different forms of electrical activity can be observed in the rat hippocampus: spikes and local field potentials (LFPs). Hippocampal pyramidal neurons are mostly silent, yet spike vigorously as the subject encounters particular locations in its environment. In contrast, LFPs appear to lack place-selectivity, persisting regardless of the rat's location. Recently, we found that in fact one can recover from LFPs the spatial information present in the underlying neuronal population, showing how these two signals are two sides of the same coin. Nonetheless, there are many aspects of the LFP that remain mysterious. I will review several observations and explanatory gaps which await further study. These include: the relationship of LFP patterns to anatomy; the elusive structure of gamma waves; complex forms of cross-frequency coupling; variations in LFP patterns seen when the rat explores its world more freely; reconciling the memory and navigation roles of the hippocampus.<br />
<br />
'''6 Aug 2014'''<br />
* Speaker: Georg Martius<br />
* Affiliation: Max Planck Institute, Leipzig<br />
* Host: Fritz Sommer<br />
* Status: confirmed<br />
* Title: Information driven self-organization of robotic behavior<br />
* Abstract: Autonomy is a puzzling phenomenon in nature and a major challenge in the world of artifacts. A key feature of autonomy in both natural and<br />
artificial systems is seen in the ability for independent<br />
exploration. In animals and humans, the ability to modify its own<br />
pattern of activity is not only an indispensable trait for adaptation<br />
and survival in new situations, it also provides a learning system<br />
with novel information for improving its cognitive capabilities, and<br />
it is essential for development. Efficient exploration in<br />
high-dimensional spaces is a major challenge in building learning<br />
systems. We propose to implement the exploration as a deterministic<br />
law derived from maximizing an information quantity. More<br />
specifically we use the predictive information of the sensor process<br />
(of a robot) to obtain an update rule (exploration dynamics) of the<br />
controller parameters. To be adequate in robotics application the<br />
non-stationary nature of the underlying time-series have to be taken<br />
into account, which we do by proposing the time-local predictive<br />
information (TiPI). Importantly the exploration dynamics is derived<br />
analytically and by this we link information theory and dynamical<br />
systems. Without a random component the change in the parameters is<br />
deterministically given as a function of the states in a certain time<br />
window. For an embodied system this means in particular that<br />
constraints, responses and current knowledge of the dynamical<br />
interaction with the environment can directly be used to advance<br />
further exploration. Randomness is replaced with spontaneity which we<br />
demonstrate to restrict the search space automatically to the<br />
physically relevant dimensions. Its effectiveness will be<br />
presented with various experiments on high-dimensional robotic system<br />
and we argue that this is a promising way to avoid the curse of<br />
dimensionality. This talk describes joint work with Ralf Der and Nihat Ay.<br />
<br />
'''15 Aug 2014'''<br />
* Speaker: Juergen Schmidhuber<br />
* Affiliation: IDSIA, Switzerland<br />
* Host: James/Shariq<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''2 Sept 2014'''<br />
* Speaker: Oriol Vinyals <br />
* Affliciation: Google<br />
* Host: Guy<br />
* Status: confirmed<br />
* Title: Machine Translation with Long-Short Term Memory Models<br />
* Abstract: Supervised large deep neural networks achieved good results on speech recognition and computer vision. Although very successful, deep neural networks can only be applied to problems whose inputs and outputs can be conveniently encoded with vectors of fixed dimensionality - but cannot easily be applied to problems whose inputs and outputs are sequences. In this work, we show how to use a large deep Long Short-Term Memory (LSTM) model to solve domain-agnostic supervised sequence to sequence problems with minimal manual engineering. Our model uses one LSTM to map the input sequence to a vector of a fixed dimensionality and another LSTM to map the vector to the output sequence. We applied our model to a machine translation task and achieved encouraging results. On the WMT'14 translation task from English to French, a model combination of 6 large LSTMs achieves a BLEU score of 32.3 (where a larger score is better). For comparison, a strong standard statistical MT baseline achieves a BLEU score of 33.3. When we use our LSTM to rescore the n-best lists produced by the SMT baseline, we achieve a BLEU score of 36.3, which is a new state of the art. This is joint work with Ilya Sutskever and Quoc Le.<br />
<br />
'''19 Sept 2014'''<br />
* Speaker: Gary Marcus<br />
* Affiliation: NYU<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''24 Sept 2014'''<br />
* Speaker: Alyosha Efros<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''30 Sep 2014'''<br />
* Speaker: Alejandro Bujan<br />
* Affiliation:<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Propagation and variability of evoked responses: the role of correlated inputs and oscillations<br />
* Abstract: <br />
<br />
'''8 Oct 2014'''<br />
* Speaker: Siyu Zhang<br />
* Affiliation: UC Berkeley<br />
* Host: Karl<br />
* Status: confirmed<br />
* Title: Long-range and local circuits for top-down modulation of visual cortical processing<br />
* Abstract:<br />
<br />
'''15 Oct 2014'''<br />
* Speaker: Tamara Broderick<br />
* Affiliation: UC Berkeley<br />
* Host: Yvonne/James<br />
* Status: confirmed<br />
* Title: Feature allocations, probability functions, and paintboxes<br />
* Abstract: Clustering involves placing entities into mutually exclusive categories. We wish to relax the requirement of mutual exclusivity, allowing objects to belong simultaneously to multiple classes, a formulation that we refer to as "feature allocation." The first step is a theoretical one. In the case of clustering the class of probability distributions over exchangeable partitions of a dataset has been characterized (via exchangeable partition probability functions and the Kingman paintbox). These characterizations support an elegant nonparametric Bayesian framework for clustering in which the number of clusters is not assumed to be known a priori. We establish an analogous characterization for feature allocation; we define notions of "exchangeable feature probability functions" and "feature paintboxes" that lead to a Bayesian framework that does not require the number of features to be fixed a priori. The second step is a computational one. Rather than appealing to Markov chain Monte Carlo for Bayesian inference, we develop a method to transform Bayesian methods for feature allocation (and other latent structure problems) into optimization problems with objective functions analogous to K-means in the clustering setting. These yield approximations to Bayesian inference that are scalable to large inference problems.<br />
<br />
'''29 Oct 2014'''<br />
* Speaker: Ken Nakayama<br />
* Affiliation: Harvard<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Topics in higher level visuo-motor control<br />
* Abstract: TBA<br />
<br />
'''5 Nov 2014''' - **BVLC retreat**<br />
<br />
'''20 Nov 2014'''<br />
* Speaker: Haruo Hasoya<br />
* Affiliation: ATR Institute, Japan<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''9 Dec 2014'''<br />
* Speaker: Dirk DeRidder<br />
* Affiliation: Dundedin School of Medicine, University of Otago, New Zealand<br />
* Host: Bruno/Walter Freeman<br />
* Status: confirmed<br />
* Title: The Bayesian brain, phantom percepts and brain implants<br />
* Abstract: TBA<br />
<br />
'''January 14, 2015'''<br />
* Speaker: Kevin O'regan<br />
* Affiliation: CNRS - Université Paris Descartes<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''January 21, 2015'''<br />
* Speaker: Adrienne Fairhall<br />
* Affiliation: University of Washington<br />
* Host: Mike Schachter<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''January 26, 2015'''<br />
* Speaker: Abraham Peled<br />
* Affiliation: Mental Health Center, 'Technion' Israel Institute of Technology<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Clinical Brain Profiling: A Neuro-Computational psychiatry<br />
* Abstract: TBA<br />
<br />
'''January 28, 2015'''<br />
* Speaker: Rich Ivry<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Embodied Decision Making: System interactions in sensorimotor adaptation and reinforcement learning<br />
* Abstract:<br />
<br />
'''February 11, 2015'''<br />
* Speaker: Mark Lescroart<br />
* Affiliation: UC Berkeley<br />
* Host: Karl<br />
* Status: tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''February 25, 2015'''<br />
* Speaker: Steve Chase<br />
* Affiliation: CMU<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Joint Redwood/CNEP seminar<br />
* Abstract:<br />
<br />
'''March 3, 2015'''<br />
* Speaker: Andreas Herz<br />
* Affiliation: Bernstein Center, Munich<br />
* Host: Bruno/Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''March 3, 2015 - 4:00'''<br />
* Speaker: James Cooke<br />
* Affiliation: Oxford<br />
* Host: Mike Deweese<br />
* Status: confirmed<br />
* Title: Neural Circuitry Underlying Contrast Gain Control in Primary Auditory Cortex<br />
* Abstract:<br />
<br />
'''March 4, 2015'''<br />
* Speaker: Bill Sprague<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: V1 disparity tuning and the statistics of disparity in natural viewing<br />
* Abstract:<br />
<br />
'''March 11, 2015'''<br />
* Speaker: Jozsef Fiser<br />
* Affiliation: Central European University<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''April 1, 2015'''<br />
* Speaker: Saeed Saremi<br />
* Affiliation: Salk Inst<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''April 15, 2015'''<br />
* Speaker: Zahra M. Aghajan<br />
* Affiliation: UCLA<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Hippocampal Activity in Real and Virtual Environments<br />
* Abstract:<br />
<br />
'''May 7, 2015'''<br />
* Speaker: Santani Teng<br />
* Affiliation: MIT<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''May 13, 2015'''<br />
* Speaker: Harri Valpola<br />
* Affiliation: ZenRobotics<br />
* Host: Brian<br />
* Status: Tentative<br />
* Title: TBA<br />
* Abstract<br />
<br />
'''June 24, 2015'''<br />
* Speaker: Kendrick Kay<br />
* Affiliation: Department of Psychology, Washington University in St. Louis<br />
* Host: Karl<br />
* Status: Confirmed<br />
* Title: Using functional neuroimaging to reveal the computations performed by the human visual system<br />
* Abstract<br />
Visual perception is the result of a complex set of computational transformations performed by neurons in the visual system. Functional magnetic resonance imaging (fMRI) is ideally suited for identifying these transformations, given its excellent spatial resolution and ability to monitor activity across the numerous areas of visual cortex. In this talk, I will review past research in which we used fMRI to develop increasingly accurate models of the stimulus transformations occurring in early and intermediate visual areas. I will then describe recent research in which we successfully extend this approach to high-level visual areas involved in perception of visual categories (e.g. faces) and demonstrate how top-down attention modulates bottom-up stimulus representations. Finally, I will discuss ongoing research targeting regions of ventral temporal cortex that are essential for skilled reading. Our model-based approach, combined with high-field laminar measurements, is expected to provide an integrated picture of how bottom-up stimulus transformations and top-down cognitive factors interact to support rapid and accurate word recognition. Development of quantitative models and associated experimental paradigms may help us understand and diagnose impairments in neural processing that underlie visual disorders such as dyslexia and prosopagnosia.<br />
<br />
=== 2013/14 academic year ===<br />
<br />
'''9 Oct 2013'''<br />
* Speaker: Ekaterina Brocke<br />
* Affiliation: KTH University, Stockholm, Sweden<br />
* Host: Tony<br />
* Status: confirmed<br />
* Title: Multiscale modeling in Neuroscience: first steps towards multiscale co-simulation tool development.<br />
* Abstract: Multiscale modeling/simulations attracts an increasing number of neuroscientists to study how different levels of organization (networks of neurons, cellular/subcellular levels) interact with each other across multiple scales, space and time, to mediate different brain functions. Different scales are usually described by different physical and mathematical formalisms thus making it non trivial to perform the integration. In this talk, I will discuss key phenomena in Neuroscience that can be addressed using subcellular/cellular models, possible approaches to perform multiscale simulations in particular a co-simulation method. I will also introduce several multiscale "toy" models of cellular/subcellular levels that were developed with the aim to understand numerical and technical problems which might appear during the co-simulation. And finally, the first steps made towards multiscale co-simulation tool development will be presented during the talk.<br />
<br />
'''29 Oct 2013 - note: 4:00'''<br />
* Speaker: Mitya Chkolovskii<br />
* Affiliation: HHMI/Janelia Farm<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''30 Oct 2013'''<br />
* Speaker: Ilya Nemanman<br />
* Affiliation: Emory University, Departments of Physics and Biology<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Large N in neural data -- expecting the unexpected.<br />
* Abstract: Recently it has become possible to directly measure simultaneous collective states of many biological components, such as neural activities, genetic sequences, or gene expression profiles. These data are revealing striking results, suggesting, for example, that biological systems are tuned to criticality, and that effective models of these systems based on only pairwise interactions among constitutive components provide surprisingly good fits to the data. We will explore a handful of simplified theoretical models, largely focusing on statistical mechanics of Ising spins, that suggest plausible explanations for these observations. Specifically, I will argue that, at least in certain contexts, these intriguing observations should be expected in multivariate interacting data in the thermodynamic limit of many interacting components.<br />
<br />
'''31 Oct 2013'''<br />
* Speaker: Oriol Vinyals<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno/Brian<br />
* Status: confirmed<br />
* Title: Beyond Deep Learning: Scalable Methods and Models for Learning<br />
* Abstract: In this talk I will briefly describe several techniques I explored in my thesis that improve how to efficiently model signal representations and learn useful information from them. The building block of my dissertation is based on machine learning approaches to classification, where a (typically non-linear) function is learned from labeled examples to map from signals to some useful information (e.g. an object class present an image, or a word present in an acoustic signal). One of the motivating factors of my work has been advances in neural networks in deep architectures (which has led to the terminology "deep learning"), and that has shown state-of-the-art performance in acoustic modeling and object recognition -- the main focus of this thesis. In my work, I have contributed to both the learning (or training) of such architectures through faster and robust optimization techniques, and also to the simplification of the deep architecture model to an approach that is simple to optimize. Furthermore, I derived a theoretical bound showing a fundamental limitation of shallow architectures based on sparse coding (which can be seen as a one hidden layer neural network), thus justifying the need for deeper architectures, while also empirically verifying these architectural choices on speech recognition. Many of my contributions have been used in a wide variety of applications, products and datasets as a result of many collaborations within ICSI and Berkeley, but also at Microsoft Research and Google Research.<br />
<br />
'''6 Nov 2013'''<br />
* Speaker: Garrett T. Kenyon<br />
* Affiliation: Los Alamos National Laboratory, The New Mexico Consortium<br />
* Host: Dylan Paiton<br />
* Status: Confirmed<br />
* Title: Using Locally Competitive Algorithms to Model Top-Down and Lateral Interactions<br />
* Abstract: Cortical connections consist of feedforward, feedback and lateral pathways. Infragranular layers project down the cortical hierarchy to both supra- and infragranular layers at the previous processing level, while the neurons in supragranular layers are linked by extensive long-range lateral projections that cross multiple cortical columns. However, most functional models of visual cortex only account for feedforward connections. Additionally, most models of visual cortex fail to account both for the thalamic projections to non-striate areas and the reciprocal connections from extrastriate areas back to the thalamus. In this talk, I will describe how a modified Locally Competitive Algorithm (LCA; Rozell et al, Neural Comp, 2008) can be used as a unifying framework for exploring the role of top-down and lateral cortical pathways within the context of deep, sparse, generative models. I will also describe an open source software tool called PetaVision that can be used to implement and execute hierarchical LCA-based models on multi-core, multi-node computer platforms without requiring specific knowledge of parallel-programming constructs.<br />
<br />
'''14 Nov 2013 (note: Thursday), ***12:30pm*** '''<br />
* Speaker: Geoffrey J Goodhill<br />
* Affiliation: Queensland Brain Institute and School of Mathematics and Physics, The University of Queensland, Australia<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Computational principles of neural wiring development<br />
* Abstract: Brain function depends on precise patterns of neural wiring. An axon navigating to its target must make guidance decisions based on noisy information from molecular cues in its environment. I will describe a combination of experimental and computational work showing that (1) axons may act as ideal observers when sensing chemotactic gradients, (2) the complex influence of calcium and cAMP levels on guidance decisions can be predicted mathematically, (3) the morphology of growth cones at the axonal tip can be understood in terms of just a few eigenshapes, and remarkably these shapes oscillate in time with periods ranging from minutes to hours. Together this work may shed light on how neural wiring goes wrong in some developmental brain disorders, and how best to promote appropriate regrowth of axons after injury.<br />
<br />
'''4 Dec 2013'''<br />
* Speaker: Zhenwen Dai<br />
* Affiliation: FIAS, Goethe University Frankfurt, Germany.<br />
* Host: Georgios Exarchakis<br />
* Status: Confirmed<br />
* Title: What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach <br />
* Abstract: We study optimal image encoding based on a generative approach with non-linear feature combinations and explicit position encoding. By far most approaches to unsupervised learning of visual features, such as sparse coding or ICA, account for translations by representing the same features at different positions. Some earlier models used a separate encoding of features and their positions to facilitate invariant data encoding and recognition. All probabilistic generative models with explicit position encoding have so far assumed a linear superposition of components to encode image patches. Here, we for the first time apply a model with non-linear feature superposition and explicit position encoding for patches. By avoiding linear superpositions, the studied model represents a closer match to component occlusions which are ubiquitous in natural images. In order to account for occlusions, the non-linear model encodes patches qualitatively very different from linear models by using component representations separated into mask and feature parameters. We first investigated encodings learned by the model using artificial data with mutually occluding components. We find that the model extracts the components, and that it can correctly identify the occlusive components with the hidden variables of the model. On natural image patches, the model learns component masks and features for typical image components. By using reverse correlation, we estimate the receptive fields associated with the model’s hidden units. We find many Gabor-like or globular receptive fields as well as fields sensitive to more complex structures. Our results show that probabilistic models that capture occlusions and invariances can be trained efficiently on image patches, and that the resulting encoding represents an alternative model for the neural encoding of images in the primary visual cortex. <br />
<br />
'''11 Dec 2013'''<br />
* Speaker: Kai Siedenburg<br />
* Affiliation: UC Davis, Petr Janata's Lab.<br />
* Host: Jesse Engel<br />
* Status: Confirmed<br />
* Title: Characterizing Short-Term Memory for Musical Timbre<br />
* Abstract: Short-term memory is a cognitive faculty central for the apprehension of music and speech. Only little is known, however, about memory for musical timbre despite its“sisterhood”with speech; after all, speech can be regarded as sequencing of vocal timbre. Past research has isolated many characteristic effects of verbal memory. Are these also in play for non-vocal timbre sequences? We studied this question by considering short-term memory for serial order. Using timbres and dissimilarity data from McAdams et al. (Psych. Research, 1995), we employed a same/different discrimination paradigm. Experiment 1 (N = 30 MU + 30 nonMU) revealed effects of sequence length and timbral dissimilarity of items, as well as an interaction of musical training and pitch variability: in contrast to musicians, non-musicians' performance was impaired by simultaneous changes in pitch, compared to a constant pitch baseline. Experiment 2 (N = 22) studied whether musicians' memory for timbre sequences was independent of pitch irrespective of the degree of complexity of pitch progressions. Comparing sequences with pitch changing within and across standard and comparison to a constant pitch baseline, performance was now clearly impaired for the variable pitch condition. Experiment 3 (N = 22) showed primacy and recency effects for musicians, and reproduced a positive effect of timbral heterogeneity of sequences. Our findings demonstrate the presence of hallmark effects of verbal memory such as similarity, word length, primacy/recency for the domain of non-vocal timbre, and suggest that memory for speech and non- vocal timbre sequences might to a large extent share underlying mechanisms.<br />
<br />
'''12 Dec 2013'''<br />
* Speaker: Matthias Bethge<br />
* Affiliation: University of Tubingen<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''22 Jan 2014'''<br />
* Speaker: Thomas Martinetz<br />
* Affiliation: Univ Luebeck<br />
* Host: Bruno/Fritz<br />
* Status: confirmed<br />
* Title: Orthogonal Sparse Coding and Sensing<br />
* Abstract: Sparse Coding has been a very successful concept since many natural signals have the property of being sparse in some dictionary (basis). Some natural signals are even sparse in an orthogonal basis, most prominently natural images. They are sparse in a respective wavelet transform. An encoding in an orthogonal basis has a number of advantages,.e.g., finding the optimal coding coefficients is simply a projection instead of being NP-hard.<br />
Given some data, we want to find the orthogonal basis which provides the sparsest code. This problem can be seen as a <br />
generalization of Principal Component Analysis. We present an algorithm, Orthogonal Sparse Coding (OSC), which is able to find this basis very robustly. On natural images, it compresses on the level of JPEG, but can adapt to arbitrary and special data sets and achieve significant improvements. With the property of being sparse in some orthogonal basis, we show how signals can be sensed very efficiently in an hierarchical manner with at most k log D sensing actions. This hierarchical sensing might relate to the way we sense the world, with interesting applications in active vision. <br />
<br />
'''29 Jan 2014'''<br />
* Speaker: David Klein<br />
* Affiliation: Audience<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''5 Feb 2014''' (leave open for Barth/Martinetz seminar)<br />
<br />
'''12 Feb 2014'''<br />
* Speaker: Ilya Sutskever <br />
* Affiliation: Google<br />
* Host: Zayd<br />
* Status: confirmed<br />
* Title: Continuous vector representations for machine translation<br />
* Abstract: Dictionaries and phrase tables are the basis of modern statistical machine translation systems. I will present a method that can automate the process of generating and extending dictionaries and phrase tables. Our method can translate missing word and phrase entries by learning language structures using large monolingual data, and by mapping between the languages using a small bilingual dataset. It uses distributed representations of words and learns a linear mapping between vector spaces of languages. Despite its simplicity, our method is surprisingly effective: we can achieve almost 90% precision@5 for translation of words between English and Spanish. This method makes little assumption about the languages, so it can be used to extend and refine dictionaries and translation tables for any language pairs. Joint work with Tomas Mikolov and Quoc Le.<br />
<br />
'''25 Feb 2014'''<br />
* Speaker: Alexander Terekhov <br />
* Affiliation: CNRS - Université Paris Descartes<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Constructing space: how a naive agent can learn spatial relationships by observing sensorimotor contingencies<br />
* Abstract:<br />
<br />
'''12 March 2014'''<br />
* Speaker: Carlos Portera-Cailliau<br />
* Affiliation: UCLA<br />
* Host: Mike<br />
* Status: confirmed<br />
* Title: Circuit defects in the neocortex of Fmr1 knockout mice<br />
* Abstract: TBA<br />
<br />
'''19 March 2014'''<br />
* Speaker: Dean Buonomano<br />
* Affiliation: UCLA<br />
* Host: Mike<br />
* Status: confirmed<br />
* Title: State-dependent Networks: Timing and Computations Based on Neural Dynamics and Short-term Plasticity<br />
* Abstract: The brain’s ability to seamlessly assimilate and process spatial and temporal information is critical to most behaviors, from understanding speech to playing the piano. Indeed, because the brain evolved to navigate a dynamic world, timing and temporal processing represent a fundamental computation. We have proposed that timing and the processing of temporal information emerges from the interaction between incoming stimuli and the internal state of neural networks. The internal state, is defined not only by ongoing activity (the active state) but by time-varying synaptic properties, such as short-term synaptic plasticity (the hidden state). One prediction of this hypothesis is that timing is a general property of cortical circuits. We provide evidence in this direction by demonstrating that in vitro cortical networks can “learn” simple temporal patterns. Finally, previous theoretical studies have suggested that recurrent networks capable of self-perpetuating activity hold significant computational potential. However, harnessing the computational potential of these networks has been hampered by the fact that such networks are chaotic. We show that it is possible to “tame” chaos through recurrent plasticity, and create a novel and powerful general framework for how cortical circuits compute.<br />
<br />
'''26 March 2014'''<br />
* Speaker: Robert G. Smith<br />
* Affiliation: University of Pennsylvania<br />
* Host: Mike S<br />
* Status: confirmed<br />
* Title: Role of Dendritic Computation in the Direction-Selective Circuit of Retina<br />
* Abstract: The retina utilizes a variety of signal processing mechanisms to compute direction from image motion. The computation is accomplished by a circuit that includes starburst amacrine cells (SBACs), which are GABAergic neurons presynaptic to direction-selective ganglion cells (DSGCs). SBACs are symmetric neurons with several branched dendrites radiating out from the soma. When a stimulus moving back and forth along a SBAC dendrite sequentially activates synaptic inputs, larger post-synaptic potentials (PSPs) are produced in the dendritic tips when the stimulus moves outwards from the soma. The directional difference in EPSP amplitude is further amplified near the dendritic tips by voltage-gated channels to produce directional release of GABA. Reciprocal inhibition between adjacent SBACs may also amplify directional release. Directional signals in the independent SBAC branches are preserved because each dendrite makes selective contacts only with DSGCs of the appropriate preferred-direction. Directional signals are further enhanced within the dendritic arbor of the DSGC, which essentially comprises an array of distinct dendritic compartments. Each of these dendritic compartments locally sum excitatory and inhibitory inputs, amplifies them with voltage-gated channels, and generates spikes that propagate to the axon via the soma. Overall, the computation of direction in the retina is performed by several local dendritic mechanisms both presynaptic and postsynaptic, with the result that directional responses are robust over a broad range of stimuli.<br />
<br />
'''16 April 2014'''<br />
* Speaker: David Pfau<br />
* Affiliation: Columbia<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''22 April 2014 *Tuesday*'''<br />
* Speaker: Jochen Braun<br />
* Affiliation: Otto-von-Guericke University, Magdeburg<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Dynamics of visual perception and collective neural activity<br />
* Abstract:<br />
<br />
'''29 April 2014'''<br />
* Speaker: Guiseppe Vitiello<br />
* Affiliation: University of Salerno<br />
* Host: Fritz/Walter Freeman<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''30 April 2014'''<br />
* Speaker: Masataka Watanabe<br />
* Affiliation: University of Tokyo / Max Planck Institute for Biological Cybernetics<br />
* Host: Gautam Agarwal<br />
* Status: confirmed<br />
* Title: Turing Test for Machine Consciousness and the Chaotic Spatiotemporal Fluctuation Hypothesis<br />
* Abstract: I propose an experimental method to test various hypotheses on consciousness. Inspired by Sperry's observation that split-brain patients possess two independent streams of consciousness, the idea is to implement candidate neural mechanisms of visual consciousness onto an artificial cortical hemisphere and test whether subjective experience is evoked in the device's visual hemifield. In contrast to modern neurosynthetic devices, I show that mimicking interhemispheric connectivity assures that authentic and fine-grained subjective experience arises only when a stream of consciousness is generated within the device. It is valid under a widely believed assumption regarding interhemispheric connectivity and neuronal stimulus-invariance. (I will briefly explain my own evidence of human V1 not responding to changes in the contents of visual awareness [1])<br />
<br />
If consciousness is actually generated within the device, we should be able to construct a case where two objects presented in the device's visual field are distinguishable by visual experience but not by what is communicated through the brain-machine interface. As strange as it may sound, and clearly violating the law of physics, this is likely to be happening in the intact brain, where unified subjective bilateral vision and its verbal report occur without the total interhemispheric exchange of conscious visual information.<br />
<br />
Together, I present a hypothesis on the neural mechanism of consciousness, “The Chaotic Spatiotemporal Fluctuation Hypothesis” that passes the proposed test for visual qualia and also explains how physics that we know of today is violated. Here, neural activity is divided into two components, the time-averaged activity and the residual temporally fluctuating activity, where the former serves as the content of consciousness (neuronal population vector) and the latter as consciousness itself. The content is “read” into consciousness in the sense that, every local perturbation caused by change in the neuronal population vector creates a spatiotemporal wave in the fluctuation component that travels through out the system. Deterministic chaos assures that every local difference makes a difference to the whole of the dynamics, as in the butterfly effect, serving as a foundation for the holistic nature of consciousness. I will present data from simultaneous electrophysiology-fMRI recordings and human fMRI [2] that supports the existence of such large-scale causal fluctuation.<br />
<br />
Here, the chaotic fluctuation cannot be decoded to trace back the original perturbation in the neuronal population vector, because initial states of all neurons are required with infinite precision to do so. Hence what is transmitted over the two hemispheres are not "information" in the normal sense. This illustrates the violation of physics by the metaphysical assumption, "chaotic spatiotemporal fluctuation is consciousness", where unification of bilateral vision and the solving of visual tasks (e.g. perfect symmetry detection) are achieved without exchanging the otherwise required Shannon information between the two hemispheres.<br />
<br />
Finally, minimal and realistic versions of the proposed test for visual qualia can be conducted on laboratory animals to validate the hypothesis. It deals with two biological hemispheres, which we know already that it contains consciousness. We dissect interhemispheric connectivity and form instead an artificial one that is capable of filtering out the neural fluctuation component. A limited interhemispheric connectivity may be sufficient, which would drastically discount the technological challenge. If the subject is capable of conducting a bilateral stimuli matching task with the full artificial interhemispheric connectivity, but not when the fluctuation component is filtered out, it can be considered a strong supporting evidence of the hypothesis.<br />
<br />
1.Watanabe, M., Cheng, K., Ueno, K., Asamizuya, T., Tanaka, K., Logothetis, N., Attention but not awareness modulates the BOLD signal in the human V1 during binocular suppression. Science, 2011. 334(6057): p. 829-31.<br />
<br />
2.Watanabe, M., Bartels, A., Macke, J., Logothetis, N., Temporal jitter of the BOLD signal reveals a reliable initial dip and improved spatial resolution. Curr Biol, 2013. 23(21): p. 2146-50.<br />
<br />
'''11 June 2014'''<br />
* Speaker: Stuart Hammeroff<br />
* Affiliation: University of Arizona, Tucson<br />
* Host: Gautam<br />
* Status: confirmed<br />
* Title: ‘Tuning the brain’ – Treating mental states through microtubule vibrations <br />
* Abstract: Do mental states derive entirely from brain neuronal membrane activities? Neuronal interiors are organized by microtubules (‘MTs’), protein polymers proposed to encode memory, process information and support consciousness. Using nanotechnology, Bandyopadhyay’s group at MIT has shown coherent vibrations (megahertz to 10 kilohertz) from microtubule bundles inside active neurons, vibrations (electric field potentials ~40 to 50 mV) able to influence membrane potentials. This suggests EEG rhythms are ‘beat’ frequencies of megahertz vibrations in microtubules inside neurons (Hameroff and Penrose, 2014), and that consciousness and cognition involve vibrational patterns resonating across scales in the brain, more like music than computation. MT megahertz may be a useful therapeutic target for ‘tuning’ mood and mental states. Among noninvasive transcranial brain stimulation techniques (TMS, TDcS), transcranial ultrasound (TUS) is megahertz mechanical vibrations. Applied at the scalp, low intensity, sub-thermal ultrasound (TUS) safely reaches the brain. In human studies, brief (15 to 30 seconds) TUS at 0.5, 2 and 8 megahertz to frontal-temporal cortex results in 40 minutes or longer of reported mood improvement, and focused TUS enhances sensory discrimination (Legon et al, 2014). In vitro, ultrasound promotes growth of neurite outgrowth in embryonic neurons (Raman), and stabilizes microtubules against disassembly (Gupta). (In Alzheimer’s disease, MTs disassemble and release tau.) These findings suggest ‘tuning the brain’ with TUS should be a safe, effective and inexpensive treatment for Alzheimer’s, traumatic brain injury, depression, anxiety, PTSD and other disorders. <br />
<br />
References: Hameroff S, Penrose R (2014) Phys Life Rev http://www.sciencedirect.com/science/article/pii/S1571064513001188; Sahu et al (2013) Biosens Bioelectron 47:141–8; Sahu et al (2013) Appl Phys Lett 102:123701; Legon et al (2014) Nature Neuroscience 17: 322–329<br />
<br />
'''25 June 2014'''<br />
* Speaker: Peter Loxley<br />
* Affiliation: <br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: The two-dimensional Gabor function adapted to natural image statistics: An analytical model of simple-cell responses in the early visual system<br />
* Abstract: TBA<br />
<br />
=== 2012/13 academic year ===<br />
<br />
'''26 Sept 2012''' <br />
* Speaker: Jason Yeatman<br />
* Affiliation: Department of Psychology, Stanford University<br />
* Host: Bruno/Susana Chung<br />
* Status: confirmed<br />
* Title: The Development of White Matter and Reading Skills<br />
* Abstract: The development of cerebral white matter involves both myelination and pruning of axons, and the balance between these two processes may differ between individuals. Cross-sectional measures of white matter development mask the interplay between these active developmental processes and their connection to cognitive development. We followed a cohort of 39 children longitudinally for three years, and measured white matter development and reading development using diffusion tensor imaging and behavioral tests. In the left arcuate and inferior longitudinal fasciculus, children with above-average reading skills initially had low fractional anisotropy (FA) with a steady increase over the 3-year period, while children with below-average reading skills had higher initial FA that declined over time. We describe a dual-process model of white matter development that balances biological processes that have opposing effects on FA, such as axonal myelination and pruning, to explain the pattern of results.<br />
<br />
'''8 Oct 2012''' <br />
* Speaker: Sophie Deneve<br />
* Affiliation: Laboratoire de Neurosciences cognitives, ENS-INSERM<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Balanced spiking networks can implement dynamical systems with predictive coding<br />
* Abstract: Neural networks can integrate sensory information and generate continuously varying outputs, even though individual neurons communicate only with spikes---all-or-none events. Here we show how this can be done efficiently if spikes communicate "prediction errors" between neurons. We focus on the implementation of linear dynamical systems and derive a spiking network model from a single optimization principle. Our model naturally accounts for two puzzling aspects of cortex. First, it provides a rationale for the tight balance and correlations between excitation and inhibition. Second, it predicts asynchronous and irregular firing as a consequence of predictive population coding, even in the limit of vanishing noise. We show that our spiking networks have error-correcting properties that make them far more accurate and robust than comparable rate models. Our approach suggests spike times do matter when considering how the brain computes, and that the reliability of cortical representations could have been strongly under-estimated.<br />
<br />
<br />
'''19 Oct 2012'''<br />
* Speaker: Gert Van Dijck<br />
* Affiliation: Cambridge<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: A solution to identifying neurones using extracellular activity in awake animals: a probabilistic machine-learning approach<br />
* Abstract: Electrophysiological studies over the last fifty years have been hampered by the difficulty of reliably assigning signals to identified cortical neurones. Previous studies have employed a variety of measures based on spike timing or waveform characteristics to tentatively classify other neurone types (Vos et al., Eur. J. Neurosci., 1999; Prsa et al., J. Neurosci., 2009), in some cases supported by juxtacellular labelling (Simpson et al., Prog. Brain Res., 2005; Holtzman et al., J. Physiol., 2006; Barmack and Yakhnitsa, J. Neurosci., 2008; Ruigrok et al., J. Neurosci., 2011), or intracellular staining and / or assessment of membrane properties (Chadderton et al., Nature, 2004; Jorntell and Ekerot, J. Neurosci., 2006; Rancz et al., Nature, 2007). Anaesthetised animals have been widely used as they can provide a ground-truth through neuronal labelling which is much harder to achieve in awake animals where spike-derived measures tend to be relied upon (Lansink et al., Eur. J. Neurosci., 2010). Whilst spike-shapes carry potentially useful information for classifying neuronal classes, they vary with electrode type and the geometric relationship between the electrode and the spike generation zone (Van Dijck et al., Int. J. Neural Syst., 2012). Moreover, spike-shape measurement is achieved with a variety of techniques, making it difficult to compare and standardise between laboratories.In this study we build probabilistic models on the statistics derived from the spike trains of spontaneously active neurones in the cerebellum and the ventral midbrain. The mean spike frequency in combination with the log-interval-entropy (Bhumbra and Dyball, J. Physiol.-London, 2004) of the inter-spike-interval distribution yields the highest prediction accuracy. The cerebellum model consists of two sub-models: a molecular layer - Purkinje layer model and a granular layer - Purkinje layer model. The first model identifies with high accuracy (92.7 %) molecular layer interneurones and Purkinje cells, while the latter identifies with high accuracy (99.2 %) Golgi cells, granule cells, mossy fibers and Purkinje cells. Furthermore, it is shown that the model trained on anaesthetized rat and decerebrate cat data has broad applicability to other species and behavioural states: anaesthetized mice (80 %), awake rabbits (94.2 %) and awake rhesus monkeys (89 - 90 %).Recently, opto-genetics allow to obtain a ground-truth about cell classes. Using opto-genetically identified GABA-ergic and dopaminergic cells we build similar statistical models to identify these neuron types from the ventral midbrain.Hence, this illustrates that our approach will be of general use to a broad variety of laboratories.<br />
<br />
'''Tuesday, 23 Oct 2012''' <br />
* Speaker: Jaimie Sleigh<br />
* Affiliation: University of Auckland<br />
* Host: Fritz/Andrew Szeri<br />
* Status: confirmed<br />
* Title: Is General Anesthesia a failure of cortical information integration<br />
* Abstract: General anesthesia and natural sleep share some commonalities and some differences. Quite a lot is known about the chemical and neuronal effects of general anesthetic drugs. There are two main groups of anesthetic drugs, which can be distinguished by their effects on the EEG. The most commonly used drugs exert a strong GABAergic action; whereas a second group is characterized by minimal GABAergic effects, but significant NMDA blockade. It is less clear which and how these various effects result in failure of the patient to wake up when the surgeon cuts them. I will present some results from experimental brain slice work, and theoretical mean field modelling of anesthesia and sleep, that support the idea that the final common mechanism of both types of anaesthesia is fragmentation of long distance information flow in the cortex.<br />
<br />
'''31 Oct 2012''' (Halloween)<br />
* Speaker: Jonathan Landy<br />
* Affiliation: UCSB<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Mean-field replica theory: review of basics and a new approach<br />
* Abstract: Replica theory provides a general method for evaluating the mode of a distribution, and has varied applications to problems in statistical mechanics, signal processing, etc. Evaluation of the formal expressions arising in replica theory represents a formidable technical challenge, but one that physicists have apparently intuited correct methods for handling. In this talk, I will first provide a review of the historical development of replica theory, covering: 1) motivation, 2) the intuited ``Parisi-ansatz" solution, 3) continued controversies, and 4) a survey of applications (including to neural networks). Following this, I will discuss an exploratory effort of mine, aimed at developing an ansatz-free solution method. As an example, I will work out the phase diagram for a simple spin-glass model. This talk is intended primarily as a tutorial.<br />
<br />
'''7 Nov 2012''' <br />
* Speaker: Tom Griffiths<br />
* Affiliation: UC Berkeley<br />
* Host:Daniel Little<br />
* Status: Confirmed<br />
* Title: Identifying human inductive biases<br />
* Abstract: People are remarkably good at acquiring complex knowledge from limited data, as is required in learning causal relationships, categories, or aspects of language. Successfully solving inductive problems of this kind requires having good "inductive biases" - constraints that guide inductive inference. Viewed abstractly, understanding human learning requires identifying these inductive biases and exploring their origins. I will argue that probabilistic models of cognition provide a framework that can facilitate this project, giving a transparent characterization of the inductive biases of ideal learners. I will outline how probabilistic models are traditionally used to solve this problem, and then present a new approach that uses Markov chain Monte Carlo algorithms as the basis for an experimental method that magnifies the effects of inductive biases.<br />
<br />
'''19 Nov 2012''' (Monday) (Thanksgiving week)<br />
* Speaker: Bin Yu<br />
* Affiliation: Dept. of Statistics and EECS, UC Berkeley<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Representation of Natural Images in V4<br />
* Abstract: The functional organization of area V4 in the mammalian ventral visual pathway is far from being well understood. V4 is believed to play an important role in the recognition of shapes and objects and in visual attention, but the complexity of this cortical area makes it hard to analyze. In particular, no current model of V4 has shown good predictions for neuronal responses to natural images and there is no consensus on the primary role of V4.<br />
In this talk, we present analysis of electrophysiological data on the response of V4 neurons to natural images. We propose a new computational model that achieves comparable prediction performance for V4 as for V1 neurons. Our model does not rely on any pre-defined image features but only on invariance and sparse coding principles. We interpret our model using sparse principal component analysis and discover two groups of neurons: those selective to texture versus those selective to contours. This supports the thesis that one primary role of V4 is to extract objects from background in the visual field. Moreover, our study also confirms the diversity of V4 neurons. Among those selective to contours, some of them are selective to orientation, others to acute curvature features.<br />
(This is joint work with J. Mairal, Y. Benjamini, B. Willmore, M. Oliver<br />
and J. Gallant.)<br />
<br />
'''30 Nov 2012''' <br />
* Speaker: Yan Karklin<br />
* Affiliation: NYU<br />
* Host: Tyler<br />
* Status: confirmed<br />
* Title: <br />
* Abstract: <br />
<br />
'''10 Dec 2012 (note this would be the Monday after NIPS)''' <br />
* Speaker: Marius Pachitariu<br />
* Affiliation: Gatsby / UCL<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: NIPS paper "Learning visual motion in recurrent neural networks"<br />
* Abstract: We present a dynamic nonlinear generative model for visual motion based on a<br />
latent representation of binary-gated Gaussian variables connected in a network. <br />
Trained on sequences of images by an STDP-like rule the model learns <br />
to represent different movement directions in different variables. We use an online <br />
approximate inference scheme that can be mapped to the dynamics of networks <br />
of neurons. Probed with drifting grating stimuli and moving bars of light, neurons <br />
in the model show patterns of responses analogous to those of direction-selective <br />
simple cells in primary visual cortex. We show how the computations of the model <br />
are enabled by a specific pattern of learnt asymmetric recurrent connections. <br />
I will also briefly discuss our application of recurrent neural networks as statistical <br />
models of simultaneously recorded spiking neurons. <br />
<br />
'''12 Dec 2012''' <br />
* Speaker: Ian Goodfellow<br />
* Affiliation: U Montreal<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''7 Jan 2013'''<br />
* Speaker: Stuart Hammeroff<br />
* Affiliation: University of Arizona <br />
* Host: Gautam Agarwal<br />
* Status: confirmed<br />
* Title: Quantum cognition and brain microtubules <br />
* Abstract: Cognitive decision processes are generally seen as classical Bayesian probabilities, but better suited to quantum mathematics. For example: 1) Psychological conflict, ambiguity and uncertainty can be viewed as (quantum) superposition of multiple possible judgments and beliefs. 2) Measurement (e.g. answering a question, reaching a decision) reduces possibilities to definite states (‘constructing reality’, ‘collapsing the wave function’). 3) Previous questions influence subsequent answers, so sequence affects outcomes (‘contextual non-commutativity’). 4) Judgments and choices may deviate from classical logic, suggesting random, or ‘non-computable’ quantum influences. Can quantum cognition operate in the brain? Do classical brain activities simulate quantum processes? Or have biomolecular quantum devices evolved? In this talk I will discuss how a finer scale, intra-neuronal level of quantum information processing in cytoskeletal microtubules can accumulate, operate upon and integrate quantum information and memory for self-collapse to classical states which regulate axonal firings, controlling behavior.<br />
<br />
'''Monday 14 Jan 2013, 1:00pm'''<br />
* Speaker: Dibyendu Mandal <br />
* Affiliation: Physics Dept., University of Maryland (Jarzynski group)<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: An exactly solvable model of Maxwell’s demon<br />
* Abstract: The paradox of Maxwell’s demon has stimulated numerous thought experiments, leading to discussions about the thermodynamic implications of information processing. However, the field has lacked a tangible example or model of an autonomous, mechanical system that reproduces the actions of the demon. To address this issue, we introduce an explicit model of a device that can deliver work to lift a mass against gravity by rectifying thermal fluctuations, while writing information to a memory register. We solve for the steady-state behavior of the model and construct its nonequilibrium phase diagram. In addition to the engine-like action described above, we identify a Landauer eraser region in the phase diagram where the model uses externally supplied work to remove information from the memory register. Our model offers a simple paradigm for investigating the thermodynamics of information processing by exposing a transparent mechanism of operation.<br />
<br />
'''23 Jan 2013'''<br />
* Speaker: Carlos Brody<br />
* Affiliation: Princeton<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Neural substrates of decision-making in the rat<br />
* Abstract: Gradual accumulation of evidence is thought to be a fundamental component of decision-making. Over the last 16 years, research in non-human primates has revealed neural correlates of evidence accumulation in parietal and frontal cortices, and other brain areas . However, the circuit mechanisms underlying these neural correlates remains unknown. Reasoning that a rodent model of evidence accumulation would allow a greater number of experimental subjects, and therefore experiments, as well as facilitate the use of molecular tools, we developed a rat accumulation of evidence task, the "Poisson Clicks" task. In this task, sensory evidence is delivered in pulses whose precisely-controlled timing varies widely within and across trials. The resulting data are analyzed with models of evidence accumulation that use the richly detailed information of each trial’s pulse timing to distinguish between different decision mechanisms. The method provides great statistical power, allowing us to: (1) provide compelling evidence that rats are indeed capable of gradually accumulating evidence for decision-making; (2) accurately estimate multiple parameters of the decision-making process from behavioral data; and (3) measure, for the first time, the diffusion constant of the evidence accumulator, which we show to be optimal (i.e., equal to zero). In addition, the method provides a trial-by-trial, moment-by-moment estimate of the value of the accumulator, which can then be compared in awake behaving electrophysiology experiments to trial-by-trial, moment-by-moment neural firing rate measures. Based on such a comparison, we describe data and a novel analysis approach that reveals differences between parietal and frontal cortices in the neural encoding of accumulating evidence. Finally, using semi-automated training methods to produce tens of rats trained in the Poisson Clicks accumulation of evidence task, we have also used pharmacological inactivation to ask, for the first time, whether parietal and frontal cortices are required for accumulation of evidence, and we are using optogenetic methods to rapidly and transiently inactivate brain regions so as to establish precisely when, during each decision-making trial, it is that each brain region's activity is necessary for performance of the task.<br />
<br />
'''28 Jan 2013'''<br />
* Speaker: Eugene M. Izhikevich<br />
* Affiliation: Brain Corporation<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Spikes<br />
* Abstract: Most communication in the brain is via spikes. While we understand the spike-generation mechanism of individual neurons, we fail to appreciate the spike-timing code and its role in neural computations. The speaker starts with simple models of neuronal spiking and bursting, describes small neuronal circuits that learn spike-timing code via spike-timing dependent plasticity (STDP), and finishes with biologically detailed and anatomically accurate large-scale brain models.<br />
<br />
'''29 Jan 2013'''<br />
* Speaker: Goren Gordon<br />
* Affiliation: Weizman Intitute<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: Hierarchical Curiosity Loops – Model, Behavior and Robotics<br />
* Abstract: Autonomously learning about one's own body and its interaction with the environment is a formidable challenge, yet it is ubiquitous in biology: every animal’s pup and every human infant accomplish this task in their first few months of life. Furthermore, biological agents’ curiosity actively drives them to explore and experiment in order to expedite their learning progress. To bridge the gap between biological and artificial agents, a formal mathematical theory of curiosity was developed that attempts to explain observed biological behaviors and enable curiosity emergence in robots. In the talk, I will present the hierarchical curiosity loops model, its application to rodent’s exploratory behavior and its implementation in a fully autonomously learning and behaving reaching robot.<br />
<br />
'''29 Jan 2013'''<br />
* Speaker: Jenny Read<br />
* Affiliation: Institute of Neuroscience, Newcastle University<br />
* Host: Sarah<br />
* Status: confirmed<br />
* Title: Stereoscopic vision<br />
* Abstract: [To be written]<br />
<br />
'''7 Feb 2013'''<br />
* Speaker: Valero Laparra<br />
* Affiliation: University of Valencia<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Empirical statistical analysis of phases in Gabor filtered natural images<br />
* Abstract:<br />
<br />
'''20 Feb 2013'''<br />
* Speaker: Dolores Bozovic<br />
* Affiliation: UCLA<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: Bifurcations and phase-locking dynamics in the auditory system<br />
* Abstract: The inner ear constitutes a remarkable biological sensor that exhibits nanometer-scale sensitivity of mechanical detection. The first step in auditory processing is performed by hair cells, which convert movement into electrical signals via opening of mechanically gated ion channels. These cells are operant in a viscous medium, but can nevertheless sustain oscillations, amplify incoming signals, and even exhibit spontaneous motility, indicating the presence of an underlying active amplification system. Theoretical models have proposed that a hair cell constitutes a nonlinear system with an internal feedback mechanism that can drive it across a bifurcation and into an unstable regime. Our experiments explore the nonlinear response as well as feedback mechanisms that enable self-tuning already at the peripheral level, as measured in vitro on sensory tissue. A simple dynamic systems framework will be discussed, that captures the main features of the experimentally observed behavior in the form of an Arnold Tongue.<br />
<br />
'''27 March 2013'''<br />
* Speaker: Dale Purves<br />
* Affiliation: Duke<br />
* Host: Sarah<br />
* Status: confirmed<br />
* Title: How Visual Evolution Determines What We See<br />
* Abstract: Information about the physical world is excluded from visual stimuli by the nature of biological vision (the inverse optics problem). Nonetheless, humans and other visual animals routinely succeed in their environments. The talk will explain how the assignment of perceptual values to visual stimuli according to the frequency of occurrence of stimulus patterns resolves the inverse problem and determines the basic visual qualities we see. This interpretation of vision implies that the best (and perhaps the only) way to understand visual system circuitry is to evolve it, an idea supported by recent work.<br />
<br />
'''9 April 2013'''<br />
* Speaker: Mounya Elhilali<br />
* Affiliation: Johns Hopkins<br />
* Host: Tyler<br />
* Status: confirmed<br />
* Title: Attention at the cocktail party: Neural bases and computational strategies for auditory scene analysis<br />
* Abstract: The perceptual organization of sounds in the environment into coherent objects is a feat constantly facing the auditory system. It manifests itself in the everyday challenge faced by humans and animals alike to parse complex acoustic information arising from multiple sound sources into separate auditory streams. While seemingly effortless, uncovering the neural mechanisms and computational principles underlying this remarkable ability remain a challenge for both the experimental and theoretical neuroscience communities. In this talk, I discuss the potential role of neuronal tuning in mammalian primary auditory cortex in mediating this process. I also examine the role of mechanisms of attention in adapting this neural representation to reflect both the sensory content and the changing behavioral context of complex acoustic scenes.<br />
<br />
'''17th of April 2013'''<br />
* Speaker: Wiktor Młynarski<br />
* Affiliation: Max Planck Institute for Mathematics in the Sciences<br />
* Host: Urs<br />
* Status: confirmed<br />
* Title: Statistical Models of Binaural Sounds<br />
* Abstract: The auditory system exploits disparities in the sounds arriving at the left and right ear to extract information about the spatial configuration of sound sources. According to the widely acknowledged Duplex Theory, sounds of low frequency are localized based on Interaural Time Differences (ITDs) and localization of high frequency sources relies on Interaural Level Differences (ILDs). Natural sounds, however, possess a rich structure and contain multiple frequency components. This leads to the question: what are the contributions of different cues to sound position identification in the natural environment and how much information do they carry about its spatial structure? In this talk, I will present my attempts to answer the above questions using statistical, generative models of naturalistic (simulated) and fully natural binaural sounds.<br />
<br />
'''15 May 2013'''<br />
* Speaker: Byron Yu<br />
* Affiliation: CMU<br />
* Host: Bruno/Jose (jointly sponsored with CNEP)<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
'''22 May 2013'''<br />
* Speaker: Bijan Pesaran<br />
* Affiliation: NYU<br />
* Host: Bruno/Jose (jointly sponsored with CNEP)<br />
* Status: confirmed <br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
=== 2011/12 academic year ===<br />
<br />
'''15 Sep 2011 (Thursday, at noon)'''<br />
* Speaker: Kathrin Berkner<br />
* Affiliation: Ricoh Innovations Inc.<br />
* Host: Ivana Tosic<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract: TBD<br />
<br />
'''21 Sep 2011'''<br />
* Speaker: Mike Kilgard<br />
* Affiliation: UT Dallas<br />
* Host: Michael Silver<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''27 Sep 2011'''<br />
* Speaker: Moshe Gur<br />
* Affiliation: Dept. of Biomedical Engineering, Technion, Israel Institute of Technology<br />
* Host: Bruno/Stan<br />
* Status: Confirmed<br />
* Title: On the unity of perception: How does the brain integrate activity evoked at different cortical loci?<br />
* Abstract: Any physical device we know, including computers, when comparing A to B must send the information to point C. I have done experiments in three modalities, somato-sensory, auditory, and visual, where 2 different loci at the primary cortex are stimulated and I argue that the "machine" converging hypothesis cannot explain the perceptual results. Thus we must assume a non-converging mechanism whereby the brain, at times, can compare (integrate, process) events that take place at different loci without sending the information to a common target. Once we allow for such a mechanism, many phenomena can be viewed differently. Take for example the question of how and where does multi-sensory integration take place; we perceive a synchronized talking face yet detailed visual and auditory information are represented at very different brain loci.<br />
<br />
'''5 Oct 2011'''<br />
* Speaker: Susanne Still<br />
* Affiliation: University of Hawaii at Manoa<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: Predictive power, memory and dissipation in learning systems operating far from thermodynamic equilibrium<br />
* Abstract: Understanding the physical processes that underly the functioning of biological computing machinery often requires describing processes that occur far from thermodynamic equilibrium. In recent years significant progress has been made in this area, most notably Jarzynski’s work relation and Crooks’ fluctuation theorem. In this talk I will explore how dissipation of energy is related to a system's information processing inefficiency. The focus is on driven systems that are embedded in a stochastic operating environment. If we describe the system as a state machine, then we can interpret the stochastic dynamics as performing a computation that results in an (implicit) model of the stochastic driving signal. I will show that instantaneous non-predictive information, which serves as a measure of model inefficiency, provides a lower bound on the average dissipated work. This implies that learning systems with larger predictive power can operate more energetically efficiently. We could speculate that perhaps biological systems may have evolved to reflect this kind of adaptation. One interesting insight here is that purely physical notions require what is perfectly in line with the general belief that a useful model must be predictive (at fixed model complexity). Our result thereby ties together ideas from learning theory with basic non-equilibrium thermodynamics.<br />
<br />
'''19 Oct 2011'''<br />
* Speaker: Graham Cummins<br />
* Affiliation: WSU<br />
* Host: Jeff Teeters<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''26 Oct 2011'''<br />
* Speaker: Shinji Nishimoto<br />
* Affiliation: Gallant lab, UC Berkeley<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''14 Dec 2011'''<br />
* Speaker: Austin Roorda<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: How the unstable eye sees a stable and moving world<br />
* Abstract:<br />
<br />
'''11 Jan 2012'''<br />
* Speaker: Ken Nakayama<br />
* Affiliation: Harvard University<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Subjective Contours<br />
* Abstract: The concept of the receptive field in visual science has been transformative. It fueled great discoveries of the second half of the 20th C, providing the dominant understanding of how the visual system works at its early stages. Its reign has been extended to the field of object recognition where in the form of a linear classifier, it provides a framework to understand visual object recognition (DiCarlo and Cox, 2007).<br />
Untamed, however, are areas of visual perception, now more or less ignored, dubbed variously as the 2.5 D sketch, mid-level vision, surface representations. Here, neurons with their receptive fields seem unable to bridge the gap, to supply us with even a plausible speculative framework to understand amodal completion, subjective contours and other surface phenomena. Correspondingly, these areas have become backwater, ignored, leapt over.<br />
Subjective contours, however, remain as vivid as ever, even more so.<br />
Everyday, our visual system makes countless visual inferences as to the layout of the world surfaces and objects. What’s remarkable is that subjective contours visibly reveal these inferences.<br />
<br />
'''Tuesday, 24 Jan 2012'''<br />
* Speaker: Aniruddha Das<br />
* Affiliation: Columbia University<br />
* Host: Fritz<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''22 Feb 2012'''<br />
* Speaker: Elad Schneidman <br />
* Affiliation: Department of Neurobiology, Weizmann Institute of Science<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Sparse high order interaction networks underlie learnable neural population codes<br />
* Abstract:<br />
<br />
'''29 Feb 2012 (at noon as usual)'''<br />
* Speaker: Heather Read<br />
* Affiliation: U. Connecticut<br />
* Host: Mike DeWeese<br />
* Status: confirmed<br />
* Title: "Transformation of sparse temporal coding from auditory colliculus and cortex"<br />
* Abstract: TBD<br />
<br />
'''1 Mar 2012 (note: Thurs)'''<br />
* Speaker: Daniel Zoran<br />
* Affiliation: Hebrew University, Jerusalem<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''7 Mar 2012'''<br />
* Speaker: David Sivak<br />
* Affiliation: UCB<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''8 Mar 2012'''<br />
* Speaker: Ivan Schwab<br />
* Affiliation: UC Davis<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Evolution's Witness: How Eyes Evolved<br />
* Abstract:<br />
<br />
'''14 Mar 2012'''<br />
* Speaker: David Sussillo<br />
* Affiliation:<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''18 April 2012'''<br />
* Speaker: Kristofer Bouchard<br />
* Affiliation: UCSF<br />
* Host: Bruno<br />
* Status: confirmed<br />
* Title: Cortical Foundations of Human Speech Production<br />
* Abstract:<br />
<br />
'''23 May 2012''' (rescheduled from April 11)<br />
* Speaker: Logan Grosenick<br />
* Affiliation: Stanford, Deisseroth & Suppes Labs<br />
* Host: Jascha<br />
* Status: confirmed<br />
* Title: Acquisition, creation, & analysis of 4D light fields with applications to calcium imaging & optogenetics<br />
* Abstract: In Light Field Microscopy (LFM), images can be computationally refocused after they are captured [1]. This permits acquiring focal stacks and reconstructing volumes from a single camera frame. In Light Field Illumination (LFI), the same ideas can be used to create an illumination system that can deliver focused light to any position in a volume without moving optics, and these two devices (LFM/LFI) can be used together in the same system [2]. So far, these imaging and illumination systems have largely been used independently in proof-of-concept experiments [1,2]. In this talk I will discuss applications of a combined scanless volumetric imaging and volumetric illumination system applied to 4D calcium imaging and photostimulation of neurons in vivo and in vitro. The volumes resulting from these methods are large (>500,000 voxels per time point), collected at 10-100 frames per second, and highly correlated in space and time. Analyzing such data has required the development and application of machine learning methods appropriate to large, sparse, nonnegative data, as well as the estimation of neural graphical models from calcium transients. This talk will cover the reconstruction and creation of volumes in a microscope using Light Fields [1,2], and the current state-of-the-art for analyzing these large volumes in the context of calcium imaging and optogenetics. <br />
<br />
[1] M. Levoy, R. Ng, A. Adams, M. Footer, and M. Horowitz. Light Field Microscopy. ACM Transactions on Graphics 25(3), Proceedings of SIGGRAPH 2006.<br />
[2] M. Levoy, Z. Zhang, and I. McDowall. Recording and controlling the 4D light field in a microscope. Journal of Microscopy, Volume 235, Part 2, 2009, pp. 144-162. Cover article.<br />
<br />
BIO: Logan received bachelors degrees with honors in Biology and Psychology from Stanford, and a Masters in Statistics from Stanford. He is a Ph.D. candidate in the Neurosciences Program working in the labs of Karl Deisseroth and Patrick Suppes, and a trainee at the Stanford Center for Mind, Brain, and Computation. He is interested in developing and applying novel computational imaging and machine learning techniques in order to observe, control, and understand neuronal circuit dynamics.<br />
<br />
'''7 June 2012''' (Thursday)<br />
* Speaker: Mitya Chklovskii<br />
* Affiliation: janelia<br />
* Host: Bruno<br />
* Status:<br />
* Title:<br />
* Abstract<br />
<br />
'''27 June 2012''' <br />
* Speaker: Jerry Feldman<br />
* Affiliation:<br />
* Host: Bruno<br />
* Status:<br />
* Title:<br />
* Abstract:<br />
<br />
'''30 July 2012''' <br />
* Speaker: Lucas Theis<br />
* Affiliation: Matthias Bethge lab, Werner Reichardt Centre for Integrative Neuroscience, Tübingen<br />
* Host: Jascha<br />
* Status: Confirmed<br />
* Title: Hierarchical models of natural images<br />
* Abstract: Probabilistic models of natural images have been used to solve a variety of computer vision tasks as well as a means to better understand the computations performed by the visual system in the brain. A lot of theoretical considerations and biological observations point to the fact that natural image models should be hierarchically organized, yet to date, the best known models are still based on what is better described as shallow representations. In this talk, I will present two image models. One is based on the idea of Gaussianization for greedily constructing hierarchical generative models. I will show that when combined with independent subspace analysis, it is able to compete with the state of the art for modeling image patches. The other model combines mixtures of Gaussian scale mixtures with a directed graphical model and multiscale image representations and is able to generate highly structured images of arbitrary size. Evaluating the model's likelihood and comparing it to a large number of other image models shows that it might well be the best model for natural images yet.<br />
<br />
(joint work with Reshad Hosseini and Matthias Bethge)<br />
<br />
=== 2010/11 academic year ===<br />
<br />
'''02 Sep 2010'''<br />
* Speaker: Johannes Burge<br />
* Affiliation: University of Texas at Austin<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''8 Sep 2010'''<br />
* Speaker: Tobi Szuts<br />
* Affiliation: Meister Lab/ Harvard U.<br />
* Host: Mike DeWeese<br />
* Status: Confirmed<br />
* Title: Wireless recording of neural activity in the visual cortex of a freely moving rat.<br />
* Abstract: Conventional neural recording systems restrict behavioral experiments to a flat indoor environment compatible with the cable that tethers the subject to the recording instruments. To overcome these constraints, we developed a wireless multi-channel system for recording neural signals from a freely moving animal the size of a rat or larger. The device takes up to 64 voltage signals from implanted electrodes, samples each at 20 kHz, time-division multiplexes them onto a single output line, and transmits that output by radio frequency to a receiver and recording computer up to >60 m away. The system introduces less than 4 ?V RMS of electrode-referred noise, comparable to wired recording systems and considerably less than biological noise. The system has greater channel count or transmission distance than existing telemetry systems. The wireless system has been used to record from the visual cortex of a rat during unconstrained conditions. Outdoor recordings show V1 activity is modulated by nest-building activity. During unguided behavior indoors, neurons responded rapidly and consistently to changes in light level, suppressive effects were prominent in response to an illuminant transition, and firing rate was strongly modulated by locomotion. Neural firing in the visual cortex is relatively sparse and moderate correlations are observed over large distances, suggesting that synchrony is driven by global processes.<br />
<br />
'''29 Sep 2010'''<br />
* Speaker: Vikash Gilja<br />
* Affiliation: Stanford University<br />
* Host: Charles<br />
* Status: Confirmed<br />
* Title: Towards Clinically Viable Neural Prosthetic Systems.<br />
* Abstract:<br />
<br />
'''20 Oct 2010'''<br />
* Speaker: Alexandre Francois<br />
* Affiliation: USC<br />
* Host: <br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''3 Nov 2010'''<br />
* Speaker: Eric Jonas and Vikash Mansinghka<br />
* Affiliation: Navia Systems<br />
* Host: Jascha<br />
* Status: Confirmed<br />
* Title: Natively Probabilistic Computation: Principles, Artifacts, Architectures and Applications<br />
* Abstract: Complex probabilistic models and Bayesian inference are becoming<br />
increasingly critical across science and industry, especially in<br />
large-scale data analysis. They are also central to our best<br />
computational accounts of human cognition, perception and action.<br />
However, all these efforts struggle with the infamous curse of<br />
dimensionality. Rich probabilistic models can seem hard to write and<br />
even harder to solve, as specifying and calculating probabilities<br />
often appears to require the manipulation of exponentially (and<br />
sometimes infinitely) large tables of numbers.<br />
<br />
We argue that these difficulties reflect a basic mismatch between the<br />
needs of probabilistic reasoning and the deterministic, functional<br />
orientation of our current hardware, programming languages and CS<br />
theory. To mitigate these issues, we have been developing a stack of<br />
abstractions for natively probabilistic computation, based around<br />
stochastic simulators (or samplers) for distributions, rather than<br />
evaluators for deterministic functions. Ultimately, our aim is to<br />
produce a model of computation and the associated hardware and<br />
programming tools that are as suited for uncertain inference and<br />
decision-making as our current computers are for precise arithmetic.<br />
<br />
In this talk, we will give an overview of the entire stack of<br />
abstractions supporting natively probabilistic computation, with<br />
technical detail on several hardware and software artifacts we have<br />
implemented so far. we will also touch on some new theoretical results<br />
regarding the computational complexity of probabilistic programs.<br />
Throughout, we will motivate and connect this work to some current<br />
applications in biomedical data analysis and computer vision, as well<br />
as potential hypotheses regarding the implementation of probabilistic<br />
computation in the brain.<br />
<br />
This talk includes joint work with Keith Bonawitz, Beau Cronin,<br />
Cameron Freer, Daniel Roy and Joshua Tenenbaum.<br />
<br />
BRIEF BIOGRAPHY<br />
<br />
Vikash Mansinghka is a co-founder and the CTO of Navia Systems, a<br />
venture-funded startup company building natively probabilistic<br />
computing machines. He spent 10 years at MIT, eventually earning an<br />
SB. in Mathematics, an SB. in Computer Science, an MEng in Computer<br />
Science, and a PhD in Computation. He held graduate fellowships from<br />
the NSF and MIT's Lincoln Laboratories, and his PhD dissertation won<br />
the 2009 MIT George M. Sprowls award for best dissertation in computer<br />
science. He currently serves on DARPA's Information Science and<br />
Technology (ISAT) Study Group.<br />
<br />
Eric Jonas is a co-founder of Navia Systems, responsible for in-house<br />
accelerated inference research and development. He spent ten years at<br />
MIT, where he earned SB degrees in electrical engineering and computer<br />
science and neurobiology, an MEng in EECS, with a neurobiology PhD<br />
expected really soon. He’s passionate about biological applications<br />
of probabilistic reasoning and hopes to use Navia’s capabilities to<br />
combine data from biological science, clinical histories, and patient<br />
outcomes into seamless models.<br />
<br />
'''8 Nov 2010'''<br />
* Speaker: Patrick Ruther<br />
* Affiliation: Imtek, University of Freiburg<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract: TBD<br />
<br />
'''10 Nov 2010'''<br />
* Speaker: Aurel Lazar<br />
* Affiliation: Department of Electrical Engineering, Columbia University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Encoding Visual Stimuli with a Population of Hodgkin-Huxley Neurons<br />
* Abstract: We first present a general framework for the reconstruction of natural video<br />
scenes encoded with a population of spiking neural circuits with random thresholds.<br />
The visual encoding system consists of a bank of filters, modeling the visual<br />
receptive fields, in cascade with a population of neural circuits, modeling encoding<br />
with spikes in the early visual system.<br />
The neuron models considered include integrate-and-fire neurons and ON-OFF<br />
neuron pairs with threshold-and-fire spiking mechanisms. All thresholds are assumed<br />
to be random. We show that for both time-varying and space-time-varying stimuli neural<br />
spike encoding is akin to taking noisy measurements on the stimulus.<br />
Second, we formulate the reconstruction problem as the minimization of a<br />
suitable cost functional in a finite-dimensional vector space and provide an explicit<br />
algorithm for stimulus recovery. We also present a general solution using the theory of<br />
smoothing splines in Reproducing Kernel Hilbert Spaces. We provide examples of both<br />
synthetic video as well as for natural scenes and show that the quality of the<br />
reconstruction degrades gracefully as the threshold variability of the neurons increases.<br />
Third, we demonstrate a number of simple operations on the original visual stimulus<br />
including translations, rotations and zooming. All these operations are natively executed<br />
in the spike domain. The processed spike trains are decoded for the faithful recovery<br />
of the stimulus and its transformations.<br />
Finally, we extend the above results to neural encoding circuits built with Hodking-Huxley<br />
neurons.<br />
References:<br />
Aurel A. Lazar, Eftychios A. Pnevmatikakis and Yiyin Zhou,<br />
Encoding Natural Scenes with Neural Circuits with Random Thresholds, Vision Research, 2010,<br />
Special Issue on Mathematical Models of Visual Coding,<br />
http://dx.doi.org/10.1016/j.visres.2010.03.015<br />
Aurel A. Lazar,<br />
Population Encoding with Hodgkin-Huxley Neurons,<br />
IEEE Transactions on Information Theory, Volume 56, Number 2, pp. 821-837, February, 2010,<br />
Special Issue on Molecular Biology and Neuroscience,<br />
http://dx.doi.org/10.1109/TIT.2009.2037040<br />
<br />
'''11 Nov 2010''' (UCB holiday)<br />
* Speaker: Martha Nari Havenith<br />
* Affiliation: UCL<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: Finding spike timing in the visual cortex - Oscillations as the internal clock of vision?<br />
* Abstract:<br />
<br />
'''19 Nov 2010''' (note: on Friday because of SFN)<br />
* Speaker: Dan Butts<br />
* Affiliation: UMD<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: Common roles of inhibition in visual and auditory processing.<br />
* Abstract: The role of inhibition in sensory processing is often obscured in extracellular recordings, because the absence of a neuronal response associated with inhibition might also be explained by a simple lack of excitation. However, increasingly, evidence from intracellular recordings demonstrates important roles of inhibition in shaping the stimulus selectivity of sensory neurons in both the visual and auditory systems. We have developed a nonlinear modeling approach that can identify putative excitatory and inhibitory inputs to a neuron using standard extracellular recordings, and have applied these techniques to understand the role of inhibition in shaping sensory processing in visual and auditory areas. In pre-cortical visual areas (retina and LGN), we find that inhibition likely plays a role in generating temporally precise responses, and mediates adaptation to changing contrast. In an auditory pre-cortical area (inferior colliculus) identified inhibition has nearly identical appearance and functions in temporal processing and adaptation. Thus, we predict common roles of inhibition in these sensory areas, and more generally demonstrate general methods for characterizing the nonlinear computations that comprise sensory processing.<br />
<br />
'''24 Nov 2010'''<br />
* Speaker: Eizaburo Doi<br />
* Affiliation: NYU<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
<br />
'''29 Nov 2010 - informal talk'''<br />
* Speaker: Eero Lehtonen<br />
* Affiliation: UTU Finland<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Memristors<br />
* Abstract:<br />
<br />
'''1 Dec 2010'''<br />
* Speaker: Gadi Geiger<br />
* Affiliation: MIT<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: Visual and Auditory Perceptual Modes that Characterize Dyslexics<br />
* Abstract: I will describe how dyslexics’ visual and auditory perception is wider and more diffuse than that of typical readers. This suggests wider neural tuning in dyslexics. In addition I will describe how this processing relates to difficulties in reading. Strengthening the argument and more so helping dyslexics I will describe a regimen of practice that results in improved reading in dyslexics while narrowing perception.<br />
<br />
<br />
'''13 Dec 2010'''<br />
* Speaker: Jorg Lueke<br />
* Affiliation: FIAS<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Linear and Non-linear Approaches to Component Extraction and Their Applications to Visual Data<br />
* Abstract: In the nervous system of humans and animals, sensory data are represented as combinations of elementary data components. While for data such as sound waveforms the elementary components combine linearly, other data can better be modeled by non-linear forms of component superpositions. I motivate and discuss two models with binary latent variables: one using standard linear superpositions of basis functions and one using non-linear superpositions. Crucial for the applicability of both models are efficient learning procedures. I briefly introduce a novel training scheme (ET) and show how it can be applied to probabilistic generative models. For linear and non-linear models the scheme efficiently infers the basis functions as well as the level of sparseness and data noise. In large-scale applications to image patches, we show results on the statistics of inferred model parameters. Differences between the linear and non-linear models are discussed, and both models are compared to results of standard approaches in the literature and to experimental findings. Finally, I briefly discuss learning in a recent model that takes explicit component occlusions into account.<br />
<br />
'''15 Dec 2010'''<br />
* Speaker: Claudia Clopath<br />
* Affiliation: Universite Paris Decartes<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
<br />
'''18 Jan 2011'''<br />
* Speaker: Siwei Lyu<br />
* Affiliation: Computer Science Department, University at Albany, SUNY<br />
* Host: Bruno<br />
* Status: confirmed <br />
* Title: Divisive Normalization as an Efficient Coding Transform: Justification and Evaluation<br />
* Abstract:<br />
<br />
'''19 Jan 2011'''<br />
* Speaker: David Field (informal talk)<br />
* Affiliation: <br />
* Host: Bruno<br />
* Status: Tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''25 Jan 2011'''<br />
* Speaker: Ruth Rosenholtz<br />
* Affiliation: Dept. of Brain & Cognitive Sciences, Computer Science and AI Lab, MIT<br />
* Host: Bruno<br />
* Status: Confirmed <br />
* Title: What your visual system sees where you are not looking<br />
* Abstract:<br />
<br />
'''26 Jan 2011'''<br />
* Speaker: Ernst Niebur<br />
* Affiliation: Johns Hopkins U<br />
* Host: Fritz<br />
* Status: Confirmed <br />
* Title: <br />
* Abstract:<br />
<br />
'''16 March 2011'''<br />
* Speaker: Vladimir Itskov<br />
* Affiliation: University of Nebraska-Lincoln<br />
* Host: Chris<br />
* Status: Confirmed <br />
* Title: <br />
* Abstract:<br />
<br />
'''23 March 2011'''<br />
* Speaker: Bruce Cumming<br />
* Affiliation: National Institutes of Health<br />
* Host: Ivana<br />
* Status: Confirmed<br />
* Title: TBD<br />
* Abstract:<br />
<br />
'''27 April 2011'''<br />
* Speaker: Lubomir Bourdev<br />
* Affiliation: Computer Science, UC Berkeley<br />
* Host:Bruno<br />
* Status: Confirmed<br />
* Title: "Poselets and Their Applications in High-Level Computer Vision Problems"<br />
* Abstract:<br />
<br />
'''12 May 2011 (note: Thursday)'''<br />
* Speaker: Jack Culpepper<br />
* Affiliation: Redwood Center/EECS<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''26 May 2011'''<br />
* Speaker: Ian Stevenson<br />
* Affiliation: Northwestern University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Explaining tuning curves by estimating interactions between neurons<br />
* Abstract: One of the central tenets of systems neuroscience is that tuning curves are a byproduct of the interactions between neurons. Using multi-electrode recordings and recently developed inference techniques we can begin to examine this idea in detail and study how well we can explain the functional properties of neurons using the activity of other simultaneously recorded neurons. Here we examine datasets from 6 different brain areas recorded during typical sensorimotor tasks each with ~100 simultaneously recorded neurons. Using these datasets we measured the extent to which interactions between neurons can explain the tuning properties of individual neurons. We found that, in almost all areas, modeling interactions between 30-50 neurons allows more accurate spike prediction than tuning curves. This suggests that tuning can, in some sense, be explained by interactions between neurons in a variety of brain areas, even when recordings consist of relatively small numbers of neurons.<br />
<br />
'''1 June 2011'''<br />
* Speaker: Michael Oliver<br />
* Affiliation: Gallant lab<br />
* Host: Bruno<br />
* Status: Tentative <br />
* Title: <br />
* Abstract:<br />
<br />
'''8 June 2011'''<br />
* Speaker: Alyson Fletcher<br />
* Affiliation: UC Berkeley<br />
* Host: Bruno<br />
* Status: tentative<br />
* Title: Generalized Approximate Message Passing for Neural Receptive Field Estimation and Connectivity<br />
* Abstract: Fundamental to understanding sensory encoding and connectivity of neurons are effective tools for developing and validating complex mathematical models from experimental data. In this talk, I present a graphical models approach to the problems of neural connectivity reconstruction under multi-neuron excitation and to receptive field estimation of sensory neurons in response to stimuli. I describe a new class of Generalized Approximate Message Passing (GAMP) algorithms for a general class of inference problems on graphical models based Gaussian approximations of loopy belief propagation. The GAMP framework is extremely general, provides a systematic procedure for incorporating a rich class of nonlinearities, and is computationally tractable with large amounts of data. In addition, for both the connectivity reconstruction and parameter estimation problems, I show that GAMP-based estimation can naturally incorporate sparsity constraints in the model that arise from the fact that only a small fraction of the potential inputs have any influence on the output of a particular neuron. A simulation of reconstruction of cortical neural mapping under multi-neuron excitation shows that GAMP offers improvement over previous compressed sensing methods. The GAMP method is also validated on estimation of linear nonlinear Poisson (LNP) cascade models for neural responses of salamander retinal ganglion cells.<br />
<br />
=== 2009/10 academic year ===<br />
<br />
'''2 September 2009''' <br />
* Speaker: Keith Godfrey<br />
* Affiliation: University of Cambridge<br />
* Host: Tim<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract:<br />
<br />
'''7 October 2009'''<br />
* Speaker: Anita Schmid<br />
* Affiliation: Cornell University<br />
* Host: Kilian<br />
* Status: Confirmed<br />
* Title: Subpopulations of neurons in visual area V2 perform differentiation and integration operations in space and time<br />
* Abstract: The interconnected areas of the visual system work together to find object boundaries in visual scenes. Primary visual cortex (V1) mainly extracts oriented luminance boundaries, while secondary visual cortex (V2) also detects boundaries defined by differences in texture. How the outputs of V1 neurons are combined to allow for the extraction of these more complex boundaries in V2 is as of yet unclear. To address this question, we probed the processing of orientation signals in single neurons in V1 and V2, focusing on response dynamics of neurons to patches of oriented gratings and to combinations of gratings in neighboring patches and sequential time frames. We found two kinds of response dynamics in V2, both of which are different from those of V1 neurons. While V1 neurons in general prefer one orientation, one subpopulation of V2 neurons (“transient”) shows a temporally dynamic preference, resulting in a preference for changes in orientation. The second subpopulation of V2 neurons (“sustained”) responds similarly to V1 neurons, but with a delay. The dynamics of nonlinear responses to combinations of gratings reinforce these distinctions: the dynamics enhance the preference of V1 neurons for continuous orientations, and enhance the preference of V2 transient neurons for discontinuous ones. We propose that transient neurons in V2 perform a differentiation operation on the V1 input, both spatially and temporally, while the sustained neurons perform an integration operation. We show that a simple feedforward network with delayed inhibition can account for the temporal but not for the spatial differentiation operation.<br />
<br />
'''28 October 2009'''<br />
* Speaker: Andrea Benucci<br />
* Affiliation: Institute of Ophthalmology, University College London<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Stimulus dependence of the functional connectivity between neurons in primary visual cortex<br />
* Abstract: It is known that visual stimuli are encoded by the concerted activity of large populations of neurons in visual cortical areas. However, it is only recently that recording techniques have been made available to study such activations from large ensembles of neurons simultaneously, with millisecond temporal precision and tens of microns spatial resolution. I will present data from voltage-sensitive dye (VSD) imaging and multi-electrode recordings (“Utah” probes) from the primary visual cortex of the cat (V1). I will discuss the relationship between two fundamental cortical maps of the visual system: the map of retinotopy and the map of orientation. Using spatially localized and full-field oriented stimuli, we studied the functional interdependency of these maps. I will describe traveling and standing waves of cortical activity and their key role as a dynamical substrate for the spatio-temporal coding of visual information. I will further discuss the properties of the spatio-temporal code in the context of continuous visual stimulation. While recording population responses to a sequence of oriented stimuli, we asked how responses to individual stimuli summate over time. We found that such rules are mostly linear, supporting the idea that spatial and temporal codes in area V1 operate largely independently. However, these linear rules of summation fail when the visual drive is removed, suggesting that the visual cortex can readily switch between a dynamical regime where either feed-forward or intra-cortical inputs determine the response properties of the network.<br />
<br />
'''12 November 2009 (Thursday)'''<br />
* Speaker: Song-Chun Zhu<br />
* Affiliation: UCLA<br />
* Host: Jimmy<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''18 November 2009'''<br />
* Speaker: Dan Graham<br />
* Affiliation: Dept. of Mathematics, Dartmouth College<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: The Packet-Switching Brain: A Hypothesis<br />
* Abstract: Despite great advances in our understanding of neural responses to natural stimuli, the basic structure of the neural code remains elusive. In this talk, I will describe a novel hypothesis regarding the fundamental structure of neural coding in mammals. In particular, I propose that an internet-like routing architecture (specifically packet-switching) underlies neocortical processing, and I propose means of testing this hypothesis via neural response sparseness measurements. I will synthesize a host of suggestive evidence that supports this notion and will, more generally, argue in favor of a large scale shift from the now dominant “computer metaphor,” to the “internet metaphor.” This shift is intended to spur new thinking with regard to neural coding, and its main contribution is to privilege communication over computation as the prime goal of neural systems.<br />
<br />
'''16 December 2009'''<br />
* Speaker: Pietro Berkes<br />
* Affiliation: Volen Center for Complex Systems, Brandeis University<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Generative models of vision: from sparse coding toward structured models<br />
* Abstract: From a computational perspective, one can think of visual perception as the problem of analyzing the light patterns detected by the retina to recover their external causes. This process requires combining the incoming sensory evidence with internal prior knowledge about general properties of visual elements and the way they interact, and can be formalized in a class of models known as causal generative models. In the first part of the talk, I will discuss the first and most established generative model, namely the sparse coding model. Sparse coding has been largely successful in showing how the main characteristics of simple cells receptive fields can be accounted for based uniquely on the statistics of natural images. I will briefly review the evidence supporting this model, and contrast it with recent data from the primary visual cortex of ferrets and rats showing that the sparseness of neural activity over development and anesthesia seems to follow trends opposite to those predicted by sparse coding. In the second part, I will argue that the generative point of view calls for models of natural images that take into account more of the structure of the visual environment. I will present a model that takes a first step in this direction by incorporating the fundamental distinction between identity and attributes of visual elements. After learning, the model mirrors several aspects of the organization of V1, and results in a novel interpretation of complex and simple cells as parallel population of cells, coding for different aspects of the visual input. Further steps toward more structured generative models might thus lead to the development of a more comprehensive account of visual processing in the visual cortex.<br />
<br />
'''6 January 2010'''<br />
* Speaker: Susanne Still<br />
* Affiliation: U of Hawaii<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''20 January 2010'''<br />
* Speaker: Tom Dean<br />
* Affiliation: Google<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: Accelerating Computer Vision and Machine Learning Algorithms with Graphics Processors<br />
* Abstract: Graphics processors (GPUs) and massively-multi-core architectures are becoming more powerful, less costly and more energy efficient, and the related programming language issues are beginning to sort themselves out. That said most researchers don’t want to be writing code that depends on any particular architecture or parallel programming model. Linear algebra, Fourier analysis and image processing have standard libraries that are being ported to exploit SIMD parallelism in GPUs. We can depend on the massively-multiple-core machines du jour to support these libraries and on the high-performance-computing (HPC) community to do the porting for us or with us. These libraries can significantly accelerate important applications in image processing, data analysis and information retrieval. We can develop APIs and the necessary run-time support so that code relying on these libraries will run on any machine in a cluster of computers but exploit GPUs whenever available. This strategy allows us to move toward hybrid computing models that enable a wider range of opportunities for parallelism without requiring the special training of programmers or the disadvantages of developing code that depends on specialized hardware or programming models. This talk summarizes the state of the art in massively-multi-core architectures, presents experimental results that demonstrate the potential for significant performance gains in the two general areas of image processing and machine learning, provides examples of the proposed programming interface, and some more detailed experimental results on one particular problem involving video-content analysis.<br />
<br />
'''27 January 2010'''<br />
* Speaker: David Philiponna<br />
* Affiliation: Paris<br />
* Host: Bruno<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
''''24 Feburary 2010'''<br />
* Speaker: Gordon Pipa<br />
* Affiliation: U Osnabrueck/MPI Frankfurt<br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''3 March 2010'''<br />
* Speaker: Gaute Einevoll<br />
* Affiliation: UMB, Norway<br />
* Host: Amir<br />
* Status: Confirmed<br />
* Title: TBA<br />
* Abstract: TBA<br />
<br />
<br />
'''4 March 2010'''<br />
* Speaker: Harvey Swadlow<br />
* Affiliation: <br />
* Host: Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''8 April 2010'''<br />
* Speaker: Alan Yuille <br />
* Affiliation: UCLA<br />
* Host: Amir<br />
* Status: Confirmed (for 1pm)<br />
* Title: <br />
* Abstract:<br />
<br />
'''28 April 2010'''<br />
* Speaker: Dharmendra Modha - cancelled<br />
* Affiliation: IBM<br />
* Host:Fritz<br />
* Status: Confirmed<br />
* Title: <br />
* Abstract:<br />
<br />
'''5 May 2010'''<br />
* Speaker: David Zipser<br />
* Affiliation: UCB<br />
* Host: Daniel Little<br />
* Status: Tentative<br />
* Title: Brytes 2:<br />
* Abstract:<br />
<br />
Brytes are little brains that can be assembled into larger, smarter brains. In my first talk I presented a biologically plausible, computationally tractable model of brytes and described how they can be used as subunits to build brains with interesting behaviors.<br />
<br />
In this talk I will first show how large numbers of brytes can cooperate to perform complicated actions such as arm and hand manipulations in the presence of obstacles. Then I describe a strategy for a higher level of control that informs each bryte what role it should play in accomplishing the current task. These results could have considerable significance for understanding the brain and possibly be applicable to robotics and BMI.<br />
<br />
'''12 May 2010'''<br />
* Speaker: Frank Werblin (Redwood group meeting - internal only)<br />
* Affiliation: Berkeley<br />
* Host: Bruno<br />
* Status: Tentative<br />
* Title: <br />
* Abstract:<br />
<br />
'''19 May 2010'''<br />
* Speaker: Anna Judith<br />
* Affiliation: UCB<br />
* Host: Daniel Little (Redwood Lab Meeting - internal only)<br />
* Status: confirmed<br />
* Title: <br />
* Abstract:</div>Jesselivezeyhttps://rctn.org/w/index.php?title=ClusterAdmin&diff=8464ClusterAdmin2016-02-04T20:05:06Z<p>Jesselivezey: /* Matlab Installation */</p>
<hr />
<div>= Cluster Administration =<br />
<br />
This page documents the details of different aspects of administration of the cluster.<br />
<br />
== Managing Local Modules ==<br />
<br />
Cortex manages its own group of modules locally. You can ask Bruno about getting added to the admin team. Currently installed modules can be seen to running <br />
module avail<br />
The bottom group of modules are the ones that we are managing ourselves.<br />
<br />
=== Installing New Modules ===<br />
<br />
There are two steps to create a new module.<br />
First, the package needs to be downloaded and built locally. Second, the package needs to be added to the list of modules.<br />
<br />
Package installers and executable files belong in<br />
<br />
/clusterfs/cortex/software/<br />
<br />
To add the package as a module, a folder/file needs to be created in<br />
/global/home/groups/cortex/modulefiles/centos-5.x86_64/<br />
You should look at some of the other module files as examples.<br />
<br />
=== Module ownership ===<br />
<br />
It is easiest to administer the modules if everyone on the admin team (cortexsw group) has read, write, and execute privileges for the package files and module file.<br />
You can do this by hand with chmod, or you can run<br />
bash /clusterfs/cortex/software/fix_permissions.sh <br />
which will add the correct permissions for the module files.<br />
<br />
== Matlab Installation ==<br />
<br />
Matlab versions and licenses are updates roughly annually. If possible, the current version and second most recent versions will be maintained. The rough steps are listed here with more detailed instructions below.<br />
<br />
* Request licenses from campus. We are asking for ~30.<br />
* Download and move files to /clusterfs/cortex/software/matlab<version><br />
* Install matlab on the cluster<br />
* Setup licenses correctly<br />
* Create new matlab module<br />
<br />
=== 2016 ===</div>Jesselivezeyhttps://rctn.org/w/index.php?title=ClusterAdmin&diff=8463ClusterAdmin2016-02-04T20:04:35Z<p>Jesselivezey: /* Cluster Administration */</p>
<hr />
<div>= Cluster Administration =<br />
<br />
This page documents the details of different aspects of administration of the cluster.<br />
<br />
== Managing Local Modules ==<br />
<br />
Cortex manages its own group of modules locally. You can ask Bruno about getting added to the admin team. Currently installed modules can be seen to running <br />
module avail<br />
The bottom group of modules are the ones that we are managing ourselves.<br />
<br />
=== Installing New Modules ===<br />
<br />
There are two steps to create a new module.<br />
First, the package needs to be downloaded and built locally. Second, the package needs to be added to the list of modules.<br />
<br />
Package installers and executable files belong in<br />
<br />
/clusterfs/cortex/software/<br />
<br />
To add the package as a module, a folder/file needs to be created in<br />
/global/home/groups/cortex/modulefiles/centos-5.x86_64/<br />
You should look at some of the other module files as examples.<br />
<br />
=== Module ownership ===<br />
<br />
It is easiest to administer the modules if everyone on the admin team (cortexsw group) has read, write, and execute privileges for the package files and module file.<br />
You can do this by hand with chmod, or you can run<br />
bash /clusterfs/cortex/software/fix_permissions.sh <br />
which will add the correct permissions for the module files.<br />
<br />
== Matlab Installation ==<br />
<br />
Matlab versions and licenses are updates roughly annually. If possible, the current version and second most recent versions will be maintained. The rough steps are listed here with more detailed instructions below.<br />
<br />
- Request licenses from campus. We are asking for ~30.<br />
- Download and move files to /clusterfs/cortex/software/matlab<version><br />
- Install matlab on the cluster<br />
- Setup licenses correctly<br />
- Create new matlab module<br />
<br />
=== 2016 ===</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8462Cluster2016-02-04T19:47:44Z<p>Jesselivezey: /* General Information */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8450Cluster2016-01-30T02:27:41Z<p>Jesselivezey: /* Software */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
== Matlab ==<br />
Matlab instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Matlab here].<br />
<br />
== Python ==<br />
Python instructions are [http://redwood.berkeley.edu/wiki/Cluster-Software#Python here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8449Cluster2016-01-30T02:25:28Z<p>Jesselivezey: /* Software */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs), and one very clever wombat who can optimize your neural network for you if you ask nicely. The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
== Cluster Administration ==<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
= Job Management =<br />
<br />
In order to coordinate our cluster usage patterns fairly, our cluster uses a job manager known as SLURM. If your are planning to run jobs on the cluster you should be using SLURM! Learn how [http://redwood.berkeley.edu/wiki/Cluster_Job_Management here].<br />
<br />
<br />
= Software =<br />
Information on what software is installed on the cluster and how to access it is [http://redwood.berkeley.edu/wiki/Cluster-Software here].<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8329Cluster2015-09-12T00:25:28Z<p>Jesselivezey: /* General Information */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. Lastly, if you want to do long GPU computations. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''SLURM''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
=== Cluster Administration ===<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
In brief, we have 14 nodes with over 60 cores and 4 GPUs.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM ===<br />
<br />
SLURM is our scheduler. It is very important you understand SLURM well to have a good time doing research on the cluster. SLURM is our administrator on the cluster, it helps you find resources for your job. It also helps others do the same, so we are not stepping on each others' toes. There are some do's and don'ts with using SLURM.<br />
<br />
* Logging in -- when you login to the cluster, you end up landing on the login node. We do not own the login node and share this with other members of the Berkeley Research Consortium. So, it is important not to run anything here *at all*<br />
<br />
* Information on Submitting, Monitoring, Reviewing Jobs can be found here. You can do many simple BASH tricks to submit a large number of embarrassingly parallel jobs on the cluster. This is great for parameter sweeps. <br />
<br />
* Storage -- every user gets a 10 GB quota gratis from the BRC. This is your home folder or where you land when you login. In addition to this there's a 20TB scratch space (/clusterfs/cortex/scratch) shared by all members of the Redwood Center. We have a log of how much space is being used by each member who writes into the scratch folder at (TODO)<br />
<br />
* We have 4 GPU nodes and information on requesting and using them can be found here. When you request a GPU as a resource, you get the whole node along with it. <br />
<br />
* We have a debug queue that can be requested for research here<br />
<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
<br />
To start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
For information on what software is installed on the cluster and how to access it, head [http://redwood.berkeley.edu/wiki/Cluster-Software here]<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster-Software&diff=8317Cluster-Software2015-09-11T23:11:41Z<p>Jesselivezey: /* Anaconda Python Distribution */</p>
<hr />
<div>= Software =<br />
<br />
== Matlab ==<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2<br />
or<br />
module load python/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
By default, Theano expects the default compiler to be gcc, so you'll need to unload the intel compiler.<br />
<br />
module unload intel<br />
<br />
Theano caches certain compiled libraries and these will sometimes cause errors when Theano gets updated. If you are experiencing problems with Theano, you can try clearing the cache with<br />
theano-cache clear<br />
and if you still have problems you can delete the .theano folder from your home directory.<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8294Cluster2015-08-14T17:15:02Z<p>Jesselivezey: /* Using Theano */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
By default, Theano expects the default compiler to be gcc, so you'll need to unload the intel compiler.<br />
<br />
module unload intel<br />
<br />
Theano caches certain compiled libraries and these will sometimes cause errors when Theano gets updated. If you are experiencing problems with Theano, you can try clearing the cache with<br />
theano-cache clear<br />
and if you still have problems you can delete the .theano folder from your home directory.<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8293Cluster2015-08-14T16:59:27Z<p>Jesselivezey: /* Using Theano */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
By default, Theano expects the default compiler to be gcc, so you'll need to unload the intel compiler.<br />
<br />
module unload intel<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8257Cluster2015-07-03T20:36:00Z<p>Jesselivezey: /* General Information */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
[[ClusterAdmin]] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8256Cluster2015-07-03T20:35:44Z<p>Jesselivezey: /* General Information */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
[ClusterAdmin] has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8255Cluster2015-07-03T20:35:27Z<p>Jesselivezey: /* General Information */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
ClusterAdmin has information about cluster administration.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=ClusterAdmin&diff=8254ClusterAdmin2015-07-03T20:34:28Z<p>Jesselivezey: Created page with "= Cluster Administration = This page documents the details of different aspects of administration of the cluster. == Managing Local Modules == Cortex manages its own group..."</p>
<hr />
<div>= Cluster Administration =<br />
<br />
This page documents the details of different aspects of administration of the cluster.<br />
<br />
== Managing Local Modules ==<br />
<br />
Cortex manages its own group of modules locally. You can ask Bruno about getting added to the admin team. Currently installed modules can be seen to running <br />
module avail<br />
The bottom group of modules are the ones that we are managing ourselves.<br />
<br />
=== Installing New Modules ===<br />
<br />
There are two steps to create a new module.<br />
First, the package needs to be downloaded and built locally. Second, the package needs to be added to the list of modules.<br />
<br />
Package installers and executable files belong in<br />
<br />
/clusterfs/cortex/software/<br />
<br />
To add the package as a module, a folder/file needs to be created in<br />
/global/home/groups/cortex/modulefiles/centos-5.x86_64/<br />
You should look at some of the other module files as examples.<br />
<br />
=== Module ownership ===<br />
<br />
It is easiest to administer the modules if everyone on the admin team (cortexsw group) has read, write, and execute privileges for the package files and module file.<br />
You can do this by hand with chmod, or you can run<br />
bash /clusterfs/cortex/software/fix_permissions.sh <br />
which will add the correct permissions for the module files.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8253Cluster2015-07-03T19:51:59Z<p>Jesselivezey: /* Local Install of Anaconda Python Distribution */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda-version_info.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8252Cluster2015-07-03T19:51:37Z<p>Jesselivezey: /* Anaconda Python Distribution */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name username@hpc.brc.berkeley.edu:/global/home/users/username/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8251Cluster2015-07-03T19:49:50Z<p>Jesselivezey: /* = Using the CPU */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ====<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8250Cluster2015-07-03T19:49:41Z<p>Jesselivezey: /* = Using the GPU */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ====<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ===<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8249Cluster2015-07-03T19:49:07Z<p>Jesselivezey: /* Using the GPU with Theano */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using Theano ===<br />
<br />
==== Using the GPU ===<br />
<br />
You must request a GPU node. The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
==== Using the CPU ===<br />
<br />
Theano can also run on the CPU. Any of the CPU nodes will work. You will want to have Theano build against the MKL BLAS library that comes with Anaconda and so your .theanorc might look like<br />
<br />
[global]<br />
device = cpu<br />
floatX = float32<br />
ldflags = -lmkl_rt<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8248Cluster2015-07-03T19:44:17Z<p>Jesselivezey: /* Using the GPU with Theano */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda Python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository], installed locally, and added to your PYTHONPATH if you are using the preinstalled Python verions. If you have a local python install you can install theano with<br />
python setup.py develop<br />
from the repository folder.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8247Cluster2015-07-03T19:42:08Z<p>Jesselivezey: /* CUDA SDK (Outdated since version change to 3.0) */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8246Cluster2015-07-03T19:41:23Z<p>Jesselivezey: /* Usage Tips */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips TODO =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8245Cluster2015-07-03T19:39:21Z<p>Jesselivezey: /* ssh to the gateway computer */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to a login node ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the login nodes for computations (e.g. matlab, python)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8244Cluster2015-07-03T19:37:54Z<p>Jesselivezey: /* pledge app (get a password) */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== Pledge App (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8243Cluster2015-07-03T19:37:36Z<p>Jesselivezey: /* data */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== Data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8242Cluster2015-07-03T19:37:25Z<p>Jesselivezey: /* home directory quota */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== Home Directory Quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8241Cluster2015-07-03T19:28:51Z<p>Jesselivezey: /* Python */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy built against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8240Cluster2015-07-03T19:28:29Z<p>Jesselivezey: /* Python */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy build against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
This should download a .sh file that can be run with<br />
bash Anaconda<version info>.sh<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8239Cluster2015-07-03T19:24:18Z<p>Jesselivezey: /* Python */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 or 3.4 Distributions can be loaded through<br />
module load python/anaconda2/anaconda2<br />
or<br />
module load python/anaconda3/anaconda3<br />
respectively. This distribution has NumPy and SciPy build against the Intel MKL BLAS library (multicore BLAS). You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
=== Local Install of Anaconda Python Distribution ===<br />
If you want to manage your own python distribution the Anaconda Python is a very good distribution. To get it, go the the [http://continuum.io/downloads Continuum downloads] page and select the linux distribution (penguin).<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8238Cluster2015-07-03T19:19:54Z<p>Jesselivezey: /* Python */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
=== Anaconda Python Distribution ===<br />
<br />
The Anaconda Python 2.7 Distribution can be loaded through<br />
module load python/anaconda2/anaconda2<br />
This distribution has NumPy and SciPy build against the Intel MKL BLAS library. You will need to get an [https://store.continuum.io/cshop/academicanaconda academic license] from Continuum and copy it to the cluster.<br />
<br />
On the cluster<br />
cd<br />
mkdir .continuum<br />
<br />
On the machine where you downloaded the license file<br />
scp file_name <username>@hpc.brc.berkeley.edu:/global/home/users/<username>/.continuum/.<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8232Cluster2015-07-02T23:51:36Z<p>Jesselivezey: /* home directory quota */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command: TODO<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8231Cluster2015-07-02T23:50:31Z<p>Jesselivezey: /* Hardware Overview */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server TODO<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command:<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8230Cluster2015-07-02T23:49:52Z<p>Jesselivezey: /* General Information */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs TODO (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command:<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8200Cluster2015-06-27T00:06:10Z<p>Jesselivezey: /* Using the GPU with Theano */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command:<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally.<br />
Theano must be configured to use the GPU. General information can be found in the [http://deeplearning.net/software/theano/library/config.html Theano documentation], but a working (June 2015) version is to create a .theanorc file in your HOME directory with the contents:<br />
<br />
[global]<br />
root = /global/software/sl-6.x86_64/modules/langs/cuda/6.5/<br />
device = gpu<br />
floatX = float32<br />
force_device=True<br />
<br />
[nvcc]<br />
fastmath = True<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8199Cluster2015-06-27T00:03:06Z<p>Jesselivezey: /* CUDA */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command:<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} option must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Using the GPU with Theano ===<br />
<br />
The Anaconda python distribution comes with a version of Theano that should work. If you need new Theano features, the development version of Theano can be obtained from the [https://github.com/Theano/Theano github repository] and installed locally<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please obtain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezeyhttps://rctn.org/w/index.php?title=Cluster&diff=8198Cluster2015-06-26T23:59:42Z<p>Jesselivezey: /* CUDA */</p>
<hr />
<div>= General Information =<br />
<br />
The Redwood computing cluster consists of about a dozen somewhat heterogeneous machines, some with graphics cards (GPUs). The typical use cases for the cluster are that you have jobs that run in parallel which are independent, so having several machines will complete the task faster, even though any one machine might not be faster than your own laptop. Or you have a long running job which may take a day, and you don't want to worry about having to leave your laptop on at all times and not be able to use it. Another reason is that your code leverages a communication scheme (such as MPI) to have multiple machines cooperatively work on a problem. <br />
<br />
In order for the cluster to be useful and well-utilized, it works best for everyone to submit jobs (see '''qsub''' further down on this page for the details) to the queue. A job may not start right away, but will get run once its turn comes. Please do not run extended interactive sessions or ssh directly to worker nodes for performing computation.<br />
<br />
== Hardware Overview == <br />
<br />
The current hardware and node configuration is listed [https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/ucb-supercluster/cortex here].<br />
<br />
In addition to the compute nodes we own a file server<br />
NetOp 4TB<br />
which is mounted as scratch space.<br />
<br />
== Getting an account and one-time password service == <br />
In order to get an account on the cluster, please send an email to Bruno (baolshausen AT berk...edu) with the following information:<br />
<br />
Full Name <emailaddress> desiredusername<br />
<br />
Please also include a note about which PI you are working with. Note: the '''desireusername''' must be 3-8 characters long, so it would have been truncated to '''desireus''' in this case.<br />
<br />
'''OTP (One Time Password) Service'''<br />
<br />
Once you have a username, you will need to follow the instructions found [https://commons.lbl.gov/display/itfaq/OTP+%28One+Time+Password%29+Service here] to set up the Pledge application, which gives you a one-time password for logging into the cluster (see '''Installing and Configuring the OTP Token''').<br />
<br />
== Directory setup ==<br />
<br />
=== home directory quota ===<br />
<br />
There is a 10GB quota limit enforced on $HOME directory<br />
(/global/home/users/username) usage. Please keep your usage below<br />
this limit. There will be NETAPP snapshots in place in this file<br />
system so we suggest you store only your source code and scripts<br />
in this area and store all your data under /clusterfs/cortex<br />
(see below).<br />
<br />
In order to see your current quota and usage, use the following command:<br />
<br />
quota -s<br />
<br />
=== data ===<br />
<br />
For large amounts of data, please create a directory<br />
<br />
/clusterfs/cortex/scratch/username<br />
<br />
and store the data inside that directory. Note that unlike the home directory, scratch space is not backed up and permanence of your data is not guaranteed. There is a total limit of 4 TB for this drive that is shared by everyone at the Redwood center.<br />
<br />
== Connect ==<br />
<br />
==== pledge app (get a password) ====<br />
<br />
* Run the pledge app and click "Generate one-time password"<br />
* Enter your PIN and press "Enter"<br />
* The application will present your 7 digit one time password<br />
<br />
=== ssh to the gateway computer ===<br />
<br />
ssh -Y username@hpc.brc.berkeley.edu<br />
<br />
and use your one-time password.<br />
<br />
If you intend on working with a remote GUI session you can add a -C flag to the command above to enable compression data to be sent through the ssh tunnel.<br />
<br />
''' note: please don't use the gateway for computations (e.g. matlab)! '''<br />
<br />
=== Setup environment ===<br />
<br />
* put all your customizations into your .bashrc <br />
* for login shells, .bash_profile is used, which in turn loads .bashrc<br />
<br />
=== Using a Windows machine ===<br />
Windows is not a Unix-based operating system and as a result does not natively interface with a Unix environment. Download the 2 following pieces of software to create a workaround:<br />
* Install a Unix environment emulator to interface directly with the cluster. Cygwin [http://www.cygwin.com] seems to work well. During installation make sure to install Net -> "openssh". Editors -> "vim" is also recommended. Then you can use the instructions detailed in ssh to gateway above<br />
* Install an SFTP/SCP/FTP client to allow for file sharing between the cluster and your local machine. WinSCP [http://www.winscp.net] is recommended. ExpanDrive can also be used to create a cluster-based network drive on your local machine.<br />
<br />
== Useful commands ==<br />
<br />
See https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/scheduler/ucb-supercluster-slurm-migration for a detailed FAQ on the SLURM job manager. <br />
<br />
Full description of our system by the LBL folks is at http://go.lbl.gov/hpcs-user-svcs/ucb-supercluster/cortex<br />
<br />
=== SLURM usage ===<br />
<br />
* Submitting a Job<br />
<br />
From the login node, you can submit jobs to the compute nodes using the syntax<br />
<br />
sbatch myscript.sh<br />
<br />
where the myscript.sh is an shell script containing the call to the executable to be submitted to the cluster. Typically, for a matlab job, it would look like<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
cd /clusterfs/cortex/scratch/working/dir/for/your/code<br />
module load matlab/R2013a<br />
matlab -nodisplay -nojvm -r "mymatlabfunction( parameters); exit"<br />
exit<br />
<br />
the --time defines the walltime of the job, which is an upper bound on the estimated runtime. The job will be killed after this time is elapsed. --mem specifies how much memory the job requires, the default is 1GB per job. <br />
<br />
* Monitoring Jobs <br />
<br />
Additional options can be passed to sbatch to monitor outputs from the running jobs<br />
<br />
sbatch -o outputfile.txt -e errofile.txt -J jobdescriptor myscript.sh<br />
<br />
the output of the job will be piped to outputfile.txt and any errors if the job crashes to errofile.txt<br />
<br />
* Cluster usage<br />
<br />
Use<br />
squeue<br />
to get a list of pending and running jobs on the cluster. It will show user names jobdescriptor passed to sbatch, runtime and nodes.<br />
<br />
=== Perceus commands ===<br />
<br />
The perceus manual is [http://www.warewulf-cluster.org/portal/book/export/html/7 here]<br />
<br />
* listing available cluster nodes:<br />
<br />
wwstats<br />
wwnodes<br />
<br />
* list cluster usage<br />
<br />
wwtop<br />
<br />
* to restrict the scope of these commands to cortex cluster, add the following line to your .bashrc<br />
<br />
export NODES='*cortex'<br />
<br />
* module list<br />
* module avail<br />
* module help<br />
<br />
* help pages are [http://lrc.lbl.gov/html/guide.html here]<br />
<br />
=== Finding out the list of occupants on each cluster node ===<br />
<br />
* One can find out the list of users using a particular node by ssh into the node, e.g.<br />
<br />
ssh n0000.cortex<br />
<br />
* After logging into the node, type<br />
<br />
top<br />
<br />
* This is useful if you believe someone is abusing the machine and would like to send him/her a friendly reminder.<br />
<br />
= Software =<br />
<br />
== Matlab ==<br />
<br />
Start an interactive session on the cluster (requires specifying the cluster and walltime as is shown here):<br />
<br />
srun -u -p cortex -t 2:0:0 --pty bash -i<br />
<br />
In order to use matlab, you have to load the matlab environment:<br />
<br />
module load matlab/R2013a<br />
<br />
Once the matlab environment is loaded, you can start a matlab session by running<br />
<br />
matlab -nodesktop<br />
<br />
An example SLURM script for running matlab code is<br />
<br />
#!/bin/bash -l<br />
#SBATCH -p cortex<br />
#SBATCH --time=03:30:00<br />
#SBATCH --mem-per-cpu=2G<br />
module load matlab/R2013a<br />
matlab -nodesktop -r "scriptname. $variable1 $variable2"<br />
<br />
The above script takes a matlab job with scriptname = scriptname and accepts two variables $variable1 and $variable2<br />
<br />
If you would like to see who is using matlab licenses, enter<br />
<br />
lmstat<br />
<br />
== Python ==<br />
<br />
It is currently (June 2015) easiest for you to install a local version of Python. Anaconda Python is a very good distribution. To get it, go the the Continuum downloads page and select the linux distribution (penguin).<br />
http://continuum.io/downloads<br />
Copy the download link address, and then in a terminal on the cluster run:<br />
<br />
wget paste_link_here<br />
<br />
We have several Python Distributions installed: The Enthought Python Distribution (EPD), the Source Python Distribution (SPD) and Sage. The easiest way to get started is probably to use EPD (see below).<br />
<br />
=== Enthought Python Distribution (EPD) ===<br />
<br />
We have the Enthought Python Distribution 7.2.0 installed [[http://www.enthought.com/products/epd.php EPD]]. In order to use it, you have to follow the following steps:<br />
<br />
* login to the gateway server using "ssh -Y" (see above)<br />
* start an interactive session using "srun -u -p cortex -t 2:0:0 bash -i" (see above)<br />
* load the python environment module:<br />
<br />
module load python/epd<br />
<br />
* start ipython:<br />
<br />
ipython -pylab<br />
<br />
* run the following commands inside ipython to test the setup:<br />
<br />
from enthought.mayavi import mlab<br />
mlab.test_contour3d()<br />
<br />
<!--<br />
=== Source Python Distribution (SPD) ===<br />
<br />
We have the Source Python Distribution installed [[http://code.google.com/p/spdproject/ SPD]]. In order to use it, you have to first load the python environment module:<br />
<br />
module load python/spd<br />
<br />
Afterwards, you can run ipython<br />
<br />
% ipython -pylab<br />
<br />
At the moment, we have numpy, scipy, and matplotlib installed. If you would like to have additional modules installed, let me know [[mailto:kilian@berkeley.edu kilian]]<br />
<br />
=== Sage ===<br />
<br />
Sage is [http://sagemath.org http://sagemath.org]. In order to use sage, you have to first load the sage environment module<br />
<br />
module load python/sage<br />
<br />
After loading the sage module, if you want to have a scipy environment (run ipython, etc) in your interactive session, first do:<br />
<br />
% sage -sh<br />
<br />
then you can run:<br />
<br />
% ipython<br />
<br />
or you can just do:<br />
<br />
% sage -ipython<br />
<br />
This is a temporary solution for people wanting use scipy with mpi on the cluster. It was built against the default openmpi (1.2.8) (icc) and mpi4py 1.1.0. For those using hdf5, I also built hdf5 1.8.3 (gcc) and h5py 1.2.<br />
<br />
Sample pbs and mpi script is here:<br />
<br />
~amirk/test<br />
<br />
You can run it as:<br />
<br />
% mkdir -p ~/jobs<br />
% cd ~amirk/test<br />
% qsub pbs<br />
<br />
--Amir<br />
--><br />
<br />
== CUDA ==<br />
<br />
CUDA is a library to use the graphics processing units (GPU) on the graphics card for general-purpose computing. We have a separate wiki page to collect information on how to do general-purpose computing on the GPU: [[GPGPU]].<br />
The --constraint={cortex_k40, cortex_fermi} must be used in order to schedule a node with a GPU.<br />
We have installed the CUDA 6.5 driver and toolkit.<br />
<br />
In order to use CUDA, you have to load the CUDA environment:<br />
<br />
module load cuda<br />
<br />
=== Obtain GPU lock in python ===<br />
<br />
If you would like to use one of the GPU cards on node n0000 or n0001, please optain a GPU lock to make sure the card is not in use and that no one else will be using the card. <br />
<br />
If you are using Python, you can obtain a GPU lock by running<br />
<br />
import gpu_lock<br />
gpu_lock.obtain_lock_id()<br />
<br />
The function either returns the number of the card you can use (0 or 1) or -1 if both cards are in use.<br />
<br />
=== Obtain GPU lock for Jacket in Matlab ===<br />
<br />
If you are using Matlab, you can obtain a GPU lock by running<br />
<br />
addpath('/clusterfs/cortex/software/gpu_lock');<br />
addpath('/clusterfs/cortex/software/jacket/engine');<br />
gpu_id = obtain_gpu_lock_id();<br />
gselect(gpu_id);<br />
<br />
By default, obtain_gpu_lock() throws an error when all gpu cards are taken.<br />
There is another option: obtain_gpu_lock_id(true) will return -1 in case there<br />
is no card available and you can then write your own code to deal with that<br />
fact.<br />
<br />
ginfo tells you which gpu card you are using.<br />
<br />
The following lines should also be in your .bashrc<br />
<br />
## jacket stuff!<br />
module load cuda<br />
export LD_LIBRARY_PATH=/clusterfs/cortex/software/jacket/engine/lib64:$LD_LIBRARY_PATH<br />
<br />
=== CUDA SDK (Outdated since version change to 3.0) ===<br />
<br />
You can install the CUDA SDK by running<br />
<br />
bash /clusterfs/cortex/software/cuda-2.3/src/cudasdk_2.3_linux.run<br />
<br />
You can compile all the code examples by running<br />
<br />
module load X11<br />
module load Mesa/7.4.4<br />
cd ~/NVIDIA_GPU_Computing_SDK/C<br />
make<br />
<br />
The compiled examples can be found in the directory<br />
<br />
~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release<br />
<br />
'''note:''' The examples using graphics with OpenGL don't seem to run on a remote X server. In order to make them work, we probably need to install something like [http://www.virtualgl.org/ virtualgl].<br />
<br />
<!--<br />
=== PyCuda ===<br />
<br />
PyCuda 0.93 is installed as part of the Source Python Distribution (SPD). This is how you run all unit tests:<br />
<br />
module load python/spd<br />
cd /clusterfs/cortex/software/src/pycuda-0.93/test/<br />
nosetests<br />
<br />
If you are having trouble installing PyCuda, please note the following:<br />
<br />
* gcc 4.1.2 related issues with boost [http://tinyurl.com/28zrjnv]<br />
* also, gcc 4.1.2 related [http://tinyurl.com/25obx6g]<br />
--><br />
<br />
= Usage Tips =<br />
Here are some tips on how to effectively use the cluster.<br />
<br />
== Embarrassingly Parallel Submissions ==<br />
<br />
Here is an alternate script to do embarrassingly parallel submissions on the cluster.<br />
<br />
iterate.sh<br />
#!/bin/sh<br />
#Leap Size<br />
param1=11<br />
param2=1.2<br />
param3=.75<br />
#LeapSize<br />
for i in 14 15 16<br />
do<br />
#Epsilon<br />
for j in $(seq .8 .1 $param2);<br />
do<br />
#Beta<br />
for k in $(seq .65 .01 $param3);<br />
do<br />
echo $i,$j,$k<br />
qsub param_test.sh -v "LeapSize=$i,Epsilon=$j,Beta=$k"<br />
done<br />
done<br />
done<br />
<br />
param_test.sh<br />
#!/bin/bash<br />
#PBS -q cortex<br />
#PBS -l nodes=1:ppn=2:gpu<br />
#PBS -l walltime=10:35:00<br />
#PBS -o /global/home/users/mayur/Logs<br />
#PBS -e /global/home/users/mayur/Errors<br />
cd /global/home/users/mayur/HMC_reducedflip/<br />
module load matlab<br />
echo "Epsilon = ",$Epsilon<br />
echo "Leap Size = ",$LeapSize<br />
echo "Beta = ",$Beta<br />
matlab -nodisplay -nojvm -r "make_figures_fneval_cluster $LeapSize $Epsilon $Beta"<br />
<br />
Now run ./iterate.sh<br />
<br />
== Mounting Cluster File System ==<br />
Mounting the cluster file system remotely allows you to easily access files on the cluster, and allows you to use local programs to edit code or examine simulation outputs locally (very useful). I often edit the remote code using a text editor running on my local machine. This allows you to take advantage of the niceties of a native editor without having to copy code back and forth before you run a simulation on the cluster.<br />
<br />
On linux distributions you can mount your cluster home directory locally using sshfs [http://fuse.sourceforge.net/sshfs.html]<br />
<br />
sshfs hadley.berkeley.edu: <mount-dir><br />
<br />
On Mac and Windows machines the program ExpanDrive works well (uses Fuse under the hood): [http://www.expandrive.com]<br />
<br />
= Support Requests =<br />
<br />
* If you have a problem that is not covered on this page, you can send an email to our user list:<br />
<br />
[mailto:redwood_cluster@lists.berkeley.edu redwood_cluster@lists.berkeley.edu]<br />
<br />
* If you need additional help from the LBL group, send an email to their email list. Please always cc our email list as well. Or visit their website[https://sites.google.com/a/lbl.gov/high-performance-computing-services-group/].<br />
<br />
[mailto:hpcshelp@lbl.gov hpcshelp@lbl.gov]<br />
<br />
* In urgent cases, you can also email [mailto:kmuriki@lbl.gov Krishna Muriki] (LBL User Services) directly.</div>Jesselivezey