Difference between revisions of "Minnesota Supercomputing Institute"

From Vrieze Wiki
Jump to navigation Jump to search
 
(25 intermediate revisions by 3 users not shown)
Line 1: Line 1:
MSI uses a PBS scheduler. It's easy to find a lot of information online about how to use PBS, but here are some commonly used functions
+
=MSI Tutorial Resources=
 +
Link to the semester long LATIS Tutorial
  
First, submitting a job is easiest with a PBS batch script. The following script, called <code>kinship.sh</code>, runs a program called vcf2kinship to create a kinship matrix on MCTFR genotypes. It requests a single node and 12 processors on that node. It requests 25 GB of memory.
+
http://latis.umn.edu/services-and-programs/research-support/2016-research-workshop-series
 +
 
 +
Link to broader list of MSI Tutorials
 +
 
 +
https://www.msi.umn.edu/tutorials/current
 +
 
 +
=Stratus=
 +
How to add a second user to a given instance:
 +
 
 +
<syntaxhighlight lang="bash">
 +
# Add the user:
 +
sudo useradd -m -d /home/<username> -s /bin/bash <username>
 +
 
 +
# Now setup the SSH key for the new user:
 +
sudo mkdir -p /home/<username>/.ssh
 +
sudo chmod 700 /home/<username>/.ssh
 +
sudo vi /home/<username>/.ssh/authorized_keys
 +
#        [... paste in your PUBLIC (*.pub) key, save and quit ...]
 +
sudo chown -R <username>:<username> /home/<username>/.ssh
 +
sudo chmod 600 /home/<username>/.ssh/authorized_keys
 +
 
 +
# If desired, give the user sudo privileges
 +
sudo passwd <username> #give the user a password, maybe the same as their username
 +
sudo adduser <username> sudo
 +
 
 +
</syntaxhighlight>
 +
 
 +
=Mesabi=
 +
 
 +
==Common PBS commands==
 +
MSI uses a PBS scheduler. It's easy to find a lot of information online about how to use PBS, but here are some commonly used functions.
 +
<syntaxhighlight lang="bash">
 +
qsub mypbsscript.pbs  # submit a PBS script to schedule a job
 +
qsub -t 1-22 arrayscript.sh # submit an array of jobs
 +
qstat                # check on job status
 +
showstart <jobid>    # check when your scheduled job is due to begin (note this is always
 +
                      #  an overestimate, as it depends on walltimes for all running
 +
                      #  and queued jobs
 +
acctinfo              # see total account service unit allocation, service units used by
 +
                      #  each person, and fairshare status.
 +
</syntaxhighlight>
 +
 
 +
==Example code to submit jobs on Mesabi==
 +
First, submitting a job is easiest with a PBS batch script. The following script, called <code>kinship.pbs</code>, runs a program called vcf2kinship to create a kinship matrix on MCTFR genotypes. It requests a single node and 12 processors on that node. It requests 25 GB of memory.
  
 
<syntaxhighlight lang="bash">
 
<syntaxhighlight lang="bash">
 
#!/bin/bash -l                                                                                                                                                               
 
#!/bin/bash -l                                                                                                                                                               
 
#PBS -l walltime=10:00:00,nodes=1:ppn=12,mem=25gb                                                                                                                           
 
#PBS -l walltime=10:00:00,nodes=1:ppn=12,mem=25gb                                                                                                                           
/home/vrie0006/hyoung/software/rvtests/executable/vcf2kinship --inVcf /home/vrie0006/hyoung/gedi5-660WQuad-b37-forwardstrand-correctreferenceallele-final-vcf.vcf.gz --bn -\
+
/home/vrie0006/hyoung/software/rvtests/executable/vcf2kinship --inVcf /home/vrie0006/hyoung/genotypes.vcf.gz \
-thread 10 --out MCTFR
+
      --bn \
 +
      --thread 10 \
 +
      --out /home/vrie0006/hyoung/kinship
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Line 13: Line 59:
  
 
<syntaxhighlight lang="bash">
 
<syntaxhighlight lang="bash">
qsub -t short kinship.sh
+
qsub -q short kinship.sh
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 
You can then check the status of this job by running various commands.
 
You can then check the status of this job by running various commands.
  
To see when your job might start running use <code>showstart \<jobid\></code> as follows:
+
To see when your job might start running use <code>showstart <jobid></code> like this:
  
 
<syntaxhighlight lang="bash">
 
<syntaxhighlight lang="bash">
Line 28: Line 74:
  
 
Best Partition: mesabipar
 
Best Partition: mesabipar
 +
</syntaxhighlight>
 +
 +
To check status use <code>qstat</code>:
 +
<syntaxhighlight lang="bash">
 +
[ln0006:hyoung] qstat
 +
Job ID                    Name            User            Time Use S Queue
 +
------------------------- ---------------- --------------- -------- - -----
 +
4560955.mesabim3.msi.umn.edu kinship.sh      vrie0006        01:02:54 R small
 +
</syntaxhighlight>
 +
 +
Here's an example of an array of 22 jobs to perform rvtests, with 1 job per chromosome. Note that in the script <code>${PBS_ARRAYID}</code> denotes the job number, and can be used to differentiate jobs.
 +
<syntaxhighlight lang="bash">
 +
#!/bin/bash
 +
 +
#PBS -l walltime=24:00:00,nodes=1:ppn=1,mem=2gb
 +
#PBS -m abe
 +
#PBS -M datt0019@umn.edu
 +
 +
/home/vrie0006/datt0019/tools/rvtests/executable/rvtest
 +
--inVcf /home/vrie0006/datt0019/mctfr/vcf_files/chr${PBS_ARRAYID}.withRS.filtered.PASS.beagled.MZadded.vcf.gz
 +
--boltPlink /home/vrie0006/datt0019/mctfr/bed/cpd_merged
 +
--pheno /home/vrie0006/datt0019/mctfr/phenotypes/residualized_phenotypes.ped
 +
--pheno-name cpd
 +
--meta bolt
 +
--inverseNormal --qtl
 +
--out /home/vrie0006/datt0019/mctfr/gwas/cpd/chr${PBS_ARRAYID}
 +
--boltPlinkNoCheck
 +
--siteMACMin 10
 +
 +
</syntaxhighlight>
 +
 +
That script can then be submitted for scheduling to mesabi as follows:
 +
 +
<syntaxhighlight lang="bash">
 +
qsub -t 1-22 rvtests.sh
 +
</syntaxhighlight>
 +
 +
=== Interactive job ===
 +
 +
<syntaxhighlight lang="bash">
 +
qsub -I -l walltime=5:00:00,mem=10gb,nodes=1:ppn=1
 
</syntaxhighlight>
 
</syntaxhighlight>

Latest revision as of 15:27, 6 June 2019

MSI Tutorial Resources

Link to the semester long LATIS Tutorial

http://latis.umn.edu/services-and-programs/research-support/2016-research-workshop-series

Link to broader list of MSI Tutorials

https://www.msi.umn.edu/tutorials/current

Stratus

How to add a second user to a given instance:

# Add the user:
sudo useradd -m -d /home/<username> -s /bin/bash <username>

# Now setup the SSH key for the new user:
sudo mkdir -p /home/<username>/.ssh
sudo chmod 700 /home/<username>/.ssh
sudo vi /home/<username>/.ssh/authorized_keys
#         [... paste in your PUBLIC (*.pub) key, save and quit ...]
sudo chown -R <username>:<username> /home/<username>/.ssh
sudo chmod 600 /home/<username>/.ssh/authorized_keys

# If desired, give the user sudo privileges
sudo passwd <username> #give the user a password, maybe the same as their username
sudo adduser <username> sudo

Mesabi

Common PBS commands

MSI uses a PBS scheduler. It's easy to find a lot of information online about how to use PBS, but here are some commonly used functions.

qsub mypbsscript.pbs  # submit a PBS script to schedule a job
qsub -t 1-22 arrayscript.sh # submit an array of jobs
qstat                 # check on job status
showstart <jobid>     # check when your scheduled job is due to begin (note this is always 
                      #   an overestimate, as it depends on walltimes for all running 
                      #   and queued jobs
acctinfo              # see total account service unit allocation, service units used by
                      #   each person, and fairshare status.

Example code to submit jobs on Mesabi

First, submitting a job is easiest with a PBS batch script. The following script, called kinship.pbs, runs a program called vcf2kinship to create a kinship matrix on MCTFR genotypes. It requests a single node and 12 processors on that node. It requests 25 GB of memory.

#!/bin/bash -l                                                                                                                                                              
#PBS -l walltime=10:00:00,nodes=1:ppn=12,mem=25gb                                                                                                                           
/home/vrie0006/hyoung/software/rvtests/executable/vcf2kinship --inVcf /home/vrie0006/hyoung/genotypes.vcf.gz \
      --bn \
      --thread 10 \
      --out /home/vrie0006/hyoung/kinship

That script can then be submitted for scheduling to the mesabi short queue on msi (MSI list of queues) as follows:

qsub -q short kinship.sh

You can then check the status of this job by running various commands.

To see when your job might start running use showstart <jobid> like this:

[ln0006:hyoung] showstart 4560951
job 4560951 requires 10 procs for 10:00:00

Estimated Rsv based start in                 2:10:44 on Wed Nov 29 23:58:00
Estimated Rsv based completion in           12:10:44 on Thu Nov 30 09:58:00

Best Partition: mesabipar

To check status use qstat:

[ln0006:hyoung] qstat
Job ID                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
4560955.mesabim3.msi.umn.edu kinship.sh       vrie0006        01:02:54 R small

Here's an example of an array of 22 jobs to perform rvtests, with 1 job per chromosome. Note that in the script ${PBS_ARRAYID} denotes the job number, and can be used to differentiate jobs.

#!/bin/bash

#PBS -l walltime=24:00:00,nodes=1:ppn=1,mem=2gb
#PBS -m abe
#PBS -M datt0019@umn.edu

/home/vrie0006/datt0019/tools/rvtests/executable/rvtest 
--inVcf /home/vrie0006/datt0019/mctfr/vcf_files/chr${PBS_ARRAYID}.withRS.filtered.PASS.beagled.MZadded.vcf.gz 
--boltPlink /home/vrie0006/datt0019/mctfr/bed/cpd_merged 
--pheno /home/vrie0006/datt0019/mctfr/phenotypes/residualized_phenotypes.ped 
--pheno-name cpd 
--meta bolt 
--inverseNormal --qtl 
--out /home/vrie0006/datt0019/mctfr/gwas/cpd/chr${PBS_ARRAYID} 
--boltPlinkNoCheck 
--siteMACMin 10

That script can then be submitted for scheduling to mesabi as follows:

qsub -t 1-22 rvtests.sh

Interactive job

qsub -I -l walltime=5:00:00,mem=10gb,nodes=1:ppn=1