UK Biobank/Downloading the data: Difference between revisions
Jump to navigation
Jump to search
| Line 30: | Line 30: | ||
<li> Genetic data is downloaded following the instructions at [http://biobank.ctsu.ox.ac.uk/showcase/exinfo.cgi?src=AccessingGeneticData the UK Biobank site]. | <li> Genetic data is downloaded following the instructions at [http://biobank.ctsu.ox.ac.uk/showcase/exinfo.cgi?src=AccessingGeneticData the UK Biobank site]. | ||
<li> Scripted downloads of all chromosomes were done using a command such as | <li> Scripted downloads of all chromosomes were done using a command such as | ||
< | <syntaxhighlight lang="bash"> | ||
$ seq 1 26 | parallel -j1 ./gfetch cal {} | $ seq 1 26 | parallel -j1 ./gfetch cal {} | ||
$ seq 1 26 | parallel -j1 ./gfetch imp {} | $ seq 1 26 | parallel -j1 ./gfetch imp {} | ||
</ | </syntaxhighlight> | ||
<li> A single sample map (impv1.sample) for the imputed data also was downloaded | <li> A single sample map (impv1.sample) for the imputed data also was downloaded | ||
< | <syntaxhighlight lang="bash"> | ||
$ ./gfetch imp 1 -m | $ ./gfetch imp 1 -m | ||
</ | </syntaxhighlight> | ||
</ol> | </ol> | ||
Revision as of 18:24, 29 February 2016
These procedures were all derived from the documentation at the UK Biobank. This information is here as a record and reference. Researchers should not have to repeat these steps.
Phenotypic data
- The phenotype file was downloaded from UK Biobank by the project PI as instructed in the data accessibility email.
- All of the utilities from the UK Biobank download page were retrieved.
- The key, k1234.key was saved from the PI's email.
- This command was run to decrypt the downloaded phenotype file
which produced the file ukb1234.enc_ukb
$ ./ukb_unpack ukb1234.enc k1234.key
- Once decrypted, the following commands were run to extract the data into useful formats
$ ./ukb_conv ukb1234.enc_ukb bulk -eencoding.ukb $ ./ukb_conv ukb1234.enc_ukb docs -eencoding.ukb $ ./ukb_conv ukb1234.enc_ukb r -eencoding.ukb
- bulk is a list of IDs for use with the ukbfetch utility
- docs produces an html file containing documentation of the variables in this dataset
- r produces a tab deliminated file and an R script for labeling and putting levels on the variables.
Genotypic data
- Genetic data is downloaded following the instructions at the UK Biobank site.
- Scripted downloads of all chromosomes were done using a command such as
$ seq 1 26 | parallel -j1 ./gfetch cal {} $ seq 1 26 | parallel -j1 ./gfetch imp {} - A single sample map (impv1.sample) for the imputed data also was downloaded
$ ./gfetch imp 1 -m