UK Biobank/Downloading the data: Difference between revisions

Revision as of 22:53, 19 February 2016

These procedures were all derived from the documentation at the UK Biobank.

The phenotype file was downloaded from UK Biobank by the project PI as instructed in the data accessibility email.
All of the utilities from the UK Biobank download page were retrieved.
The key, k1234.key was saved from the PI's email.
This command was run to decrypt the downloaded phenotype file
```
$ ./ukb_unpack ukb1234.enc k1234.key
```
which produced the file ukb1234.enc_ukb
Once decrypted, the following commands were run to extract the data into useful formats
```
$ ./ukb_conv ukb1234.enc_ukb bulk -eencoding.ukb
$ ./ukb_conv ukb1234.enc_ukb docs -eencoding.ukb
$ ./ukb_conv ukb1234.enc_ukb r -eencoding.ukb
```
1. bulk is a list of IDs for use with the ukbfetch utility
2. docs produces an html file containing documentation of the variables in this dataset
3. r produces a tab deliminated file and an R script for labeling and putting levels on the variables.

Genetic data is downloaded following the instructions at the UK Biobank site.
Scripted downloads of all chromosomes were done using a command such as
```
$ seq 1 26 | parallel -j1 ./gfetch cal {}
```

@@ Line 29: / Line 29: @@
 == Genotypic data ==
 <ol>
+<li> Genetic data is downloaded following the instructions at [http://biobank.ctsu.ox.ac.uk/showcase/exinfo.cgi?src=AccessingGeneticData the UK Biobank site].
+<li> Scripted downloads of all chromosomes were done using a command such as
+<pre>
+$ seq 1 26 | parallel -j1 ./gfetch cal {}
+</pre>
 </ol>