Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Vrieze Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
BG Lab 2
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Reading a vcf file === To walk you through how to read a vcf I ran the following command, which will show the contents of the file minus the header, and minus the annotation column. This just cleans things up by removing parts of the file that we don't need. <syntaxhighlight lang="bash"> zgrep -v '##' hu916767_20170324191934.1kgALTallele.withHeader.snpEff.vcf.gz | cut -f-5,9- | head </syntaxhighlight> The output is here: #CHROM POS ID REF ALT FORMAT hu916767_20170324191934 1 82154 rs4477212 A . GT 0/0 1 752566 rs3094315 G A GT 1/0 1 752721 rs3131972 A G GT 0/1 1 768448 rs12562034 G A GT 0/0 1 776546 rs12124819 A G GT 0/1 1 798959 rs11240777 G A GT 1/0 1 800007 rs6681049 T C GT 1/1 1 838555 rs4970383 C A GT 0/0 1 846808 rs4475691 C T GT 0/1 The the columns are # chromosome # position # unique identifier (rs ID) # the reference allele (the allele found in the human reference genome) # the alternate allele (an allele discovered in other individuals) # the FORMAT the individuals genotypes are in (in this case they are coded in the "GT" format, which is 0/0, 0/1, 1/0, or 1/1; believe it or not there are other useful formats). <span style="color:#ff0000"> THIS COLUMN DOES NOT PROVIDE THE INDIVIDUAL'S GENOTYPES AND CAN BE SAFELY IGNORED! </span> # the genotype of this 23andme individual To decode the genotype, you must combine the last column with the REF and ALT allele information. Take the last line, rs4475691. The REF allele is "C", and the ALT allele is "T". The genotype is 0/1, which tells you that one chromosome of this individual carries 0 ALT alleles (i.e., 1 reference allele) and the other chromosome carries 1 ALT allele. So the genotype is C/T. For another example, take rs6681049. The REF allele is T and the ALT is C. The genotype is 1/1. That means that one chromosome of this individual carries 1 ALT allele (i.e., a C) and the other chromosome also carries 1 ALT allele (i.e., a C). So the genotype for this individual at that site is C/C.
Summary:
Please note that all contributions to Vrieze Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
MyWiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)