Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Vrieze Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
GSCAN dbGaP
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Stroke== ===Phenotypes=== <syntaxhighlight lang="rsplus"> ### Date: Feb 13 2017 ### Author: Scott Vrieze options(stringsAsFactors=F) ### Load in dataset ### ninds <- read.table("/work/KellerLab/GSCAN/dbGaP/CIDR_StrokeGenetics/PhenoGenotypeFiles/RootStudyConsentSet_phs000615.CIDR_International_StrokeGenetics.v1.p1.c3.GRU-NPU/PhenotypeFiles/phs000615.v1.pht003307.v1.p1.c3.IschemicStroke_Subject_Phenotypes.GRU-NPU.txt.gz", header = T, sep="\t") ### The file reads into R incorrectly because of a weird trailing tab ### character in the data file, so use the below code to shift column ### names to the correct column. names(ninds)[1] <- "XXX" ninds$dbGaP_Subject_ID <- row.names(ninds) ninds$smokingStatus <- NULL names(ninds) <- c(names(ninds)[2:17], "smokingStatus", "dbGaP_Subject_ID") ### subset the only variables needed pheno <- subset(ninds, select=c("subject_id", "smokingStatus", "age", "gender", "AFFECTION_STATUS")) ###ββββββββββββββββββββ### ### SMOKING INITIATION ### ###ββββββββββββββββββββ### ### NINDS variable is βsmokingStatusβ ### Variables are βNEVERβ, βFORMERβ, βCURRENTβ, βNOβ, βUNKNOWNβ, and blank. ### ### βCURRENT' defined as cigarette smoking within last 30 days, ### 'FORMER' defined as ### more than 100 cigarettes in one's lifetime ### but no smoking within the last 30 ### days; βNEVER' defined as ### less than 100 cigarettes smoked in one's lifetime. ### table(pheno$smokingStatus) ### ### CURRENT FORMER NEVER NO UNKNOWN ### 63 727 1137 2397 4 251 si <- pheno$smokingStatus si[si == "CURRENT" | si == "FORMER"] <- 2 si[si == "NEVER" | si == "NO"] <- 1 si[si != "1" & si != "2"] <- NA si <- as.numeric(si) ### table(si) ### ### 1 2 ### 2401 1864 ###ββββββββββββββββββββ-### ### SMOKING CESSATION ### ###ββββββββββββββββββββ-### ### NINDS variable is βsmokingStatusβ ### Variables are βNEVERβ, βFORMERβ, βCURRENTβ, βNOβ, βUNKNOWNβ, and blank. ### ### βCURRENT' defined as cigarette smoking within last 30 days, ### 'FORMER' defined as more than 100 cigarettes in one's lifetime but ### no smoking within the last 30 days; βNEVER' defined as less than ### 100 cigarettes smoked in one's lifetime. ### table(pheno$smokingStatus) ### ### CURRENT FORMER NEVER NO UNKNOWN ### 63 727 1137 2397 4 251 sc <- pheno$smokingStatus sc[sc == "CURRENT"] <- 2 sc[sc == "FORMER"] <- 1 sc[sc != "1" & sc != "2"] <- NA sc <- as.numeric(sc) ### table(pheno$V5) ### ### 1 2 ### 1137 727 ###ββββββββββ### ### GENDER ### ###ββββββββββ### ### NINDS variable is βgenderβ ### Variables are βFβ and βMβ ### table(pheno$gender) ### ### F M ### 2627 1952 sex <- pheno$gender sex[sex == "M"] <- 1 sex[sex == "F"] <- 2 sex <- as.numeric(sex) ###----------------### ### Write to files ### ###----------------### ### This study uses the "subject_id" ID field in the genotype files, ### so use that here, instead of the dbGaP_Subject_ID phenotypes <- data.frame(fid = pheno$subject_id, iid = pheno$subject_id, patid = "x", matid = "x", sex = sex, si = si, sc = sc) write.table(phenotypes, "/work/KellerLab/vrieze/GSCAN/GWAS/summary_stats_generated_internally/Stroke/GSCAN_STROKE_phenotypes.ped", row.names=F, quote = F, sep="\t") covariates <- data.frame(fid = pheno$subject_id, iid = pheno$subject_id, patid = "x", matid = "x", sex = sex, age = pheno$age, age2 = pheno$age^2, AFFECTION_STATUS = pheno$AFFECTION_STATUS) write.table(covariates, "/work/KellerLab/vrieze/GSCAN/GWAS/summary_stats_generated_internally/Stroke/GSCAN_STROKE_covariates.ped", row.names=F, quote = F, sep="\t") ### save table </syntaxhighlight>
Summary:
Please note that all contributions to Vrieze Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
MyWiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)