xlink from github is a package for the unified partial likelihood approach for X-chromosome association on time-to-event/ continuous/ binary outcomes. The expression of X-chromosome undergoes three possible biological processes: X-chromosome inactivation (XCI), escape of the X-chromosome inactivation (XCI-E), and skewed X-chromosome inactivation (XCI-S). Although these expressions are included in various predesigned genetic variation chip platforms, the X-chromosome has generally been excluded from the majority of genome-wide association studies analyses; this is most likely due to the lack of a standardized method in handling X-chromosomal genotype data. To analyze the X-linked genetic association for time-to-event outcomes with the actual process unknown, we propose a unified approach of maximizing the partial likelihood over all of the potential biological processes. The proposed method can be used to infer the true biological process and derive unbiased estimates of the genetic association parameters.
“Xu, Wei, and Meiling Hao.”A unified partial likelihood approach for X‐chromosome association on time‐to‐event outcomes.” Genetic epidemiology 42.1 (2018): 80-94.” (via)
“Han, D., Hao, M., Qu, L., & Xu, W. (2019).”A novel model for the X-chromosome inactivation association on survival data.” Statistical Methods in Medical Research.” (via)
You can install xlink from [github]((https://github.com/qiuanzhu/xlink):
#3. Examples
In the following examples, we choose the model is “survival” model, which could also applied to “linear” model for continuous response and “binary” for fitting logistic regression model.
In the sample data with 10 SNPs and 4 clinic covariates,
ID | OS | OS_time | gender | Age | Smoking | Treatment | snp_1 | snp_2 | snp_3 | snp_4 | snp_5 | snp_6 | snp_7 | snp_8 | snp_9 | snp_10 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0.0335 | 1 | 44.3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 1 | 0.0424 | 1 | 76.9 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 0 | 0.6435 | 1 | 53.7 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 |
4 | 1 | 0.3548 | 0 | 63.1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 |
5 | 1 | 0.0306 | 0 | 29.2 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
6 | 1 | 0.3050 | 1 | 77.5 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
If the Model type is chosen to be XCI and threshold for MAF_v is set to be 0.05, the output for snp_1 with coefficient, P value and loglikelihood information
Covars<-c("Age","Smoking","Treatment")
SNPs<-c("snp_1","snp_2")
output<-xlink_fit(os="OS",ostime="OS_time",snps=SNPs,gender="gender",covars=Covars, option =list(type="XCI",MAF_v=0.05),model="survival",data = Rdata)
Hazard Ratio | Confidence Interval (95%) | P Value | MAF | |
---|---|---|---|---|
snp_1 | 1.6392 | [1.3895,1.9339] | 0.0000000 | 0.2062 |
gender | 0.9529 | [0.7557,1.2015] | 0.6834041 | NA |
Age | 1.0228 | [1.0155,1.0301] | 0.0000000 | NA |
Smoking | 1.3001 | [1.0172,1.6617] | 0.0360414 | NA |
Treatment | 1.2356 | [0.9781,1.561] | 0.0760446 | NA |
Baseline | Full model | Loglik ratio |
---|---|---|
-1493.808 | -1478.774 | 15.03336 |
If the Model type is chosen to be all and threshold for MAF_v is set to be 0.1, the output for snp_1 with coefficient , P value and log-likelihood function information for XCI-E, XCI and XCI-S respectively,
Covars<-c("Age","Smoking","Treatment")
SNPs<-c("snp_1","snp_2")
output<-xlink_fit(os="OS",ostime="OS_time",snps=SNPs,gender="gender",covars=Covars, option =list(type="all",MAF_v=0.05),model="survival",data = Rdata)
For XCI-E model, snp_1 with coefficient, P value and log-likelihood function information
Hazard Ratio | Confidence Interval (95%) | P Value | MAF | |
---|---|---|---|---|
snp_1 | 1.8293 | [1.4411,2.322] | 0.0000007 | 0.2062 |
gender | 1.0885 | [0.8477,1.3978] | 0.5061374 | NA |
Age | 1.0232 | [1.0157,1.0306] | 0.0000000 | NA |
Smoking | 1.3058 | [1.0211,1.67] | 0.0334504 | NA |
Treatment | 1.2063 | [0.9543,1.5249] | 0.1167569 | NA |
Baseline | Full model | Loglik ratio |
---|---|---|
-1493.808 | -1482.332 | 11.47576 |
For XCI model, snp_1 with coefficient, P value and log-likelihood function information
Hazard Ratio | Confidence Interval (95%) | P Value | MAF | |
---|---|---|---|---|
snp_1 | 1.6392 | [1.3895,1.9339] | 0.0000000 | 0.2062 |
gender | 0.9529 | [0.7557,1.2015] | 0.6834041 | NA |
Age | 1.0228 | [1.0155,1.0301] | 0.0000000 | NA |
Smoking | 1.3001 | [1.0172,1.6617] | 0.0360414 | NA |
Treatment | 1.2356 | [0.9781,1.561] | 0.0760446 | NA |
Baseline | Full model | Loglik ratio |
---|---|---|
-1493.808 | -1478.774 | 15.03336 |
For XCI-S model, snp_1 with coefficient , log-likelihood function information and gamma estimation
Hazard Ratio | Confidence Interval (95%) | P Value | MAF | |
---|---|---|---|---|
snp_1 | 1.6596 | [1.4031,1.9629] | 0.0000000 | 0.2062 |
gender | 0.9277 | [0.7361,1.1692] | 0.5250786 | NA |
Age | 1.0228 | [1.0155,1.0301] | 0.0000000 | NA |
Smoking | 1.2989 | [1.0162,1.6602] | 0.0367447 | NA |
Treatment | 1.2374 | [0.9795,1.5632] | 0.0741144 | NA |
Baseline | Full model | Loglik ratio |
---|---|---|
-1493.808 | -1478.709 | 15.09833 |
Gamma |
---|
0.8707407 |
The best model for snp_1 among model type XCI-E, XCI and XCI-S by using the AIC is
Best model by AIC |
---|
XCI |
By setting the threshold for pv_thold, the select output become
Covars<-c("Age","Smoking","Treatment")
SNPs<-c("snp_1","snp_2","snp_3")
result<-xlink_fit(os="OS",ostime ="OS_time",snps=SNPs,gender ="gender",covars=Covars,
option =list(type="all",MAF_v=0.05), model="survival", data = Rdata)
select_output(input=result,pv_thold=10^-5)
SNP | Hazard Ratio | Confidence Interval (95%) | P Value | MAF | Best model | Gamma |
---|---|---|---|---|---|---|
snp_1 | 1.6392 | [1.3895,1.9339] | 0 | 0.2062 | XCI | NA |
snp_3 | 1.5596 | [1.3661,1.7805] | 0 | 0.3638 | XCI-S | 1.538163 |