Genotyping, characterization, and imputation of known and novel CYP2A6 structural variants using SNP array data

Langlois, Alec W R and El-Boraie, Ahmed and Pouget, Jennie G and Cox, Lisa Sanderson and Ahluwalia, Jasjit S and Fukunaga, Koya and Mushiroda, Taisei and Knight, Jo and Chenoweth, Meghan J and Tyndale, Rachel F (2023) Genotyping, characterization, and imputation of known and novel CYP2A6 structural variants using SNP array data. Journal of Human Genetics, 68. pp. 533-541. ISSN 1434-5161

[thumbnail of CYP2A6SV_text_clean_mar22_AL]
Text (CYP2A6SV_text_clean_mar22_AL)
CYP2A6SV_text_clean_mar22_AL.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (294kB)


CYP2A6 metabolically inactivates nicotine. Faster CYP2A6 activity is associated with heavier smoking and higher lung cancer risk. The CYP2A6 gene is polymorphic, including functional structural variants (SV) such as gene deletions (CYP2A6*4), duplications (CYP2A6*1 × 2), and hybrids with the CYP2A7 pseudogene (CYP2A6*12, CYP2A6*34). SVs are challenging to genotype due to their complex genetic architecture. Our aims were to develop a reliable protocol for SV genotyping, functionally phenotype known and novel SVs, and investigate the feasibility of CYP2A6 SV imputation from SNP array data in two ancestry populations. European- (EUR; n = 935) and African- (AFR; n = 964) ancestry individuals from smoking cessation trials were genotyped for SNPs using an Illumina array and for CYP2A6 SVs using Taqman copy number (CN) assays. SV-specific PCR amplification and Sanger sequencing was used to characterize a novel SV. Individuals with SVs were phenotyped using the nicotine metabolite ratio, a biomarker of CYP2A6 activity. SV diplotype and SNP array data were integrated and phased to generate ancestry-specific SV reference panels. Leave-one-out cross-validation was used to investigate the feasibility of CYP2A6 SV imputation. A minimal protocol requiring three Taqman CN assays for CYP2A6 SV genotyping was developed and known SV associations with activity were replicated. The first domain swap CYP2A6-CYP2A7 hybrid SV, CYP2A6*53, was identified, sequenced, and associated with lower CYP2A6 activity. In both EURs and AFRs, most SV alleles were identified using imputation (>70% and >60%, respectively); importantly, false positive rates were <1%. These results confirm that CYP2A6 SV imputation can identify most SV alleles, including a novel SV.

Item Type:
Journal Article
Journal or Publication Title:
Journal of Human Genetics
Uncontrolled Keywords:
Research Output Funding/yes_externally_funded
?? yes - externally fundedgeneticsgenetics(clinical) ??
ID Code:
Deposited By:
Deposited On:
21 Apr 2023 14:00
Last Modified:
29 May 2024 01:46