T1K for PGx #23

nbiesot · 2023-11-20T11:02:47Z

Hi,

I am trying to use T1K for PGx, following the step-by-step plan described in the vcf_database. Unfortunately, I am not getting the expected results for my samples (for example, I get for the CYP2D6 gene, *4/*86 as output, where I expect *1/*4).

This is the case for both the reference file I created for CYP2D6 according to the step-by-step plan and the reference files in the cyp2d6_idx folder on Git.

What could be possible reasons for not getting the expected outputs?

(The data I am using is from the Genetic Testing Reference Material Coordination Program (GeT-RM). These reference materials contain mutations of clinical importance that have been confirmed by multiple volunteer laboratories using different testing platforms, including for the CYP2D6 gene.)

mourisl · 2023-11-20T16:13:13Z

Do you mean you did not get CYP2D6*1 series in the output? Could you please share the .dat generated from the procedure? Thank you.

nbiesot · 2023-11-21T07:50:22Z

Yes, indeed.
cyp2d6.txt

(I couldn't upload the .dat file, it was not supported)

mourisl · 2023-11-23T02:43:29Z

The txt file looks fine, and I can generate the reference fasta files containing the CYP2D61 or CYP2D61.XXX . So for the *4/*86 and *1/*4 is the genotyping results?

One possible reason is that CYP2D6 is highly homologous to CYP2D7, and you may need to put in some CYP2D7 gene sequences in the reference.

nbiesot · 2023-11-23T07:36:53Z

Thank you for looking into the file!
CYP2D6 is not the only gene I have looked at; I have also examined CYP2C9, CYP2C19, CYP3A5, and CYP4F2. For these genes as well, I do not get the expected output for the 16 samples I tested. If the .dat file looks good, is there another possibility for why I am not getting the expected output for these other genes?

mourisl · 2023-11-23T14:04:00Z

Can you show me your running commands and your genotype.tsv file? Is your data RNA-seq or other sequencing platform?

nbiesot · 2023-11-23T14:18:46Z

The WGS files are available at: https://www.ebi.ac.uk/ena/browser/view/ERR1955327
The command I am using is: run-t1k -f T1K/vcf_database/cyp2d6_idx/cyp2d6_dna_seq.fa -1 ERR1955327_1.fastq.gz -2 ERR1955327_2.fastq.gz --od ERR1955327/cyp2d6 --alleleDigitUnits 1 --alleleDelimiter . -t 16
The output that results from this is:
T1K_ERR1955327_1_genotype.ods

Thank you very much for your effort.

mourisl · 2023-11-23T15:04:09Z

I would recommend concatenating all the dna_seq.fa from cyp genes into a combined fasta file. This way it may resolve reads that are aligned to multiple cyp genes. Another parameter to tune is the "-s" option, the default 0.8 might be to lenient. You may consider trying values like 0.9 and 0.97.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T1K for PGx #23

T1K for PGx #23

nbiesot commented Nov 20, 2023

mourisl commented Nov 20, 2023

nbiesot commented Nov 21, 2023

mourisl commented Nov 23, 2023

nbiesot commented Nov 23, 2023

mourisl commented Nov 23, 2023

nbiesot commented Nov 23, 2023

mourisl commented Nov 23, 2023

T1K for PGx #23

T1K for PGx #23

Comments

nbiesot commented Nov 20, 2023

mourisl commented Nov 20, 2023

nbiesot commented Nov 21, 2023

mourisl commented Nov 23, 2023

nbiesot commented Nov 23, 2023

mourisl commented Nov 23, 2023

nbiesot commented Nov 23, 2023

mourisl commented Nov 23, 2023