-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscalling of SNPs and GQ value of 0 #9086
Comments
Have you tried with a recent version such as 4.6.1.0? 4.1 is quite old and it has been a long time since then. |
Hello, |
Hello, We tried with the new version 4.6.1.0 using haplotyepcaller and GenomicsDBImport and got the same results. It still failed to call some of the heterozygous genotype at high read depth (15x), like we have showed in the previous figure. This time, it assigned the site as ./.. Thank you for your help, Samuel |
Can you post the GVCF entry of that particular site for that particular sample alone? It is possible that the site could have been pruned during reassembly or could have fallen below Standard Call Confidence threshold which is 30 for GenotypeGVCFs therefore emitted as NO_CALL. |
Yes. Please note that this is another site than the one I posted first, because we have plenty of miscallings and only ran the new version for some of our species. GVCF entry : Thanks! |
So GQ is 15 which falls way below the Standard Calling Confidence Threshold of 30. |
Sorry yes my bad. GQ is 0. This is most likely a assembly failure in this region.
parameters as well to get a better pileup detection from HaplotypeCaller. This method has a downside of not being compatible with GVCF output so it is up to you to decide how to proceed. I hope this helps. Regards. |
Thank you for digging through, we appreciate a lot! We will try those suggestions right away. Thanks again for the help! Best regards, Samuel |
Problem report
Affected tool(s) or class(es)
HaplotypeCaller
Affected version : 4.1
Description
We used gatk HaplotypeCaller to find SNPs in our samples, following this command line:
gatk HaplotypeCaller --java-options "-Xmx20g -XX:ParallelGCThreads=10" -R ref_genome.fa -I sample.bam -ERC GVCF -O output_sample.g.vcf.gz
Afterward, we also used GenomicsDBImport, GenotypeGVCFs and VariantFiltration.
However, we noticed quickly some genotypes had a GQ tag equal to 0 despite a very good DP. Looking at the PL, it seemed like gatk could not decide between a homozygous and a heterozygous site, but still called a homozygous site and displayed all reads mapping to the reference allele. Looking at the read mapping in IGV, we noticed that those sites were in fact heterozygous, and therefore miscalled.
Here is an example (we could provide more if needed) :
<style> </style>Site :
Here, genotype of the individual ZCh210C is said to be homozygous (GT 0/0) and have all reads mapping the reference allele (AD 35:0). The GQ is 0, and PL is 0 for both homozygous and heterozygous genotypes.
The reads in IGV for this individual map as follows:
Which clearly shows reads mapping on the alternative allele.
Why is gatk having this behavior?
Thank you for your anwsers,
Best,
Samuel
The text was updated successfully, but these errors were encountered: