You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HiC-Pro does not perform deduplication even when RMDUP=1 is set in the config file unless the 'merge-persample' option is used, but this is not run by default and the option to do is not documented on the github page
#307
Closed
gbonora opened this issue
Jan 27, 2020
· 2 comments
I recently realized that HiC-Pro (v2.11.1) does not deduplicate read pairs even when RMDUP=1 is set in the config file unless the 'merge-persample' option is also used. However, the 'merge-persample' analysis step option is not run by default and the option to do is not documented in the 'How to use it ?' section on the github page, although it is described by the HiC-Pro's help (see attached slide).
I think it would be helpful to describe the 'merge-persample' analysis step option under the 'How to use it ?' section on you github page and to make it clear that this analysis step option is necessary for deduplication.
Hi,
Indeed the duplicates removal is performed at the merge-persample step of the pipeline.
But you're right, there is mistake in the help page.
I will change that for the next version
I thought I was having this same issue because the .Rstat file Valid_interaction_pairs number appears to include duplicate reads in the total, but if you check the .mergestat file it specifies valid_interaction and valid_interaction_rmdup totals which reflects the number before and after duplicates are removed. If you wc -l the .allValidPairs file it should match the rmdup file if duplicates are removed. I just ran with -s merge_persample to double check and the .allValidPairs file resulting from that has the same number of reads as the rmdup total and the original .allValidPairs file.
Hi,
I recently realized that HiC-Pro (v2.11.1) does not deduplicate read pairs even when RMDUP=1 is set in the config file unless the 'merge-persample' option is also used. However, the 'merge-persample' analysis step option is not run by default and the option to do is not documented in the 'How to use it ?' section on the github page, although it is described by the HiC-Pro's help (see attached slide).
I think it would be helpful to describe the 'merge-persample' analysis step option under the 'How to use it ?' section on you github page and to make it clear that this analysis step option is necessary for deduplication.
Thanks.
gb_20200123.pdf
The text was updated successfully, but these errors were encountered: