Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

names of R1/R2 #153

Closed
nservant opened this issue Apr 27, 2018 · 18 comments
Closed

names of R1/R2 #153

nservant opened this issue Apr 27, 2018 · 18 comments

Comments

@nservant
Copy link
Owner

it seems that sometimes the read names in R1/R2 files are not exactly the same.
if it happens, HiCPro currently launch an error and stops ...

@milkcookie
Copy link

Did you have any suggestion to solve it? Or Which version can avoid it?

@nservant
Copy link
Owner Author

I did not fix it yet.
Could you first check it by looking at the first raw of your fastq files.
And if that's the case, I would suggest to modify the read name of your fastq files.

@milkcookie
Copy link

ok , here is the name of two sequence
@E00492:288:HJLVLCCXY:2:1101:2991:1854 1:N:0:GGCTAC
@E00492:288:HJLVLCCXY:2:1101:2991:1854 2:N:0:GGCTAC
it has '1' and '2' in it.

@milkcookie
Copy link

in merged bam file here is the name
E00492:288:HJLVLCCXY:2:1101:1763:5282
E00492:288:HJLVLCCXY:2:1101:1763:5282
I think is ok
but I run successful in recent days before with the data, now its failed with a new assembled genome.

@nservant
Copy link
Owner Author

indeed.
I would suggest to first save your files somewhere :)
Then something like
perl -pi -e 's/2:N/1:N/' will transform all 2:N into 1:N ...
Or simply remove the last part starting with 1:N ...
N

@nservant
Copy link
Owner Author

btw, can I ask you why type of sequencer it was ??
because so far, I do not know when the sequencer add this flag or not ...

@milkcookie
Copy link

the sam merge terminate after run part of two sam file, becouse of this problem.

@milkcookie
Copy link

it seq by company, to assemble the genome.

@nservant
Copy link
Owner Author

yes because the merge expect to have the R1 and R2 files in the same order.
So to do that, it checks if the read names are the same ... which is not the case here.

@milkcookie
Copy link

I think is novaseq or hiseq2500. I don't know clearly.
I can run successful with another data seq by same company, and the same format....
I suggest you to add some code to skip the error line in bam.

@milkcookie
Copy link

Hi
I found the problem, its becouse the reads num in two fq file is diff. any suggestion to solve it?

@milkcookie
Copy link

Hi:
I try some test today, but when the fastq file has diff num of reads there will meet this problem?
how to avoid this diff happen in bam file by bowtie2? if I can add some args to skip?

@nservant
Copy link
Owner Author

what do you mean. The fastq files must have the same number of lines as this is PE sequencing.
Then, you just have to fix the issue with the read name. It should not affect the number of lines in you fastq ?

@nservant
Copy link
Owner Author

Hi,
I think I fixed it.
In order to test it, would you mind sending me two small SAM files with a few reads ?
Thanks

@milkcookie
Copy link

Hi
I have some PE clean data with diff num of reads.
I don't know why it happen?
can you give me some suggestion?
the problem happle becouse of the diff num of reads, not the seq name?

@nservant
Copy link
Owner Author

Difficult to say. What do you mean by 'clean' ? PE raw data should always be paired with same number of reads in the two files. Could you have access to the files before cleaning ?

@milkcookie
Copy link

yes
I have raw data
the clean data had been trimming adapter and dumping the low quality reads, I don't know what happened, the company do it.
so I want to know is there some tools to correct the reads, so it can be same numbers in R1 and R2.

@nservant
Copy link
Owner Author

should be fixed now in devel (v2.11.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants