You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The EMU pipeline requires substantial computational resources, coming primarily from the initial mapping step based on minimap2.
Based on some toyish trying on 10 cores on my laptop (12th Gen Intel(R) Core(TM) i7-1255U), I've got:
Ca 375 CPU core ms per read (median of ca 1400 bp read length)
Ca 2.5 CPU core minutes per "chunked" fastq file med 4000 reads from the instrument
A typical sample with hundreds of such files seem to easily take 10-20 CPU core hours (note that spreading on multiple cores then will cut this down significantly).
Creating this issue to summarize ideas and things we've been trying to optimize resource usage of the pipeline.
Ideas
Run only on the forward strand in the database
Using this flag to emu abundance:
--mm2-forward-only force minimap2 to consider the forward transcript strand only
The EMU pipeline requires substantial computational resources, coming primarily from the initial mapping step based on minimap2.
Based on some toyish trying on 10 cores on my laptop (12th Gen Intel(R) Core(TM) i7-1255U), I've got:
Creating this issue to summarize ideas and things we've been trying to optimize resource usage of the pipeline.
Ideas
emu abundance
:@ryanjameskennedy and @LordRust feel free to fill in here, as I understand you've been looking at this too!
The text was updated successfully, but these errors were encountered: