Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carolyn's feedback #2

Open
0x644BE25 opened this issue Sep 28, 2018 · 0 comments
Open

Carolyn's feedback #2

0x644BE25 opened this issue Sep 28, 2018 · 0 comments

Comments

@0x644BE25
Copy link

0x644BE25 commented Sep 28, 2018

I think your idea of pre-processing index reads in order to correct a certain number of mismatches is interesting! You could potentially avoid throwing out a lot of reads that way, but you'd need to ensure that no index read would ever get "corrected" to the wrong read, like some algorithm that makes sure that there's only one "right" index that an ambiguous index could be interpreted as.

I'm somewhat concerned that you may be building some untenably large dictionaries. If you get errors where it won't allocate enough space for them, you may want to switch over to a line-by-line system for reading the files, so that you don't have to keep it all in working memory.

I like how you track the different kinds of bad reads and report them separately at the end!

I think it would be good to flesh out your sub-functions a little more thoroughly, as far as inputs, returns, effects (file I/O, etc.) and naming them so that you can reference them more easily in your larger algorithm and improve clarity. That would also allow you to generate some quick tests specific to each function to ensure that it's doing what you want it to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant