A new transcriptome assembler has been released. Shanon: http://biorxiv.org/content/early/2016/02/09/039230 and http://sreeramkannan.github.io/Shannon/. It looks quite interesting from an algorithm standpoint and claims to be better than the existing suite of assemblers.. Good deal, I\’m all ears! I\’ll hope to provide a full review (with the lab) in a bit, but I have a few immediate suggestion after kicking the tires yesterday.
A list of software \’issues\’ and suggestions that I have wondered about:
- Help menu: List defaults along with options. E.g., what is defaults value for `K`, `partition_size`, etc
- Why Quorum – claim better in the paper – how was this evaluated? Did you test other error correction software packages, esp. those designed for RNAseq. I test them here: http://biorxiv.org/content/early/2015/12/30/035642
- Allow people to feed in previously error corrected reads. I like (per manuscripts linked above), bfc and Rcorrector. I should be able to use those tools in Shannon. Maybe `–left_corr` and `–right_corr` to pass in corrected reads? EDIT – you can do this by passing in reads in fasta format.
- Checkpoints! Restart a failed run at the last known \’good\’ checkpoint. For instance, I had run fail at 13 hours yesterday – surely I should not have to start from the beginning of the assembly process.
- Any tips for increasing speed? For instance, what if I set the partition size to smaller/larger? EDIT – Right now Shannon is completely unusable for anything but the smallest datasets (IME). Right now, for me, Shannon takes in excess of 120 hours for a 20M read assembly. WAY too long to be useful. The developers are working on this issue.
- How sensitive are results to kmer size? Should we be optimizing kmer length?
- Have you evaluated Shannon assemblies using `DETONATE`/`TransRate`/`BUSCO`? Would be informative to see how they compare to the Trinity assemblies. EDIT – with a very small test (1M reads) the numbers are not as good as Trinity, but let\’s best a larger dataset before we say too much here.
- Unless I missed it, how is Shannon licensed? Might I suggest some version of a MIT/BSD license? EDIT- the license is GPL v3