Aedes and Anopheles mosquito EvidentialGene setsEvidentialGene is a genome informatics project/pipeline for gene set construction that has a measurably high accuracy and completeness rate, compared with other gene informatics methods used for animals and plants. See http://eugenes.org/EvidentialGene/ or https://sourceforge.net/projects/evidentialgene/
Gene orthology accuracy and completeness, measured with protein homology to reference species genes, for gene sets of 2 species of malaria vector mosquito Anopheles, and yellow fever/zika virus vector mosquito Aedes aegypti are summarized here for EvidentialGene in comparison with recently published gene sets using now popular gene prediction/assembly methods (MAKER, Trinity and related methods).
For both species, EvidentialGene method used the published RNA-seq, assembled it with 4 gene assemblers, then reduced to a concise and accurate locus/alternate gene set. In these 3 tests, Evigene produced the more accurate gene sets, with minimal time and effort. The RNA data sets used were smaller than recommended for complete gene set reconstruction, and additional effort + data will improve these genes.
The software pipeline pair of MAKER and Trinity form a common recipe now for genome biologists. Those scientists don't realize that greater accuracy is possible and easier to obtain, I suspect.
Aedes aegypti yellow fever vector mosquito
Aedes_aegypti x REFERENCE Highly conserved (BUSCO drosmel, nr=3055) Evigene PubTrinVb3 Vecbase3 found 99.5% 98.6% 98.3% align 91.3% 86.5% 85.1% best 42.3% 5.2% 3.0% equal 52% Aedes_aegypti x REFERENCE Drosophila mel. model (nr=11146) Evigene PubTrinVb3 Vecbase3 found 99.0% 97.5% 97.1% align 86.4% 82.4% 81.1% best 44.0% 9.2% 6.1% equal 47% Aedes_aegypti x REFERENCE Anopheles gambia/AGAP Evigene PubTrinVb3 Vecbase3 found 99.1% 97.3% 96.6% align 94.3% 89.7% 87.2% best 44.3% 10.4% 8.5% equal 45%
PubTrinVb3 = https://doi.org/10.1186/s12864-015-2239-0 ; Matthews et al. BMC Genomics (2016) 17:32, The neurotranscriptome of the Aedes_aegypti mosquito
Vecbase3 = Aedes-aegypti-Liverpool_PEPTIDES_AaegL3.3 gene set of vectorbase.org
Aedes PubTrinVb3 ref used Trinity denovo rna-assembler, cufflinks genome rna-assembler, and PASA EST-gene construction pipeline. RNA-seq of this paper is source for EVm gene construction. Evigene version evg12aedes, 2016.04.08, improves a subset of ortholog genes with evg2aedes assembly, data of SRP047470,SRP046160 (doi:10.1126/science.aaa2850).
Anopheles species malaria vector mosquito
Highly conserved REFERENCE (BUSCO drosmel, nr=3041) Anopheles_funestus Anopheles_albimanus Evigene MAKER Trinity Evigene MAKER Trinity found 99.8% 98.9% 98.7% 98.6% 98.7% 97.4% align 89.0% 85.1% 83.7% 87.2% 84.9% 82.4% best 33.4% 6.9% 3.1% 39.7% 11.3% 4.6% equal 60% 49% Drosophila mel. model REFERENCE (nr=11043) Anopheles_funestus Anopheles_albimanus Evigene MAKER Trinity Evigene MAKER Trinity found 98.8% 97.8% 97.2% 96.5% 97.8% 95.6% align 83.9% 80.5% 79.3% 81.3% 80.6% 78.5% best 38.6% 10.8% 4.0% 40.3% 17.4% 5.7% equal 50% 42% Anopheles gambia REFERENCE (tr total=14870, locus total=12994) Anopheles_funestus Anopheles_albimanus Evigene MAKER Trinity Evigene MAKER Trinity found 98.9% 98.6% 97.7% 96.4% 98.2% 96.0% align 96.9% 93.1% 90.2% 91.0% 91.2% 85.0% best 39.9% 12.4% 3.7% 41.5% 19.5% 6.4% equal 48% 39%
Anopheles ref = MAKER gene source, used Trinity but not public, I redid Trinity assembly.
https://doi.org/10.1126/science.1258522 ; Highly evolvable malaria vectors:the genomes of 16 Anopheles mosquitoes
Statistics:found = % reference proteins with significant alignment to test gene sets
align = % alignment of target proteins sets to reference proteins
best = % pairwise count of best alignment of two target gene sets to reference
Evigene ref: Gilbert, Donald (2013) Gene-omes built from mRNA seq not genome DNA.
7th annual arthropod genomics symposium. Notre Dame.