euGenes/Arthropods About Arthropods EvidentialGene DroSpeGe

EvidentialGene gene assembly for Zea mays corn plant

version 5, Evigene merged assemblies of these three Illumina RNA-seq sets:
  1. JGI-2014, NCBI PRJNA168080, from maize seedling, 250 M Illumina pairs, doi:10.1038/srep04519
  2. CSHL-2016, NCBI PRJEB10406, from several maize tissues,
  3. UCBerkeley-2016, NCBI PRJNA306885, from several maize tissues,
      Name                                         Last modified       Size  

[DIR] Parent Directory 26-Jan-2017 21:02 - [   ] evg5corn161012.aa.gz 10-Oct-2016 14:28 71.1M [TXT] evg5corn161012.aa.qual 10-Oct-2016 15:05 35.4M [   ] evg5corn161012.cds.gz 10-Oct-2016 14:10 110M [TXT] evg5corn161012.cds.qual 11-Oct-2016 12:35 45.6M [TXT] evg5corn161012.genetable.txt 11-Oct-2016 13:07 61.5M [   ] evg5corn161012.mrna.gz 10-Oct-2016 15:01 149M [TXT] evg5corn161012.mrna.qual 10-Oct-2016 15:07 35.6M [TXT] evg5corn161012.readme.txt 12-Oct-2016 15:13 4k [TXT] evg5corn161012_other_refsorgum16_aascore.txt 12-Oct-2016 14:08 6.1M [TXT] evg5corn161012_v34chr.mapeq.txt 11-Oct-2016 21:31 13.9M [TXT] evg5corn161012_v34chr.mapinfo.txt 12-Oct-2016 13:55 1k [   ] evg5corn161012_v3chr.gff3.gz 12-Oct-2016 12:11 57.5M [TXT] evg5corn161012_v3chr.maploc.txt 11-Oct-2016 21:28 21.9M [   ] evg5corn161012_v4chr.gff3.gz 12-Oct-2016 12:29 56.9M [TXT] evg5corn161012_v4chr.maploc.txt 11-Oct-2016 21:26 35.0M [DIR] evg5corn_examples/ 11-Nov-2016 14:13 - [TXT] evg5corn_subset_info.txt 12-Oct-2016 13:35 4k


evg5corn, 2016.09.09
  Zea mays gene set from EvidentialGene, merge of 3 separate Evigene assemblies of 3 RNA sets, 
  each de-novo assembled with four gene assemblers, from 3 Illumina RNA-seq sets 
  (JGI-2014 PRJNA168080, CSHL-2016 PRJEB10406, UCBerkeley-2016 PRJNA306885)

subsets: 
  Zeamay4EVm, evg4corn, data of JGI-2014 root tissue, high identity paralog (ohnolog) loci resolved with chr mapping
  Zmcshl5EVm, evg5corncsh, data of CSHL-2016 PRJEB10406, Illumina RNA samples of 6 tissues, 
     from Gramene Pac-Bio gene assembly, 
  Zmucb5EVm,  evg5cornucb, data of UCBerkeley-2016 PRJNA306885, Illumina RNA samples of 4 tissues

evg5corn is merge of evg4corn, filling in missed or partial genes from 2 added tissue sample experiments
  (all Zea_mays B73 cultivar)

evg5corn161012 is a first release data set, dated 2016.10.12, may be updated with further checks.
Gene sequences, with suffix 
  aa = protein, cds = coding seq, mrna = transcript seq, 
  aa.qual = table of protein IDs, sizes

genetable.txt = gene class and attribute table, per transcript, with public ids
Gene class numbers, in genetable.txt
  45689 main, 6315 noclass  == loci, noclass have no alternates
  168285 alt, 10889 altfrag == alternates of main class
Some of  these are likely non-coding, 4436 loci, 12037 transcripts, based on CDS/UTR properties.
  not included: 11877 dropalt, 1105 dropaltfrag, 5857 dropmain, 1127 dropnoclass

Gene locations
  gff3 = gene locations on chromosomes, GFF format tables, on v3chr and v4chr
  maploc.txt = tabular location summary per transcript (gene span, coverage, exons)
  v34chr.mapeq.txt = match of locations on V3/V4 chromosomes, equal and unequal mappings
  v3chr = Version 3 chromosome assembly, maize B73_RefGen_v3 from NCBI (2013?)
  v4chr = Version 4 chromosome assembly, Zea_mays.AGPv4.32 from Ensembl/Gramene (2016)

Mapped genes
  Gene loci n=48478,  mRNA n=225332 ; Split-mapped n=2715; alts=176854, 
  ave 4.6 mRNA/locus, approx. 29200 loci, 60%, have 2+ mRNA
There remain some 4000s mRNA un-mapped in this data set, but I've alternate mappings 
for many of those, to be updated.

Protein alignment summary

  Zea_mays x REFERENCE Sorghum (Sbicolor_313 v3.1 of JGI Phytozome, ngene=31054)
  Summary per method, min 25 %Align, for 28951/34211 84.6% found ref transcripts
Source nHits   pHits   nTids   pIdenH  pAlgnH  AlignH  TlenH   pAlgnT  AlignT 
z45EVm 28500   83.3    21993   81.4    94.5    431.3   470.1   93.0    424.6 
z4Gram 28048   82.0    21471   80.4    93.9    421.8   464.1   91.0    408.7
z3NCBI 27374   80.0    20711   79.8    94.1    423.8   455.8   89.0    400.7 
  evg5corn161012_other_refsorgum16_aascore.txt = per gene alignment scores of this summary
-----------------------

  Zea_mays x REFERENCE Arabidopsis thal. model (Araport 2015 version, ngene=28902)
  Summary per method, min 25 %Align, for 23415/28598 81.9% found ref transcripts
Source  nHits   pHits   nTids   pIdenH  pAlgnH  AlignH  TlenH   pAlgnT  AlignT 
z45EVm  23148   80.9    15318   57.1    92.8    433.2   498.9   91.7    428.3  
z4Gram  23007   80.4    13413   53.3    90.9    419.3   497.5   89.3    412.0  
z3NCBI  23063   80.6    13601   51.7    90.3    411.4   490.6   89.0    405.2 
 .. z45EVm subsets ..
z4EVm   22890   80.0    13930   55.6    91.8    427.9   502.3   89.7    418.3  
z5uEVm  22728   79.5    14731   53.8    91.3    424.2   506.7   88.7    411.8  
z5cEVm  22716   79.4    13949   52.9    90.4    416.6   488.5   87.7    404.2  
-------
  z45EVm = evg5corn merged gene set,
     z4EVm, evg4corn of JGI14 root rna, 
     z5cEVm,evg5corncsh = CSHL16 rna 6 tissues, z5uEVm,evg5cornucb = UCB16 rna 3 tissues
  z4Gram = gene set maize_v4 from Gramene/ENSembl (Zea_mays.AGPv4.32, MAKER modelled on chr assembly)
  z3NCBI = gene set maize_b73refgen3v from NCBI reference genomes (Zea_mays/GCF_000005005.1_B73_RefGen_v3)


Developed at the Genome Informatics Lab of Indiana University Biology Department