Musa balbisiana Pisang Klutuk Wulung
This Musa balbisiana “Pisung Klutuk Wulung” resequencing project is a collaboration between 1 Lab. of Fruit Breeding and Biotechnology, Department of Biosystems, Katholieke Universiteit Leuven, Belgium, and 2 The Centre for Research in Biotechnology for Agriculture and the Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia.
Results have been published as Davey MW1 Gudimella R 2, Harikrishna JA 2, Sin LW 2, Khalid N2 and Keulemans W1 (2013) A draft Musa balbisiana genome sequence for molecular genetics in polyploid, inter- and intra-specific Musa hybrids. BMC Genomics 2013, 14:683 doi:10.1186/1471-2164-14-683
Access to the genome annotation:
Musa balbisiana PKW v1
- Gene structure and function information in GFF3 format (version 1, gz file, 5.3MB)
- Nucleotide FASTA format file of all gene coding sequences (version 1, gz file, 14.4MB)
- Amino acid FASTA format file of all gene coding sequences (version 1, gz file, 9.7MB)
- Genome assemblies (version 1, gz file, 103MB)
- de novo assembled gDNA contigs (version 1, gz file, 102MB)
Musa balbisiana PKW v1 filtered CDS
- Nucleotide FASTA format file of all gene coding sequences (version 1, TE-filtered, gz file, 14MB)
- Amino acid FASTA format file of all gene coding sequences (version 1, TE-filtered, gz file, 9.2MB)
Data & methods summary:
- gDNA was isolated from sterile plantlets of the wild diploid M. balbisiana ‘Pisang Klutuk Wulung’ (‘PKW’, BB genome) , obtained from the International Transit Centre, KU Leuven (gene bank number ITC1587)
- 281 million, 100bp paired end Illumina reads were aligned to the reference, doubled haploid ‘Pahang’ genome, and the consensus sequences extracted for annotation and characterization. The mean read coverage is 41.4x, resulting in a consensus PKW genome size of 402.5 Mb, which is 74.9% of the size of the reference ‘ DH Pahang’ genome.
- 96.4% of the reads were also de novo assembled into 180K gDNA contigs, with an N50 of 7,884 bp, an average contig length of 1,883bp, and a max contig length of 152,268 bp.
- PKW gene prediction identified 39,914 unique gene models on the consensus genome. Following functional annotation, 3,276 transposable elements were identified leaving a total of 36,638 protein coding gene sequences, nearly identical to the 36,542 gene models of 'DH Pahang'.
- Using 11 small RNA libraries, 270 known miRNA precursors for the B-genome in 42 families, and 266 in 47 families for the A-genome were predicted. All of the known miRNA families detected in the B-genome were also found to be present in the A-genome. In addition 32 miRNA precurors in 28 new families, specific to Musa were predicted.
- In total 20,657,389 variants relative to ’DH Pahang’ were detected, of which there were 8,738,760 homozygous SNPs (sequence divergence from ‘Pahang’), and 10,130,236 heterozygous SNPs, representing the degree of heterozygosity in PKW. There were 4,880,516 SNPs detected in coding regions.
- A total of 30,559 SSRs were identified in PKW, corresponding to a frequency of 5.7 SSRs/kb genome. The most abundant class were dimeric repeats of all SSRs detected.
Full details can be found in Davey et al 2013, BMC Genomics.
For questions about genome assembly, alignment, annotation and variant detection please contact Mark W. Davey, email@example.com
Enquiries relating to miRNA prediction and repeats annotation, please contact Jennifer Ann Harikrishna (J. A. Harikrishna) firstname.lastname@example.org