Supporting data for "Finding Nemo: Hybrid assembly with Oxford Nanopore and Illumina reads greatly improves the Clownfish (Amphiprion ocellaris) genome assembly"

Dataset type: Genomic, Transcriptomic
Data released on December 22, 2017

Tan MH; Austin CM; Hammer MP; Lee YP; Croft LJ; Gan HM (2017): Supporting data for "Finding Nemo: Hybrid assembly with Oxford Nanopore and Illumina reads greatly improves the Clownfish (Amphiprion ocellaris) genome assembly" GigaScience Database. http://dx.doi.org/10.5524/100397

DOI10.5524/100397

Some of the most widely recognised coral-reef fishes are clownfish or anemonefishes, members of the family Pomacentridae (subfamily: Amphiprioninae). They are popular aquarium species due to their bright colours, adaptability to captivity and fascinating behavior. Their breeding biology (sequential hermaphrodites) and symbiotic mutualism with sea anemones have attracted much scientific interest. Moreover, there are some curious geographic-based phenotypes which warrant investigation. Leveraging on the advancement in Nanopore long read technology, we report the first hybrid assembly of the clown anemonefish (Amphiprion ocellaris) genome utilizing Illumina and Nanopore reads, further demonstrating the substantial impact of modest long read sequencing data sets on improving genome assembly statistics. We generated 43 Gb of short Illumina reads and 9 Gb of long Nanopore reads representing an approximate genome coverage of 54× and 11×, respectively, based on the range of estimated k-mer-predicted genome sizes of between 791 to 967 Mbp. The final assembled genome size is contained in 6,404 scaffolds with an accumulated length of 880 Mb (96.3% BUSCO-calculated genome completeness). Compared to the Illumina-only assembly, the hybrid approach generated 94% fewer scaffolds with 18-fold increase in N50 length (401 kb) and increased the genome completeness by an additional 16%. A total of 27,240 high quality protein-coding genes were predicted from the clown anemonefish, 26,211 (96%) of which were annotated functionally with information from either sequence homology or protein signature searches. We present the first genome of any anemonefish and demonstrate the value of low coverage (~11×) long Nanopore reads sequencing in improving both genome assembly contiguity and completeness. The near-complete assembly of the A. ocellaris genome will be an invaluable molecular resource for supporting a range of genetic, genomic and phylogenetic studies specifically for clownfish and more generally for other related fish species of the family Pomacentridae.

Additional details

Read the peer-reviewed publication(s):

(PubMed: 29342277)

Accessions (data generated as part of this study):

GENBANK: NXFZ00000000
BioProject: PRJNA407816

Accessions (data referenced by this study):

BioProject: PRJNA374650





Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
SAMN0632991280972 clown anemonefishAmphiprion ocellaris Description:Male Amphiprion ocellaris
Sex:male
Age:1 year
...
+
SAMN0632991380972 clown anemonefishAmphiprion ocellaris Description:Female Amphiprion ocellaris
Sex:female
Age:1 year
...
+
SAMN0766476580972 clown anemonefishAmphiprion ocellaris Description:Amphiprion ocellaris black and white c...
Sex:not determined
Age:not provided
...
+
SAMN0766476680972 clown anemonefishAmphiprion ocellaris Description:Amphiprion ocellaris black and white c...
Sex:not determined
Age:not provided
...
+
SAMN0766476780972 clown anemonefishAmphiprion ocellaris Description:Amphiprion ocellaris black and white c...
Sex:not determined
Age:not provided
...
+
SAMN0766476780972 clown anemonefishAmphiprion ocellaris Description:Amphiprion ocellaris black and white c...
Sex:not determined
Age:not provided
...
+
Displaying 1-6 of 6 Sample(s).




File NameSample IDData TypeFile FormatSizeRelease Date 
AnnotationUNKNOWN640.17 MB2017-12-19
Coding sequenceFASTA17.96 MB2017-12-19
Sequence assemblyFASTA227.48 MB2017-12-19
Sequence assemblyUNKNOWN360.45 MB2017-12-19
Annotationarchive135.18 MB2017-12-19
Annotationarchive102.78 MB2017-12-19
Sequence assemblyFASTA283.46 MB2017-12-19
AnnotationTSV9.28 MB2017-12-19
AnnotationTAR48.29 KB2017-12-19
Protein sequenceFASTA8.32 MB2017-12-19
Displaying 1-10 of 13 File(s).
Funding body Awardee Award ID Comments
Monash University Christopher M. Austin and Han Ming Gan Monash University Malaysia
Deakin University Christopher M. Austin and Han Ming Gan
Date Action
December 20, 2017 Dataset publish
December 20, 2017 Link updated : BioProject:PRJNA407816
December 21, 2017 Title updated from : Finding Nemo: Hybrid assembly with Oxford Nanopore and Illumina reads greatly improves the Clownfish (Amphiprion ocellaris) genome assembly
March 30, 2018 Manuscript Link added : 10.1093/gigascience/gix137
November 10, 2022 Manuscript Link updated : 10.1093/gigascience/gix137