Shotgun Genome Sequencing of the Furrow Spider Larinioides Cornutus Using Nanopore Long Reads
John Loaisiga Mora
Co-Presenters: Sam Charlie
College: Dorothy and George Hennings College of Science, Mathematics and Technology
Major: Biology
Faculty Research Mentor: Jesus Ballesteros Chavez
Abstract:
The furrow spider, ("Lariniones cornutus") is a conspicuous orb weaver with an holartic distribution. In the American continent it is widespread across the US and Canada but is most common in the eastern United States. This relatively large spider is commonly found around human habitations, often close to bodies of water. Our team recently surveyed the spider diversity at Kean Skylands campus, with "L. cornutus" being one the most abundant, accessible and easily identifiable species. Whole genome sequencing projects for diverse Eukaryotes typically require significant investments and resources; mostly depending on the combination of short read (Illumina) and long read (Pacbio) sequencing approaches, in combination with HI-C libraries that preserve chromatin structure and help to reduce the complexity of the genome assembly step. Additionally, several RNA sequencing experiments are carried out for annotations purposes. While sequencing costs are relatively low the complexity of such large-scale projects remains prohibitive, restricting a more widespread use in non-model organisms. Here, we investigate the characteristics, quality, coverage and accuracy of reads obtained with Oxford Nanopore long read sequencing technology (ONT) using "L. cornutus", as our test subject. The primary goal focuses on the recovery of the mitochondrial genome, as it is expected to be more abundant in total DNA extractions using conventional protocols. Additionally, the mitochondrial genome is much smaller, less complex and better characterized in terms of synteny and composition than the nuclear genome. First, we identified potential mitochondrial reads using a custom python pipeline and assembled the identified reads with hifiasm. Finally, we compared our assembly completeness and accuracy with reference sequences for this species available in public databases. The overarching goal of this project is to test and evaluate the utility of our general protocol, from sample collection to sequencing and bioinformatics, thus providing the basis for a scalable approach that will allow documenting diverse genomes in non-model organisms.