Scaffolding algorithm using multiple referencegenomes: a case study of the rhizobium ecuadorensecnpso 671t
Recently, we started to realize the long-term consequences of artificial fertilizers. Besides, understanding the relationships between plants and micro-organisms in the soil (such as fungus -Mycorrhiza- and bacteria -Rhizobacteria-) has become the center of numerous studies looking forward to feedin...
Autor principal: | Mercado, Hugo Mauricio Pena |
---|---|
Formato: | Dissertação |
Idioma: | Inglês |
Publicado em: |
Universidade Tecnológica Federal do Paraná
2020
|
Assuntos: | |
Acesso em linha: |
http://repositorio.utfpr.edu.br/jspui/handle/1/5439 |
Tags: |
Adicionar Tag
Sem tags, seja o primeiro a adicionar uma tag!
|
Resumo: |
Recently, we started to realize the long-term consequences of artificial fertilizers. Besides, understanding the relationships between plants and micro-organisms in the soil (such as fungus -Mycorrhiza- and bacteria -Rhizobacteria-) has become the center of numerous studies looking forward to feeding a 9.8 billion people world1. An approach to further study those organisms is the sequencing of its DNA. However, when these sequencing technologies only allow us to generate short-reads, this becomes a challenging computational problem(due to the presence of repeated sequences and non-uniform coverage). Here we present a scaffolding algorithm using multiple-reference genomes, that can discriminate between misassemblies and generate putative plasmids and chromosomes. Although there are many scaffolding algorithms already2, we found none of them take as input genomes in the contig stage, even though these genomes might also contain useful information. Furthermore, these scaffolders only take care of the assembly of scaffolds and neglect the possible introduction of misassemblies due to the use of graphs and heuristics. Our algorithm offers an alternative for more advanced analysis of genomes, and the possibility to personalize the outputted scaffolds according to specific needs. We hope our algorithm could help identify symbiotic plasmids within genomes, by finding homologous in reference genomes. Besides, the generalization of scaffolding can be brought not only to prokaryotes but also to larger genomes such as eukaryotes. |
---|