데이터셋 상세
미국
Genome Sequence Data Set01
The fasta files (Genome_Set01.zip) contain the reference-assisted de novo assemblies (as contigs) of four Campylobacter spp. isolates. The table contains rows as isolates (yellow) and columns as attributes (green) for each individual genome. This dataset is associated with the following publication: Gomez-Alvarez, V., N. Ashbolt, J. Griffith, J. Santo Domingo, and J. Lu. Whole-Genome Sequencing of Four Campylobacter strains Isolated from Gull Excreta collected from Hobie Beach (Oxnard, CA, USA). Microbiology Resource Announcements. American Society for Microbiology, Washington, DC, USA, 8(32): e00560-19, (2019).
데이터 정보
연관 데이터
Genome Sequence Data Set01
공공데이터포털
The fasta files (Genome_Set01.zip) contain the reference-assisted de novo assemblies (as contigs) of four Campylobacter spp. isolates. The table contains rows as isolates (yellow) and columns as attributes (green) for each individual genome. This dataset is associated with the following publication: Gomez-Alvarez, V., N. Ashbolt, J. Griffith, J. Santo Domingo, and J. Lu. Whole-Genome Sequencing of Four Campylobacter strains Isolated from Gull Excreta collected from Hobie Beach (Oxnard, CA, USA). Microbiology Resource Announcements. American Society for Microbiology, Washington, DC, USA, 8(32): e00560-19, (2019).
Genome Sequence Data Set01
공공데이터포털
The fasta files (Genome_Set01.zip) contain the reference-assisted de novo assemblies (as contigs) of three Escherichia coli isolates. The table contains rows as isolates (yellow) and columns as attributes (green) for each individual genome. This dataset is associated with the following publication: Gomez-Alvarez, V., and J. Hoelle-Schwalbach. Draft Genome Sequences of Antibiotic-Resistant Escherichia coli Isolates from U.S. Wastewater Treatment Plants. Microbiology Resource Announcements. American Society for Microbiology, Washington, DC, USA, 8(23): e00351-19, (2019).
Genome Sequence Data Set01
공공데이터포털
The fasta files (Genome_Set01.zip) contain the reference-assisted de novo assemblies (as contigs) of seven Legionella pneumophila subps. pneumophila isolates. The table contains rows as isolates (yellow) and columns as attributes (green) for each individual genome. This dataset is associated with the following publication: Gomez-Alvarez, V., L. Boczek, D. King, A. Pemberton, S. Pfaller, M. Rodgers, J. Santodomingo, and R. Revetta. Draft Genome Sequences of Seven Legionella pneumophila Isolates from a Hot Water System of a Large Building. Microbiology Resource Announcements. American Society for Microbiology, Washington, DC, USA, 8(18): e00384-19, (2019).
FastGroup: A program to dereplicate libraries of 16S rDNA sequences
공공데이터포털
Background Ribosomal 16S DNA sequences are an essential tool for identifying and classifying microbes. High-throughput DNA sequencing now makes it economically possible to produce very large datasets of 16S rDNA sequences in short time periods, necessitating new computer tools for analyses. Here we describe FastGroup, a Java program designed to dereplicate libraries of 16S rDNA sequences. By dereplication we mean to: 1) compare all the sequences in a data set to each other, 2) group similar sequences together, and 3) output a representative sequence from each group. In this way, duplicate sequences are removed from a library. Results FastGroup was tested using a library of single-pass, bacterial 16S rDNA sequences cloned from coral-associated bacteria. We found that the optimal strategy for dereplicating these sequences was to: 1) trim ambiguous bases from the 5' end of the sequences and all sequence 3' of the conserved Bact517 site, 2) match the sequences from the 3' end, and 3) group sequences >=97% identical to each other. Conclusions The FastGroup program simplifies the dereplication of 16S rDNA sequence libraries and prepares the raw sequences for subsequent analyses.
Data from: Draft genome sequences of eight streptogramin-resistant Enterococcus species isolates from animal and environmental sources in the United States
공공데이터포털
,Draft genome sequences of five Enterococcus faecium, two Enterococcus hirae, and one Enterococcus gallinarum from enviromental sources and chicken carcass rinsates. Isolates were selected for their resistance to the streptogramin antibiotic, Quinupristin-Dalfopristin and were all collected in the United States between 2001 and 2004. Antimicrobial resistance genes were identified conferring resistance to the macrolide-lincosamide-streptogramins, aminoglycosides, tetracycline, beta-lactams, and glycopeptides.,,
Data from: A High-Quality Genome Assembly from a Single, Field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System
공공데이터포털
,A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies, however, long-read methods have historically had greater input DNA requirements and higher costs than next generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female Spotted Lanternfly (Lycorma delicatula) using a single PacBio SMRT Cell. The Spotted Lanternfly is an invasive species recently discovered in the northeastern United States, threatening to damage economically important crop plants in the region. The DNA from one individual female specimen collected in Reading, Berks County, Pennsylvania was used to make one standard, size-selected library with an average DNA fragment size of ~20 kb. The library was run on one Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing approximately 38x coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Further, it was possible to segregate more than half of the diploid genome into the two separate haplotypes. The assembly also recovered two microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.,Supporting files for the manuscript "A High-Quality Genome Assembly from a Single, Field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System", include several intermediate versions of the assembly (raw output from Falcon, raw output from Falcon unzip, etc.) as well as the final assembly primary contigs and haplotigs (for the regions of the genome that were phased).,,
MG1 dataset
공공데이터포털
Genome sequence, PCR clone sequences and qPCR data. This dataset is associated with the following publication: Linz, D., K. McIntosh, I. Struewing, S. Klemm, B. McMinn, R. Haugland, E. Villegas, and J. Lu. Genomic Characterization and Wetland Occurrence of a Novel Campylobacter Isolate from Canada Geese. Microorganisms. MDPI, Basel, SWITZERLAND, 11(3): 648, (2023).
Data from: Survey of CRISPR spacers among greater than 35,000 Campylobacter spp. genomes both recognizes known bacteriophages and suggests novel bacteriophages
공공데이터포털
,"Survey of CRISPR spacers among greater than 35,000 Campylobacter spp. genomes both recognizes known bacteriophages and suggests novel bacteriophages" study.,Extended dataset 1: DA10 bacteriophage gene prevalence in targets.,Extended dataset 2: Fletchervirus bacteriophage gene prevalence in targets.,
Data from: Agile Genetics: Single gene resolution without the fuss
공공데이터포털
,These files are 250bp Illumina MiSeq paired-end sequencing reads in fastq format. Libraries were prepared from DNA fragments amplified from tomato bulk (heterogenous) samples around the fruit weight locus.,
Sequencing Data for Hospital Metagenomes
공공데이터포털
FASTA files containing the sequence data and for Assembled contigs (FastA), Predicted genes (FastA), Predicted proteins (FastA), Gene prediction (GFF v2). This dataset is not publicly accessible because: These are sequences that have already been deposited in publicly available databases and therefore we can avoid replication. Also the data is quite large and there are numerous files associated with these entries, which are included in the links below. It can be accessed through the following means: Using the following web links https://www.ncbi.nlm.nih.gov/bioproject/PRJNA299404 https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP065069 http://enve-omics.ce.gatech.edu/data/showerheads. Format: The data represent genome sequencing and assembly of 180 different contigs. This dataset is associated with the following publication: Soto-Giron, M.J., L. Rodriguez, C. Luo , M. Elk, H. Ryu, J. Santodomingo , and K. Konstantinidis. Biofilms on Hospital Shower Hoses: Characterization and Implications for Nosocomial Infections. APPLIED AND ENVIRONMENTAL MICROBIOLOGY. American Society for Microbiology, Washington, DC, USA, 82(9): 2872-2883, (2016).