교육데이터 활용•지원 서비스

로그인

데이터셋 상세

미국

Sequence Read Archive (SRA)

,The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome Analyzer®, Applied Biosystems SOLiD® System, Helicos Heliscope®, Complete Genomics®, and Pacific Biosciences SMRT®.,

데이터 정보

데이터 포털
미국
META URL
https://catalog.data.gov/dataset/sequence-read-archive-sra
라이선스
other-license-specified
비용
제공기관
U.S. Department of Health & Human Services
관리부서
데이터
- Sequence Read Archive (SRA)
- 랜딩 페이지

연관 데이터

공공데이터포털

A repository of DNA sequence chromatograms (traces), base calls, and quality estimates for single-pass reads from various large-scale sequencing projects.

공공데이터포털

The BioProject database links to data that have been or will be deposited into archival databases maintained at members of the International Nucleotide Sequence Database Consortium (INSDC, which comprises the DNA DataBank of Japan (DDBJ), the European Nucleotide Archive at European Molecular Biology Laboratory (ENA), and GenBank at the National Center for Biotechnology Information (NCBI)).

NIST test dataset for assessing baseline nucleic acid sequence screening

공공데이터포털

This repository contains the dataset used in the manuscript "Inter-tool analysis of a NIST dataset for assessing baseline nucleic acid sequence screening". NIST constructed the test dataset based on the current screening recommendations from HHS. The dataset is a FASTA formatted file with blinded numerical sequence headers. The dataset was sent to sequence screening tool developers for initial testing and to obtain feedback about its utility for assessing baseline sequence screening. An additional metadata file provides the NIST-assigned label for each sequence, along with a more detailed description derived from the source database.

NIST test dataset for assessing baseline nucleic acid sequence screening

공공데이터포털

This repository contains the dataset used in the manuscript "Inter-tool analysis of a NIST dataset for assessing baseline nucleic acid sequence screening". NIST constructed the test dataset based on the current screening recommendations from HHS. The dataset is a FASTA formatted file with blinded numerical sequence headers. The dataset was sent to sequence screening tool developers for initial testing and to obtain feedback about its utility for assessing baseline sequence screening. An additional metadata file provides the NIST-assigned label for each sequence, along with a more detailed description derived from the source database.

Genome Workbench

공공데이터포털

An integrated application for viewing and analyzing sequence data. With Genome Workbench, you can view data in publically available sequence databases at NCBI, and mix these data with your own data.

공공데이터포털

Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.

Sequencing Data for Hospital Metagenomes

공공데이터포털

FASTA files containing the sequence data and for Assembled contigs (FastA), Predicted genes (FastA), Predicted proteins (FastA), Gene prediction (GFF v2). This dataset is not publicly accessible because: These are sequences that have already been deposited in publicly available databases and therefore we can avoid replication. Also the data is quite large and there are numerous files associated with these entries, which are included in the links below. It can be accessed through the following means: Using the following web links https://www.ncbi.nlm.nih.gov/bioproject/PRJNA299404 https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP065069 http://enve-omics.ce.gatech.edu/data/showerheads. Format: The data represent genome sequencing and assembly of 180 different contigs. This dataset is associated with the following publication: Soto-Giron, M.J., L. Rodriguez, C. Luo , M. Elk, H. Ryu, J. Santodomingo , and K. Konstantinidis. Biofilms on Hospital Shower Hoses: Characterization and Implications for Nosocomial Infections. APPLIED AND ENVIRONMENTAL MICROBIOLOGY. American Society for Microbiology, Washington, DC, USA, 82(9): 2872-2883, (2016).

Sequencing Data Set of Sediment Layers

공공데이터포털

A table (DP_SRA.xlsx) contains rows as sample and columns as entries representing the biosample accession number (NCBI), collection (date), library strategy, target (source), and sequencing (technology) for each individual sample. The zip file (Genome_Set01.zip) contain nine (9) fasta file (DP_bin_02.fasta, DP_bin_04.fasta, DP_bin_09.fasta, DP_bin_10.fasta, DP_bin_14.fasta, DP_bin_15.fasta, DP_bin_16a.fasta, DP_bin_20.fasta, DP_bin_23.fasta) with the contig sequences (i.e. binning) for each metagenome-assembled genomes (MAGs). These data are available from the NCBI Sequence Read Archive (SRA) under the BioProject (https://www.ncbi.nlm.nih.gov/bioproject) with accession number PRJNA646252 and the following BioSample numbers: SAMN15536103 to SAMN15536108. This dataset is associated with the following publication: Gomez-Alvarez, V., H. Liu, J. Pressman, and D. Wahman. Metagenomic Profile of Microbial Communities in a Drinking Water Storage Tank Sediment after Sequential Exposure to Monochloramine, Free Chlorine, and Monochloramine. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 1(5): 1283-1294, (2021).

Sequencing Data Set of Sediment Layers

공공데이터포털

A table (DP_SRA.xlsx) contains rows as sample and columns as entries representing the biosample accession number (NCBI), collection (date), library strategy, target (source), and sequencing (technology) for each individual sample. The zip file (Genome_Set01.zip) contain nine (9) fasta file (DP_bin_02.fasta, DP_bin_04.fasta, DP_bin_09.fasta, DP_bin_10.fasta, DP_bin_14.fasta, DP_bin_15.fasta, DP_bin_16a.fasta, DP_bin_20.fasta, DP_bin_23.fasta) with the contig sequences (i.e. binning) for each metagenome-assembled genomes (MAGs). These data are available from the NCBI Sequence Read Archive (SRA) under the BioProject (https://www.ncbi.nlm.nih.gov/bioproject) with accession number PRJNA646252 and the following BioSample numbers: SAMN15536103 to SAMN15536108. This dataset is associated with the following publication: Gomez-Alvarez, V., H. Liu, J. Pressman, and D. Wahman. Metagenomic Profile of Microbial Communities in a Drinking Water Storage Tank Sediment after Sequential Exposure to Monochloramine, Free Chlorine, and Monochloramine. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 1(5): 1283-1294, (2021).

공공데이터포털

This resource organizes NCBI information, resources, data, and tools and utilities on genomes including sequences, maps, chromosomes, assemblies, and annotations. Sequence and map data from the whole genomes of organisms. The genomes represent both completely sequenced organisms and those for which sequencing is in progress.

목록