데이터셋 상세
미국
Genome In A Bottle - v2.0 Genome Stratifications (Deprecated)
These stratification BED files from the Global Alliance for Genomics and Health (GA4GH) Benchmarking Team and the Genome in a Bottle Consortium are intended as a standard resource of BED files for use in stratifying true positive, false positive, and false negative variant calls. These v2.0 stratification BED files from the Global Alliance for Genomics and Health (GA4GH) Benchmarking Team and the Genome in a Bottle Consortium are intended as a standard resource of BED files for use in stratifying true positive, false positive, and false negative variant calls. v2.0 stratifications have been deprecated and replaced by v3.0 genome-stratifications.
데이터 정보
연관 데이터
Genome In A Bottle - v2.0 Genome Stratifications (Deprecated)
공공데이터포털
These stratification BED files from the Global Alliance for Genomics and Health (GA4GH) Benchmarking Team and the Genome in a Bottle Consortium are intended as a standard resource of BED files for use in stratifying true positive, false positive, and false negative variant calls. These v2.0 stratification BED files from the Global Alliance for Genomics and Health (GA4GH) Benchmarking Team and the Genome in a Bottle Consortium are intended as a standard resource of BED files for use in stratifying true positive, false positive, and false negative variant calls. v2.0 stratifications have been deprecated and replaced by v3.0 genome-stratifications.
Genome In A Bottle - v3.0 Genome Stratifications
공공데이터포털
These v3.0 stratification BED files from the Global Alliance for Genomic Health (GA4GH) Benchmarking Team, the Genome in a Bottle Consortium and the Telomere-to-Telomere Consortium are intended as a standard resource of BED files for use in stratifying true positive, false positive, and false negative variant calls in challenging and targeted regions of the the genome. v3.0 stratifications contain new and revised stratification files and replace v2.0 stratifications.
Genome In A Bottle - v3.0 Genome Stratifications
공공데이터포털
These v3.0 stratification BED files from the Global Alliance for Genomic Health (GA4GH) Benchmarking Team, the Genome in a Bottle Consortium and the Telomere-to-Telomere Consortium are intended as a standard resource of BED files for use in stratifying true positive, false positive, and false negative variant calls in challenging and targeted regions of the the genome. v3.0 stratifications contain new and revised stratification files and replace v2.0 stratifications.
Challenging Medically-Relevant Genes Benchmark Set
공공데이터포털
CMRG v1.00 of a small variant benchmark and structural variant benchmark focused on 273 challenging medically relevant genes for the Genome in a Bottle (GIAB) sample HG002 (aka Ashkenazi son). These benchmarks were generated from a trio-based hifiasm v0.11 (https://doi.org/10.1038/s41592-020-01056-5) diploid assembly of HG002 using PacBio HiFi reads for HG002 for assembly and partitioning into phased haplotypes using Illumina reads for the parents, HG003 and HG004. This benchmark contains vcfs for small and structural variants along with corresponding benchmark bed files indicating regions that are homozygous reference if they do not have a variant in the vcf. We extensively curated the variant calls, excluding any found to be questionable or errors. This benchmark helps measure performance in important challenging regions, including challenging segmental duplications, regions with complex variants, regions with structural variants, and regions affected by false duplications in GRCh37 or GRCh38. This benchmark is described in https://doi.org/10.1101/2021.06.07.444885.
Challenging Medically-Relevant Genes Benchmark Set
공공데이터포털
CMRG v1.00 of a small variant benchmark and structural variant benchmark focused on 273 challenging medically relevant genes for the Genome in a Bottle (GIAB) sample HG002 (aka Ashkenazi son). These benchmarks were generated from a trio-based hifiasm v0.11 (https://doi.org/10.1038/s41592-020-01056-5) diploid assembly of HG002 using PacBio HiFi reads for HG002 for assembly and partitioning into phased haplotypes using Illumina reads for the parents, HG003 and HG004. This benchmark contains vcfs for small and structural variants along with corresponding benchmark bed files indicating regions that are homozygous reference if they do not have a variant in the vcf. We extensively curated the variant calls, excluding any found to be questionable or errors. This benchmark helps measure performance in important challenging regions, including challenging segmental duplications, regions with complex variants, regions with structural variants, and regions affected by false duplications in GRCh37 or GRCh38. This benchmark is described in https://doi.org/10.1101/2021.06.07.444885.
GIAB Benchmarking of HG002 Assemblies from HPRC Year 1 Bakeoff
공공데이터포털
The Human Pangenome Reference Consortium (HPRC) tested which combination of current genome sequencing and automated assembly approaches yields the most complete, accurate, and cost-effective diploid genome assemblies with minimal manual curation. Assemblies were generated for GIAB HG002. Variant calls from twenty-nine assemblies were evaluated by NIST using dipcall v0.3 (https://github.com/lh3/dipcall) to produce variant calls when aligned to GRCh38. Benchmarking of small variant calls was then performed against GIAB benchmark v4.2.1 using hap.py v3.12 (https://github.com/Illumina/hap.py).
Genome Assembly Data
공공데이터포털
A database providing information on the structure of assembled genomes, assembly names and other meta-data, statistical reports, and links to genomic sequence data. Genomes FTP site FAQ at https://www.ncbi.nlm.nih.gov/genome/doc/ftpfaq/
Genome
공공데이터포털
This resource organizes NCBI information, resources, data, and tools and utilities on genomes including sequences, maps, chromosomes, assemblies, and annotations. Sequence and map data from the whole genomes of organisms. The genomes represent both completely sequenced organisms and those for which sequencing is in progress.
Sequence Set Browser
공공데이터포털
This site is for browsing WGS (Whole Genome Shotgun) genomes, TSA (Transcriptome Shotgun Assemblies) and TLS (Targeted Locus Study) sets. WGS sequences are incomplete genomes that have been sequenced by a whole genome shotgun strategy. TSA sequences are transcript sequences that have been computationally assembled from primary RNA sequence data. TLS sequences are large-scale marker gene sequencing studies. Please consult WGS Submission or TSA Submission pages for more details. https://www.ncbi.nlm.nih.gov/genbank/wgs https://www.ncbi.nlm.nih.gov/genbank/tsa
Database of Genotype and Phenotype (dbGaP)
공공데이터포털
Database of Genotype and Phenotype (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans.