Database of Genomic Structural Variation (dbVar)
공공데이터포털
Database of Genomic Structural Variation (dbVar) is NCBI's database of human genomic Structural Variation — large variants >50 bp including insertions, deletions, duplications, inversions, mobile elements, translocations, and complex variants.
High-throughput genotyping of single nucleotide polymorphisms with rolling circle amplification
공공데이터포털
Background Single nucleotide polymorphisms (SNPs) are the foundation of powerful complex trait and pharmacogenomic analyses. The availability of large SNP databases, however, has emphasized a need for inexpensive SNP genotyping methods of commensurate simplicity, robustness, and scalability. We describe a solution-based, microtiter plate method for SNP genotyping of human genomic DNA. The method is based upon allele discrimination by ligation of open circle probes followed by rolling circle amplification of the signal using fluorescent primers. Only the probe with a 3' base complementary to the SNP is circularized by ligation. Results SNP scoring by ligation was optimized to a 100,000 fold discrimination against probe mismatched to the SNP. The assay was used to genotype 10 SNPs from a set of 192 genomic DNA samples in a high-throughput format. Assay directly from genomic DNA eliminates the need to preamplify the target as done for many other genotyping methods. The sensitivity of the assay was demonstrated by genotyping from 1 ng of genomic DNA. We demonstrate that the assay can detect a single molecule of the circularized probe. Conclusions Compatibility with homogeneous formats and the ability to assay small amounts of genomic DNA meets the exacting requirements of automated, high-throughput SNP scoring.
Data from: Development of a versatile resource from 1500 diverse genomes for post-genomics research
공공데이터포털
,This data set contains 32 million annotated SNPs having an average SNP density of 30 SNPs per kb and 12 non-synonymous SNPs per gene model. These SNPs were identified from a genetically diverse, worldwide, collection of soybean germplasm representing wild, landrace, and improved cultivars. A combination of new and publicly available re-sequencing data was used in this analysis. The accession genotypes and their annotations are described in the manuscript titled: "Analysis and characterization of 1500 diverse genome sequences as a versatile resource for post-genomics research".,,
Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
Evaluation of DNA yield from various sources for use in single nucleotide polymorphism panels
공공데이터포털
Genetics studies are used by wildlife managers and researchers to gain inference into a population of a species of interest. To gain these insights, micro-satellites have been the primary method, however, there currently is a shift from micro-satellites to single nucleotide polymorphisms (SNPs). With the DNA requirements being different, an investigation into which samples can provide adequate DNA yield is warranted. Using samples that were collected from previous genetic projects from regions in the USA from 2014 to 2021, we investigated the DNA yield of eight sample categories to gain insights into which provided adequate DNA to be used in various panels. We found four sample categories that met the DNA requirements for use in all three panels, three sample categories that only met the DNA requirements for two panels, and one sample category that did not meet any of the three panels requirements. Additionally, we used linear random-effects models to determine which covariates would have the greatest influence on DNA yield. We determined that all covariates, tissue type, storage method, preservative, DNA quality, time until DNA extraction and time after DNA extraction could influence DNA yield.
Data from: A High-Quality Genome Assembly from a Single, Field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System
공공데이터포털
,A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies, however, long-read methods have historically had greater input DNA requirements and higher costs than next generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female Spotted Lanternfly (Lycorma delicatula) using a single PacBio SMRT Cell. The Spotted Lanternfly is an invasive species recently discovered in the northeastern United States, threatening to damage economically important crop plants in the region. The DNA from one individual female specimen collected in Reading, Berks County, Pennsylvania was used to make one standard, size-selected library with an average DNA fragment size of ~20 kb. The library was run on one Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing approximately 38x coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Further, it was possible to segregate more than half of the diploid genome into the two separate haplotypes. The assembly also recovered two microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.,Supporting files for the manuscript "A High-Quality Genome Assembly from a Single, Field-collected Spotted Lanternfly (Lycorma delicatula) using the PacBio Sequel II System", include several intermediate versions of the assembly (raw output from Falcon, raw output from Falcon unzip, etc.) as well as the final assembly primary contigs and haplotigs (for the regions of the genome that were phased).,,
인포보스 - 자생종 단백질 유전자 발현확률 데이터
공공데이터포털
● 데이터 키워드 - 유전체, 유전자, NGS, DNA ● 데이터 상품 정보 - 본 상품은 자생종 유전체 분석을 통해 얻어진 유전자의 유전자군 발현확률 정보를 제공합니다. - 기능 도메인에 대해 기능별 유용성, 효소, 단백질, 병 저항성 유전자군 분류 가공 - 데이터 comparative analysis를 통해 유전자군별 발현 확률 및 계통 확률 계산 ● 컬럼 정보 - fasta format ● 활용 예제 - 본 데이터 상품을 활용하여 사용자는 다음과 같은 정보를 확인할 수 있습니다. 1) 신약 및 기능성 식품, 화장품 개발 관련 분야 기초자료 ● 기간 및 범위 - 2019년 7월 ~ 2019년 12월 [원본 데이터](https://www.bigdata-forest.kr/product/GNM201201)는 로그인 후 구매하여 다운로드 하십시오.
인포보스 - 자생종 식물 유전체 원 데이터
공공데이터포털
● 데이터 키워드 - 유전체, NGS, DNA ● 데이터 상품 정보 - 본 상품은 국내 식물 자생종과 동일한 종의 유전체 해독 원 데이터를 제공합니다.(인포보스 자체 생산 NGS 기초 데이터) - DNA, RNA 차세대 염기서열 분석 생산 - de novo assembly, gene prediction, functional domain 발굴 및 phylogenetic analysis를 위한 데이터 가공 ● 컬럼(속성) 정보 - fasta format ● 활용 예제 - 본 데이터 상품을 활용하여 사용자는 다음과 같은 정보를 확인할 수 있습니다. 1) 신약 및 기능성 식품, 화장품 개발 관련 분야 기초자료 ● 기간 및 범위 - 2019년 7월 ~ 2019년 12월 [원본 데이터](https://www.bigdata-forest.kr/product/GNM100401)는 로그인 후 구매하여 다운로드 하십시오.
인포보스 - 자생종 단백질 유전자 데이터
공공데이터포털
● 데이터 키워드 - 유전체, 유전자, NGS, DNA ● 데이터 상품 정보 - 본 상품은 자생종 유전체 분석을 통해 얻어진 유전자의 유전자군 정보를 제공합니다. - 기능 도메인에 대해 기능별 유용성, 효소, 단백질, 병 저항성 유전자군 분류 가공 - 데이터 comparative analysis를 통해 유전자군별 발현 확률 및 계통 확률 계산 ● 컬럼 정보 - fasta format ● 활용 예제 - 본 데이터 상품을 활용하여 사용자는 다음과 같은 정보를 확인할 수 있습니다. 1) 신약 및 기능성 식품, 화장품 개발 관련 분야 기초자료 ● 데이터 및 기간 - 2019년 7월 ~ 2019년 12월 [원본 데이터](https://www.bigdata-forest.kr/product/GNM201001)는 로그인 후 구매하여 다운로드 하십시오.