데이터셋 상세
미국
Analysis of a human brain transcriptome map
Background Genome wide transcriptome maps can provide tools to identify candidate genes that are over-expressed or silenced in certain disease tissue and increase our understanding of the structure and organization of the genome. Expressed Sequence Tags (ESTs) from the public dbEST and proprietary Incyte LifeSeq databases were used to derive a transcript map in conjunction with the working draft assembly of the human genome sequence. Results Examination of ESTs derived from brain tissues (excluding brain tumor tissues) suggests that these genes are distributed on chromosomes in a non-random fashion. Some regions on the genome are dense with brain-enriched genes while some regions lack brain-enriched genes, suggesting a significant correlation between distribution of genes along the chromosome and tissue type. ESTs from brain tumor tissues have also been mapped to the human genome working draft. We reveal that some regions enriched in brain genes show a significant decrease in gene expression in brain tumors, and, conversely that some regions lacking in brain genes show an increased level of gene expression in brain tumors. Conclusions This report demonstrates a novel approach for tissue specific transcriptome mapping using EST-based quantitative assessment.
데이터 정보
연관 데이터
Survey of transcripts in the adult
공공데이터포털
Background: Classic methods of identifying genes involved in neural function include the laborious process of behavioral screening of mutagenized flies and then rescreening candidate lines for pleiotropic effects due to developmental defects. To accelerate the molecular analysis of brain function in Drosophila we constructed a cDNA library exclusively from adult brains. Our goal was to begin to develop a catalog of transcripts expressed in the brain. These transcripts are expected to contain a higher proportion of clones that are involved in neuronal function. Results: The library contains approximately 6.75 million independent clones. From our initial characterization of 271 randomly chosen clones, we expect that approximately 11% of the clones in this library will identify transcribed sequences not found in expressed sequence tag databases. Furthermore, 15% of these 271 clones are not among the 13,601 predicted Drosophila genes. Conclusions: Our analysis of this unique Drosophila brain library suggests that the number of genes may be underestimated in this organism. This work complements the Drosophila genome project by providing information that facilitates more complete annotation of the genomic sequence. This library should be a useful resource that will help in determining how basic brain functions operate at the molecular level.
A rapid method to map mutations in
공공데이터포털
Background Genetic screens in Drosophila have provided a wealth of information about a variety of cellular and developmental processes. It is now possible to screen for mutant phenotypes in virtually any cell at any stage of development by performing clonal screens using the flp/FRT system. The rate-limiting step in the analysis of these mutants is often the identification of the mutated gene, however, because traditional mapping strategies rely mainly on genetic and cytological markers that are not easily linked to the molecular map. Results Here we describe the development of a single-nucleotide polymorphism (SNP) map for chromosome arm 3R. The map contains 73 polymorphisms between the standard FRT chromosome, and a mapping chromosome that carries several visible markers (rucuca), at an average density of one SNP per 370 kilobases (kb). Using this collection, we show that mutants can be mapped to a 400 kb interval in a single meiotic mapping cross, with only a few hundred SNP detection reactions. Discovery of further SNPs in the region of interest allows the mutation to be mapped with the same recombinants to a region of about 50 kb. Conclusion The combined use of standard visible markers and molecular polymorphisms in a single mapping strategy greatly reduces both the time and cost of mapping mutations, because it requires at least four times fewer SNP detection reactions than a standard approach. The use of this map, or others developed along the same lines, will greatly facilitate the identification of the molecular lesions in mutants from clonal screens.
한림대학교산학협력단 - 경도인지장애코호트 유전체검사데이터(2022년)
공공데이터포털
경도인지장애 환자들의 혈액 샘플을 이용해 Whole Genome Sequencing, 특정 질병발생과 관련하여 유전적 질병의 위험도를 분석한 데이터(ex: 특정 암에 걸릴 확률)
A draft annotation and overview of the human genome
공공데이터포털
Background The recent draft assembly of the human genome provides a unified basis for describing genomic structure and function. The draft is sufficiently accurate to provide useful annotation, enabling direct observations of previously inferred biological phenomena. Results We report here a functionally annotated human gene index placed directly on the genome. The index is based on the integration of public transcript, protein, and mapping information, supplemented with computational prediction. We describe numerous global features of the genome and examine the relationship of various genetic maps with the assembly. In addition, initial sequence analysis reveals highly ordered chromosomal landscapes associated with paralogous gene clusters and distinct functional compartments. Finally, these annotation data were synthesized to produce observations of gene density and number that accord well with historical estimates. Such a global approach had previously been described only for chromosomes 21 and 22, which together account for 2.2% of the genome. Conclusions We estimate that the genome contains 65,000-75,000 transcriptional units, with exon sequences comprising 4%. The creation of a comprehensive gene index requires the synthesis of all available computational and experimental evidence.
GenBank
공공데이터포털
NIH Genetic sequence database; an annotated collection of all publicly available DNA sequences.
Full-length messenger RNA sequences greatly improve genome annotation
공공데이터포털
Background Annotation of eukaryotic genomes is a complex endeavor that requires the integration of evidence from multiple, often contradictory, sources. With the ever-increasing amount of genome sequence data now available, methods for accurate identification of large numbers of genes have become urgently needed. In an effort to create a set of very high-quality gene models, we used the sequence of 5,000 full-length gene transcripts from Arabidopsis to re-annotate its genome. We have mapped these transcripts to their exact chromosomal locations and, using alignment programs, have created gene models that provide a reference set for this organism. Results Approximately 35% of the transcripts indicated that previously annotated genes needed modification, and 5% of the transcripts represented newly discovered genes. We also discovered that multiple transcription initiation sites appear to be much more common than previously known, and we report numerous cases of alternative mRNA splicing. We include a comparison of different alignment software and an analysis of how the transcript data improved the previously published annotation. Conclusions Our results demonstrate that sequencing of large numbers of full-length transcripts followed by computational mapping greatly improves identification of the complete exon structures of eukaryotic genes. In addition, we are able to find numerous introns in the untranslated regions of the genes.
Improved analytical methods for microarray-based genome-composition analysis
공공데이터포털
Genome-composition analysis using microarrays can be used to categorize genes into 'present' and 'divergent' categories. This involves selecting a signal value that is used as a cutoff to discriminate present and divergent genes, but this can result in the misclassification of many genes. A method is described that depends on the shape of the signal-ratio distribution and does not require empirical determination of a cutoff. Many genes previously classified as present using static methods are in fact divergent on the basis of microarray signal; this is corrected by our algorithm.
Expression profiling of
공공데이터포털
A combination of linear RNA amplification and DNA microarray hybridization has allowed the determination of expression profiles of individual imaginal discs and larval tissues and the identification of genes expressed in tissue-specific patterns.