교육데이터 활용•지원 서비스

로그인

데이터셋 상세

미국

Estimation of genetic distances from human and mouse introns

Background Using genetic distances measured from exons, it has been observed that the mutation rate is not constant along mammalian chromosomes. Exons constitute only 1% of the human genome, however, and thus they cannot provide a complete picture of the mutational variation in the genome. Results I calculated genetic distances between 504 human introns and their orthologous mouse counterparts from a set of 63 pairs of human and mouse genes scattered through the genome using a recently developed method that can extract reliably aligned regions from the introns in an objective manner. I found a significant correlation between the genetic distance measured in the conserved intron segments and the synonymous and nonsynonymous distances measured in the corresponding coding exons, indicating that genes with fast-evolving exons tend to have fast-evolving introns, and vice versa. Conclusions These results indicate that introns, which extend over almost a quarter of the human genome, contain useful information for fully understanding the mutational dynamics of human and mouse genomes. This work also supports the idea that there is a mutational force that fluctuates nonrandomly along the genome, and shows for the first time that this force affects the introns and the synonymous and nonsynonymous positions in the exons of the genes simultaneously.

데이터 정보

데이터 포털
미국
META URL
https://catalog.data.gov/dataset/estimation-of-genetic-distances-from-human-and-mouse-introns
라이선스
notspecified
비용
제공기관
U.S. Department of Health & Human Services
관리부서
데이터
- Official Government Data Source
- 랜딩 페이지

연관 데이터

A hierarchical statistical model for estimating population properties of quantitative genes

공공데이터포털

Background Earlier methods for detecting major genes responsible for a quantitative trait rely critically upon a well-structured pedigree in which the segregation pattern of genes exactly follow Mendelian inheritance laws. However, for many outcrossing species, such pedigrees are not available and genes also display population properties. Results In this paper, a hierarchical statistical model is proposed to monitor the existence of a major gene based on its segregation and transmission across two successive generations. The model is implemented with an EM algorithm to provide maximum likelihood estimates for genetic parameters of the major locus. This new method is successfully applied to identify an additive gene having a large effect on stem height growth of aspen trees. The estimates of population genetic parameters for this major gene can be generalized to the original breeding population from which the parents were sampled. A simulation study is presented to evaluate finite sample properties of the model. Conclusions A hierarchical model was derived for detecting major genes affecting a quantitative trait based on progeny tests of outcrossing species. The new model takes into account the population genetic properties of genes and is expected to enhance the accuracy, precision and power of gene detection.

High correlation between the turnover of nucleotides under mutational pressure and the DNA composition

공공데이터포털

Background Any DNA sequence is a result of compromise between the selection and mutation pressures exerted on it during evolution. It is difficult to estimate the relative influence of each of these pressures on the rate of accumulation of substitutions. However, it is important to discriminate between the effect of mutations, and the effect of selection, when studying the phylogenic relations between taxa. Results We have tested in computer simulations, and analytically, the available substitution matrices for many genomes, and we have found that DNA strands in equilibrium under mutational pressure have unique feature: the fraction of each type of nucleotide is linearly dependent on the time needed for substitution of half of nucleotides of a given type, with a correlation coefficient close to 1. Substitution matrices found for sequences under selection pressure do not have this property. A substitution matrix for the leading strand of the Borrelia burgdorferi genome, having reached equilibrium in computer simulation, gives a DNA sequence with nucleotide composition and asymmetry corresponding precisely to the third positions in codons of protein coding genes located on the leading strand. Conclusions Parameters of mutational pressure allow us to count DNA composition in equilibrium with this mutational pressure. Comparing any real DNA sequence with the sequence in equilibrium it is possible to estimate the distance between these sequences, which could be used as a measure of the selection pressure. Furthermore, the parameters of the mutational pressure enable direct estimation of the relative mutation rates in any DNA sequence in the studied genome.

Major genomic mitochondrial lineages delineate early human expansions

공공데이터포털

Background The phylogeographic distribution of human mitochondrial DNA variations allows a genetic approach to the study of modern Homo sapiens dispersals throughout the world from a female perspective. As a new contribution to this study we have phylogenetically analysed complete mitochondrial DNA(mtDNA) sequences from 42 human lineages, representing major clades with known geographic assignation. Results We show the relative relationships among the 42 lineages and present more accurate temporal calibrations than have been previously possible to give new perspectives as how modern humans spread in the Old World. Conclusions The first detectable expansion occurred around 59,000–69,000 years ago from Africa, independently colonizing western Asia and India and, following this southern route, swiftly reaching east Asia. Within Africa, this expansion did not replace but mixed with older lineages detectable today only in Africa. Around 39,000–52,000 years ago, the western Asian branch spread radially, bringing Caucasians to North Africa and Europe, also reaching India, and expanding to north and east Asia. More recent migrations have entangled but not completely erased these primitive footprints of modern human expansions.

Sample collection information and microsatellite data for Gunnison sage-grouse pre and post translocation

공공데이터포털

Maintenance of genetic diversity is important for conserving species, especially those with fragmented habitats and/or ranges. In the absence of natural dispersal, translocation can be used to achieve this goal. However, the long-term impacts from translocation can be expensive and difficult to evaluate. This dataset is used to evaluate genetic change as a result of translocation and represents samples collected before and after translocations were conducted.

The genetic structure of recombinant inbred mice: high-resolution consensus maps for complex trait analysis

공공데이터포털

Background Recombinant inbred (RI) strains of mice are an important resource used to map and analyze complex traits. They have proved particularly effective in multidisciplinary genetic studies. Widespread use of RI strains has been hampered by their modest numbers and by the difficulty of combining results derived from different RI sets. Results We have increased the density of typed microsatellite markers two- to five-fold in each of several major RI sets that share C57BL/6 as a parental strain (AXB, BXA, BXD, BXH and CXB). A common set of 490 markers was genotyped in just over 100 RI strains. Genotypes of around 1,100 additional microsatellites in one or more RI sets were generated, collected and checked for errors. Consensus RI maps that integrate genotypes of approximately 1,600 microsatellite loci were assembled. The genomes of individual strains typically incorporate 45-55 recombination breakpoints. The collected RI set - termed the BXN set - contains approximately 5,000 breakpoints. The distribution of recombinations approximates a Poisson distribution and distances between breakpoints average about 0.5 centimorgans (cM). Locations of most breakpoints have been defined with a precision of < 2 cM. Genotypes deviate from Hardy-Weinberg equilibrium in only a small number of intervals. Conclusions Consensus maps derived from RI strains conform almost exactly to theoretical expectation and are close to the length predicted by the Haldane-Waddington equation (x3.6 for a 2-3 cM interval between markers). Non-syntenic associations between different chromosomes introduce predictable distortions in quantitative trait locus (QTL) datasets that can be partly corrected using two-locus correlation matrices.

A simple method for statistical analysis of intensity differences in microarray-derived gene expression data

공공데이터포털

Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

Comparison of complete nuclear receptor sets from the human,

공공데이터포털

Background The availability of complete genome sequences enables all the members of a gene family to be identified without limitations imposed by temporal, spatial or quantitative aspects of mRNA expression. Using the nearly completed human genome sequence, we combined in silico and experimental approaches to define the complete human nuclear receptor (NR) set. This information was used to carry out a comparative genomic study of the NR superfamily. Results Our analysis of the human genome identified two novel NR sequences. Both these contained stop codons within the coding regions, indicating that both are pseudogenes. One (HNF4 γ-related) contained no introns and expressed no detectable mRNA, whereas the other (FXR-related) produced mRNA at relatively high levels in testis. If translated, the latter is predicted to encode a short, non-functional protein. Our analysis indicates that there are fewer than 50 functional human NRs, dramatically fewer than in Caenorhabditis elegans and about twice as many as in Drosophila. Using the complete human NR set we made comparisons with the NR sets of C. elegans and Drosophila. Searches for the >200 NRs unique to C. elegans revealed no human homologs. The comparative analysis also revealed a Drosophila member of NR subfamily NR3, confirming an ancient metazoan origin for this subfamily. Conclusions This work provides the basis for new insights into the evolution and functional relationships of NR superfamily members.

A genomic timescale for the origin of eukaryotes

공공데이터포털

Background Genomic sequence analyses have shown that horizontal gene transfer occurred during the origin of eukaryotes as a consequence of symbiosis. However, details of the timing and number of symbiotic events are unclear. A timescale for the early evolution of eukaryotes would help to better understand the relationship between these biological events and changes in Earth's environment, such as the rise in oxygen. We used refined methods of sequence alignment, site selection, and time estimation to address these questions with protein sequences from complete genomes of prokaryotes and eukaryotes. Results Eukaryotes were found to evolve faster than prokaryotes, with those eukaryotes derived from eubacteria evolving faster than those derived from archaebacteria. We found an early time of divergence (~4 billion years ago, Ga) for archaebacteria and the archaebacterial genes in eukaryotes. Our analyses support at least two horizontal gene transfer events in the origin of eukaryotes, at 2.7 Ga and 1.8 Ga. Time estimates for the origin of cyanobacteria (2.6 Ga) and the divergence of an early-branching eukaryote that lacks mitochondria (Giardia) (2.2 Ga) fall between those two events. Conclusions We find support for two symbiotic events in the origin of eukaryotes: one premitochondrial and a later mitochondrial event. The appearance of cyanobacteria immediately prior to the earliest undisputed evidence for the presence of oxygen (2.4–2.2 Ga) suggests that the innovation of oxygenic photosynthesis had a relatively rapid impact on the environment as it set the stage for further evolution of the eukaryotic cell.

Conservation of long-range synteny and microsynteny between the genomes of two distantly related nematodes

공공데이터포털

To assess whether the pattern of high rates of genome rearrangement, with a bias towards within-chromosome events is true of nematodes in general, genome sequence was used to compare the model Caenorhabditis elegans and the filarial parasite Brugia malayi. It is suggested that intrachromosomal rearrangement is a major force driving chromosomal organization in nematodes.

Exploring the Effects of Experimental Parameters and Data Modeling Approaches on In Vitro Transcriptomic Point-of-Departure Estimates

공공데이터포털

Dataset for 'Exploring the Effects of Experimental Parameters and Data Modeling Approaches on In Vitro Transcriptomic Point-of-Departure Estimates' published in Toxicology December 2023, DOI https://doi.org/10.1016/j.tox.2023.153694. This dataset is associated with the following publication: Harrill, J., L. Everett, D. Haggard, J. Bundy, C. Willis, I. Shah, K. Friedman, D. Basili, A. Middleton, and R. Judson. Exploring the Effects of Experimental Parameters and Data Modeling Approaches on In Vitro Transcriptomic Point-of-Departure Estimates. TOXICOLOGY. Elsevier Science Ltd, New York, NY, USA, 501: 153694, (2024).

목록