Bioinformatics and Genomics: October 2010

A few thoughts on TEs

1) Perhaps ecological risk assessment could be performed by looking at the transcriptome or the proteome of a population / sample. If the tscriptome has many non-coding RNAs that resemble TE, perhaps the organism is under stress. Similarly, if an organism is intensively expressing enzymes specific to retrotransposition, that may be an indicator of stress.

Side note: I believe the human body is highly resilient; it is a very intelligent machine. Yes, I am anthropomorphizing, and even though it was made through "tinkering" as Dr. King Jordan would say, I still believe that we live in the "best of all possible worlds."

2) Perhaps (and I know that i have suggested this before), the natural history of an organism could be reconstructed by looking at the type I TE to see how many of each kind exist, and how long each of the TEs have been a part of the genome. This approximate age could be determined by looking at the number of mutations away from a function TE it is in present form. However, it must be taken into account that mutations are almost certainly NOT random. Which brings me to a third idea...

3) Investigating why and how mutations are not random. Does it have to do with the 3-dimensional structure of the DNA (folds and everything)? Does the organism (cell) control if, when, and/or where mutagens can act?

4) Investigate the relative importance of the different means by which organisms deal with stress.

Excerpt from Madlung and Comai, 2004:

Stress, in any form, exerts strong evolutionary pressure on
all organisms. To survive, any organism must develop tolerance,
resistance or avoidance mechanisms. Tolerance
allows the organism to withstand the assault unharmed.
Resistance involves active countermeasures, while avoidance
prevents exposure to the stress.

I believe that it is easy to understand that plants would probably rely much more heavily upon tolerance and resistance whereas motile organisms like animals can take much greater advantage of avoidance. However, if people (humans) engage in behavior that causes the constant and continuous stress, tolerance might have to work overtime. Furthermore, if tolerance (namely the detoxifying mechanisms in the body e.g. the liver) can no longer prevent the degradation of the integrity of the (genomic) individual, then resistance (or other forms of tolerance) might be induced. As a side note, I think that is it very comical that humans consider themselves the smartest of all creation, and yet we are one of the only organisms that (across the board) find the most harmful chemicals, toxicants, etc. and make them habits and lifestyle choices. Most other organisms would "listen" to their bodies and realize that the action that they are taking is harmful and should be discontinued. With that being said, of course the people who put their bodies under greater stress will have the greater amounts of Transpositional events occurring in their DNA.

4b) With this in mind, I note that twins can be identical, even in their DNA, but have vastly different expression patterns for certain (very important) genes. (As a side note for myself, I have found it helpful to think of identical twins when thinking about gene expression and TE and stressors and DNA's 3-D structure.) Even if a scientist had the economic resources to perform a complete sequencing of the two genomes, it may be found that not a single mutation occurs in a gene-coding region. And still, the gene expression would be vastly different; perhaps because of DNA's 3-D structure; epigenetic control.

5) Furthermore, I feel as though I have just had an epiphany! Oh happy day! I have just synthesized a postulate that flies in the face of modern genetics. Genetics 101 claims that inheritance of acquired traits is utter non-sense and only happens in the rarest of cases. But what if inheritance of acquired traits is essential and very beneficial. I have an idea of how it could happen, but suffice it to say that expression of most (if not all) genes is affected by the structural formation taken by the DNA. The DNA takes it 3-D, tightly packaged shape from many smaller structures whose location and structure is intimately associated with the precise base pair sequence that the DNA has. What if stressful events modified the DNA (a mutation occurs); as a side note, the mutation could be exogenously (biotic or abiotic) induced or there may be some internal controls within the cell that 'cause,' 'select for,' or 'allow for' mutations to occur in a very specific place. The said mutation occurs not within any coded gene, rather, the change in nucleotide leads to conformational changes of the 3-d structure of the DNA, leading to epigenetic up- (or down-) regulation of specified genes.

Just a rant - there needs to be a faster and cheaper way to extract the entire transcriptome from a cell. Why don't you work on that, Kevin?

Comment on TEs

From Madlung and Comai:

To summarize, abiotic stress can result not only in well-programmed
physiological stress responses but also in
genome-wide changes. Stress-induced genomic responses
include transposon activation, transposition, and structural
genome changes. Like other stress responses transposon-mediated
alterations in transcriptional activity of affected
genes might lead to avoidance or tolerance of the stress.
Unlike many other stress responses, however, transpositional
activation appears to be a reaction not directly targeting
an evolutionarily developed physiological pathway but
is a hit-or-miss approach to finding an appropriate way of
handling an unusual challenge.

This is more or less what I was trying to say. It seems like it has already been said and much more eloquently. However, I disagree with his conclusion. They say that "transpositional activation appears to be a reaction not directly targeting..." Whereas I believe that while transposition could be a 'random' and disorganized process, perhaps it has evolved a complex regulatory pathway, and that certain parts of the 'junk' DNA (all 98% of it) somehow contain the information necessary to create an appropriate way of handling an unusual challenge. Considering that human life has been around for many hundreds of thousands of years, not to mention all the inherited genetic history, I believe that the human body has faced many of the same challenges that it is faced with today (even in the face of such new, xenobiotic chemicals and anthropogenic pollution). With this "experience" in our genetic "memory", the cellular machinery "knows" how to respond in a way that is at least a little better than hit-or-miss.

Here's a story: metabolites are accumulating in the cell because the protein that is supposed to process them is more affected than average to the stress. The increasing levels of the metabolite increase the expression of the affected protein (by some feedback loop). The high concentration of the metabolite in the cell and/or the stressor molecule induce the activation of DNA-affecting systems (e.g. - TEs), which have as a target regions the places where high levels of transcription are taking place. RTs have been shown to use the machinery of cellular division to insert themselves into the genome. Would it be to far of a leap to assume that they (or mutating, inserting elements) could utilize the machinery of transcription, as well?

Hints of hidden heritability in GWAS

Gibson, 2010

Although susceptibility loci identified through genome-wide association studies (GWAS) typically explain only a small proportion of the heritability, a classical quantitative genetic analysis now argues that considering together all common SNPs can explain a large proportion of the heritability of these complex traits. A related study provides recommendations for the sample sizes needed in future GWAS to identify additional susceptibility loci.

While GWAS have helped us identify genetic variants associated with many different types of diseases, these associations only explain a few percent of the heritability of complex disease. As I have addressed before, there are a few different reasons that SNPs in GWAS are not capturing heritability of disease: 1) we are improperly estimating heritability of the disease (or phenotype), or 2) the common variants of GWAS are not capable of (statistically significantly) capturing the genetic heritability of these phenotypes. It is important to note that these two explanations are not mutually exclusive.

If the former is true, we need to go back and refine our protocol and understanding of the problem of inheritance and more accurately estimate the proportion of phenotypic variation explained by inheritance.

If the latter is true, it would be an interesting question to examine why it might be true. Many researchers have proposed many different hypotheses, and generally they are not mutually exclusive. Rare variants, epistasis, epigenetics, and geneotype-environment interactions are listed by Greg Gibson as potential sources of heritability. Also noted, is the possibility that complex traits emerge from the interaction of thousands of (common) variants with small effects.

So let the great debate begin; what is the reason that GWAS do not identify causal variants (in most cases). Is it 1) that some rare variants of high impact, 2) or many common variants with small impact are affecting phenotypes. Both of the previous scenarios would lead to a situation of low statistical significance. The former because if a variant is rare it will only be present in a few people in the population. For instance, in a study of 5000 people, if the allele only has a prevalence of 0.2% in the population, no subjects would be expected to be homozygous and ~20 subjects are expected to be heterozygous. The stronger the effect, the more likely it would be for a statistical analysis to pick up the association. But just a few phenotypic outliers could dramatically alter the p-value for the association between the rare variant and the phenotype.
The latter is hard to decipher (ie - find statistically significant associations) because if a common variants only explain a small percentage of the variance, a few ouliers could also change the results. Another problem is that if there are 5000 people in the study and 1000 common variants are affecting the trait, there is the potential problem of multicollinearity caused by not having enough data to fit all of the parameters (for the 1000 genes).

Gibson states, "It is unlikely that GWAS will ever be sufficiently powered to uncover even the majority of the heritability" of complex disease. My reply to that is, what then should we be doing in increase our explanatory power (that is, what tests should we be running to elucidate and assign heritability to genes, regulatory elements or networks of these component parts).

"This [paper by Yang et al] presents an elegant argument that most of the heritability [of height] is hidden rather than missing and hence, that there is no pressing need to invoke more complex genetic mechanisms to explain height."
If that is the case, then I want to know how to uncover the hidden heritability because that is going yield causal variants which will lead to molecular mechanisms for the condition, be it height or complex disease like T2D or CVD.

Sunday, October 24, 2010

A few thoughts on TEs

Comment on TEs

Hints of hidden heritability in GWAS