Functional genomics

First, remember what we have covered so far. 
                                                                                                      
Whole genome sequencing – Can give us information about

What is the level we tend to think about when we have whole genome sequence?

So, once you know the genomic sequence, what are your other questions and how do you address them?

 

Think about the levels of organization we discussed.  Each molecule in these levels has a “life” that can be described.  Good webpage for basic structures, information

DNA 

RNA

Protein

Lipid

Metabolite

 

What is gene expression?  This seems like an easy question.  How do you think of it and can you defend your definition?

Relative concentrations of macromolecules
Amino acids (1 and 2)

To really think deeply about this, we need some review:

Weak interactions (1 and 2)

Delta G

ΔG0’ = -RTlnKeq
ΔG    = -RTlnKeq + RTln [P]/[R]

 

DNA base pairs

 

But let’s think about how proteins bind DNA

Functional Specificity of a Hox
Protein Mediated by the Recognition
of Minor Groove Structure
Rohit Joshi,1 Jonathan M. Passner,4,5 Remo Rohs,1,2,5 Rinku Jain,4,5 Alona Sosinsky,1,2,5
Michael A. Crickmore,1,3 Vinitha Jacob,4 Aneel K. Aggarwal,4,* Barry Honig,1,2,* and Richard S. Mann1,*
1Department of Biochemistry and Molecular Biophysics
2Howard Hughes Medical Institute
3Department of Biological Sciences
Columbia University, 701 W. 168th St. HHSC 1104, New York, NY 10032, USA
4Department of Structural and Chemical Biology, Mount Sinai School of Medicine, 1425 Madison Avenue, New York,
NY 10029, USA
5These authors contributed equally to this work.
*Correspondence: aneel.aggarwal@

 

SUMMARY
The recognition of specific DNA-binding sites by transcription factors is a critical yet poorly
understood step in the control of gene expression. Members of the Hox family of transcription
factors bind DNA by making nearly identical major groove contacts via the recognition helices
of their homeodomains. In vivo specificity, however, often depends on extended and unstructured
regions that link Hox homeodomains to a DNA-bound cofactor, Extradenticle (Exd).
Using a combination of structure determination, computational analysis, and in vitro and in vivo
assays, we show that Hox proteins recognize specific Hox-Exd binding sites via residues located
in these extended regions that insert into the minor groove but only when presented with the correct DNA sequence. Our results suggest that these residues, which are conserved
in a paralog-specific manner, confer specificity by recognizing a sequence-dependent DNA structure instead of directly reading a specific DNA sequence.

 

Read introduction:  How may HOX genes are there in flies?  How will you find out?  What is a HOX paralog?

 

They say HOX N-terminal arms are required for specificity, but are mostly disordered.  Why do you think this presents a challenge to model-builders?

from paper

 

single letter AA code

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In the figure below, please discuss what are the challenges, based on the amino acids you see that are important.  What do they tell you about this binding?
from paper

 

 

 

 

 

 

 

 

 

 

 

What does the figure below tell you about stabilization of DNA-TF interactions?  These interactions are highly conserved among Scr TFs in both vertebrates and invertebrates. 

from paper

 

 

 

 

 

 

 

 

 

But, remember, these are crystallized structures.  AT-rich regions often lead to narrow minor grooves due in large part to negative propeller twisting (Crothers and Shakked, 1999).

from paper

 

 

 

 

 

 

 

 

 

 

 

Now they want to see if there are real phenotypic effects of changes.  How important is the minor groove binding?

What does the figure below say about the role of the His and Arg in Scr function?  What does it say about the dynamics of this binding?
from paper

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

From the review:

As in the present example of two homeodomains, inspection of the protein- DNA structures alone does not distinguish between a preformed narrow groove and a groove that can be narrowed more readily— that is, between recognition of DNA conformation or of DNA conformability.

There was once a view that base sequence– specific DNA-binding proteins might have relatively simple ‘read-out’ properties, and much speculation went into considering potential transcription-factor ‘codes’. Early studies of phage repressors more or less dispelled such notions (although they have popped up from time to time since), and true ‘altered-specificity’ mutants have proved exceptionally hard to isolate or design. Various recently determined examples of multiple proteins bound to DNA—for example, the interferon-â enhanceosome10—should help dispel the further illusion that the logic of transcriptional regulation can be extracted from genomic sequences by matching consensus transcription-factor sites. The actual encoding of regulatory information is far more intricate, and contingent specificities such as the one analyzed by Joshi et al.2 are probably common features. The conformational characteristics of DNA and of its protein partners do far more than merely create a framework within
which projecting side chains interrogate base-pair functional groups.

As you move into functional genomics, you do not have to be disabled by what you don’t know, but identify the path to get the information you need.

 

So, we can measure RNA (hybridization-based and sequence-based technologies); proteins (MS/MS, electrophoretic, and live-cell technologies) and metabolites (HPLC, LC and other separation and enzymatic identification and quantification technologies),

 

But how do we integrate this data so we can detect other inter-level emergent properties?  What are the limits of the data we can get?

 

From Bruggeman and Westerhoff:

Top-down systems biology identifies molecular interaction networks on the basis of correlated molecular behavior
observed in genome-wide ‘omics’ studies. Bottom-up systems biology examines the mechanisms through which functional properties arise in the interactions of known components.

link high throughput sequencing, bioinformatics, and genome-wide experimentation.