Preliminary II Examination, Aug 04, 2008, 10:00AM - 12:00PM, Wachman 447
Identify biological interfaces from crystal structures of homologous proteins
Qifang Xu
Committee:
Dr. Zoran Obradovic (Advisor)
Dr. Roland L. Dunbrack, Jr.
Dr. Slobodan Vucetic
Dr. Longin Jan Latecki
Many proteins function as homooligomers and are regulated via their
oligomeric state. For some proteins, the stoichiometry of homooligomeric
states under various conditions has been studied using gel filtration or
analytical ultracentrifugation experiments. The interfaces involved in
these assemblies may be identified using crosslinking and mass
spectrometry, solution-state NMR, and other experiments. But for most
proteins, the actual interfaces that are involved in oligomerization are
inferred from X-ray crystallographic structures using assumptions about
interface surface areas and physical properties. PDB, PQS and PISA provide
biological units hence the interfaces. Our study showed that the
inconsistence in these databases and between them is significant. Most of
the biological units are inferred from individual entries. Examination of
interfaces across different PDB entries in a protein family reveals
several important features. First, similarity of space group, asymmetric
unit size, and cell dimensions and angles (within 1%) does not guarantee
that two crystals are actually the same crystal form, that is containing
similar relative orientations and interactions within the crystal.
Conversely, two crystals in different space groups may be quite similar in
terms of all of the interfaces within each crystal. Second, NMR structures
and an existing benchmark of PDB crystallographic entries consisting of
126 dimers and larger structures and 132 monomers was used to determine
whether the existence or lack of existence of common interfaces across
multiple crystal forms can be used to predict whether a protein is an
oligomer or not. Monomeric proteins tend to have common interfaces across
only a minority of crystal forms, while higher order structures exhibit
common interfaces across a majority of available crystal forms. The data
can be used to estimate the probability that an interface is biological if
two or more crystal forms are available. Third, the evolution information
was used in evaluating interfaces in more than one crystal form. An
interface shared in two different crystal forms by divergent proteins is
very likely to be biologically important, while some interfaces are
restricted to one branch of a family, indicating the evolution of an
interface in one branch of the family and/or loss in another. Finally, the
PISA database available from the EBI is more consistent in identifying
interfaces observed in many crystal forms than is the PDB or EBI's Protein
Quaternary Server (PQS). The PDB in particular is missing highly likely
biological interfaces in its biological unit files for about 10% of PDB
entries.

