Structure of Hemoglobin |
This is still a part of the “Sebuah Tulisan
Bioinformatika” series which was previously written in Bahasa Indonesia. Now I
would like to push my luck a little bit further by writing it in English
started from this article and further on. Okay, this time I’d like to write
about the basics of Structural Bioinformatics. Hopefully you would enjoy my
story :)
The story starts with a definition. Structural
Bioinformatics is a branch of bioinformatics which deals with the structural
parts of biological macromolecules, the DNA, RNA and Protein. Nowadays, protein
structures are dominating the structural databases compared to DNA and RNA. Several
factors that cause protein domination might be associated with the history which
in 1970s were initiated by resolving protein structure by using X-ray
crystallography. Therefor, I will put more focus on protein structural
bioinformatics in the story.
As you might already know, there are 4
hierarchical structures governing the protein, named primary until quartenary
structure. The primary structure represents protein in a string/sequence of
amino acid composing the protein. So if you see the sequence WHYGARTFED for
example, that is the primary structure composed of tryptophan, histidine,
tyrosine, glycine, alanine, arginine, and so on. For more details of amino acid
abbreviation symbol, you can search in the Google. The secondary structure
composed of the local structures formed by local interactions of adjacent amino
acids through hydrogen bonds. Commonly there are eight types of secondary
structures as defined in the Dictionary of Secondary Structure of Protein
(DSSP) by Kabsch in 1983. They are 310 helix (G), alpha helix (H),
pi helix (I), beta bridge (B), beta bulges (E), turns (T), curve (S), and loop
(C). To ease the complexity, these secondary structures often grouped into
three larger classes, named helix (G, H, and I), strands (B and E), and loops
(T, S, and C). The tertiary structure often determines the majority folding
patterns of protein. It is formed by non-local residual interactions involving Van
der Waals and hydrophobic interactions. Sometimes the folding is also
strengthened by incorporating covalent bond through disulfide bridge between
two cysteine. Owing to the native tertiary structure, the protein can function
properly. But several large protein complexes needs a higher order structure in
order to function, by which we call as quartenary structure. This structure
involves several tertiary structure subunits to be assembled together.
Hemoglobin, a classic example, is a protein with a quartenary structure. It is
composed of two alpha globin and two beta globin subunits.
Until recently, there are three methods
employed to determine the protein structure. Ordered from the earliest to the
most recent, they are X-ray
crystallography, nucleic magnetic resonance (NMR) spectroscopy, and electron
microscopy (EM). Each of these methods have their advantages and drawbacks.
X-ray crystallography focusing an x-ray beam through a crystallized protein.
The patterns of electron diffraction due to x-ray beam is mapped into an
electron-density map, which is then used to build a model structure of the
corresponding protein. The use of protein crystallization and electron-density
map allows the building of a model structure in high resolution. However, the
crystallization process almost always give a problem since not all protein can
be crystallized. The use of NMR in protein structure determination solves this
crystallization problem simply because this method do not require such process.
In NMR spectroscopy, a purified protein solution is placed in a very strong
magnetic field and then a radio wave hit to the molecules. The corresponding
resonance from a radio wave is then analyzed to map a number of adjacent atomic
nuclei. The model structure is then build based on the position of these atomic
nuclei relative to the others. NMR spectroscopy gives an intermediate
resolution of the resulted structure compared to x-ray, but the independency of
protein crystallization process make this method used to model the structure of
non-crystallizeable protein such as transmembrane proteins. The EM method is the
most recently developed method for determining protein structure. In the
process, electron beams are projected directly to the protein complex at every
angle to generate a 3D image, similar to cell structure visualization. EM is
able to model a large of even huge protein complexes which the other two
methods could not. However, speaking of the resolution, EM-generated model
structure has the lowest resolution compared to NMR or X-ray methods.
After generating an image, then what to do
next? Well, in this digital era where uploading images are prominent feature of
the term “exist”, similar thing also happens for protein structure. The world
wide Protein Data Bank (wwPDB) is the primary database where the researchers
all over the world submit their model structures to be deposited. This database
is divided into three sub-databases located in three different countries:
1.
Research
Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) in USA.
URL: http://www.rcsb.org/pdb/home/home.do
2.
European
Protein Data Bank (PDBe) in UK. URL: http://www.ebi.ac.uk/pdbe/node/1
3.
Japan
Protein Data Bank (PDBj) in Japan. URL: http://pdbj.org/
These three
databases also accept other biological macromolecular structures such as DNA
and RNA. But as the data grows, a new database developed specially to accommodate
DNA and RNA structure was built. This database is called Nucleic Acid Database
(NDB) and you can access it in: http://ndbserver.rutgers.edu/.
The occurrence of all these databases help the researcher all around the world
to deposit and exchanging structural data in order to make one further step in
their research. Well, I think that’s enough for the first part of the story. In
the next part, I will tell more about the structural databases as well as the
file formats.
Victor
Tidak ada komentar:
Posting Komentar