Amino acids: building blocks of proteins
- Amino acids are the building blocks of proteins. Twenty amino acids through different combinations make up proteins ranging from small peptides to huge polymers, each with a unique function. In fact, these building blocks of proteins are responsible for the production of diverse products like enzymes, hormones, antibodies, transporters, muscle fibres, lens protein in the eye, feathers, spider webs, animal horns, milk proteins, keratin etc
- Discovery of the first amino acid was in 1806 i.e arginine and discovery of the last amino acid threonine was not until 1938
- As mentioned before, amino acids have five components: a central carbon atom bound to hydrogen, carboxyl group, an amino group and an R group (side chain)
- All the amino acids are chiral except glycine as it its bonded to two hydrogen atoms
- Enantiomers: Attributing to their tetrahedral arrangement and chirality, two isomers denoted as D and L enantiomers are possible for each amino acid, but the amino acids in biological protein take the L form
Classification of amino acids based on side chains
Classification
- Non polar aliphatic side chains are hydrophobic and these amino acids tend to cluster together in the interior of the protein away from the aqueous solution
- Aromatic side chains are hydrophobic. Tyrosine and tryptophan are more polar than phenylalanine as tyrosine has a hydroxyl group and tryptophan has a nitrogen as a part of its indole ring
- Polar uncharged side chains are as the name suggests more hydrophilic as they have functional groups that can form hydrogen bonds with water. For instance, serine and threonine has hydroxyl groups, cysteine has sulfhydryl group and asparagine and glutamine has amide group. Cysteine readily oxidises to cystine, a dimeric amino acid consisting of two cysteines joint by disulphide bond
- Positively charged side chains are hydrophilic due to the following groups.
- Lysine: a second primary amino group on its aliphatic chain
- Arginine: guanidinium group and
- Histidine: imidazole group.
- Negatively charged side chains are hydrophilic as both have a second carboxyl group
Special amino acids
- Histidine is physiologically very important as it is the only amino acid which has an ionisable side chain with pKa near neutrality and can hence be positively charge or neutral at pH 7. This property proves useful in many enzyme-catalyzed reactions
- In addition to the twenty amino acids, uncommon or derived amino acids are important too. For instance, plant cell walls contain 4- hydroxyproline and collagen, a fibrous protein contain 5- hydroxylysine
- Another interesting amino acid, selenocysteine, also known as the 21st amino acid. Introducing it during protein synthesis rather than as a post translational modification is what makes this amino acid unique
- Ornithine and citrulline deserve special attention as they are key metabolites in the biosynthesis of arginine and urea cycle respectively
Properties of amino acids
- Amino acids can act as acids or bases, meaning they exist as zwitterions, and are hence amphoteric.
- All amino acids have their characteristic titration curves. A simple example would be the titration curve of glycine
Titration curve of Glycine
- Glycine has two ionisable groups- carboxyl group and amino group. When titrated with a strong base such as NaOH, the plot has two distinct stages, corresponding to the de protonation of the two different ionisable groups in glycine, resembling the titration curve of acetic acid
- First stage of titration: At very low pH, glycine is in its fully protonated form. At the mid point when COOH group loses its proton, an equimolar mixture of proton acceptor and proton donor are present, the point where pH=pKa of the protonated group (COOH). For glycine, the pH and therefore the pKa is 2.34 (pKa measures the tendency of a group to give up a proton)
- Upon further titration, a crucial point at pH 5.97, where removal of the first proton is complete and the second has just begun. At this point, glycine exists as a zwitterion
- Second stage of the titration: where the NH3+ group starts to lose its protons. This is 9.60 for glycine and also the pKa for the NH3 group
- What can we learn from the titration curve of glycine?
- It gives a quantitative measure of pKa of the two ionisable groups i.e. 2.34 for COOH and 9.60 for NH3 group
- COOH group of glycine is almost a 100 times more acidic than acetic acid and this is due to the repulsion between the proton which is leaving and the positively charged amino group
- The pKa of a functional group largely depends on its chemical environment
- Titration curves predict the electric charge of the amino acids. Isoelectric point or isoelectric pH (pI) is the pH at which the net electric charge is zero
Biologically important peptides
Peptides and proteins, as you have read earlier are polymers of amino acids connected to each other through a peptide bond
- Two amino acids- dipeptide; few amino acids- oligopeptide; many amino acids- polypeptide
- Proteins and polypeptides.. what’s the difference? Proteins generally have a molecular weight of more than 10000, while polypeptide have a molecular weight of less than 10000
- In a protein, the end with a free amino group= N terminal / amino terminal and the end with a free carboxyl group= C terminal/ carboxy terminal
Examples of biologically important peptides
- They come in a wide range of sizes and they can exert their effects even at low concentrations. Let’s look at some examples of biologically important peptides:
- Oxytocin: 9 amino acids, secreted by the posterior pituitary, helps in uterine contractions
- Extremely toxic mushroom poisons like aminitin (for their use as antibiotics)
- Thyrotropin releasing factor secreted by the hypothalamus helps in stimulating the release of thyrotropin from anterior pituitary
- Some proteins have a single peptide and some are multi-subunit proteins. Oligomers are multi-subunit proteins with two identical protein subunits (each subunit is a protomer)
- Haemoglobin- a tetramer, for instance has 2 alpha subunits and 2 beta subunits held together by non covalent bonds. Proteins can also have a non covalent linkage as in insulin, where disulphide bonds connect two chains of polypeptides
- While Conjugated proteins are proteins containing other chemical components, prosthetic group is the non amino acid part of the protein
- Examples of conjugated proteins: lipoproteins, glycoproteins, metalloproteins
Proteins- levels of organization
Proteins have four different levels of organization- primary, secondary, tertiary and quarternary. Let’s look at each of them in detail
Primary structure of proteins
- This involves the sequence of amino acids and the peptide bonds and covalent bonds (disulphide) connect them
- The function of a protein depends on its amino acid sequence. There are several examples where a change in the amino acid has resulted in genetic diseases. For instance, a single change in amino acid where valine replaces glutamic acid can result in sickle cell anaemia
- A conservation of amino acid sequence among different species of animals is note worthy
Secondary structure of proteins
- The primary sequence of amino acids fold into recurring structural patterns
- Proteins fold into 2 types of structures: the alpha helix , beta sheets
- you can read more on the characteristics of alpha and beta structure in my previous blog post
- Alpha helix form more readily than any other conformations as the alpha helix makes optimal use of internal hydrogen bonds. Also, they can form in polypeptides consisting either L or D amino acids
- Beta sheets are very common in globular proteins, hence glycine and proline residues have a higher propensity towards them
Tertiary structure of proteins
- This comprises of the overall 3 dimensional structure of the protein
- They consider longer range aspects of the amino acid sequence
- Amino acids which are far apart in the polypeptide chain and which may even be present in different secondary structures can interact in the tertiary structure while folding into a 3 dimensional shape
- Weak interactions and covalent bonds such as disulphide bonds hold together the different segments of the polypeptide chains in a tertiary structure
Motifs and domains
- To understand the complete three dimensional structure of a protein, one needs to understand motifs and domains
- Motifs are recognisable folding patterns consisting of two or more elements of the secondary structures and connections between them. They can be very simple like beta alpha beta loops or have elaborate structures like the beta barrel
- Domains are independently stable part of the polypeptide chain. A domain from a large protein will retain its 3D structure even when separated. Examples of domains include the alpha/beta barrels
Quaternary structure of proteins
- When the 3D structure has two or more subunits, it is a quaternary structure
- The different subunits may be identical or non identical. For instance, haemoglobin is a tetramer consisting of two alpha subunits and two beta subunits. The four subunits coordinate and act together to carry out their function of carrying oxygen from lungs to tissues
Ramachandran plot
- As discussed earlier, the peptide bond is rigid and planar. The partial double bond character reduces its chance of rotation around the bond resulting in a limited range of conformation for a peptide
- Ramachandran plot is a map that shows the allowed and disallowed conformations based on the phi and psi bond. In principle, phi and psi can have any value between -180 and +180, however, due to stearic hindrances, it is not possible
- The Ramachandran plot is based on van der Waals radii and dihedral angles
- The allowed regions may vary for each amino acid. For instance, glycine being the smallest and simplest amino acids have a lot of allowed conformations as it lacks stearic hindrance. While proline has a lot of disallowed regions due to stearic hindrance from its imidazole ring
Myoglobin
- It is the oxygen binding protein present in the muscles. It has a dual function wherein it help store oxygen in the muscle as well as facilitate oxygen distribution to the fast beating muscle cells
structure and location
- It is a single polypeptide chain of 153 amino acids with a single iron protoporphyrin
- It is abundant in the muscles of diving mammals like whales and seals as they can store and distribute oxygen by the muscle myoglobin. This is why they can stay submerged for long periods of time
- The structure of myoglobin consists of 8 alpha chains interrupted by bends, some of which are beta sheets. The longest alpha helix= 23 amino acids, shortest alpha helix= 7 amino acids
stability
- Myoglobin receives its stability from hydrophobic interactions. All hydrophobic amino acids are inside the alpha helix. All but 2 polar amino acids are present on the exterior
- It has a thick and dense core resembling a globular protein, thus weak interactions strengthen and reinforce each other
- The heme group is present in a crevice in the myoglobin. The iron atom at its centre is bonded to tow groups perpendicular to the plane of the heme. One of these is bound to the histidine residue at position 93 and the other at which an oxygen binds
- The presence of heme group in a pocket is essential as it stays restricted from the solvent. If they are exposed to a solvent, ferrous form of heme gets oxidised to ferric form, which does not bind oxygen
Haemoglobin
- It is an iron containing globular protein found in the red blood cells (erythrocytes) of humans. Its main function is to transport oxygen from lungs to the tissues
- It transfers oxygen by forming an unstable reversible bond with oxygen. In the oxygen bound state, it is called oxyhemoglobin and appears bright red in colour. In the unbound state, it appears purplish blue
- Haemoglobin develops in the bone marrow which forms red blood cells. When they die, haemoglobin is broken down and the iron is transported back to the bone marrow. The remaining portion of the haemoglobin forms the basis of bilirubin
- In an healthy individual, haemoglobin levels ranges from 12 to 20 g/dL and it is slightly higher in males (13.5 to 17.5) compared to females
structure
- Structurally, it is made of four heme groups (2 alpha subunits and 2 beta subunits) surrounded by a porphyrin ring giving it a tetrahedral structure. An iron atoms is attached to the porphyrin ring. Each molecule of haemoglobin has four iron atoms and can therefore bind four oxygen molecules
- Haemoglobin can reversibly bind to oxygen in a process known as oxygenation. The way in which the four subunits interact with each other is called cooperativity
- Alpha subunit: 141 amino acids, Beta subunit: 146 amino acids
- The subunits are held together by hydrophobic interactions, hydrogen bonding or salt bridges
- Infants have two alpha chains and two gamma chains, which will eventually get replaced by beta chains
- Haemoglobin exists in 2 conformations: relaxed (no oxygen bound) and tensed (oxygen bound)
- The affinity of oxygen to haemoglobin is affected by 3 factors: pH, 2,3-Bisphosphoglyceric acid (2,3-BPG) and partial pressure of oxygen
- Low pH, high BPG and presence of carbon dioxide in the blood, low partial pressure of oxygen——-favour T state and oxygen is released.
- High pH, low BPG and low CO2 in the blood (in alveolus), high partial pressure of oxygen——- favour R state and oxygen is bound
- The allosteric binding of oxygen to haemoglobin follows a sigmoid curve
- Types of haemoglobin: Hb A (most common in adults with 2 alpha and 2 beta chains), HbA2 (rare Hb with 2 alpha chains and 2 delta chains) and Hb F (petal haemoglobin with 2 alpha and 2 gamma chains)
Structure of collagen
- Collagen is a protein found in connective tissues like tendons and cartilage. They are also found in the matrix of bone and cornea of the eye
- Although the structure of collage looks like an alpha helix, it is a unique secondary structure with a phi bond of -51 and psi bond of +153
- It is left handed and has 3 amino acid residues per turn. It is a coiled coil made of 3 separate polypeptide called alpha chains (not helices) and the superhelical twisting is right handed
- They mainly contain 4 amino acids- glycine, alanine, proline and 4-hydroxy proline. The amino acid content of collagen is unique and this is due to structural constraints
- Collagen is basically a repeating sequence of Gly-X-Y where X is proline and Y is hydroxy proline. The sharp twisting of collagen is because of X and Y. The tight wrapping of the 3 alpha helices provide tensile strength
- The alpha chains and fibrils of the collagen molecules are cross linked by covalent bonds involving lysine, hydroxy lysine or histidine. These are present at a few of the X or Y positions
Biosynthesis of amino acids
- Biosynthesis of amino acids require nitrogen. Nitrogen is abundant in the atmosphere, however, most organisms cannot convert it to a useful form. The process in which different species use and reuse biological nitrogen is called nitrogen cycle.
- The different steps involved in nitrogen cycle include fixation (atmospheric nitrogen converted to ammonia) and nitrification ( ammonia converted to nitrite and ultimately nitrate). Fixation is done by a specific group of azotobacter species which live in symbiotic association with root nodules of legumes.The ammonia so formed is used by plants to make amino acids and animals get nitrogen from plants when they eat them. When plants die, the ammonia is returned to the soil and the cycle continues. We will see this detail shortly
- Ammonia is the source of nitrogen for all amino acids. The carbon backbones come from the glycolytic pathway, pentose phosphate pathway or citric acid cycle.
- However, stereochemical control is an important issue when it comes to biosynthesis of amino acids i.e. the pathways must generate the correct isomer. The right stereochemistry at the alpha carbon requires pyridoxal phosphate
- The ammonia obtained from nitrogen cycle is used to synthesise glutamate. All other amino acids are formed from glutamate
How atmospheric nitrogen is converted to ammonia and how ammonia enters the biological system?
- It all starts with the nitrogen cycle.Nitrogen is required for the biosynthesis of amino acids, nucleotides and other biomolecules. This nitrogen comes from atmospheric nitrogen. The biosynthetic pathway starts with the reduction of nitrogen to ammonia. This is called nitrogen fixation
Nitrogen Fixation
- Not all organisms can fix nitrogen. It is done by a group of bacteria called rhizobium which are in symbiotic relationship with the root nodules of leguminous plants. Nitrogen fixation requires a complex enzyme called nitrogenase complex. It consists of a reductase and a nitrogenase. The actual site of nitrogen fixation is the iron molybdenum cofactor. It is very crucial for the conversion of nitrogen to ammonia
Transamination
- Now that the ammonium ion is formed, let us see how it gets incorporated into an amino acid through glutamate and glutamine. The alpha amino group of most amino acids come from the alpha amino group of glutamate by a process called transamination. The amino group of tryptophan and histidine come from another important nitrogen donor i.e. glutamine
Glutamate synthesis
- How is glutamate synthesised? Glutamate comes from ammonia and alpha keto glutarate ( a citric acid cycle intermediate) by the action of glutamate dehydrogenase
- The reaction proceeds through 2 steps involving the formation of a schiff base. The protonated schiff base can be reduced to form glutamate. The formation of schiff base is a common reaction in biosynthesis and degradation of many amino acids
- Formation of glutamine: a second ammonium ion is added to glutamate to form glutamine. Glutamine synthase catalyses this reaction. This reaction too involves an intermediate i.e. acyl phosphate intermediate
- So, far, we have seen the biosynthesis of glutamine and glutamate. How are the rest of the amino acids synthesised?
- Out of the 20 amino acids, 9 of them are essential amino acids and need to be supplied though our diet
- Some non essential amino acids are synthesised in a single step through the addition of an amino group from glutamate (transamination)
Oxaloacetate + glutamate —— aspartate + alpha keto glutarate
Pyruvate + glutamate —— alanine + alpha keto glutarate
These 2 reactions are catalysed by pyridoxal phosphate dependent transaminases
- Aspartate, upon amidation gives Aspargine
- This reaction is similar to the formation of glutamine from glutamate. They are both amidation reactions and involve ATP hydrolysis.
- In bacteria, asparagine is formed by adenylation of aspartate. However, in mammals, it is slightly different as the nitrogen donor for asparagine is glutamate instead of ammonia. The hydrolysis of side chain of glutamine generates ammonia which is directly transferred to aspartate. This reaction proceeds through an acyl- adenylate intermediate
Formation of proline and arginine
- Glutamate is the precursor for proline and arginine (both non essential amino acids)
- Carboxyl group of glutamate reacts with ATP to form an acyl phosphate, which is then reduced by NADPH to an aldehyde glutamic semialdehyde. This semialdehyde cyclises to give pyrroline-5-carboxylate, which upon reduction by NADPH gives proline. The semialdehyde can transaminate to give ornithine which after several steps give rise to arginine
Formation of serine, cysteine and glycine from 3-phosphoglycerate
- 3 phosphoglycerate is an intermediate in glycolysis
- Formation of serine: 3 phosphoglycerate upon oxidation gives phosphohydroxypyruvate, which is then transaminate to 3 phosphoserine and then hydrolysed to serine
- Formation of cysteine and glycine: serine is the precursor of both cysteine and glycine
- Side chain methylene group of serine transfers to tetrahydrofolate (reaction catalysed by serine transhydroxymethylase) to form glycine and methylenetetrahydrofolate
- Formation of cysteine is similar, but it requires the substitution of a sulphur from methionine for the side chain oxygen atom. Cystathioninase beta synthase catalyses the condensation of serine and homocysteine to form cystathionine. Cystathioninase then deaminates and cleaves to cysteine and alpha keto butyrate
- The synthesis of S adenosyl methionine (SAM), a major donor of methyl groups from methionine due to its interaction with ATP. The triphosphate group in ATP is unusual as it splits into pyrophosphate and orthophosphate. Pyrophosphate gives S adenosyl homocysteine, which hydrolyses to homocysteine and adenosine
- Methionine synthase regenerates Methionine from homocysteine
- How are aromatic amino acids formed?
- These are essential amino acids and the intermediates are Shikimate and chorismate. It is synthesised in Ecoli.
Urea cycle
Upon utilisation of proteins and breakdown of amino acids, excess nitrogen needs removal from the body. There are three ways to do this:
- Aquatic animals : they directly excrete nitrogen into the water
- Terrestrial vertebrates: they convert ammonia to less toxic products like urea
- Birds and terrestrial vertebrates: convert nitrogen to uric acid
Let us focus on urea cycle:
- Where is urea produced? In the liver by the enzymes of urea cycle. It is later released and the blood stream and the kidney filters it out through urine
- The reaction:
Ammonia + HCO3- + aspartate————— urea + fumarate
- This process utilises ATP (3ATP to 2ATP + 2 Pi + AMP +PPi)
Enzymes that carry out urea cycle:
- Carbamoyl phosphate synthetase (CPS) : it catalyses activation and condensation of ammonia and HCO3- to give carbamoyl phosphate (ATP driven reaction) carbamoyl phosphate is a substrate in the urea cycle
- CPS exists in 2 forms: I and II.
- CPS I: ammonia is the nitrogen donor
- CPS II: glutamine is the nitrogen donor
- CPS II catalyses a crucial rate limiting step of urea cycle i.e. the formation of carbamoyl phosphate. This takes place in 3 steps:
- activation of HCO3- to carboxyphosphate by ATP
- attack on carboxyphosphate by ammonia. This removes the phosphate giving carbamate
- A second ATP phosphorylates carbamate to form carbamoyl phosphate
- Ornithine transcarbamoylase: this transfers carbamoyl group (carbamoylates) of carbamoyl phosphate to ornithine to give citrulline
- Arginosuccinate Synthetase : it catalyses the condensation of citrulline (ureido group) and aspartate (amino group) to form arginosuccinate
- Arginosuccinase: the previous step assembles all the components of the urea molecule are assembled, but the nitrogen from aspartate still remains. This is where arginosuccinase comes in. It catalyses the elimination of fumarate, leaving arginine, urea’s immediate precursor.
- Arginase: it catalyses the hydrolysis of arginine to urea and the ornithine is regenerated and returned to the mitochondria to start another cycle
Enzyme | Process | Substrate | Product |
ATP | Activation | HCO3- | Carboxyphosphate |
Ammonia | Attack | Carboxyphosphate | Carbamate |
ATP | Phosphorylation | Carbamate | Carbamoyl Phosphate |
Ornithine transcarbomylase | Carbomylates | Carbamoyl Phosphate | Citrulline |
Arginosuccinate Synthetase | Condensation | Citrulline and aspartate | Arginosuccinate |
Arginosuccinase | Elimination | Arginosuccinate | Arginine |
Arginase | Hydrolysis | Arginine | Urea and ornithine |
Protein folding
- Protein folding is necessary in order to attain its native conformation. A loss of protein structure can result in a loss of function
- The folding of a protein is rapid and occurs in a stepwise manner. Although it is quick, the complicated folding pathway is yet to be fully deduced. Several models have been proposed to explain this, but, the exact mechanism remains unknown
Models of protein folding:
- Hierarchial folding of protein: the secondary structure forms first. Certain amino acids readily form alpha helix or beta sheets based on their propensity. Ionic interactions play a key role in guiding protein folding at this early stage. Long range interactions eg: interaction between 2 alpha helices to form a stable super secondary structure is seen at a later stage. This process continues until the protein has folded completely
- Folding through spontaneous collapse of a polypeptide into a compact structure called molten globule. The globule formation is because of hydrophobic interactions.
Molecular chaperones:
- Some proteins undergo assisted folding via molecular chaperones. They bind to partially folded or incorrectly folded proteins and guide them towards the right folding pattern
- Two well known classes of chaperones include Hsp 70 and Chaperonins . Hsp stands for heat shock proteins and are generally found in cells stressed by high temperatures
- Hsp 70: molecular weight 70000, generally bind to regions which are hydrophobic residues and thereby prevent aggregation. They act on denatured proteins, newly synthesised proteins and proteins which need to remain unfolded. They require ATP for their function and work along with another class of Hsp called Hsp 40. They have homologues in Ecoli and they are DnaK and DnaJ
- Chaperonins: they act on proteins which cannot fold spontaneously. They exist as a chaperonin system called GroEL/GroES. Unfolded proteins bind within small pockets in the GroEL complex and GroES acts as the lid of that pocket. The substantial change in conformation (requires ATP) and the binding and release of GroES helps in folding of the polypeptide
- Folding process of proteins involves two key enzymes that catalyse an isomerisation reaction. They are: Protein disulphide isomerase (PDI) and Peptide prolly cistrans isomerase (PPI)
Protein misfolding and associated diseases
- Classic examples of diseases arising due to protein misfiling include Alzeimer’s, Huntington’s disease, and Parkinson’s disease. These are collectively called amyloidoses
- A soluble protein secreted from a cell, when secreted in a misfiled state results in the formation of insoluble amyloid fibres. They are highly unbranched and mostly made of beta sheets. And most of them have a rich core of aromatic amino acids
- An amyloid fibril forms by a combination of 2 beta sheets which form around an incorrectly folded protein
- The onset of symptoms in amyloidosis are very slow, except for genetic diseases where there might be an early onset of disease symptoms
- Primary systemic amyloidosis is due to a deposition of misfiled immunoglobulin light chains. Onset of symptoms is around 65 years and mainly affect kidneys and heart. Symptoms include fatigue, hoarseness, swelling and weight loss
- Secondary systemic amyloidosis affects patients with chronic infections or inflammatory diseases like rheumatoid arthritis, tuberculosis, cyctic fibrosis and they exhibit a sharp increase in the secretion of serum amyloid A (SAA). This protein deposits in the spleen, kidney, liver and heart giving a wide range of symptoms based on the primary organ affected
- Genetic diseases arise due to inherent mutations in proteins like lysozyme, fibrinogen A and apolipoproteins.
- Some amyloid diseases are organ specific eg: deposition of amyloid around the islet cells of pancreas. This results in diabetes mellitus