1Department of Zoology, Darrang College, Tezpur, Assam, India
2Department of Zoology, Gauhati University, Guwahati, Assam, India
Received date: 25/05/2016; Accepted date: 24/12/2016; Published date: 29/12/2016
Visit for more related articles at Research & Reviews: Journal of Zoological Sciences
Retinol-binding proteins (RBP) are a family of proteins with diverse functions. They are carrier proteins that bind retinol. An in silico study has been performed for 3D structure prediction and molecular phylogenetic analysis of RBP from two freshwater fishes, Danio rerio and Cyprinus carpio. The analyses were performed using sequence data of rbp gene and retinol binding protein, extracted from Gen Bank and UNIPROKB/SWISSPROT databases respectively. The evolutionary analyses of rbp gene have been conducted in MEGA5 by Maximum Likelihood and Maximum Parsimony methods. The present study represents the application of comparative modeling for 3D structure prediction of RBP and molecular phylogenetic analyses of rbp gene in MEGA5, using two different methods namely Maximum Parsimony and Maximum Likelihood. The predicted structures were deposited to Protein Model Database (PMDB) after statistical verification. The study shows that rbp gene of Cyprinus carpio is the closest ortholog of the rbp gene of Denio rerio. The structures of RBP can be helpful in structural biology for further investigations on active sites, molecular mechanism of function and structure based phylogeny. The structure will be helpful in the further understanding of the RBP-retinol complex interacts with transthyretin in freshwater fishes. The phylogenetic tree revealed the evolutionary relationship of RBP family.
Comparative modelling, RBP, Vitamin A, Phylogeny
Freshwater fish having both retinol and dehydroretinol showed the capabilities of cleaving the provitamin A-status carotenoids into both forms of vitamin A, through either central or terminal cleavage [1-13]. Retinol-binding protein (RBP) which delivers the retinol from the liver to the target tissues in vertebrates, acts as a transport protein for plasma and cellular retinoids  and transporter of small hydrophobic molecules like lipids, bilins, steroids etc., belongs to the lipocalin family. Documentation of piscine RBP in fishes has already been made [15,16]. The fish RBP is a monomeric protein of 21 KDa provided with single retinol binding site . Moreover, the transport of RBP is associated with transthyretin (TTR), which, in fact, prevents the filtration of RBP through the kidney glomerulus.
The structure of RBP is highly conserved  and there has been considerable similarity among the RBPs of various species, yet full knowledge on the RBP is not yet known . It is established that RBP molecule binds to a TTR tetramer and the interection domain with TTR in human has been shown [20,21]. In fact the mechanism of retention of RBP even without TTR  is not known in fishes, yet Folli  recorded the distinct binding site and structural properties of TTR in fish.
Lipocalins are a group of proteins with a wide range of functions with high level of sequence and closer similarity in their folding status (Flower 1995) The RBP attains at a stable condition at low pH . RBP consists of a single globular domain protein ~40 A° made up of an N-terminal coil, a beeta-sheet core, an alpha-helix and C-terminus coil. It is an eight stranded up and down beeta-barrel core where the retinol is sandwiched .
Although, there is availability of sequence information for Retinol Binding Protein (RBP), yet there is no structural information available. Therefore, the biochemistry and molecular mechanism of their functions in freshwater fish is still not very well understood due to lack of their species specific structural information. Therefore, an attempt was made to predict the 3D folding pattern and sequence analysis of RBP. The computational model of RBP can be helpful in structural biology for revealing functional information and characterization. It is said that a model of a protein predicted using in-silico method is around 95% similar to its natural structure . The key to a successful homology modelling is not usually the server or software used to predict the 3D model, the skill of designing a good alignment to a template structure is far more critical .
The study was extended to data mining and sequence analyses of RBP from Danio rerio and Cyprinus carpio (UNIPROKB/ SWISSPROT Accession numbers Q9PT95 and Q9DET6 respectively). The statistical analyses were performed using CLC Genomics Workbench (CLC Bio, Hyderabad). The important calculations for the amino acid composition, atomic composition, Theoretical pI, molecular weight, Formula, Extinction coefficients, half-life, Instability index, Aliphatic index, Hydrophobicity, charge vs. pH etc. were carried out under sequence analysis.
Molecular Phylogenetic Analysis of RBP Gene and RBP
The sequences for the rbp gene have been separately aligned using ClustalW 1.6  integrated in software MEGA5, using default parameters. The rbp dataset were subjected to phylogenetic analyses. Evolutionary analyses were conducted in MEGA5. The evolutionary history of rbp gene and RBP protein were inferred by using Maximum Likelihood estimates. Nucleotide and amino acid substitution models that best fit each dataset and the model parameters were estimated using Akaike information criterion implemented in the program MODELTEST version 3.7 . The evolutionary analysis rbp gene involved 11 nucleotide sequences and the phylogeny analysis of RBP involved 8 amino acid sequences.
Three-dimensional Structure Prediction
WU BLAST  and FASTA [27-29] searches were performed independently with PDB [30,31] for obtaining a suitable template. The significance of the BLAST results was assessed by expect values (e-value) generated by BLAST family of search algorithm. The target-template alignment  was carried out using ClustalW version 2.1 and Modeller 9.10  programmes. Comparative (Homology) modelling was conducted by the Modeller version 9.10 . The final 3D structures with all coordinates for RBP for both the targets were obtained by optimization of a molecular probability density function (pdf) of Modeller . The molecular pdf for homology modelling was optimized with the variable target function procedure in Cartesian space that employed the method of conjugate gradients and molecular dynamics with simulated annealing .
The 3D structures for RBP were evaluated  by ERRAT  and ProCheck  programmes. After fruitful verification, the coordinate files were successfully deposited to PMDB . All the graphic presentations of the 3D structures were prepared using Chimera  and RasMol programs .
The rbp gene (cDNA) sequence of the present study was 891 and 1006 nucleotide long in Cyprinus carpio and Danio rerio with molecular weights of 288.258 kDa and 325.597 kDa respectively. The melting temperature ranged between 81.85 to 83.86 at 0.1M salt concentration (Table 1). The nucleotide sequence analysis based on the homologous rbp mRNA (cDNA) sequence showed the domination of A:T in the rbp gene (Table 1; Figure 1). The frequency of AT in Cyprinus carpio and Danio rerio were 0.538 and 0.586 respectively (Table 1).
|Information||Danio rerio||Cyprinus carpio|
|Length||1006 nuc||891 nuc|
|Molecular weight||325.597 kDa||288.258 kDa|
|Frequencies of A+T||0.586||0.538|
|Frequencies of C+G||0.414||0.462|
|Melting temperature (°C) at [salt]=0.1M||81.85||83.86|
Table 1: Nucleotide sequence statistics.
The primary structures of RBP are shown to comprise of 192 and 213 amino acid residues in D. rerio and C. carpio respectively. The amino acid alanine (A) and Aspertic acid (D) has been found predominantly rich in the RBP of these two fish species (Figure 2A). Sequence analysis of RBP revealed negative hydropathy on average (-0.228 to -0.235) (Table 2; Figure 3). The molecular weight of RBP in the fishes of the present study ranged from 21970.8 Da(in D. rerio) to 24522.8 Da (in C. carpio). The isoelectric point of the RBP ranged from 4.83 (in D. rerio) to 4.98 (in C. carpio) (Table 2; Figure 2B). Extinction coefficients for RBP are 42900 [Abs 0.1% (=1 g/l) 0.758] and 50015 [Abs 0.1% (=1 g/l) 0.748] in D. rerio and C. carpio respectively. The instability index (II) of RBP was computed to be 33.52 to 36.50. There were 27 to 29 negatively charged and 18 to 22 positively charged amino acid resides in the RBP sequence. The aliphatic index for RBP was computed in the range of 73.23 to 74.65 (Table 2). Multiple sequence alignment of the RBP protein showed that Danio rerio has a deletion of ‘FLESNTT VKQDCALGTCWAQDCL’ from 19th to 39th positions (Figure 4).
|Protein statistics||Danio rerio||Cyprinus carpio|
|Number of amino acids||192||213|
|SWISS PROT AC number||Q9PT95||Q9DET6|
|Molecular weight||21970.8 Da||24522.8 Da|
|Total number of negatively charged residues (Asp+Glu)||27||29|
|Total number of positively charged residues (Arg+Lys)||18||22|
|Total number of atoms||3013||3372|
|Ext. coefficient (Abs 0.1% (=1 g/l) 0.758, assuming ALL Cys residues appear as half cystines)||42900||50015|
|Grand average of hydropathicity (GRAVY)||-0.235||-0.228|
Table 2: Protein statistics.
Phylogenetic Profile of rbp Gene and RBP
The evolutionary tree of rbp gene among eleven (11) fishes supports the fact that rbp gene in Danio rerio and Cypruns carpio has parallel evolution and they are represented as sister taxa with bootstrap support 100%, while Sulmo salar, Onchthyncus sp. and Anguilla Anguilla were clustered together as their successive sister group as observed in both in the MP and ML tree. Oreochromis niloticus and Tetradon nigroviridis represented as outgroup in the rbp gene phylogeny (Figures 5 and 6).
Figure 5: Molecular phylogenetic anaylsis of rbp gene using Likelihood estimates. A. rbp gene, B. RBP protein. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) is shown next to the branches (Felsenstein, 1985). The scale bars represent the branch lengths measured in the number of substitutions per site over the whole sequence.
The evolutionary history of rbp gene was inferred by using the Maximum Likelihood method based on the Kimura 2-parameter model [43-49]. The bootstrap consensus tree inferred from 1000 replicates  is taken to represent the evolutionary history of the taxa analyzed [51-55]. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches . Initial tree(s) for the heuristic search were obtained automatically as follows. When the number of common sites was <100 or less than one fourth of the total number of sites, the maximum parsimony method was used; otherwise BIONJ method with MCL distance matrix was used. A discrete Gamma distribution was used to model evolutionary rate differences among sites [5 categories (+G, parameter=0.4797)]. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. There was a total of 530 positions in the final dataset.
The evolutionary history of RBP protein was inferred by using the Maximum Likelihood method based on the Whelan And Goldman model [57-60]. The tree with the highest log likelihood (-1108.0087) is shown. When the number of common sites was <100 or less than one fourth of the total number of sites, the maximum parsimony method was used; otherwise BIONJ method with MCL distance matrix was used. A discrete Gamma distribution was used to model evolutionary rate differences among sites [5 categories (+G, parameter=0.6365)]. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. All positions containing gaps and missing data were eliminated. There were a total of 175 positions in the final dataset.
The Tertiary Structures of RBP
The resultant 3D structures of RBP are based on the coordinates from PDB Id 1qab, selected through WU BLAST and EBI’s FASTA and target-template alignment results. The tertiary structures of RBP in Danio rerio has 4 sheets, 5 beta hairpins, 4 beta bulges, 14 strands, 1 helix, 24 beta turns and 3 gamma turns and the RBP structure of Cyprinus carpio has has 4 sheets, 7 beta hairpins, 5 beta bulges, 16 strand, 2 helices, 29 beta turns and 3 gamma turns. Procheck verification proved that the models are of good quality (88.8% to 96.3% amino acids in theoretically allowed region) as judged by Ramachandran Plot [61,62] . The structures were found to be statistically significant by the structure verification programs. After fruitful verification, the tertiary structures of RBP have been deposited to Protein Model Database . The tertiary structures of RBP of Danio rerio and Cyprinus carpio have been assigned PMDB IDs PM0076072 and PM0078117 respectively.
Based on WU BLAST and EBI’s FASTA results 1QAB (Chain E, human retinol binding protein) was the best template for modelling of target proteins. The computation has been carried out on the complete sequence for both the proteins. The sequence analysis showed that the molecular formula for RBP of Danio rerio and Cyprinus carpipio are computed as C980H1470N258O291S14 and C1089H1653N289O324S1 respectively. Sequence analysis of RBP revealed negative hydropathy on average (Table 2; Figure 3), which signifies the polar and hydrophilic in nature of the RBP. The instability index (II) for Danio rerio and Cyprinus carpio (33.52 to 36.50) classifies the RBP as stable. ProCheck analyses of the structure of the RBP of fish are in conformity with the human serum retinol binding protein (PDB ID 1RBP), having amino acid residues of 175.
The models presented here can serve as a guide for the allocation of amino acid residues involved in each fold, which is important for further investigations on the molecular mechanism of functions. The present study was performed for sequence analyses and prediction of 3D structure of RBP from Danio rerio and Cyprinus carpio using the Homology modelling. A series of molecular modelling and computational methods were combined to gain insight into the 3D structure. Further study, investigating the role of other factors in RBP biosynthesis in wet lab is going on in our laboratory, which could add important information to the overall understanding of the fish model.
The paper is dedicated with deep reverence to Late Professor UC Goswami, Former Professor and Head of Department of Zoology, Gauhati University, Assam, India who inspired the authors for molecular and in-silico analysis of Retinoids and Carotinoids.