Received: 22/10/2015 Accepted: 20/11/2015 Published: 27/11/2015
Visit for more related articles at Research & Reviews: Journal of Ecology and Environmental Sciences
Rapid industrialization and population explosion generate toxic chemicals and pollute the environment. Microbial degradation is environmentally and economic responsive way for restoration of polluted environment. Currently, databases and computer programs are greatly assisting the development and implementation of bioremediation. Thus it reduces time and much basic laboratory experiments, before the application of microbial degradation in niche, in silico study is important to predict possible degradation pathways by using various computational tools. This review is intensive to the possible approaches of computational techniques in various levels of degradation process includes, analysis of toxins in physical, chemical and functional properties, toxicity prediction, genomic and proteomic approaches of microbial enzymes, pathway prediction, and prediction of degradation rate. Furthermore, the compilation of online databases and tools that may be applied in the field of biodegradation were listed out and displayed as webpage named Silico Degradation.
Toxic chemicals, Waste management, Biodegradation, Microbes, In silico
Human activities always generate waste which are toxic chemicals as the factor of rapid industrialization and pollute the environment. A number of serious and highly exposed pollution incidents associated with incorrect waste management practices, led to public concern about lack of controls, inadequate legislation, environmental and human health impact . In general, toxic and hazardous xenobiotics have a structure that is different from naturally occurring compounds and are more difficult to degrade. But microorganisms use xenobiotics as source of energy for their survival. The biological destruction of toxins is based on the principles that support all ecosystems . Microorganisms convert complex organic compounds to CO2 or other simple organic compounds through metabolic routes by means of secreting enzymes. This oxidation process yields energy and reducing equivalents that are used for conversion of a part of the intermediates to cell mass (assimilation), that enabling growth of the organisms to carry out the degradation process . The overview of biodegradation process is graphically represented in Figure 1.
Bioremediation is an umbrella concept that covers various layers of multistage complexity involved in the removal of toxic waste from polluted sites. Several databases and tools are available for assisting this process at all possible layer. That includes providing/ predicting the information on chemicals, toxicity, risk assessment, environmental properties of the chemicals, microbial enzymes, metabolic pathways and degree of degradation process. Users can use these databases to retrieve the information according to their research interests. Users can retrieve the information on toxicity using chemical databases, or can predict toxicity of chemicals using quantitative structure activity relationship (QSAR). In addition, several pathway prediction systems are available for predicting the degradation pathways for those chemicals whose degradation pathways are not known in literature. Using these pathway prediction systems, users can predict not only the degradation pathways, but also can identify enzymes involved in the degradation pathways . But most researches in the field of bioremediation are unaware of these user friendly computational techniques. Else, they have the basic knowledge in bioinformatics and have the ability to use the resources. But clearly not having the background of possible approaches of bioinformatics in different levels of biodegradation process. Numerous reviews on this field are first-rate in listing out the available bioinformatics resources [4-8]. However, quiet questions are arising when implementing integrated approach of bioinformatics and biodegradation in detoxification experiments. This present effort would represent the different bioinformatics techniques that link to the experimental procedures. This is schematically represented in Figure 2.
Also this present review is the comprehensive list of databases and tools used in microbial degradation. Besides the tools listed out in all approaches, the database Bio Red Base has the information exclusively for bioremediation of radioactive waste. The web page was created and may use as online directory of in silico bioremediation technology.
An essential aspect of enzymatic degradation is efficiency, reaction kinetics, selectivity and is represented by physicchemical parameters  Chemical characterization involves the compilation of data on physical and chemical properties, uses, environmental surveillance, fate and transport, and properties that relate to the potential for exposure, bioaccumulation and toxicity. The available chemical databases are ChemDplus, PubChem and ECHA. Most of the databases provide the information that includes name, synonyms, SMILES code, molecular weight, chemical formula, image of the chemical structure, canonical three-dimensional structure in PDB format, density, evaporation rate, melting point, boiling point, water solubility and links to other related databases. Besides, the online tools TerraQSAR and SAMFA predicts the molecular properties based on structure.
The catabolism of the degradation process takes place by the interaction of microbial enzyme with functional groups of the chemicals that to be degraded. The concept was explained by  in general molecular interaction notion. One approach to combinatorial ligand design begins by determining optimal locations. Before going to predict the possibility of molecular interaction, it is essential to know the functional group that involved in the process. Specific databases includes ChemDplus, Chemogenesis and Mitishamba Data base are providing this information for each chemical.
Many technologies have been developed for environmental cleanup of toxic compounds by microorganisms. But without the knowledge of toxicity level of the compounds it cannot be fully successful as toxicity affects the survival of the degradative strains. Thus before making many efforts in bioremediation technology of any chemicals, there is also need to predict its toxicity levels by in silico approaches . Many investigators created databases of chemicals with toxicity index, they are ACuteTox, Chemical Effects in Biological Systems (CEBS), Terra-Base, GENE-TOX, Hazardous Substances Data Bank (HSDB), SuperToxic, Aggregated Computational Toxicology Resource (ACToR), Comparative Toxicogenomics Database (CTD), Carcinogenic Potency Database (CPDB), Toxicity literature online (TOXLINE), Chemical carcinogenesis research information system (CCRIS), Development and reproductive toxicology database (DART), Genetic toxicity data bank (GENETOX), Integrated risk information system, Actoxbase, Comparative taxico genomics database, International uniform chemical information database, Haz-Map, TOXMAP, Toxics release inventory, The Household products database, European chemical substances information system, eChemPortal, Aggregated computational toxicology resource, EPA human health bench markers for pesticides, EPA office of pesticide programs’ aquatic life benchmarks (OPPALB), Chemical safety information from intergovernmental organizations – INCHEM, JECDB: Japan existing chemical database, Substances in preparations in the Nordic countries (SPIN) and US EPA: Substance registry services. Besides, the database like ECOSAR, TOPKT, EnviChem and ECOTOX represent specifically the environmental effects. Predicting toxicity of a compound by in silico toxicological methods is a developing field, the databases namely PBT profiler, Derek (Lhasa Ltd), HazardExpert, ACD/TOx suite, ADMET predictor, OncoLogic, Toxtree, MolCode toolbox, VirtualToxLa, Search Nexus, Toxicity Estimation software tool (TEST), CAESAR, ToxiPred and ToxCast program so far available.
Microbial degradation is the major and ultimate natural mechanism by which one can clean up the polluted environment. The major principle of biodegradation process is illustrated in Figure 3. Biodegradative strain data base is the public repository database having the collection of microbes that degrade toxic substances.
The degradation of chemicals can be mediated by specific enzyme system. Primarily the intracellular attack of organic pollutants stimulates oxidative process and is catalyzed by oxygenases and peroxidases . Other mechanisms involved are attachment of microbial cells to the substrates and production of bio surfactants. This peripheral degradation pathway converts organic pollutants step by step into intermediates of the central intermediary metabolism . ExPASy, OxDBase - Database of Biodegradative Oxygenases, KEGG, Bionemo - Molecular information on Biodegradation metabolism and National Center for Biotechnology Information (NCBI) are some of those databases. These databases allow users to retrieve the list of enzymes present in a particular microbe.
Sequence Assemble: In gene sequencing, multiple experimental copies are needed to establish repeats and chimeras. After short gun sequencing, the joining of fragments (contigs) should be modeled based on mathematical weighted graph. In this method, nodes are fragments and the weights of edges are the number of overlapping nucleotides, and the fragments are joined based upon maximum overlap using a greedy algorithm . Based on this algorithm tools were developed for sequence assembling and notable tools are SSAKE, SOAPdenovo, AbySS and Velvet.
Gene Prediction: After the contigs are joined, the next issue is to identify the protein coding regions or ORFs (open reading frames) in the genomes. The identification of genes can be done by searching the known database of genes suchas GenBankor by using Hidden Markov Model (HMM)based techniques . The tools using HMM are GLIMMR, AUGUSTUS, BGF, FGENESH and GeneMark.
Bacterial Annotation: Once a final gene set has been obtained, a number of post-processing procedures are applied to filter and annotate the predicted genes. Genome annotation is the description of an individual gene and its protein (or RNA) product, and the function assigned to the gene product . The record may also include a brief description of the evidence for this assigned function. The schematic sketch is given in Figure 4. It can be done by using the online tools RAST, BASys and WeGAS.
Proteomics: The proteomics techniques in biodegradation include sequencing, comparative study and functionality, altogether the same technique may use in gene sequences. Moreover, it covers the identification and characterization of protein related properties, and reconstruction of metabolic and regulatory pathways . BLAST, ALIGN Query, LALIGN, FFAS, FASTA, Gene Wise, SIM and SSEA are used. In case of multiple sequence alignment, ClustalW, MAFFT, Clustal Omega, DbClustal, PROBCONS, webPRANK, GUIDANCE, SALIGN, AlignMe and PRALINE are widely used for pairwise sequence alignment . Identified the potential amino acids of glutathione transfereases (GSTs) which involved in degradation of toxic pollutants, including poly chlorinated biphenyls. Authors made the multiple sequence alignment for the bacterial GSTs and identified the identical amino acids. Through in vitro site-directed mutagenesis studies, subsequently proved that the amino acids play a role in determining catalytic activity.
Consequently, the ancestors of the organisms are determined by phylogenetic analysis using Phylip, T-Rex, Phylogeny.fr, PHYML, ProtTest, Phylemon2, POWER, Phylodendron, Phylogenetic tree prediction, CVTree, web PRANK, Replacement Matrix and DIVEIN. Using phylogenetic analysis, the novel microbes were identified from the known microbes that related to degradation process. Notably in cyclic nitramine degradation , geosmin degradation , Alkane degradation  and in textile dye decolorization . As well in case of polyethylene biodegradation , identified the degrading bacteria by 16S rRNA sequencing. The bacterium was Pseudomonas citronellolis and BLAST result gave 96% of similarity (non-significant score) with the database, thus they clarified that their bacterium was a novel strain.
The pathway can be traced out by the databases once obtaining the details of organisms, specific enzymes involved in catabolic process. Some of the metabolic pathway databases are KEGG, UM-BBD EMP, Biocyc, MetaRouter, PANTHER, Pathway Commons, MetaCyc, BRENDA, Roche Biochemical pathways, BioCarta and WIT. Among the databases, KEGG system is the highest praise in user-friendliness and amount of data incorporated in the database. The EcoCycsystem has the most information on that specific organism. It also incorporates direct links to the literature and experimental data being used to set up the system. PathDB, ExPASyand UM-BBD systems are the other most useful ones, however beginners will prefer the KEGG graphics .
Protein-ligand docking tool can be used to screen pollutants for their susceptibility to degradation by already characterized enzyme. Using docking algorithms in one way, it is easy to predict the possible interactions and identify the hidden mechanisms involved in. On the other hand, it fits the generated poses into the target protein under investigation, thus used to develop new metabolites. Based on this information, may produce the metabolically engineered organism and could enhance the degradation [10,23]. Before docking the pollutant, binding site of the enzyme should be deeply analyzed. The tools like PROCAT, active site prediction, PAR-3D, Q-site finder and Pocket-finder.
On the other hand, number of tools are freely/ commercially available for docking study, that includes Autodock, DOCK, GOLD, Glide, SCIGRESS, GlamDock, GEMDOCK, iGEMDOCK, HomDock, ICM, FlexX, Flex-Ensemble (FlexE), Fleksy, FITTED, VLifeDock, ParaDockS, Molegro Virtual Docker, eHiTS, DAIM-SEED-FFLD, AutodockVina, VinaMPI, OEDocking, idock, Rosetta Ligand, rDock, MOE, Lead Finder, YASARA Structure, GalaxyDock, FINDSITE-LHM, ADAM, hint!, PLANTS, HADDOCK, Computer-Aided Drug-Design, GriDock, DockoMatic and BDT  made the comparative study on most widely used tools. According to them, AutoDock reproduce better results compared to DOCK, Flex and GOLD. This standard in silico method was aid in many degradation processes, notably in azo dye degradation  and pyrene degradation  used different approach for prediction. They collected the pollutants from Pub Chem database and docked with laccase, the well-studied enzymes used for bioremediation of xenobiotics such as phenols, anilines, etc. Thereby suggested that laccase might be able to oxidize the selected pollutants.
Bacteria that dwell in polluted environments are often capable of evolving from pre-existing pathways that cope with natural compounds. Promiscuous enzymes can evolve to become more effective catalysts as a result of selective pressure for detoxification of a toxic compound or use of a novel source of carbon, nitrogen or phosphorus. These novel enzymes are regulators for the degradation of xenobiotic analogues [8,11]. A wide variety of in vitro and in vivo techniques were developed to identify critical toxicity pathways. Computational approaches are also being used to design novel catabolic pathways. Some of the tools used to predict the catabolic pathway are UM-BBD, METEOR, Metabol Expert, META Router, Multi CASE/META, UM-PPS, Path Pred, Biochemical network integrated computational explorer (BNICE), DESHARKY, From metabolite to metabolite (FMM), Carbon Search, Opt Strain and Metabolic Tinker. Among them, tools like Opt Strain are primarily intended for prediction of pathways for biosynthesis of valuable end-products, but others DESHARKY, Meta Router, UM-BBD are useful for prediction of degradative pathways as well. Some algorithms generate pathways using known reactions drawn from metabolic pathway databases such as Meta Cycand KEGG. Others generate pathways based upon known transformations of functional groups regardless of whether an enzyme is known to catalyze a specific reaction .
Using this computation approach  understand how novel biodegradation pathways influence the existing metabolism of a host organism and proposed the idea to genetically engineered microorganism that could use for bioremediation  used computational tools like UMBBD-PPS and eMolecules database for the identification of potent bacteria which can be used to accomplish bioremediation of 1-naphthyl methylcarbamate/ Carbary. This will pave the way for the forward engineering of bacteria as efficient biocatalysts for bioremediation of chemical waste.
The use of computational tools to predict the biodegradation of xenobiotics can aid in identifying the reactions needed to degrade these compounds, providing insight into the fate of xenobiotic compounds in the environment. Current biodegradation prediction methods are rule based and rely on extensive databases . The online tools like BIOWIN, OECD toolbox, START, VEGA, Predict-BT and CATABOL are shown to be useful in the design and evaluation of novel xenobiotic biodegradation pathways and identifying cellular feasibility degradation routes . Made the comparative analysis on VEGA, TOPKAT (Toxicity Prediction by Komputer Assisted Technology, commercial software by Accelrys, Inc.) BIOWIN and START. They compared the performance of the tools on the basis of accuracy, sensitivity, specificity and Matthew's correlation coefficient (MCC). According to results of  VEGA gave the best accuracy results (99%) for tested chemicals and START model gave poor results. Since BIOWIN, TOPKAT and VEGA models performed well, they may be used to predict ready biodegradability . Made an attempt to predict the degradation rate of aromatic hydrocarbons using BIOWIN, the results were significant with experimental results.
The combination of the in silico techniques allowed for enhanced optimization of complex bioremediation processes. Especially because the internet provides a wealth of inspiring tools, that leads the beginners to understand and learn quickly. However, user-friendly computational techniques are available, a key bottleneck lies in the ability to choose the tools according to the available inputs, output of interest and quickly analyze the output to make meaningful conclusions. Likewise, some servers can be down for months, URLs cannot be exhaustive because of the large number of tools and/or databases are available and almost daily changes. Yet, these approaches enable development of strains that efficiently degrade recalcitrant anthropogenic compounds and thereby provide eco-friendly and relatively low-cost methods for clean-up of contaminated sites, and treatment of industrial wastes. This review listed out the resources and authors created webpage Silico degradation exclusively for this purpose. This hub of in silico resources could greatly assist the process of microbial degradation, if fully exploited.