Recent trends in In silico drug discovery

Received Aug 4th, 2015 Revised Nov 20 th , 2015 Accepted Dec 26 th , 2015 A Drug designing is a process in which new leads (potential drugs) are discovered which have therapeutic benefits in diseased condition. With development of various computational tools and availability of databases (having information about 3D structure of various molecules) discovery of drugs became comparatively, a faster process. The two major drug development methods are structure based drug designing and ligand based drug designing. Structure based methods try to make predictions based on three dimensional structure of the target molecules. The major approach of structure based drug designing is Molecular docking, a method based on several sampling algorithms and scoring functions. Docking can be performed in several ways depending upon whether ligand and receptors are rigid or flexible. Hotspot grafting, is another method of drug designing. It is preferred when the structure of a native binding protein and target protein complex is available and the hotspots on the interface are known. In absence of information of three Dimensional structure of target molecule, Ligand based methods are used. Two common methods used in ligand based drug designing are Pharmacophore modelling and QSAR. Pharmacophore modelling explains only essential features of an active ligand whereas QSAR model determines effect of certain property on activity of ligand. Fragment based drug designing is a de novo approach of building new lead compounds using fragments within the active site of the protein. All the candidate leads obtained by various drug designing method need to satisfy ADMET properties for its development as a drug. Insilico ADMET prediction tools have made ADMET profiling an easier and faster process. In this review, various softwares available for drug designing and ADMET property predictions have also been listed. Keyword:


INTRODUCTION
A drug is a small organic molecule which has therapeutic benefits as it activates and inhibits function of a biomolecule which can be an enzyme , receptors (circulating messengers), targets involved in cell replication and protein synthesis (DNA , RNA) and Transport systems (ion channels).A drug molecule can be developed by a process known as Drug Designing, in which the knowledge of the biological target is used .Traditionally Drug development was based on random trial and error methods which was highly time consuming and a very expensive method and had extremely low yield (1 in 100,000) [1] process has now been facilitated by the development of computational tools and methodologies.This Computer Based Drug Designing is target specific and structure based, comparatively fast and has low cost and high success rate.There are two major types of drug designing approaches, structure based drug design and ligand based drug design.

STRUCTURE BASED DRUG DESIGNING:
Structure based drug design [4] is based on three dimensional structural information which is obtained by modern biophysical techniques like NMR and X-ray crystallography.If the three dimensional structure is not known then homology modelling is performed to predict the structure of the target, based on the structure of the related proteins.General methodology of structure based drug designing [5] (Figure 1) begins from 3D structure of target (available in structural databases or generated through homology modelling) which is then docked against compounds (Ligands) in the databases.The ligated structures with best scores are then determined, refined and analysed to find sites on ligand involved in binding to target.These sites can further be optimised to increase the potency of ligand.This is followed by cell based or biochemical assay and clinical trials.Structure based drug designing is an iterative process, if potency of ligand is poor then next ligand in the list is tested through all the steps discussed above.While choosing a drug target-function prediction, pathway information, disease association and structural data must be considered.Similarly ligand selection must also be accurate, so that it specifically binds to the target and produce a welldefined physiological effect, simultaneously minimising undesired side effects.Ligands used in drug designing must follow Lipinski Rules to be a successful drug candidate.
ten or fewer hydrogen-bond acceptors 3.
molecular weight less than or equal to 500 4. calculated logP less than or equal to 5 The most common method which is widely used in structure based drug designing is Molecular docking [6].Molecular Docking is a method to characterize the behaviour of small molecules and binding site of the target protein, by modelling the interaction between a small molecule and protein.Two basic steps involved in docking method are: Use of sampling method to predict ligand conformation as well as the pose (position and orientation) of small molecule (ligand) within the binding sites and scoring schemes for assessment of binding affinity [6].In most of the cases position of binding site in target molecule is known, which increases the docking efficiency.However in absence of such prior information binding site prediction program or online servers such as GRID [7][8], POCKET [9], SurfNet [10] [11], PASS [12] and MMC [13] can be used.Various sampling algorithms used in Molecular Docking are [6]:

Matching algorithms (MA):
Matching algorithms [14][15] [16] mainly takes into account overlap between two molecules to identify possible binding sites of a protein by molecular surface search [17].In this algorithm, docking is guided by generating pharmacophores that represents protein and (initial conformation of) ligand.These ligand and protein pharmacophores are then studied for a match by comparing distances between each of the pharmacophoric points on them.In 3D structure of target molecule is known otherwise homology modelling can be used to generate 3D structure of target molecule based on structure of closely related protein.

Detect active sites
Dock the targets against the compounds in the database.
Ligated structures on top of the list are determined and refined.
Each of the ligated structure is analysed to find sites in the ligand involved in binding to targets.
These sites on ligand can be optimised to increase the potency (involves redesigning of ligand).
Ligands that have been designed are then synthesized and tested through cell based or biochemical assay.

Fragment based method:
 Incremental construction Incremental construction algorithm uses greedy strategy along with overlap detection method to find new interactions.[26].This algorithm divides ligand into fragments through rotable bonds which are then docked separately in receptor sites in an incremental fashion and are later on fused together [17].Usually in this method, base fragment chosen is the largest fragment having some significant role in interaction with proteins [27].Several software"s using this algorithm are: DOCK 4.0 [28], FlexX [29] [30], FLOG [31], Surflex [32] [33], Hammerhead [34], SLIDE [35] and eHiTS [36].

 Multiple Copy Simultaneous Search (MCSS)
In MCSS approach, [37][38] energetically favourable positions and orientations of different functional groups in binding sites are identified by first generating 1000 to 5000 copies of functional group which are then placed randomly in the binding site and are subjected to energy minimization or molecular dynamics.This process yields functionality maps of the binding sites [6] [39].New ligands complementary to binding sites are then constructed by linking functional groups positioned in these energetically favourable positions.

 LUDI
The LUDI algorithm is a molecular-fragment based approach and takes into account hydrogen bond and hydrophobic interactions between ligand and protein.It is used for identification of the potential de novo leads [6].Ludi follows three steps: identification of interaction sites between protein and ligand, identification of fragments (from fragment library) which form hydrogen bonds and fill hydrophobic pockets on the target and finally connection of these fragments with linkers to make a single molecule.The capability of Ludi for designing new ligands was demonstrated by development of HIV protease and DHFR inhibitors [40] [41] [42] [43].

Stochastic methods:
 Monte Carlo (MC) This method [44] [45] is used to produce different poses of the ligand (from initial configurations) inside a receptor binding site (through bond rotation, rigid body translation or rotation).Each conformation generated is applied to a force field based energy minimization [6].Minimized conformation obtained is considered as parent conformation (if it satisfies the selection criteria) and is further modified to generate next conformation.This process is repeatedly performed till specified number of conformations are generated.Main advantage of this method is that ligand may cross the energy barriers on the potential energy surface owing to large and random changes.Several software's using this algorithm are: AutoDock [46], ICM [47], QXP [48] and Affinity [Accelrys Inc., San Diego, CA, USA.] [6].

 Genetic algorithms
Genetic algorithm [49] [50] [51] is based on Darwinian evoloution theory.It is an EP algorithm and does function optimization [52] [53] [54].It has two types of genetic operators: crossover and mutations.Mutation refers to random changes to the genes, and crossover refers to exchange of genes between two chromosomes [6].Either of these two results in a new ligand structure with best characteristics from each of parent.Genotypic Mutations which work as local search operator in traditional genetic algorithm have different role in Lamarckian genetic algorithm (LGA).The LGA uses explicit local search operator and swaps between genotypic and phenotypic space .Several software"s using this algorithm are: AutoDock [49], GOLD [55], DIVALI [56] and DARWIN [57].[6].

Molecular dynamics (MD):
It is a powerful simulation method widely used in many fields of molecular modelling.MD Simulation considers motion of individual particle as a function of time providing physical basis of the structure and thus helps in understanding biomolecular function.Since Molecular dynamics progress in very small steps thus difficulties in stepping over high energy conformational barriers are faced, leading to inadequate sampling [6].In MD Simulations selection or removal of several factors can be done in order to determine their effect on the system.Molecular dynamics simulations along with simulated annealing can be used to sample configuration space by refining structures from experimental data, which further helps to understand structural thermodynamics and motional properties of system at equilibrium [58].Programs based on Molecular dynamics are: CHARMM [59], AMBER [60] and GROMOS [61].

SCORING FUNCTIONS:
Accuracy of docking hits depends on quality of scoring function, which is based on estimation of binding affinity between protein and ligand.It is of three main types [6]: Force field based: [62] [63] [64] It measures the binding energy by calculating the summation of the non-bonded (electrostatics and van der Waals) interactions.A drawback of this scoring method is the problem of slow computational speed.DOCK [19] [20] [21] [22], GOLD [55], FlexiDock, Tripos, CHARMM [59], AMBER [60] and AutoDock [46] are Force field based Software"s.

Empirical based:[65][66][67][68][69]
In this scoring method each component of decomposed binding energy, such as hydrogen bonds, ionic interaction, hydrophobic effect and binding entropy, is multiplied to a coefficient before being summed up to a final score .LUDI, AutoDock, PLP [66] [67] [70] and ChemScore [71] are examples of Empirical based scoring method.[77] In this method, interatomic contact frequencies between ligand and protein are calculated using statistical analysis of ligand Ensemble docking: This method makes use of an ensemble of different conformations of a protein structure.The ligand is then docked to each rigid protein conformation in the separately.Finally, the results are merged.[6][87]

HOTSPOT GRAFTING
Hotspot Grafting is a new method for designing protein drugs [88].Residues on protein interfaces that dominate binding energy are known as hotspots [89].If the structure of a native binding protein and target protein complex is available and the hotspots on the interface are known, then the hotspot grafting strategy is a good choice.In this method, scaffold proteins that can accommodate hotspots are searched [88].Sites accommodating hotspot patterns are detected using pharmacophoric pattern matching approach like graph theory based approach or set reduction algorithm scaffolds [90].Hotspot patterns found are thereafter transferred on these scaffold proteins.Then these are docked to target protein on the basis of hotspot superposition, out of these superposed complexes one"s with higher scores are finally selected.Using Hot spot grafting strategy protein drugs may be designed, for example; using above algorithm, rat PLCδ 1 -PH (plesksprin homology domain of phospholipase C-δ 1 ) a scaffold protein, was searched by scanning PDB.This scaffold protein was made to bind to human erythropoietin receptor (EPOR) by grafting key residues involved in interaction with human erythropoietin receptor (EPOR).Rat PLCδ 1 -PH (plesksprin homology domain of phospholipase C-δ 1 ) is showed significant biology activity in cell-based assay and a good binding affinity with EPOR in-vitro.

LIGAND BASED DRUG DESIGNING:
The ligand based method such as pharmacophore modelling and quantitative structure activity relationship (QSAR) methods are used when active ligand molecule are known but there is little or no structural information available for the target.Pharmacophore modelling can be used to search essential features for biological activity of active ligands which can further be used to screen molecules and QSAR is used to build models for prediction of activity of novel molecules [91].Natural products or substrate analogues can also be used in ligand based drug designing which give desired pharmacological effect on interaction with target molecule [92].

PHARMACOPHORE:
Pharmacophore can be defined as a molecular framework that defines necessary features that an active ligand should possess for a drugs biological activity [93] [94].

Pharmacophore Modelling:
Pharmacophore modelling is a process where pharmacophores can be queried against structural databases to:  retrieve potential leads ( Lead Discovery )  design molecules having specific desired attributes (Lead Optimization)  Assessing similarity and diversity of molecules using pharmacophore fingerprints.
A Pharmacophore model is generated by initially performing literature searches and database queries (from both in house and public database) to assemble active compound sets.While searching the active compounds from these multiple sources a consistent threshold to define compound activity should be applied and molecules exerting biological effects via same mechanism should be taken into consideration.In case chemical structure for the molecule is not available, then several software"s are used to manually sketch its structure.In next step pharmacophore model is build using common features found among ligands.The ligand aligned, are superimposed with the help either of a conformer database of relevant ligand or of an on-the-fly conformation generator to find common features [95] [91].A Pharmacophore elucidation algorithm is use to generate pharmacophore models after the alignment step is completed.More than one output models are generated and the selection of best one is assisted by scoring function contained within pharmacophore building software.Generally, the model with highest score is selected and is subjected to validation process.

QSAR
QSAR, stands for "quantitative structure-activity relationships" and relates chemical structure to biological or chemical activity using mathematical models.A model can be generated to describe this relationship, if the activity of a set of ligands.Pharmacophore model explain only the essential feature of an active ligand whereas QSAR model determine the effect of certain property on the activity of ligand [96].This set of property is computed from the structure and use to quantify it.By using both structure descriptor and activity as an independence variable, a model can be built to describe the relationship between the two.The biological activity of novel molecules from their structural property can be predicted, after a QSAR model is build and validated [97].The general methodology consists of following steps [92]

Comparative Molecular Field Analysis (CoMFA):
The first QSAR method to find correlation between 3D shape dependent steric and electrostatic properties of a molecule to its biological activity was CoMFA [101].CoMFA is based on an assumption that the minimum energy conformer is the bioactive conformer.In this method the aligned molecules (based on their 3D structure) are placed in a 3D grid and their steric and electrostatic potential energies are determined at each grid point [92]

Comparative Molecular Similarity Indices (CoMSIA):
This method is similar to CoMFA but it considers hydrophobic, hydrogen bond donor and acceptor properties along with steric and columbic properties [102] [92].It also calculates the similarity indices by comparing each ligand molecule with a common probe with a radius of 1Å, and charge, hydrophobicity and hydrogen bond properties equal to 1 [100].
CATALYST CATALYST [Catalyst.Accelrys Inc.; SanDiego, CA: 2002.] is 3D QSAR software that considers conformational flexibility during model development [92].To sample conformational space for the ligand it uses poling algorithm [103].Several conformers are generated with a default cut-off value of 20 kcal/mol above the energy of global minimum conformation [92].Pharmacophore hypothesis are then developed using spatial orientations of the functional groups.QSAR models are evaluated by comparing the estimated and observed activity values.

FRAGMENT BASED DRUG DESIGNING
Fragment based drug designing is an approach of constructing lead compounds ,using small ,low molecular weight molecules known as fragments.This technique has certain/several advantages over high throughput screening [104]. Low complexity of fragments, increases the probability of matching a target protein binding site, thereby screening only 100 to few 1000 compounds. More hits with better binding efficiency are identified/obtained in comparison to HTS.  Lead compound obtained has low molecular weight as desired for lead likeliness.
Fragment based strategy starts with selection of fragments which is mainly based on "Rule of three" [104,105]  Along with these it has been found that fragments must have number of rotatable bonds, on average less than or equal to 3 and Polar surface area as 60A 2 [104].Fragment selection is followed by its screening.Several screening methods available are Nuclear magnetic resonance (NMR) , X-Ray Crystallography , Functional screening and Mass Spectrometry (both Tethering and Non-covalent).All these methods have their own advantages and disadvantages.In the last step fragments screened, are grown into potential drug leads.This can be done by four different methods [104] [105].

Fragment Evolution
Direct binding techniques are used to identify initial fragment which is thereafter grown into larger molecule that target additional interactions in active site of protein.This approach leads to more tighter binding molecules.

Fragment Linking
In this approach, fragments binding to separate sites that are close enough to be chemically linked are identified, thereby generating large molecule with high affinity.

Fragment Self-Assembly
This method is based on fragments to undergo self-assembly which is catalyzed by template (target protein).Fragments having complementary functional groups can thus be assembled to a larger and more potent molecule in presence of target protein.

Fragment Optimization
In this method lead molecules are re-engineered such that particular properties like selectivity, cell based activity, oral activity or efficacy can be optimized along with binding affinity.Though this is a powerful approach for drug lead generation, still fragment identification, their linking and merging is a difficult task.The process by which a drug crosses the body membranes and enters the tissue or the blood plasma is called Absorption.Major factor [106] [107] [108] that influence absorption of drug are: a) Ionization constant (pKa) -It is a degree to produce ions in aqueous medium (For eg: water).Higher the value of pKa (strong acids/ strong bases), lesser is the absorption.

ADME
b) Solubility -It is an important factor for determining oral bioavailability of the drug.Lower solubility leads to lesser absorption and thereby less bioavailability of drug.
c) Lipophilicity -It describes tendency of the drug to prefer lipidic environment to that of aqueous.Higher value indicates high lipid solubility of drug.
d) Drug particle size -Larger size of the drug prevents its diffusion across compartments leading to poor absorption.Therefore, the size of the drug must be optimum.
e) Route of administration -Depending upon nature of drug , drug can be administered via one of the several routes such as intravenous, oral, intramuscular, subcutaneous, inhalation , intraperetonial , etc.

Distribution:
Distribution of drug is important for it to reach the site of action.It is mainly affected by extent of absorption and factors affecting absorption, for example: ability of a drug to diffuse through membranes, and their solubility in lipid / water medium.Other factors that influence distribution are blood flow rate, tissue storage, metabolism and excretion of the compound and various physiological barriers.Binding of drug molecule to plasma protein is another factor which causes decrease in their distribution rate as it reduces the amount of free drug in plasma [107][108].

Metabolism:
Metabolism is a process of action of body on a drug molecule to convert it into an active agent, through various biochemical processes.Rate and extent of metabolism of a drug molecule depends on its chemical structure, physiochemical properties, enzymes involved and products formed.Metabolism greatly affects drug dosage.Both Initial and maintenance Dose of a drug depends on whether a drug metabolizes rapidly or not.Frequent maintenance doses are required for rapidly metabolizable drugs (for maintaining its therapeutic index), which are not required for highly stable drugs [107] [108].

Excretion:
Drugs are metabolized and excreted either in form of metabolites or as a parent drug.Excretion can be of either form such as breath, saliva, urine, faeces, milk, bile, perspiration and hair.Kidney plays the major role in elimination of drug from the body.Factors affecting excretion are molecular weight of drug, its metabolites and its lipid solubility.Compounds having less lipid solubility (polar) are eliminated faster in comparison to those having high lipid solubility (non polar).Compounds having low molecular weight tend to eliminate through urine whereas those with high molecular weight are eliminated via bile [107] [108].

Toxicity:
A drug is said to be toxic in either of three cases: Toxicity is a major factor that causes failure of drug to reach market and for its withdrawal from market.It can be either acute or chronic depending upon period of time taken to cause toxic effect or damage to the body.Acute toxicity is when damage is caused by a single dose and in a short duration of time, whereas when body is repeatedly exposed to drug and toxic effects is caused after a prolonged period of time it is said to be chronic toxicity.Toxicity depends on various factors like patient's age, genetic composition, drug dosage and other medications taken by patient.
Traditionally, drugs were synthesized first and then there pharmacokinetic properties, metabolism and toxicity were studied.With the use of combinatorial chemistry and high throughput screening a large no of new lead compounds are screened, many of these compounds fail because of inappropriate ADME properties.With advancement in technology, these studies are done much earlier before evaluating compounds in clinic through in-silico methods, saving both time and cost.In ADME studies, pharmacokinetics parameters are evaluated (Table 3).A large no of these approaches have been developed to predict ADME properties [107] [108].

Biological effect of Drug (E)
Effect observed for a drug.

Bound Drug(B)
Amount of drug in bound form.

Volume of Distribution (V d )
Amount of drug in body to the concentration of drug in blood or plasma.

Rate of Elimination (Ke)
The elimination rate constant can be defined as the fraction of drug in an animal that is eliminated per unit of time, e.g., fraction/h.Elimination half-life is the time required for the amount of drug (or concentration) in the body to decrease by half.

Clearance (C L )
Factor that predicts the rate of elimination in relation to the drug concentration.
Half Life (t 1/2 ) Time required to change the amount of drug in the body by one half during elimination.

Bioavailability (F)
Actual amount of drug available for action in plasma.

Accumulation
Amount of drug accumulated in body.

Dosing Rate
Amount of drug administered per unit time.

Initial Dose
Amount of drug initially administered.

Maintenance Dose
Amount of drug administered to maintain the therapeutic level of drug in plasma.

CONCLUSION:
Drug designing is a vast and ever-growing field, which involve knowledge of bioinformatics, proteomics, biochemistry and computer modelling.Structural interaction of a ligand with receptor or target molecule decides effectiveness of a drug.Structure based drug designing is one of the most widely used drug designing technique; in which docking is the most widely employed one.Beside this Pharmacophore modelling and QSAR are other important methods that are based on three dimensional structural information of ligands.Fragment based approach is a de-novo approach in which fragment libraries are explored for lead identification and has several advantages over high throughput screening.Inappropriate ADMET properties are the major cause of failure of a lead compound thus there is an increasing need for in-silico tools for ADMET prediction to reduce the rate of late stage attrition and further fasten the drug discovery process.Numerous algorithms, softwares and methods are emerging in this field, day by day.All this advancement and up gradation of technology has made drug designing process much easier and faster.

Figure 1 .
Figure 1.General methodology of structure based drug designing


which includes: Molecular weight < 300  cLogP = 3 (Lipophilicity)  Number of Hydrogen bond donor and acceptor < 3 Absorption, Distribution, Metabolism, Excretion and Toxicity are major causes that lead to failure of candidate leads in drug discovery process, thus these properties must be considered during development of a drug.

ISSN: 2278-8115 IJCB Vol. 5, No. 1, August 2016, 54 -76 http
[79]ral unusual interactions like sulphur-aromatic or cation-pi can be modelled through this function.PMF, DrugScore[78], SMoG[79]and Bleep are several knowledge-based functions.In this case both ligand and receptor are treated as rigid body"s i.e, flexibility is not considered in either case.This method considers only three translational and three rotational degrees of freedom and thus the search space is limited.[6]2.

Rigid Receptor and Flexible Ligand Docking:
[6]this case ligand flexibility is taken into consideration while receptor is treated as the rigid body.[6]3.

Flexible Receptor and Flexible Ligand Docking:
[85]tudy interactions between the receptor (protein) and the ligand in a better way it is important to consider flexibility in receptor as binding of ligand leads to a conformational changes in protein.There are four methods that account for protein flexibility in docking.[6]SoftDocking:Itworksbydecreasing the extent of overlap between the ligand and receptor by decreasing the van der Waals repulsion energy in docking calculations.It is the simplest method that implements receptor flexibility but considers only small conformational changes.[81][82][83]Side-ChainFlexibility:Thismethodmakeuse of roamer libraries which contain set of different conformations of the side chain.In this case backbone is fixed and only side chain flexibility is considered i.e. different side chain conformations are sampled.[81][84][85]MolecularRelaxation:This method takes in account both backbone as well as side chain flexibility.The ligand is first docked to the binding site of the rigid receptor and then the receptor-ligand complex is minimized (relaxed) by molecular dynamics, Monte Carlo simulations or through other methods.[81][86]

Partial Least Square Analysis (PLS): 
Combines MLR and PCA techniques. Takes into consideration dependent variables also. PLS are useful in systems that possess more than one dependent variable.
Multivariable Linear Regression Analysis (MLR): Simplest method to describe correlation between molecular descriptor and activity. Involves addition or removal of descriptor to generate best model. Time consuming (especially when large numbers of descriptors are available).

Table 2 :
List of drug designing Softwares.
When gets accumulated in the body in higher amounts. Is not metabolized properly. Causes severe side effects (such as when drug don"t bind specifically to the target molecule).

Table 3 :
List of some important pharmacokinetic parameters