Considering the many recent publications deploring various inadequacies of current QSAR practicesFootnote 1 and outcomes [18], as well as QSAR’s relative antiquity, the reader is forgiven for any skeptical reaction to the title. Yet several CADD groups at prestigious major and medium pharmas whom I have recently encountered began discussions by stating their intent to increase their proportion of QSAR activities (while decreasing structure-based design). What is going on?

The major cause of this sudden renewal of interest in QSAR seems to be a major yet poorly met need—practical CADD guidance for lead optimization. As the longest, most expensive, and success-determining phase in most drug discovery projects [9],Footnote 2 one might suppose lead optimization to have special need for guidance from CADD. Yet current CADD methodology development activities tend much more to address the earlier hit discovery and hit-to-lead phases. One reason may be the much higher IP-based barrier between methodology development and usage experience during lead optimization. The data sets needed to validate any new methodology’s value in practical LO situations are far less available.

But probably a greater reason is that lead optimization also poses a couple of demanding challenges to CADD methodologies. The first is the nature of the candidate structures. Earlier in drug discovery, the candidate structures are rather dissimilar from one another, typically exhibiting at least three orders of magnitude of differences among experimental potencies. However, during lead optimization, the candidates are much more similar, usually varying by only one or two R-groups attached to a shared core. Accordingly the variation in measured potencies is much less, seldom much more than an order of magnitude. The second challenge is the much greater number and variety of the biological observables during lead optimization, all of which need to be considered if a ranking of candidates is to be meaningful. At the same time the pace of compound synthesis is much greater. A synthetic chemist whose career is currently dedicated to a lead optimization project does not idly await lengthy computations.

Furthermore, opportunities for usefully applying QSAR approaches across all phases of drug discovery are also mushrooming, thanks to the plummeting costs of acquiring and recording data points in multiplexed experiments, our rapidly expanding appreciation of biological complexity, and the inherent methodological symbioses between QSAR and bioinformatics. To cite a current example of this symbiosis, the most advanced approaches to mission-critical “off-target predictions” [10] routinely consider similarities in both ligand and biological pathway properties. Will we start regarding QSAR simply as that branch of bioinformatics that addresses chemical structure selection, in particular among ligands?

Promising methodological advances may also be a factor in this renewed QSAR interest. While the pA50 [11] predictions of structurally “local” 3D-QSAR models seem relatively reliable [2, 12],Footnote 3 sufficient to help guide lead optimization, the tedious and unavoidably subjective aspects of aligning each structure to be predicted has severely limited 3D-QSAR’s practical applicability. However, the new method of topomer CoMFA (whereby rule-generated alignments are applied individually to fragments rather than complete structures) [12] is surely among the simplest, quickest, and most objective tools in today’s CADD kit. And fortunately, the almost unprecedentedly accurate predictions so far reported in lead optimization projects using “topomer CoMFA”, specifically a standard deviation of 0.6 between predicted and found pA50’s, over 144 made-and-tested compounds from four different organizations [1316], if continued, should further encourage its widespread application. The exceptional speed and objectivity of the topomer CoMFA protocol is also inspiring new methodological opportunities, such as virtual screening for R-groups (see footnote 3), with hits being accompanied by potency predictions, and “QSEA” [15, 17], which simplifies exploration of a so far oft-neglected issue, how a specific QSAR varies with its training set composition. Also emergent at this writing is “template CoMFA” [18], which provides the CADD expert with control over the conformation(s) used to generate a 3D-QSAR, for example in the form of a receptor-bound conformation, while retaining the desirable attributes of topomer CoMFA.

Yet another massive development that favors usage of QSAR, with its need for experimental results to drive its hypothesis generation, is the growing capabilities of most drug discovery organizations for making experimental data, both public [19] and private, completely and readily available, and in as many comparative formats as possible.

But what about those publications deploring QSAR deficiencies? For the most part it is agreed that these disappointments are caused more by faulty practice or unrealistic expectations than by fundamental deficiencies in the QSAR approach. However it would be much harder to reach a consensus on which practices are faulty, and the recurring empirical dilemma—is my latest QSAR prediction trustworthy enough to guide a critical project decision?—remains problematic. With these caveats, here are some hints for the future QSAR practitioner, based on over 40 years of experience.

  • A QSAR is simply a coherent structurally-based summary of a particular set of biological observations, hopefully revealing a pattern which helps to successfully guide a discovery team to a therapeutic goal. Yet the still barely understood complexities of biological processes, at molecular, cellular, and organism levels, surely set unknowable boundaries—aka “activity cliffs” [20]—on the extrapolability of any QSAR. (And the highly multidimensional character of most QSARs dictates that almost any worthwhile prediction is an extrapolation [21].)

  • Nevertheless, it is an empirical fact, as indicated above, that QSAR predictions are often accurate enough to benefit discovery. I believe this tendency to be evidence for undiscovered regularities in biological phenomena that physical and systems biology modeling will eventually reveal. Yet meanwhile—if no drugs are found, the entire discovery process is jeopardized. If QSAR may help and is conveniently available, shouldn’t it be tried?

  • Each discovery team member is already considering the same observations to seek the same goal, often generating intuitive SARs similar to the current QSAR. However, the now overwhelming quantity of such observations suggests that a QSAR exercise may also be increasingly useful in calling attention to outliers and/or activity cliffs, to be scrutinized with the hope of exploitation. Having available such a single and relatively formal QSAR expression of the team’s intuitive and varied SAR models makes the recognition of an outlier more likely.

  • Any particular QSAR is probably only one of many statistically acceptable and perhaps equally plausible alternative QSARs, considering for example the thousands of possibly explanatory structural descriptors that are available. Rapid detection and inclusion of an “activity cliff” observation, though perhaps immediately disappointing, may highlight a more productive QSAR hypothesis.

  • Statistical parameters have marginal relevance when assessing the soundness of a QSAR prediction. For example, choosing as “the model” simply the one having the greatest q2 (or r2) value is little better than a coin flip, considering the other uncertainties discussed in previous hints.

  • Also underappreciated is the strong dependence of q2/r2 on the spread in the underlying biological potencies. More informative is the entirely model –dependent standard error in the potency predictions/fits which accompanies a q2/r2. This consideration further implies that for QSAR derivation a wide spread among the biological observations, though desirable, is far from mandatory.

  • Furthermore, excessive dependence on q2 can seriously impede the discovery of useful new hypotheses, particularly when first encountering an “activity cliff”. The q2 “leave-out-and-predict” philosophy necessarily suppresses any new hypothesis that is supported by only one newly tested structure. And it would seem, the higher a cliff, the more likely its q2 rejection.

  • “Leave-one-out” cross-validation is also a misnomer in most QSAR derivations, because most structural series contain multiple instances of the more successful R-groups, whose influence therefore cannot be erased by omitting and predicting individual structures (see footnote 3).

  • Discovery projects seek better structures. Identifying a structure which is then found to be superior, even if its potency prediction is numerically relatively inaccurate, is far more valuable than many accurate pA50 predictions of potencies already achieved (see footnote 3). Therefore judicious extrapolation of a QSAR seems desirable rather than “dangerous”.

  • In general classical statistics is far too optimistic when validating a QSAR, because its underlying assumptions about data distributions are contradicted by the extraordinarily heterogeneous nature of chemical structures and mechanisms of biological response. Restricting the structural scope of a QSAR should help, but the distribution of “local” structural variations, within a series undergoing lead optimization, is also unlikely to be uniform.

  • One tactic to consider for prediction validation is to seek multiple QSAR models encompassing different “radii” of structural variation, with the goal of detecting as many activity cliffs as possible. A prediction that is reproduced by such a varying scope of QSARs is more likely to be robust.

  • More sophisticated means of “data mining”—neural nets, pattern recognition, support vector machines, even mere non-linear regression—have been repeatedly tried [22, 23] and abandoned, as too costly and/or unreliable. The major exceptions, PLS [24], cross-validation/bootstrapping [25], and recursive partitioning [26], are relatively minor extensions of QSAR’s original multiple linear regression [2729]. It seems that biological data are too fuzzy and alternative explanatory hypotheses too numerous for QSAR to benefit from increased model complexity.

  • When selecting candidate explanatory structural descriptors for use in a QSAR:

    • Physics-based descriptors [30], for example 3D-QSAR’s fields, are likely to produce the most robust and useful models, for example capable of actively generating and selecting among quite varied structural ideas. Other classes of descriptor are more or less limited to passive discrimination among less structurally varied ideas, generated by some external process.

    • Substructural descriptors, for example “2D fingerprints”, would seem the least reliable, because biological receptors are affected only by the fields and mechanical behaviors presented by a candidate ligand, not by the underlying atomic connectivities [31]. Yet, substructural descriptor similarity has been a better predictor of biological similarity than many 3D similarity metrics [32].

    • Whenever the underlying biological observables may depend on transport as well as receptor fit, such as passive membrane penetration, “1D” descriptors, such as log P, polar surface area, and pKa, should be considered. Perhaps obvious, yet for example 3D-QSAR models almost universally omit them [33].

Unfortunately QSAR’s relatively long history also suggests that many of its future practitioners will be variously hampered in learning from past experiences. The motivations to “turn off the brain and turn on the computer” will not disappear and indeed are probably increasing. Thus it is also my somewhat biased suggestion that in such hampered situations topomer CoMFA may nevertheless be of some benefit to a discovery process.