Mnova Logo

Most analytical chemists working in Drug Discovery intuitively know that ASIC/ASV (Automatic Structure Integrity Confirmation / Automatic Structure Verification, see previous articles in the series) makes sense. The challenges posed by the large volumes of data and small analytical human resources are glaringly obvious, as are the consequences of getting Structure Integrity wrong. However, when it comes to outlining the economic argument for acquiring such a system, things are less clear and many questions arise, which can be summarized into one: How much is this system worth to our organization in financial terms?

Phil Keyes, Michael Messinger and Gonzalo Hernandez have written an excellent article, published in American Laboratory, analyzing precisely this question and using their own experience at Lexicon Pharmaceuticals, Inc. [1]. This has made my job in explaining this extremely easy, and I will lean heavily on their excellent work, presented a summarized version here. I recommend you also read the original article.

How much do my incorrect structures cost me?

The first thing to do is to compare the cost of the ASIC/ASV system with its converse, the cost of incorrect compounds in an organization which can be palliated by the implementation of such a system. The first term is simple, whilst the second is more complicated, requiring more thinking and possibly some experience (or leaning on the experience of Phil, Michael and Gonzalo). I quote them here:

´ …an incorrect structure, whether active or not, will provide a misleading result. Either the SAR will not be supported or will be misdirected, or resources will be wasted running tests on a compound that otherwise would not have been tested. The worst-case scenario is that the correct compound (the one the chemist wanted to make but did not) would have answered the SAR questions correctly. ´

But how to turn this qualitative understanding of the many costs of an incorrect structure in a Discovery operation into a dollar figure? First of all, we need to know how many incorrect structures are being generated, i.e., how big this problem is. In their article, after extensive analysis, the authors report the following results:

Incorrect Structure Unacceptable purity Registration Error Total Error
2% 2% 1.20% 5.20%

All structures were manually analyzed and 2% were found to be incorrect. Unacceptable purity was set at a 70% level, which is very much on the limit of the data quality our system can deal with (more on that on other articles in this series). If we ignore that percentage, we are left with 3.2% of structures with wrong identity in our registration system, either because the structure was wrong or because an error was made at the point of registration. It is worth noting that these structures had been analyzed and passed by LCMS, and that, since they came out of the chemistry program, it must be surmised that the chemist had reviewed the analytical data and decided to register the compounds.

Costs can be attributed to these errors by understanding parameters such as chemist costs, consumables and equipment costs to synthesize them, and any other costs associated with these erroneous compounds. One very simple, and very conservative approach, can be taking the cost of a chemistry FTE (Full Time Equivalent), which has been reported in US and Western Europe as being in the region of $250,000 – $325,000[3][4], and multiplying this by the percentage of errors made in my organization (the percentage of that FTE resource devoted to making those erroneous compounds), as we can see in Eq 1.

Annual FTE cost x % of errors = Annual cost of erroneous compounds per FTE

And therefore, the total annual cost of erroneous compounds for an organization could be arrived to by Eq 2:

Number of FTE’s x Annual cost x % of errors = Annual cost of erroneous compounds

This model is simplistic and very conservative, as it ignores a number of additional costs of the erroneous compounds, which would be either more difficult to estimate or which may have a larger but less frequently occurring impact, such as:

  • Cost of associated biology
  • HTS (High Throughput Screening) Resources
  • Cost of intensive specific investigation should an erroneous compound be active and should this activity, upon resynthesis, not be reproducible due to the right compound being made second time round, i.e. cost of in vivo and in vitro pharmacology on the ´right´ compound after the ´wrong´ compound has hit.
  • Cost of missing on active compounds due to making the wrong one
  • Cost of misleading SAR information

Once we have arrived at this Annual Cost of Erroneous compounds for an organization, calculating the return and therefore establishing the economic argument for an Invest / Not Invest decision on an ASIC / ASV system can be done in several ways.

Financial Analysis – Simple Error Cost (SEC)

Once we have Equation 2, it is easy to compare the cost of the ASIC / ASV system with the cost of erroneous compounds. Let´s look at some examples:

This raises another interesting point, which is how good the system needs to be. In this example, a system which identified only 50% of the erroneous compounds, would hit break even in just over 12 months. Therefore, the system does not have to achieve perfect results to represent a very worthwhile investment.

A second example, focused on a large pharmaceutical company, is even more enlightening:

In this case, breakeven is reached at an amazing 5 weeks, assuming a system which peaks up 100% of errors. Even at 10% of errors, breakeven would be reached within 1 year

Financial Analysis – Return On Investment (ROI)

ROI is a common finance measure to evaluate different investments in order to decide which is the most advantageous. It is very generic and simple, and therefore it is very commonly used in accounting and finance. ROI is calculated as:

ROI = (Gain for investment – cost of investment) / Cost of investment

The gain from the investment can be the proceeds of selling the investment or, in our case, the savings generated by it. If we were to apply this equation to our 2 examples above, these would be our ROI results:

These are very high ROIs and, of course, you can use different assumptions to make the case for your organization (assume different percentage of negatives picked up, lower percentage of errors in your chemistry, etc.). This model also ignores the cost of the analytical resource necessary to review the negative results introduced by the system, and this should be included into the ROI calculation. Although I have not got the relevant information, to give a basic illustration I will work on the basis that 10% of the time of an analyst must be devoted to look at the negatives for every 20 chemists (this is most likely hugely overstated, as only negatives are being checked and therefore only 20% or so of analytical data generated by the chemists would therefore be reviewed).

Even with these highly overestimated analytical costs, the case for a system such as this is compelling. Typically, when making financial decisions, the ROI is compared to the Opportunity Cost of the money invested, which can be defined, in the absence of an alternative project, by the interest which can be achieved in the financial markets for a deposit of the money invested. Clearly, an ASIC/ASV system yields an ROI which is greatly superior to such interest, particularly on 3 or 5 year horizons, which are the shortest horizons normally used for ROI calculations.

In fact, should enough analytical resource not be available at an organization to review negative results, the case is clearly made to invest on such resources and to, therefore, increase analytical headcount.

A further analysis – System quality

In the previous sections we introduced one additional parameter into our decision making: the quality of the ASIC/ASV system, defined as the ability of the system to pick out compounds of wrong identity. The interesting thing about this financial analysis, is that it can allow us to compare 2 ASIC/ASV systems, based on our particular situation and on this quality parameter. For example, what would be the difference in value between a system which picks out 40% of wrong compounds and a system which picks out 80% of wrong compounds?

In our first example, the small pharma with 20 FTEs, the answer is now simple, the first system would be worth 40% of €96,000, and therefore €38,400 per annum, whilst the second system would be worth €76,800 per annum.
In our second example, the large pharma, the first system would be worth €792,000 per annum, whilst the second system would be worth €1,584,000 per annum.

Even if both systems had different prices, it would be easy to decide, based on our particular situation, which system we should invest in. Of course, given the excellent return on investment from this type of solution, there is a very strong argument to invest in more than one system, should they pick out a somewhat different set of negatives.

So, are false positives a problem? And what about false negatives?

Another very interesting point arises from this analysis. Those readers who have been involved in ASV discussions over the years will be well aware of the different thinking about the impact of false positives and false negatives when evaluating such a system. The common, received wisdom is that false positives (incorrect structures passed as correct by the system) are bad, as we are letting wrong compounds go through the filter, and wrong compounds are costly and become more costly as they go further down the Discovery and Development pipeline. However, this clearly depends on where we are deploying the system.

If we are going to use the system to replace our analytical efforts altogether and therefore to generate a situation where we will run ASIC/ASV on all chemistry and nobody will look at the analytical results for compounds which pass, at any point along the way, then false positives are indeed a problem. We can now even calculate the cost of such approach, by applying the above Eq2 using the percentage of false positives generated by our system. In such a setting, the evaluation system needs to minimize false positives. But even here there is one caveat. If our chemistry department is generating 2%-3% incorrect structures (let´s work with 3%), this is the maximum percentage of false positives any system can generate (I will write more about testing of ASV systems and false positive and false negative rates in a separate article, but the evaluation of the proneness of a system to generate false positives relies on the generation of structures, called negative controls, which are incorrect but which could be mistaken by the correct structure, and on then running those against the system as alternatives to the correct structure. If the structures are very similar from an analytical viewpoint, a high percentage of false positives may ensue. However, this is an academic effort and, in real life, if our chemists are making the right structures in a high percentage of cases, then this percentage of wrong structures which actually exist becomes the upper limit for the potential of the system to allow real false positives). Therefore, a false positive rate of 10% in the ASV/ASIC system would mean a real number of false positives of 10% x 3% (the percentage of wrong structures in my chemistry department and therefore the maximum percentage of false positives I could have), i.e., 0.3%.

If we are going to use the system to support our current analytical efforts, by deploying it at the point of registration as a redundant system, to pick up errors and continuously improve the quality of our compound libraries, then false positives are not a big problem, as our objective is to minimize the number of false positives generated by internal or external chemistry groups, and as the number of wrong structures is so low in the first place. So, a positive result on ASIC/ASV on a compound passed as positive by chemistry has no effect, the final result being exactly the same as if ASIC/ASV had not been used. On the other hand, a negative result on ASIC/ASV on a compound passed as positive by chemistry, which turns out to be a true negative, has an immediate effect with a value which can be calculated by using Eq. 2. The same applies to a deployment of the system in Open Access to analyze the data prior to passing it to the chemist. The likely consequence there is that chemists will pay more attention to analysis of compounds failed by the system, or even refer them to analytical, again with the additional value arrived to by Eq.2 for true negatives.

In this second case, we have a further consideration. Compounds which are failed by the ASIC/ASV system will need to be reviewed. The business must decide whether they should just be automatically de-registered, whether the analytical group should confirm the failure, which would require some analytical resource, or whether the analytical group should confirm the failure and elucidate the new structure, which would require more analytical resource (this third approach may be used, for example, if the ´wrong´ compound had shown activity in screening efforts). There is, therefore, an argument for basing the selectivity of the system on the analytical resources available. If I only have the resource to look at 10% of the compounds that fail, and realizing that a considerable number of these will be false negatives, the system should be tailored to fail around 10% of compounds, the most clear cut failures. If I have the resource to analyze 30% of the compounds that fail, then I want a system which gives me around 30% of failures. Of course, a less selective system which gives me less negatives, and therefore less false negatives, will give more false positives but, as discussed above, this is not a big problem when using the ASIC/ASV system in certain contexts.

Other considerations

Earlier in this article, we studied the economic case for implementing an ASIC/ASV system. Although the case for it is compelling, it is also worth noting that such a system has additional benefits, which should not be ignored, such as:

  • Automatic purity calculation by NMR as well as MS, giving more reliable purity information
  • Automatic atom to peak assignment suggestions which provide a starting point for chemists and therefore helps them do their analytical work faster, so that they can focus on their chemistry work.
  • Possibility to couple the system to a Spectral Database which provides archiving of analytical data associated to structures and allows later searching based on spectral features (peaks, multiplets, etc) as well as more sophisticated data mining.
  • [1] Automated Structure Verification by NMR, Part 1: Lead Optimization Support in Drug Discovery. Philip Keyes, Michael Messinger, Gonzalo Hernandez. Part 1
  • [2] Automated Structure Verification by NMR, Part 1: Lead Optimization Support in Drug Discovery. Philip Keyes, Michael Messinger, Gonzalo Hernandez.Part 2.
  • [3] Outsourcing Jul/Aug 2007, pp 6–7 
  • [4] McCoy, M. Chem. Eng. News 2005, 83(44), 14–18.
  • [5] Paul et al, Nature Reviews Drug Discovery 9, 203-214, 2010
  • [6] Mistry et al, European Pharmaceutical Review 17-2, 53-56, 2012

Last modified: June 19, 2014 by