STFC report highlights ADDoPT

SCD annual report picks out ADDoPT to exemplify the development and delivery of  cutting edge solutions for academia and industry to advance data intensive science and innovation.

Science Highlights 2017, the newly published annual report of the Science and Technology Facilities Council's Scientific Computing Department (SCD), identifies particular highlights within the context of the first year of the SCD 5-year strategic plan (2017 – 2021). Amid such impressively broad and deep work across national and international collaborations, it is pleasing to report that ADDoPT has been chosen to exemplify the Department's response to the plan's Strategic Theme 3: [delivery of] "a comprehensive programme of computational and data science services, research and development to underpin STFC’s Data Intensive Science ambition: to develop and deliver cutting edge solutions for academia and industry to advance data intensive science and innovation."

According to the report, "SCD is providing vital science and software expertise to enable the STFC Hartree Centre to fulfil its mission to:'Transform the competitiveness of UK industry by accelerating the adoption of data-centric computing, big data and cognitive technologies.' SCD’s computational expertise and support helps to reduce laboratory time and costs spent on research and development." In this context, the alignment with ADDoPT's mission - to secure the UK’s position at the forefront of pharmaceutical development and manufacture through the establishment of robust manufacturing for current and next generation medicines built on UK excellence in process modelling, simulation, optimisation and control - is obvious. ADDoPT's aim is to develop more accurate digital design techniques that will enable more of the early R&D process to be carried out virtually, saving time and improving cost efficiency.

Predicting lattice energy without a crystal structure

The STFC Hartree Centre's team in ADDoPT, straddling Computational Chemistry and Biology, is led by Principal Investigator Chris Morris, and is variously working on

  • Statistical modelling of active pharmaceutical ingredients (API) powders to predict their suitability for producing tablets
  • Working with ADDoPT partners at the University of Leeds to predict the lattice energy of crystals
  • Enhancing interface codes for the DL_POLY2 molecular dynamics code, to capture analytical detail pertinent to the pharmaceutical industry.
  • Text mining of chemical papers, to extract relevant descriptive or numerical data
  • Developing a statistical model for predicting lattice energy even before a crystal structure is available.

In the example illustrated, a Naive Bayes Classifier for drug-likeness is able to predict ‘drugs to be drugs’ more often, outperforming a published quantitative estimate of drug-likeness (QED) model and a random model. (TPR – true positive rate, FPR – false positive rate).

The Hartree Centre has also provided computational resource to enable other ADDoPT partners including Process Systems Enterprise (PSE), the Cambridge Crystallographic Data Centre, and the University of Leeds to carry out advanced modelling tasks..

Lastly - for now - Hartree is organising a training course in Machine Learning for Cheminformatics at Daresbury for early 2018, about which more information is available here.