A new cancer case repository that is able to extract information from the wide variety of sources in an electronic healthcare record and make them available to academic researchers and clinicians has been launched as the first major output of the Clinical e-Science Framework (CLEF).

Professor Alan Rector, director of CLEF, an e-Science project funded by the Medical Research Council said: “The CLEF repository is optimised to treat electronic healthcare records as an interactive knowledge source for academic researchers and clinicians to help them access the latest medical information.

“Once fully deployed, it will lead to previously unthinkable, rapid advances in healthcare research by enabling researchers to analyse data stored in a wide range of geographically-spread databases, online.”

The repository’s developers say sophisticated security systems, also developed by CLEF, ensure secure and ethical access to the databank which contains records from 22,000 cancer patients. The project has also implemented stringent access control, authentication and secure transmission protocols using sophisticated encryption standards to protect against accidental disclosures.

A team at University College, London built the repository using a new method for importing and structuring data so that users can do population queries over longitudinal data sets. The CLEF repository supports the large-scale analysis of patient records in a Grid environment. It can handle complex queries, whilst retaining the critical semantic, structural and medico-legal integrity of the data.

The process, developed in part by Professor Rector’s team at Manchester University, structures the source data in multiple steps enabling users to put complex clinical questions to the repository.

First data is structured in a longitudinal format, then by clinical context and finally by the actual type of data. Previously, the retrieval of similarly complex data would have required time-consuming manual search and data analysis. Using the work of Professor Rob Gaizauskas’ team from Sheffield University, the CLEF system is able to extract key medical information from clinical records that are in a narrative format including medical letters, discharge summaries and radiology reports.