Tuesday, August 9, 2011

ICBO2011 Reports

The last week in July I and three colleagues attended the International Conference on Biomedical Ontology (ICBO) 2011, in Buffalo, NY. As I have been a "remote hang-around" on Twitter following other conferences on distance (see for example my blog post following the SemTech conference earlier this summer) it was great fun this time to be active on Twitter IRL in Buffalo:  My #ICBO2011 tweets


And yes, I did see the Niagara Falls again -- this time I did get really close to them on a boat tour with the "Maid of the Mist".


Now, after a long journey home, and a couple of relaxing days on the Swedish west coast and in central London, it's time to use my tweets, the conference presentations and proceedings (pdf) to pull together some of my insights and learnings. Here's my first report with some notes and reflections from the conference and follow up to my previous blog posts in preparation for the conference (part 1 and part 2). See also my fourth blog post from ICBO published 1 September.


High quality, "true", ontologies 
It was nice to see presentations and read papers on ontologies from a broad spectrum of domains, such as:
  • Genes
    See a recent paper: How the Gene Ontology Evolves, describing the ways in which curators of the Gene Ontology (GO) have incorporated new knowledge. 
  • Protein complex and supra-complex
    See the presentation on this topic in the panel the first day: From proteins to diseases, by Bill Crosby (Department of Biological Sciences, University of Windsor)
  • Emotions and Chronic pain
    See the presentation and paper on how to represent emotions based on research in affective disorders such as bipolar, depression and schizoaffective disorder, by Janna Hastings, (European Bioinformatics Institute, UK, and, Swiss Centre for Affective Sciences, University of Geneva, Switzerland). See also the announcement of the development of an ontology for Chronic pain and a nice video: Toward a New Vocabulary of Pain.
  • Demographics
    See the presentation describing how "demographic data in current information systems is ad hoc, and current standards are insufficient to support accurate capture and exchange of demographic data", and the proposed use of the Demographics Application Ontology to as a solution. 
  • Adverse Events
    In the workshop on representing adverse events we learned about interesting work on adverse ontologies. (See a video of the workshop organizer Mélanie Courtot: Towards an Adverse Event Reporting Ontology). We also learned about the development of ontologies to represent temporal relationships (e.g. Clinical Narrative Temporal Relation Ontology) which is a key aspect in handling safety issues and regular ongoing pharmacovigilance in pharmaceutical research and development.
All of these are examples of high quality "true"1) and modular ontologies developed beneath the Basic Formal Ontology (BFO) providing formal definitions for types of entities in reality and for the relationships between such entities (so called ontological realism). Such ontologies are designed to allow annotations of experimental and clinical data "to be unified through  disambiguation of the terms employed in a way  that allows complex statistical and other  analyses to be performed which lead to the  computational discovery of novel insights"2)


My own reflections: 
So far we have seen none, or very little, uptake of such high quality "true" ontologies for clinical data. Something I also highlighted in my earlier blog post on clinical data standards.  In a coming blog post I will present a demo using the Demographics Application Ontology showing how a high quality "true" ontology can be used to support accurate capture and exchange of demographic data. I will also outline some ideas on how this could be used also for clinical study data (CRF:s and databases). 

"Mapping mania" for the legacy of terminologies
A common theme in several of the presentations, papers and panels was the mappings (matching, alignment) needed between terms and concepts organized as terminologies and coding nomenclatures, such as SNOMED CT, LOINC, ICD, CDISC SDTM CT:s (derived from NCI Thesaurus), and MedDRA. Here are some examples:
  • Extraction of the anatomy value set from SNOMED CT to be reused for the 11th revision of the International Classification of Diseases (ICD-11). See a presentation on the problems and proposed patterns by some well known people (Harold Solbrig and Christopher Chute at Mayo Clinic, Kent Spackman working for IHTSDO, and Alan L. Rector at University of Manchester)
  • The Ontology Evaluation Alignment Initiative (OAEI) was mentioned by several presenters as a forum to discuss the problems of direct matching between different terminological resources.
  • The use of a ontology matching tool called AgreementMaker was presented.
  • In a panel on: National Center for Biomedical Ontology (NCBO) Technology in Support of Clinical and Translational Science, the basic lexical term mappings was mentioned as an example of a service available both via BioPortal's graphical interface and as REST services.
These are all example of a legacy already in use, or in the process of being used, for the annotations of EHR, clinical trials and patient safety data. For example for the huge US initiative on meaningful use of EHR as highlighted by Roberto Roch in his keynote on Practical Applications of Ontologies in Clinical Systems.


My own reflections:
In my previous blog post preparing for the conference I refereed to the mapping problem as  comparing "Apples and Oranges" and sometimes I think of it as a "mapping mania". In the conference I did hear the comment "Mappings are hard" several times,  and also the question "Who will create, validate and maintain all the mappings?


After some more days of vacation I will get back later on in August with more notes and reflections from the conference:.
  • I will report from the debate on how to accurately connect data from measurements and questionnaires (information entities) to ontologies (real world entities). I think this is a key aspect to get machine-processable clinical data ready for automatic transformation and direct querying, and ready for inferencing and reasoning. 
  • Another theme I would like to cover is referent tracking, i.e. assign globally unique identifiers for each entity in reality about which information is stored. For example diagnoses, procedures, demographics, encounters, hypersensitivity, and observations as they are reported in EHRs. This is something I think is a key enabler for accurate secondary use of EHRs.