Domain knowledge in
biology:
There seems to be occasional confusion between concept and
instance in biological knowledge; subsumption (is-a) relationship must
not be confused
with instantiation (instance-of) relationship.
The GO Perspective:
This confusion does not exist in GO. The
relationship used by GO is "is-a", that is one
of subsumption. This
is explicit, e.g. Ashburner and Lewis 2002 (In: "In Silico
Biology",
Novartis Foundation Symposium 247, page 70). The
relationship between GO terms and
gene products (or surrogates
thereof) as in the GO gene_association tables (http://www.geneontology.org/doc/GO.current.annotations.shtml)
are indeed instantiation (instance-of) relationships.
We should pay much more attention on distinguishing between
part-of and is-a relationships.
The GO Perspective:
As Jennifer Williams (Ontology Works) pointed
out there exists a multiple usage of epart off
in GO, but certainly
GO distinguishes between epart off and eis af relationships.
What hierarchical classifications should we develop in classifying
biological pathways and chemical compounds? What kind of relations do
they require?
The GO Perspective:
It could be argued that whatever
classification we need it will not be hierarchical, at least if the
strict meaning of this word is meant here. It will need to be a multi-
inheritance graph, such as a directed acyclic graph (DAG) used by GO.
For example consider the substance "penicillin" - it needs to be classed
as both an "antibiotic" and, chemically, as a gbeta-lactamh.
To which depth should we develop ontologies in biological
domain?
The GO Perspective: Ontologies should be pragmatic. The depth depends
on the use to which they are to be put. See, for example, the
distinction between a full GO graph and a slimmed down version of GO
termed (GO_slim) used for analysis and presentation purposes
(ftp://ftp.geneontology.org/pub/go/GO_slims/README). Different purposes,
different depth. But, in general, for the purposes of the annotation of
objects (e.g. genes, gene products) one must annotate (and hence have an
ontology available for annotation) at the finest granularity consistent
with scientific knowledge. It is then easy to generate shallower views;
the opposite cannot be done if the original annotation is not made.
There are two kinds of functions; domain dependent and domain
independent.
The GO Perspective:????
Molecular functions are contingent,
dependent on conditions or occurrences not yet established.
Can we extract meta-function from biological knowledge?
The GO Perspective: Meta-function is already being extracted from
biological knowledge/annotation that has been converted into GO
vocabulary. (For examples browse the ePredictions with GOf section of GO
Bibliography :http://www.geneontology.org/doc/GO.biblio.html)
A particular reference: King OD, Foulger, RE, Dwight SS, White JV, and
Roth FP. 2003. Predicting gene function from patterns of annotation.
Genome Res 13: 896-904.
Can we map phenotypic data from mouse to human? (Particularly for
immune diseases)
The GO Perspective:
Yes we can. Given (a) good anatomical
ontologies for mouse and man (b) mapping between them and (c) a good
phenotype ontology .
Janet Kelso and Winston Hide (eVOC) are
developing an open ontology for Human anatomy and development, Jonathon
Bard (GeneX) is developing a open ontology for Mouse anatomy and
development, there is also a range of phenotype ontologies under
development and to be deposited on the Open Biological Ontologies (OBO)
site:
http://obo.sourceforge.net/. These ontologies and contact addresses
are archived on the OBO site to encourage communication between groups
wishing to use the same types of ontologies. They aim to create a gold
standard ontology involving the community that will aid data
integration. Of course other groups will still develop independently of
OBO anatomy/phenotype ontologies for specific needs (such as user
interfaces). To make use of a communityfs knowledge however they may
still need to be mapped to the gold standard, either way early
communication with OBO ontologists will aid later mappings. As you say:
Exchangeability and interoperability should be considered among
ontologies and databases.
Ontology should be rigorous. But at the same time, it should be
practical.
The GO Perspective:
Definitely.
Integration of Ontologies:
Does any core ontology exist, to which all the other ontologies
(should) link? Is it GO?
The GO Perspective:
No, and it will probably never will. The
domain (of biology) is simply too broad.
How to integrate bio-ontologies? Is it possible? Is it
needed?
What do you mean by integrate? One can (as has been done by GO for
several ontologies) semantically map Ontology 'a' onto Ontology 'b'. But
this demands that both are in the same domain and, ideally, means that
both should declare definitions of all of their concepts. The OBO
project (http://obo.sf.net) is an
attempt to at least have different ontologies share concepts (such as
relationships) and syntax..
How to compare ontologies? Granularity and depth is different with
each ontology.
The GO Perspective:
And will always be so we contest, since
different ontologies are designed for different purposes.
For the GO
Consortium, the Gene Ontology vocabulary was developed for the same
purpose, members annotate and share data in the same way. When mapping
GO to other ontologies stored in the Unified Medical Language System
(UMLS) some alterations to GO resulted in order to make the mappings to
MeSH terms possible (paper about this mapping recently submitted to
Comparative Functional Genomics). The goal of mapping GO to Pubmed
articles using MeSH seemed to be worth the alterations.
Common query language (interface) will be required. Will it be
OKBC?
The GO Perspective:
For what purpose?
Do we need ontology for ontology?
The GO Perspective:
We doubt it; we do need ontology for concepts
such as relationships.
For the OBO site Chris Mungall is developing
relationship ontology.
Files rel.ontology and rel.definitions are
included below for your information.
!autogenerated-by: DAG-Edit version 1.310
!saved-by: cjm
!date:
Tue Feb 25 16:31:45 PST 2003
!version: $Revision: 1.1 $
!
!
This is the core relationships for all OBO ontologies.
! ! Some
ontologies may have very specific relationship types,
! but the core
ones are shared here.
! ! This ontology contains no biological terms
-
! it is purely for defining the logic of the relationship
types
! used in the various OBO biological ontologies.
!
! It
also defines the "mnemonics" that are currently used in
! OBO flat
files (% = is_a, < = part_of, ~ = develops_from).
!
! Note that
this ontology is self defining!
! Parsers will at least have to have
hard-coded knowledge of
! the is_a relationship to use
this.
!
! The definitions contained in this CV provide an informal
mapping
! between OBO ontologies and description logic style
ontologies;
! all non "is_a" relationships map to restrictions on the
property
! in question, and "is_a" maps to the subsumption
relationship.
! Synonyms with existing daml terms are
specified
!
$OBO_relationship_ontology ; OBO_REL:0000
%relationship ; OBO_REL:0006 ; daml:property ; synonym:property
%covered_by ; OBO_REL:0005
%is_a ; OBO_REL:0002 ; daml:subClassOf ; mnemonic:&pct\; % transitive_relationship ;
OBO_REL:0001
%part_of ; OBO_REL:0003 ; mnemonic:<\; % transitive_relationship ; OBO_REL:0001
%transitive_relationship ; OBO_REL:0001 ; daml:TransitiveProperty ; synonym:transitive_property
%develops_from ; OBO_REL:0004 ; mnemonic:~
%is_a ; OBO_REL:0002 ; daml:subClassOf ; mnemonic:&pct\; % covered_by ; OBO_REL:0005
%part_of ; OBO_REL:0003 ; mnemonic:<\; % covered_by ; OBO_REL:0005
!version: $Revision: 1.1 $
!date: Tue Feb 25 16:31:45 PST
2003
!saved-by: cjm
!autogenerated-by: DAG-Edit version
1.310
!
!Gene Ontology definitions
!
term:
covered_by
goid: OBO_REL:0005
definition: some relationships are
"covering" relationships. If an entity is associated with a term X and
that term is covered by term Y, then that entity can be associated with
term Y. See also the GO "true path" rule - this rule applies to all
covering relationships (is_a and part_of, but not, for instance,
develops_from). It can be stated more formally as E annotated_to X X
covered_by Y implies: E annotated_to Y definition_reference:
go:cjm
definition_reference:
http://www.geneontology.org/doc/GO.usage.html
term: develops_from
goid: OBO_REL:0004
definition: any kind of
temporal relationship (not just in the developmental sense). The subject
(child node) of the relationship is the post-state, the object (parent
node) is the pre-state. It could also be "produced_by" (for instance, a
protein is produced_by an mRNA)
definition_reference: go:cjm
term: is_a
goid: OBO_REL:0002
definition: Represents
subsumption relationships. The subject (child) is the more specific
term; the object (parent) is the more general term. Corresponds exactly
to the daml and rdfs property "subClassOf", which means the following
always holds: Entity instance_of TermX TermX is_a TermY implies: Entity
instance_of TermY. For example, if Entity is a specific gene, then if
that gene is assigned to class X then it is implicitly a member of class
Y (and parents of Y, because is_a is itself a
transitive_relationship).
definition_reference:
go:cjm
definition_reference:
http://www.daml.org
term: part_of
goid: OBO_REL:0003
definition: Used for
representing partonomies. The subject (child node) of the relationship
is the subpart; the object (parent node) is the superpart npart_of can
be used in various contexts - spatial, compositional, temporal. The
context can usually be infered from the terms it relates (for instance,
in a process ontology, it means sub_process_of). We may decide to
introduce subclasses of this in future. The GO "true path" rules holds
for part_of. For instance, you cannot have "door" be part_of a "car" -
it must be "car door" part_of "car"\n\n\n
definition_reference:
go:cjm
term: relationship
goid: GO:0006
definition: root class for all
OBO relationships.
definition_reference: go:cjm
term: transitive_relationship
goid: OBO_REL:0001
definition: a
relationship is transitive if you can "follow it up the graph"\n\nfor
example, if X Y and Z are terms, and R is a transitive_relationship,
then the following must hold:\n\nX R Y,\nY R Z\nimplies:\nX R
Z
definition_reference: go:cjm
definition_reference: http://www.daml.org
Implementation of ontology in
computers
How to validate
ontologies?
This question is probably best dealt with by
Robert Stevens and your good self.
GO was created by biologists for
biologists, as Robert pointed out in one of the after-talk discussions.
If GO had been created using computer programs it may be more consistent
but would it have been useful straight away in biological annotation???
There are inconsistencies in GO which can and are being corrected using
new technologies. The GO Consortium knew that they were likely to make
mistakes but is totally committed to correcting these. If GO had waited
for the ontology to be perfect it would never have been released and
used to the extent that it has been. GO is dynamic, having unique
identifiers means that tracking changes to correct annotations is
relatively easy. Involving communities from a wide variety of biological
backgrounds was also helpful.
Comment on validating annotations to ontologies:(not in your list)
GOA receives a lot of requests from groups interested in using our
manually verified GO annotation to validate their automatic extraction
techniques. GOA is involved in the BioCreative competition announced at
ISMB 2003 by the BioLink group. We will be creating the training and
test sets for use in the competition. Each competitor has to mine the
literature for biological knowledge and convert it into GO terms. The
UniProt curators will then validate and rank each prediction. It is
hoped that the winning software could be used to assist in the GO
annotation process at the EBI.