The group of Semantic Systems Biology (NTNU, Norway) hosts
four APOs (http://www.semantic-systems-biology.org/apo
):
the Cell Cycle Ontology (CCO),
the Gene Expression Ontology
(GeXO),
the Regulation of Gene Expression Ontology (ReXO),
the
Regulation of Transcription Ontology (ReTO).
These ontologies, unlike domain ontologies, incorporate as well data pertinent to the domain of discourse.
All the four share a small Upper Level Ontology (ULO) which is built from a limited number of terms taken from well established ontologies.
Below is the complete list of terms (term_id => [ label,
parent_id ]):
'SIO:000000' => [ 'entity', 'SIO:000000' ],
'SIO:000003' => [ 'physical entity', 'SIO:000000' ],
'SIO:000260' => [ 'abstract entity', 'SIO:000000' ],
'SIO:000002' => [ 'processual entity', 'SIO:000003' ],
'SIO:000004' => [ 'material entity', 'SIO:000003' ],
'SIO:000614' => [ 'attribute', 'SIO:000260' ],
'SIO:000006'
=> [ 'process', 'SIO:000002' ],
'SIO:010004' => [ 'chemical
entity', 'SIO:000004' ],
'SIO:010046' => [ 'biological
entity', 'SIO:000004' ],
'SIO:000340' => [ 'realizable
entity', 'SIO:000614' ],
'SIO:011125' => [ 'molecule',
'SIO:010004' ],
'SIO:010441' => [ 'submolecule', 'SIO:010004'
],
'SIO:000112' => [ 'capability', 'SIO:000340' ],
'SIO:010001' => [ 'cell', 'SIO:010046' ],
'SIO:010074' =>
[ 'amino acid residue', 'SIO:010441' ],
'SIO:000017' => [
'function', 'SIO:000112' ],
'SIO:000014' => [ 'disposition',
'SIO:000112' ],
'MI:0190' => [ 'interaction type',
'SIO:000006' ],
'GO:0005575' => [ 'cellular component',
'SIO:010046' ],
'GO:0008150' => [ 'biological process',
'SIO:000006' ],
'GO:0003674' => [ 'molecular function',
'SIO:000017' ],
'SIO:010043' => [ 'protein', 'SIO:011125' ],
'PR:000025513' => [ 'modified amino-acid residue',
'SIO:010074' ],
'SIO:010035' => [ 'gene', 'SIO:010441' ],
'SIO:010000' => [ 'organism', 'SIO:010046' ],
'OGMS:0000031'
=> [ 'disease', 'SIO:000014' ]
The scope of each APO is determined by the Biological Process Gene Ontology terms listed below, included in the APOs together with all their descendants (along ALL the relationship types used in the Gene Ontology).
CCO:
'GO:0007049' => 'cell cycle',
'GO:0051301' =>
'cell division',
'GO:0008283' => 'cell
proliferation',
'GO:0006261' => 'DNA-dependent DNA replication'
GeXO:
'GO:0010467' => 'gene expression'
ReXO:
'GO:0010468' => 'regulation of gene expression
process'
ReTO:
'GO:0006355' => 'regulation of transcription,
DNA-dependent'
All the four APOs include the complete Molecular Function and Cellular Component branches of the Gene Ontology and the Interaction Type branch of the Molecular Interactions Ontology.
Original IDs are re-used throughout instead of minting APO specific ones.
With respect to the data all the four APOs are protein-centric. They import data from the following sources: GOA, IntAct, UniProt, Entrez
Additionally, orthology relations are computed with the use of the OrthAgogue utility.
In each case the data are filtered by the scope of the ontology as defined above. The initial set of proteins imported from GOA is further extended by applying the 'guilt-by-association' principle to the IntAct and orthology data.
The APOs include data for the following biological species
(term_id => [term_name, term_def]):
CCO:
'NCBITaxon:559292'
=> [ 'Saccharomyces cerevisiae', 'An organism of the species
Saccharomyces cerevisiae']
'NCBITaxon:284812' => [
'Schizosaccharomyces pombe', 'An organism of the species
Schizosaccharomyces pombe']
'NCBITaxon:3702' => [ 'Arabidopsis
thaliana', 'An organism of the species Arabidopsis
thaliana']
'NCBITaxon:6239' => [ 'Caenorhabditis elegans', 'An
organism of the species Caenorhabditis elegans']
'NCBITaxon:7227'
=> [ 'Drosophila melanogaster', 'An organism of the species
Drosophila melanogaster']
'NCBITaxon:8364' => [ 'Xenopus
tropicalis', 'An organism of the species Xenopus
tropicalis']
'NCBITaxon:9606' => [ 'Homo sapiens', 'An organism
of the species Homo sapiens']
'NCBITaxon:10090' => [ 'Mus
musculus', 'An organism of the species Mus musculus']
GEXO, REXO, RETO:
'NCBITaxon:9606' => [ 'Homo sapiens', 'An
organism of the species Homo sapiens']
'NCBITaxon:10090' => [
'Mus musculus', 'An organism of the species Mus
musculus']
'NCBITaxon:10116' => [ 'Rattus Norvegicus', 'An
organism of the species Rattus Norvegicus']
Original IDs are re-used throughout with the single exception for
modified amino-acid residue terms. The following name spaces are used
to form term IDs:
IntAct => interaction terms
NCBIGene =>
gene terms
NCBITaxon => biological species terms
OMIM =>
disease terms
SSB => modified amino-acid residue terms
UniProt
=> protein terms
Terms are related to each other with the use of properties from
RO, BFO, SIO as explained below:
'protein term' =>
['RO:0002331', 'involved_in'] => 'GO term, biological_process'
'protein term' => ['BFO:0000050', 'part_of'] => 'GO term,
cellular_component'
'protein term' => ['RO:0002327',
'enables'] => 'GO term, molecular_function'
'protein
interaction term' => ['SIO:000139', 'has agent'] => 'protein
term'
'protein term' => ['SIO:000558', 'is orthologous to'] =>
'protein term'
'protein term' => ['SIO:000630', 'is paralogous
to'] => 'protein term'
'protein term' => ['RO:0000053',
'bearer of'] => 'modified amino-acid residue term'
'protein
term' => ['RO:0002331', 'involved in'] => 'disease term'
'protein term' => ['RO:0000052', 'inheres in'] =>
'biological species term'
'gene term' => ['RO:0000052',
'inheres in'] => 'biological species term'
'gene term' =>
['SIO:010078', 'encodes'] => 'protein term'