A Bridge Between Natural Language- And Ontology-based Biomedical Resources
—-The Examples of Biomedical Knowledge Expressed by SLHMR
Hanfei Bao,BMKI Lab, Toronto, Canada Mail: hanfeib@gmail.com
I. The Structured Language Readable To Human And Machine(SLHMR)
The Structured Language Readable To Human And Machine(SLHMR) is a formatted language which is expected by the author to be used to express the biomedical knowledge in both the formatted (and therefore machine-readable) and still natural-text-like way. The resources which are expressed by SLHMR (SLHMRRs) are taken as a bridge between the traditional natural-text biomedical resources (NTBMRs) and the ontology-based formatted biomedical resources(OFBMRs) . The philosophic bases for SLHMR are the views that NTBMRs are usually only human-readable (rather than machine- or computer program-readable), whereas the OFBMRs are generally only machine-readable. Thus the biomedical resources expressed by SLHMR would help human reader to understand the corresponding OFBMRs by their natural language features and are expected to be more powerful in the expression capability than the ontology-technique, as well, especially in respects of expressing the knowledge about the super-complex systems like the molecular biomedicine, such as in HIV/AIDS area. Additionally, SLHMR are intedned to be readable by machine through the development of the powerful applications. (see Fig19-1)
Fig19-1
II. The Examples of Knowledge About HIV/HIDS Expressed By SLHMR
1. (LTR–or—U3–or–R–or–U5)—isA—SequenceOfViralDNA
2. (U3–or–R–or–U5)—isA—RegionOfViralDNA
3. LTR —physicalyContains—(U3–and–R–and–U5)
4. LTR —isSubdividedInto—(U3–and–R–and–U5)
5. LTR—physicalyContains—(Enhancer–and–Promotor–and–Cleavage/Poly(A))
6. U3—isEncodedBy—ViralRNA-SequencesUniquelyAt3′EndOfGenome
7. U5—isEncodedBy—ViralRNA-SequencesUniquelyAt5′EndOfGenome
8. R—isEncodedBy—RepeatedSequenceOfViralRNAAtEitherEndOfGenome
9. SiteOf(CleavageAlsoCalledPolyadenylation)—isLocatedAt—R
10. (Promoter–or–Eenhancer)—isLocatedAt—U3
11. (Enhancer–or–Promotor–or–SiteOf(CleavageAlsoCalledPolyadenylation)—isA—TranscriptionalSignal
12. Transcription —isInitiatedAt—BoundaryBetweenU3AndR
13. CleavageAlsoCalledPolyadenylation—takesPlaceAt— BoundaryBetweenU5AndR
14. BoundaryBetweenU3AndR—isLocatedAt—UpstreamLTR
15. BoundaryBetweenU5AndR—isLocatedAt—DownstreamLTR
III. The Examples of Relations Drawn from the Particular References and Their Composite and the Granularity Evolution of Those Relations
We can draw the following set of the physical relations from ref. 2,3,4. They are
16. RNA-PolymeraseII—regulatesPositively—(TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA–or–TranscriptionOfDNA-ToSynthesizePrecursorsOf-snRNA–or–TranscriptionOfDNA-ToSynthesizePrecursorsOf-microRNA)
17. PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII—regulatesPositively—ActivityOfRNA-PolymeraseII
18. The compound form of the relations 16 and 17 would be
(RNA-PolymeraseII—regulatesPositively—(TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA–or–TranscriptionOfDNA-ToSynthesizePrecursorsOf-snRNA–or–TranscriptionOfDNA-ToSynthesizePrecursorsOf-microRNA))—byMeansOf—PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
19. The more detailed form of 18 would be
(RNA-PolymeraseII—regulatesPositively—(TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA–or–(TranscriptionOfDNA-ToSynthesizePrecursorsOf-snRNA—hasDescrPanWeightOfProportionOfPopulation—MostOfPopulation)–or–(TranscriptionOfDNA-ToSynthesizePrecursorsOf-microRNA—hasDescrPanWeightOfProportionOfPopulation—MostOfPopulation)))—byMeansOf—PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
20 The coarseness or granularity evolution here means the processes of turninng a content-detailed relation into its less detailed one or the coarser granularity form or even the blackbox-like form. For 19, we have the steps as the following:
(RNA-PolymeraseII—regulatesPositively—(TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA–or–(TranscriptionOfDNA-ToSynthesizePrecursorsOf-snRNA—hasDescrPanWeightOfProportionOfPopulation—MostOfPopulation)(#st1)–or–(TranscriptionOfDNA-ToSynthesizePrecursorsOf-microRNA—hasDescrPanWeightOfProportionOfPopulation—MostOfPopulation)(#st2)))—byMeansOf—PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
—>
(RNA-PolymeraseII—regulatesPositively—(TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA–or–(#st1)–or–(#st2))(#st3))—byMeansOf—PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
—>
(RNA-PolymeraseII—regulatesPositively—(#st3))(#st4)—byMeansOf—PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
—>
(#st4)—byMeansOf—PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
Thus a “simple” binary relation would be the last result of the granularization evolution.
IV. The Network-like Diagram For The Physical Relations Above And Some Necessary Logic Relations
The data file RNAPolII.sif, which is of the required form for Cytoscape, of the completed physical and logical relations for the network-like diagram is as the following:
Column1 Column2 Column3
st14 byMeansOf PhosphorylationOfCTD-OfLargerSubUnitOfRNA-PolymeraseII
RNA-PolymeraseII regulatesPositively TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA
RNA-PolymeraseII regulatesPositively st11
RNA-PolymeraseII regulatesPositively st12
st14 represents RNA-PolymeraseII—regulatesPositively—st13
st13 represents RNA-PolymeraseII—regulatesPositively—TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA
st13 represents RNA-PolymeraseII—regulatesPositively—st11
st13 represents RNA-PolymeraseII—regulatesPositively—st12
st12 represents TranscriptionOfDNA-ToSynthesizePrecursorsOf-microRNA—hasDescrPanWeightOfProportionOfPopulation—MostOfPopulation
st11 represents TranscriptionOfDNA-ToSynthesizePrecursorsOf-snRNA—hasDescrPanWeightOfProportionOfPopulation—MostOfPopulation
st12 hasSubjectOfRelation TranscriptionOfDNA-ToSynthesizePrecursorsOf-microRNA
st11 hasSubjectOfRelation TranscriptionOfDNA-ToSynthesizePrecursorsOf-snRNA
st13 hasObjectOfRelation st11
st13 hasObjectOfRelation st12
st13 hasObjectOfRelation TranscriptionOfDNA-ToSynthesizePrecursorsOf-mRNA
st14 hasObjectOfRelation st13
And the network diagram shown by Cytoscape can be seen in Fig. 19-2.
Fig19-2 the diagram made by Cytoscape.
V. Realization Of Ontology Based On the Resources Expressed By SLHMR
From references 5-6, we have the SLHMR-expressed resources and based on which the author has built the corresponding ontology HIVGenetics00*.owl, which will be published as a open medical machine reading resource:
(1)LTR-FullNmLongTerminalRepeatOfHIV1ProViralDNA—isA—SequenceOfDNA;
(2)LTR-FullNmLongTerminalRepeatOfHIV1ProViralDNA—hasNumberOfBasePair— some 634(see Fig19-3)
Fig. 19-3
(3)LTR-FullNmLongTerminalRepeatOfHIV1ProViralDNA—isLocatedInOrAt—EitherEndRegionOfHIV1ProviralDNA
(4)LTR-FullNmLongTerminalRepeatOfHIV1ProViralDNA—hasPhysicalComponentOrPhysicallyContains—
(U3RegionFullNmUnique3’Sequence–And–R-RegionFullNmRepeatedSequence–And–U5FullNmUnique5’Sequence)(see Fig.19-4)
Fig.19-4
(5)U3RegionFullNmUnique3’Sequence—hasNumberOfBasePair—some 450
(6)U3RegionFullNmUnique3’Sequence—physicallyAndFunctionallyContains— CisActingDNA-Element
(7)CisActingDNA-Element—isA—SiteBindingCellularTranscriptionFactor(see Fig.19-5)
Fig.19-5
(8)R-RegionFullNmRepeatedSequence—hasNumberOfBasePair—some 100
(9)Transcription—startsAtLocation—FirstBaseOfR-RegionFullNmRepeatedSequence(see Fig. 19-6)
Fig. 19-6
(10)Polyadenylation—startsAtLocation— ImmediatelyAfterLastBaseOfR-RegionFullNmRepeatedSequence(see Fig, 19-7)
Fig. 19-7
(11)U5RegionFullNmUnique5’Sequence—hasNumberOfBasePair—some 180
(12)U5RegionFullNmUnique5’Sequence—binds—Tat(Fig. 19-8)
(13)U5RegionFullNmUnique5’Sequence—binds–PackagingSequenceOfHIV(Fig. 19-8)
Fig. 19-8
References
1, Warner C. Greene, and B. Matija Peterlin:Molecular Insights Into HIV Biology
http://hivinsite.ucsf.edu/InSite?page=kb-00&doc=kb-02-01-01
2, Nguyen V.T., Kiss T., Michels A.A., Bensaude O.:,7SK small nuclear RNA binds to and inhibits the activity of CDK9/cyclin T complexes.
http://www.uniprot.org/citations/11713533
3,
https://en.wikipedia.org/wiki/RNA_polymerase_II
4, Steven Hahn:
Structure and mechanism of the RNA Polymerase II transcription machinery,
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1189732/
5. Hope, PhD, Didier Trono, MD: Structure, Expression, and Regulation of the HIV Genome,http://hivinsite.ucsf.edu/InSite?page=kb-02-01-02#S3.1X
6. https://en.wikipedia.org/wiki/Long_terminal_repeat
(Relative sites for ontologies and discussions:
https://bioportal.bioontology.org/ontologies/HIVO004
http://bioportal.bioontology.org/ontologies/HCODONONT/
http://bioportal.bioontology.org/ontologies/AIDSCLIN
http://blog.51.ca/u-345129/
http://wp.miforum.net/baohanfei/)
(All rights reserved),(Updates: 2017-01-18,2017-01-19,2017-01-24,2017-01-26,2017-02-10,2017-02-11)