SlideShare a Scribd company logo
1 of 30
Download to read offline
Inquiry Optimization Technique
      for a Topic Map Database

                               Yuki Kuribara
            (Graduate School of Engineering,
            Shibaura Institute of Technology)
                            Masaomi Kimura
                   (Information Engineering,
            Shibaura Institute of Technology)
Contents
   Background
   Research contents
   Experimental
   Conclusion




2                       Data Engineering Lab   2010/10/6
Topic maps
   Recently, many kinds of topic maps are created
       For web portal site
       For application development… and so on


   When we target the large topic maps, we need to construct
    databases for them
       since databases can deal with the data larger than the size of physical
        memory

                   Out of memory
                                                 On memory




3                                   Data Engineering Lab                    2010/10/6
The role of database
       Database systems should take responsibility for managing
        information of topic maps
           Query optimization
           Transaction management
           Physical data structure hiding
                                                                    Query
                                                                 optimization

                                query


                                                      Physical data
                              information                                  Transaction
                                                        structure
                              of topic map                                 management
                                                          hiding


                                                               Database system
    4                                   Data Engineering Lab                      2010/10/6
The physical data model for databases
        We propose to utilize the object oriented model for the
                               databases


   There are several options of data models for the databases
       A relational model (table) and an object oriented model are mainly used
        in topic map databases
   When we crawl on the topic map to retrieve information, an
    object oriented model needs not to join tables multiple times
    unlike a relational model
               A relational model                          An object oriented model

                                                           Object A         Object B


5                                   Data Engineering Lab                       2010/10/6
The logical data model for databases
   We assumed the topic map data structure defined by the topic
    maps data model (TMDM)
        since topic maps should follow TMDM!!
   The data model consists of seven types of information items
    and 19 types of named properties
        We implemented these items as classes, whose instance have reference
         relationships to other corresponding information item objects

         Association      0..*    1    TopicMap
                  +associations +parent
    +parent      1                           1 +parent
        +roles   0..*                       0..* +topics
        AssociationRole 0..* 1             Topic        1    0..*   TopicName
                        +roles   +player           +parent +topicNames
6                                          Data Engineering Lab                 2010/10/6
The possibility of plural retrieval routes
        The database systems need to select most suitable
               retrieval route (Query optimization)


   When we retrieve the information of topic map, there may be
    more than one way to retrieve the same objects
   We can retrieve objects efficiently by searching method

            Association     0..*    1    TopicMap
                    +associations +parent
         +parent   1                           1 +parent
          +roles   0..*                       0..* +topics
          AssociationRole 0..* 1             Topic        1   0..*   TopicName
                          +roles   +player           +parent +topicNames

7                                   Data Engineering Lab                         2010/10/6
Query optimization
            The database should take responsibility for query
                             optimization


   Database systems need to estimate the suitable execution plan
       the database system may take very long retrieval time without the query
        optimization
   Though there are some topic map database systems, they
    seem not to take the optimization into consideration




8                                 Data Engineering Lab                  2010/10/6
Objective
          We propose the optimization technique based on the
                     estimation of execution cost


   In this presentation, we focus on retrieval of topic objects that
    are referred by a specific association with a particular topic
       e.g.) we want to know that what Conan Doyle write?



        Intended topic                                   A particular topic
                                                         Specified in the query
                   A study in
                    Scarlet                              Conan Doyle
                                       write

                                A specific association
9                                Data Engineering Lab                       2010/10/6
Retrieval plan - the association route
   e.g.) What did Conan Doyle write?
                                We search the association
                                     objects ‘write’
                                          1

      A study in                     write                    Conan
       Scarlet                                                Doyle
                                                                      2
              2




        We find the intended                  We search the topic object
            topic objects                          ‘Conan Doyle’
10                             Data Engineering Lab                        2010/10/6
Retrieval plan - the topic route
   e.g.) What did Conan Doyle write?
                                                  We search the topic object
                                                       ‘Conan Doyle’
                3                                                  1
                                    2
     A study in                                           Conan Doyle
                                write
      Scarlet




      We find intended       We again search the association objects ‘write’
            topics              referred by the association role objects
11                         Data Engineering Lab                        2010/10/6
Estimation of execution cost
                     We define the estimation
             formulae for the retrieval cost of each plan


   Systems have to choose the most suitable plan
   It is necessary to define the cost which can effectively estimate
    the retrieval time (cost estimation)

                                                    cost : 10
               query                                             Route A




             information                                         Route B
             of topic map                           cost : 100
12                           Data Engineering Lab                   2010/10/6
Cost of objects - definition of cost
     We measured the total execution time and the retrieval time
      of objects
     The object retrieval time dominates the processing time more
      than 99%
     It is enough to measure the time to retrieve objects to
      evaluate the cost of query processing
                                                                Execution time
                                                                of retrieval
                        Retrieval time                          Retrieval time of
         Execution Time                 The ratio of object
                        of objects (B)                              objects :
         (A) (nano sec)                retrieval time (B/A)
                         (nano sec)
                                                                More than 99%
Association
            6.025×108     5.991×108            99.44 (%)
  Route
   Topic                                                         Other time :
            1.035×108     1.033×10   8
                                               99.81 (%)         Less than 1%
  Route
   13                                    Data Engineering Lab                       2010/10/6
Cost estimation formula
                 for the association route
   We need to retrieve all associations
  since multiple associations may have                A study in              Conan
                                                                     1
             the same name                             Scarlet                Doyle
                                                                   write


Cassoc_ route  Ca  N  2Car  Ct 
                                       N
                                                          2                       2
             1                         Q
                         2



 The cost is doubled since we retrieve           We approximate the number of
two topics both sides of the association     associations with the specified name by
                                             the average number of associations per
                                                        their unique name


   14                              Data Engineering Lab                    2010/10/6
Cost estimation formula
                  for the topic route
   The average times of topic retrieval                    3                         1
   ( note that each topic must have a
                                                       A study in              Conan
             unique name )                                           2
                                                        Scarlet                Doyle
                                                                    write


Ctopic_ route  Ct   Car  Ca   Car 
                    M             2N       2N
                    2             M        MQ
         1            2                3


                                                 The average number of associations
The average number of associations per
                                                 that have the name specified by the
               topic
                                                                query




    15                              Data Engineering Lab                     2010/10/6
Experiment
   In order to demonstrate our method, we applied our
    technique to TOME
       TOME is a prototype topic map database developed by authors


   As target topic maps, we selected following two that have
    different sizes
       Rampo Edogawa* topic map
           # of topics:29 (his name, his works and his hometown)
           # of associations:15 (his works and his hometown)
       Pokemon topic map
           # of topics:174 (Pokemon names and their attributes)
           # of associations:432 (evolutional and attribute relationships)

                            *Rampo Edogawa is a famous mystery story writer in Japan.
16                                     Data Engineering Lab                    2010/10/6
Evaluation of cost estimation formulae
   In order to evaluate our cost estimation formulae, we
    measured the execution time of a query and compared the
    tendency of the value of cost

                     We can see the tendencies :
     the less estimated costs are, the short the execution time is
                               The average time of query execution The evalueated cost for each query
                                           (nano sec)                       execution plan
                 Topic Maps
                                The association                     The association
                                                  The topic route                     The topic route
                                     route                               route


               Rampo Edogawa
                 Topic Map           31           <    157             133.2      <      164.0


                  Pokemon
                 Topic Map           297          >    31               2533      >      697.7

17                                  Data Engineering Lab                                           2010/10/6
Conclusion
   We proposed the optimization technique based on the
    estimation of execution cost
       We showed that there are possibly more than one way to retrieve the
        same objects
       We defined the cost estimation formulae for the retrieval cost of each
        plan


   We estimated our optimization technique
       The result of our experiment shows that we can see a proportional
        tendency of the retrieval time and the object size
       We can also see the tendencies that estimated costs are small in the
        case that the execution time is short



18                                 Data Engineering Lab                  2010/10/6
Thank you for your kind attention




19               Data Engineering Lab    2010/10/6
The effect of buffers
      If the objects existing on the memory are required to be
       loaded, a buffer shortens the retrieval time
          the cost estimated by the formulae needs to be modified (reduced)
           because of the effect of buffers
      In our target query, there are two cases that the buffer is
       used :
                                                                The topic existing on
   The Sign                               Conan                the memory is loaded
   of Four                                Doyle                     from buffer


The topic for association                                               A Study
 name existing on the                                                  in Scarlet
 memory is also loaded
                                          Write
      from buffer
      20                             Data Engineering Lab                  2010/10/6
The coefficients of buffer
   In our target query, we need two coefficients :
       For retrieval of topic
             M          M 
               r 1     
             2N         2N 

                   The probability that the topic do not
                             exist on buffer

       For retrieval of topic for the association names
                                                            r : the effective retrieval
              Q      Q                                        ratio of cost for buffer
               r 1                                    N:the number of
              N        N
                                                                 association objects
                   The probability that the topic for the   M:the number of
                    association names do not exist on            topic objects
                                                            Q:the number of unique
                                  buffer
                                                                 association names
21                                 Data Engineering Lab                    2010/10/6
The modified cost estimation formulae
     Taking the buffering effect into consideration, we modify the
      cost estimation formulae into this
         The contribution of loading topic name objects is also taken into
          consideration


Cassoc_ route  Ca   Ct  Ctn N  2Car   Ct  Ctn 
                                                               N
                                                               Q

Ctopic_ route  Ct  Ctn   Car  Ca   Ct  Ctn   Car   Ct  Ctn 
                           M                            2N                       2N
                           2                            M                        MQ




    22                               Data Engineering Lab                     2010/10/6
Cost estimation formula
                    for the association route
        We define the cost estimation formula as follows

         C1  Ca   Ct  Ctn N  2Car   Ct  Ctn 
                                                             N
                                                             Q
   Q     Q               TMDM permits the redundant existence of
   r 1              multiple associations that have the same name
   N       N                Retrieval of
     M          M         TopicMap objects
       r 1                                   We assume that the association roles are
     2N         2N          Retrieval of
                                                         Retrieval of Topic     Retrieval of TopicName
                                                     uniformly assigned to associationsare defined
                                                       objects that are defined objects that
                           Association objects
N:the number of                                        as the Association name     as the Association name
  association objects
M:the number of               Retrieval of
  topic objects          AssociationRole objects
Q:the number of unique
                                                        Retrieval of TopicName
  association names            Retrieval of
                                                        objects that are defined
                              Topic objects
                                                          as the Topic name

     23                                 Data Engineering Lab                                 2010/10/6
The accurate cost estimation formula
                      for the association route

 Cassoc_ route  Ca   Ct  Ctn N  2Car   Ct  Ctn 
                                                                N
                                                                Q
We have to consider
the retrieval cost of                                   We have to consider the retrieval
  topic and topic                                        cost of topic name objects and
 name objects and                                                 effect of buffer
  effect of buffer

Cassoc_ route  Ca  N  2Car  Ct 
                                       N
                                       Q                           Ca: the retrieval cost of
                                                                       association objects
        Q      Q                                                 Car: the retrieval cost of
         r 1  
        N        N                                                   association role objects
                           N:the number of association objects     Ct: the retrieval cost of
        M          M     M:the number of topic objects               topic objects
          r 1        Q:the number of                         Ctn: the retrieval cost of
        2N         2N      unique association names                  topic name objects
   24                            Data Engineering Lab                             2010/10/6
Cost estimation formula
                  for the topic route
      We define the cost estimation formula as follows
                     C2  Ct  Ctn       Car  Ca   Ct  Ctn      Car   Ct  Ctn 
                                        M                              2N                         2N
                                        2                              M                          MQ
     Retrieval of
   TopicMap objects
                            TMDM permits the existence of only one topic
      Retrieval of           Retrieval of TopicName objects
     Topic objects          that are defined as the Topic name name
                                          that has the same

     Retrieval of
AssociationRole objects                 Regarding the topic map as a graph, this is equal
                                                    to the average degree
     Retrieval of           Retrieval of Topic objects that are    Retrieval of TopicName objects that
  Association objects       defined as the Association name        are defined as the Association name

     Retrieval of
                                                            We assume that the association roles are
AssociationRole objects                                       uniformly assigned to associations

      Retrieval of           Retrieval of TopicName objects
     Topic objects          that are defined as the Topic name

      25                                        Data Engineering Lab                                2010/10/6
The accurate cost estimation formula
                           for the topic route

Ctopic_ route  Ct  Ctn   Car  Ca   Ct  Ctn   Car   Ct  Ctn 
                           M                            2N                       2N
                           2                            M                        MQ
     We have to                          We have to consider                     We have to
    consider the                         the retrieval cost of                  consider the
  retrieval cost of                       topic objects and                   retrieval cost of
     topic name                          topic name objects                 topic name objects
       objects                           and effect of buffer               and effect of buffer

                                   Car  Ca   Car 
                                M              2N       2N
         Ctopic_ route  Ct 
                                2              M        MQ                Ca: the retrieval cost of
                                                                              association objects
         Q      Q                                                       Car: the retrieval cost of
          r 1  
         N        N                                                         association role objects
                                    N:the number of association objects   Ct: the retrieval cost of
         M          M             M:the number of topic objects             topic objects
           r 1                Q:the number of                       Ctn: the retrieval cost of
         2N         2N              unique association names                topic name objects
    26                                    Data Engineering Lab                           2010/10/6
Result-Cost estimation of an object of each
class
             We can see a similar tendency between the retrieval
                           time and the object size

                                                            The normalized value         The object    The normalized value
                                      The retrieval time
Topic Maps      The object name                          by setting the retrieval time      Size      by setting the object size
                                         (nano sec)
                                                                    to be 1                (byte)              to be 1
              The retrieval time of
                      topic              969200                      3.34                  608                 4.75
              The retrieval time of
  Rampo            topicname             496700                      1.71                  376                 2.94
 Edogawa
              The retrieval time of
Topic Map
                associationrole          289900                        1                   128                    1
              The retrieval time of
                  association            562600                      1.94                  376                 2.94
              The retrieval time of
                      topic              1053000                      5.5                  608                 4.75
              The retrieval time of
 Pokemon           topicname             501600                      2.62                  376                 2.94
Topic Map     The retrieval time of
                associationrole          191400                        1                   128                    1
              The retrieval time of
                  association            577700                      3.02                  376                 2.94
  27                                              Data Engineering Lab                                            2010/10/6
Retrieval cost of each object
   We measured the retrieval time and the object size of each
    object
       The result tells us that the retrieval time is almost proportional to the
        object size
   Based on this, we define the cost as an object size scale factor
    ( the ratio of object size to association role objects)
                                                     We can see a similar tendency between the
                                                         retrieval time and the object size
                                            The normalized value by setting
     Topic Maps      The object name                                          Object size scale factor
                                               the retrieval time to be 1
                       Topic object                      5.5                           4.75

      Pokemon
                    Topic name object                    2.62                          2.94
     Topic Map
                  Association role object                 1                              1
                    Association object                   3.02                          2.94
28                                              Data Engineering Lab                                 2010/10/6
Future perspective
   We will apply our method to other topic maps that have much
    larger size
       Our target topic maps are less than 1000 topics
       We need to confirm the universality of cost estimate formulae by
        evaluating of various topic maps


   We will develop the mechanism to measure the size of objects
    in a topic map
       Since the size of objects depends on each topic map, we have to
        measure it to set the value of costs adequate to evaluate execution plan




29                                 Data Engineering Lab                    2010/10/6
Reference
   M. Naito:An Introduction to Topic Maps. Tokyo Denki University
    Press, 2006.
   Yuki Kuribara, Takeshi Hosoya, Masaomi Kimura : TOME : The
    Topic Map Database Extended, 2009
   Ontopia:tolog Language tutorial.
    http://www.ontopia.net/
   ISO/IEC JTC1/SC34, Topic Map – Data Model
    http://www.isotopicmaps.org/sam/sam-model/
   Pokemon Topic Map
    http://www.ontopia.net/omnigator/models/topicmap_complete
    .jsp?tm=pokemon.ltm
   Pajek, http://vlado.fmf.uni-lj.si/pub/networks/pajek/

30                          Data Engineering Lab            2010/10/6

More Related Content

Viewers also liked

Financial management project
Financial management projectFinancial management project
Financial management projectsukesh gowda
 
MIS: Project Management Systems
MIS: Project Management SystemsMIS: Project Management Systems
MIS: Project Management SystemsJonathan Coleman
 
Sustainable Infrastructure
Sustainable InfrastructureSustainable Infrastructure
Sustainable InfrastructurePhil Clark
 
Green Buildings & Sustainable Infrastructure
Green Buildings & Sustainable InfrastructureGreen Buildings & Sustainable Infrastructure
Green Buildings & Sustainable InfrastructureOSAEDA
 
Advantages and disadvantages of technology
Advantages and disadvantages of technologyAdvantages and disadvantages of technology
Advantages and disadvantages of technologyregine isabedra
 
Functional information system
Functional  information systemFunctional  information system
Functional information systemamazing19
 
Building a Project Management Information System with SharePoint
Building a Project Management Information System with SharePointBuilding a Project Management Information System with SharePoint
Building a Project Management Information System with SharePointASPE, Inc.
 
MIS Presentation
MIS PresentationMIS Presentation
MIS PresentationDhiren Gala
 
Financial management ppt
Financial management pptFinancial management ppt
Financial management pptRanal Nair
 
Management Information System (Full Notes)
Management Information System (Full Notes)Management Information System (Full Notes)
Management Information System (Full Notes)Harish Chand
 
Environmental Impact Assessment
Environmental Impact AssessmentEnvironmental Impact Assessment
Environmental Impact AssessmentPrithvi Ghag
 
Environmental Impact Assessment
Environmental Impact AssessmentEnvironmental Impact Assessment
Environmental Impact AssessmentNigel Gardner
 
Management Information System (MIS)
Management Information System (MIS)Management Information System (MIS)
Management Information System (MIS)Navneet Jingar
 

Viewers also liked (18)

Financial management project
Financial management projectFinancial management project
Financial management project
 
MIS: Project Management Systems
MIS: Project Management SystemsMIS: Project Management Systems
MIS: Project Management Systems
 
Sustainable Infrastructure
Sustainable InfrastructureSustainable Infrastructure
Sustainable Infrastructure
 
Green Buildings & Sustainable Infrastructure
Green Buildings & Sustainable InfrastructureGreen Buildings & Sustainable Infrastructure
Green Buildings & Sustainable Infrastructure
 
Advantages and disadvantages of technology
Advantages and disadvantages of technologyAdvantages and disadvantages of technology
Advantages and disadvantages of technology
 
Functional information system
Functional  information systemFunctional  information system
Functional information system
 
Quality management
Quality managementQuality management
Quality management
 
Environmental impact assessment (EIA)
Environmental impact assessment (EIA)Environmental impact assessment (EIA)
Environmental impact assessment (EIA)
 
Building a Project Management Information System with SharePoint
Building a Project Management Information System with SharePointBuilding a Project Management Information System with SharePoint
Building a Project Management Information System with SharePoint
 
MIS Presentation
MIS PresentationMIS Presentation
MIS Presentation
 
Financial management ppt
Financial management pptFinancial management ppt
Financial management ppt
 
project management information system
project management information systemproject management information system
project management information system
 
Management Information System (Full Notes)
Management Information System (Full Notes)Management Information System (Full Notes)
Management Information System (Full Notes)
 
Environmental Impact Assessment
Environmental Impact AssessmentEnvironmental Impact Assessment
Environmental Impact Assessment
 
Environmental Impact Assessment
Environmental Impact AssessmentEnvironmental Impact Assessment
Environmental Impact Assessment
 
Environmental Impact Assessment
Environmental Impact AssessmentEnvironmental Impact Assessment
Environmental Impact Assessment
 
Management Information System (MIS)
Management Information System (MIS)Management Information System (MIS)
Management Information System (MIS)
 
Project management
Project managementProject management
Project management
 

Similar to Inquiry Optimization Technique for a Topic Map Database

Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyondErnesto Reig
 
[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly TextsMatteo Romanello
 
Chapter 1 - Concepts for Object Databases.ppt
Chapter 1 - Concepts for Object Databases.pptChapter 1 - Concepts for Object Databases.ppt
Chapter 1 - Concepts for Object Databases.pptShemse Shukre
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
MapReduce and Its Discontents
MapReduce and Its DiscontentsMapReduce and Its Discontents
MapReduce and Its DiscontentsDean Wampler
 
20111120 warsaw learning curve by b hyland notes
20111120 warsaw   learning curve by b hyland notes20111120 warsaw   learning curve by b hyland notes
20111120 warsaw learning curve by b hyland notesBernadette Hyland-Wood
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 
A Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia ArticlesA Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia Articlesijma
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentationekansa
 
Research Data Sharing LERU
Research Data Sharing LERU Research Data Sharing LERU
Research Data Sharing LERU LIBER Europe
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoAshok Venkatesan
 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modellingcsandit
 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGcscpconf
 
Social Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search RefinementSocial Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search RefinementYi Zeng
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain OntologyKeerti Bhogaraju
 
Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Rabin BK
 
Information Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis ApproachInformation Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis ApproachAIRCC Publishing Corporation
 
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACHINFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACHijcsit
 

Similar to Inquiry Optimization Technique for a Topic Map Database (20)

Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyond
 
[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts[poster] Extracting Information From Classics Scholarly Texts
[poster] Extracting Information From Classics Scholarly Texts
 
Chapter 1 - Concepts for Object Databases.ppt
Chapter 1 - Concepts for Object Databases.pptChapter 1 - Concepts for Object Databases.ppt
Chapter 1 - Concepts for Object Databases.ppt
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
03 Object Dbms Technology
03 Object Dbms Technology03 Object Dbms Technology
03 Object Dbms Technology
 
MapReduce and Its Discontents
MapReduce and Its DiscontentsMapReduce and Its Discontents
MapReduce and Its Discontents
 
20111120 warsaw learning curve by b hyland notes
20111120 warsaw   learning curve by b hyland notes20111120 warsaw   learning curve by b hyland notes
20111120 warsaw learning curve by b hyland notes
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
A Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia ArticlesA Document Exploring System on LDA Topic Model for Wikipedia Articles
A Document Exploring System on LDA Topic Model for Wikipedia Articles
 
IASSIT Kansa Presentation
IASSIT Kansa PresentationIASSIT Kansa Presentation
IASSIT Kansa Presentation
 
Research Data Sharing LERU
Research Data Sharing LERU Research Data Sharing LERU
Research Data Sharing LERU
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and Dato
 
A Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic ModellingA Text Mining Research Based on LDA Topic Modelling
A Text Mining Research Based on LDA Topic Modelling
 
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLINGA TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
A TEXT MINING RESEARCH BASED ON LDA TOPIC MODELLING
 
Social Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search RefinementSocial Relation Based Scalable Semantic Search Refinement
Social Relation Based Scalable Semantic Search Refinement
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
 
Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)Object Relational Database Management System(ORDBMS)
Object Relational Database Management System(ORDBMS)
 
Searching the Web of Things
Searching the Web of ThingsSearching the Web of Things
Searching the Web of Things
 
Information Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis ApproachInformation Retrieval based on Cluster Analysis Approach
Information Retrieval based on Cluster Analysis Approach
 
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACHINFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
INFORMATION RETRIEVAL BASED ON CLUSTER ANALYSIS APPROACH
 

More from tmra

Topic Maps for improved access to and use of content in relational databases ...
Topic Maps for improved access to and use of content in relational databases ...Topic Maps for improved access to and use of content in relational databases ...
Topic Maps for improved access to and use of content in relational databases ...tmra
 
External Schema for Topic Map Database
External Schema for Topic Map DatabaseExternal Schema for Topic Map Database
External Schema for Topic Map Databasetmra
 
Weber 2010 brn
Weber 2010 brnWeber 2010 brn
Weber 2010 brntmra
 
Subject Headings make information to be topic maps
Subject Headings make information to be topic mapsSubject Headings make information to be topic maps
Subject Headings make information to be topic mapstmra
 
Topic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge FederationTopic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge Federationtmra
 
JavaScript Topic Maps in server environments
JavaScript Topic Maps in server environmentsJavaScript Topic Maps in server environments
JavaScript Topic Maps in server environmentstmra
 
Modelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic MapsModelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic Mapstmra
 
Hatana - Virtual Topic Map Merging
Hatana - Virtual Topic Map MergingHatana - Virtual Topic Map Merging
Hatana - Virtual Topic Map Mergingtmra
 
Designing a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_mapsDesigning a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_mapstmra
 
Maiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorerMaiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorertmra
 
Tmra2010 matsuuraposter
Tmra2010 matsuuraposterTmra2010 matsuuraposter
Tmra2010 matsuurapostertmra
 
Automatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge managementAutomatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge managementtmra
 
Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010tmra
 
Presentation final
Presentation finalPresentation final
Presentation finaltmra
 
Evaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based OntologyEvaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based Ontologytmra
 
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path ExpressionsDefining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressionstmra
 
Mappe1
Mappe1Mappe1
Mappe1tmra
 
Et Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse SemanticsEt Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse Semanticstmra
 
A PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS IntegrationA PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS Integrationtmra
 
Live Integration Framework
Live Integration FrameworkLive Integration Framework
Live Integration Frameworktmra
 

More from tmra (20)

Topic Maps for improved access to and use of content in relational databases ...
Topic Maps for improved access to and use of content in relational databases ...Topic Maps for improved access to and use of content in relational databases ...
Topic Maps for improved access to and use of content in relational databases ...
 
External Schema for Topic Map Database
External Schema for Topic Map DatabaseExternal Schema for Topic Map Database
External Schema for Topic Map Database
 
Weber 2010 brn
Weber 2010 brnWeber 2010 brn
Weber 2010 brn
 
Subject Headings make information to be topic maps
Subject Headings make information to be topic mapsSubject Headings make information to be topic maps
Subject Headings make information to be topic maps
 
Topic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge FederationTopic Merge Scenarios for Knowledge Federation
Topic Merge Scenarios for Knowledge Federation
 
JavaScript Topic Maps in server environments
JavaScript Topic Maps in server environmentsJavaScript Topic Maps in server environments
JavaScript Topic Maps in server environments
 
Modelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic MapsModelling IMS QTI with Topic Maps
Modelling IMS QTI with Topic Maps
 
Hatana - Virtual Topic Map Merging
Hatana - Virtual Topic Map MergingHatana - Virtual Topic Map Merging
Hatana - Virtual Topic Map Merging
 
Designing a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_mapsDesigning a gui_description_language_with_topic_maps
Designing a gui_description_language_with_topic_maps
 
Maiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorerMaiana - The social Topic Maps explorer
Maiana - The social Topic Maps explorer
 
Tmra2010 matsuuraposter
Tmra2010 matsuuraposterTmra2010 matsuuraposter
Tmra2010 matsuuraposter
 
Automatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge managementAutomatic semantic interpretation of unstructured data for knowledge management
Automatic semantic interpretation of unstructured data for knowledge management
 
Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010Putting topic maps to rest.tmra2010
Putting topic maps to rest.tmra2010
 
Presentation final
Presentation finalPresentation final
Presentation final
 
Evaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based OntologyEvaluation of Instances Asset in a Topic Maps-Based Ontology
Evaluation of Instances Asset in a Topic Maps-Based Ontology
 
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path ExpressionsDefining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
Defining Domain-Specific Facets for Topic Maps With TMQL Path Expressions
 
Mappe1
Mappe1Mappe1
Mappe1
 
Et Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse SemanticsEt Tu, Brute? Topic Maps and Discourse Semantics
Et Tu, Brute? Topic Maps and Discourse Semantics
 
A PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS IntegrationA PHP library for Ontopia-CMS Integration
A PHP library for Ontopia-CMS Integration
 
Live Integration Framework
Live Integration FrameworkLive Integration Framework
Live Integration Framework
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

Inquiry Optimization Technique for a Topic Map Database

  • 1. Inquiry Optimization Technique for a Topic Map Database Yuki Kuribara (Graduate School of Engineering, Shibaura Institute of Technology) Masaomi Kimura (Information Engineering, Shibaura Institute of Technology)
  • 2. Contents  Background  Research contents  Experimental  Conclusion 2 Data Engineering Lab 2010/10/6
  • 3. Topic maps  Recently, many kinds of topic maps are created  For web portal site  For application development… and so on  When we target the large topic maps, we need to construct databases for them  since databases can deal with the data larger than the size of physical memory Out of memory On memory 3 Data Engineering Lab 2010/10/6
  • 4. The role of database  Database systems should take responsibility for managing information of topic maps  Query optimization  Transaction management  Physical data structure hiding Query optimization query Physical data information Transaction structure of topic map management hiding Database system 4 Data Engineering Lab 2010/10/6
  • 5. The physical data model for databases We propose to utilize the object oriented model for the databases  There are several options of data models for the databases  A relational model (table) and an object oriented model are mainly used in topic map databases  When we crawl on the topic map to retrieve information, an object oriented model needs not to join tables multiple times unlike a relational model A relational model An object oriented model Object A Object B 5 Data Engineering Lab 2010/10/6
  • 6. The logical data model for databases  We assumed the topic map data structure defined by the topic maps data model (TMDM)  since topic maps should follow TMDM!!  The data model consists of seven types of information items and 19 types of named properties  We implemented these items as classes, whose instance have reference relationships to other corresponding information item objects Association 0..* 1 TopicMap +associations +parent +parent 1 1 +parent +roles 0..* 0..* +topics AssociationRole 0..* 1 Topic 1 0..* TopicName +roles +player +parent +topicNames 6 Data Engineering Lab 2010/10/6
  • 7. The possibility of plural retrieval routes The database systems need to select most suitable retrieval route (Query optimization)  When we retrieve the information of topic map, there may be more than one way to retrieve the same objects  We can retrieve objects efficiently by searching method Association 0..* 1 TopicMap +associations +parent +parent 1 1 +parent +roles 0..* 0..* +topics AssociationRole 0..* 1 Topic 1 0..* TopicName +roles +player +parent +topicNames 7 Data Engineering Lab 2010/10/6
  • 8. Query optimization The database should take responsibility for query optimization  Database systems need to estimate the suitable execution plan  the database system may take very long retrieval time without the query optimization  Though there are some topic map database systems, they seem not to take the optimization into consideration 8 Data Engineering Lab 2010/10/6
  • 9. Objective We propose the optimization technique based on the estimation of execution cost  In this presentation, we focus on retrieval of topic objects that are referred by a specific association with a particular topic  e.g.) we want to know that what Conan Doyle write? Intended topic A particular topic Specified in the query A study in Scarlet Conan Doyle write A specific association 9 Data Engineering Lab 2010/10/6
  • 10. Retrieval plan - the association route  e.g.) What did Conan Doyle write? We search the association objects ‘write’ 1 A study in write Conan Scarlet Doyle 2 2 We find the intended We search the topic object topic objects ‘Conan Doyle’ 10 Data Engineering Lab 2010/10/6
  • 11. Retrieval plan - the topic route  e.g.) What did Conan Doyle write? We search the topic object ‘Conan Doyle’ 3 1 2 A study in Conan Doyle write Scarlet We find intended We again search the association objects ‘write’ topics referred by the association role objects 11 Data Engineering Lab 2010/10/6
  • 12. Estimation of execution cost We define the estimation formulae for the retrieval cost of each plan  Systems have to choose the most suitable plan  It is necessary to define the cost which can effectively estimate the retrieval time (cost estimation) cost : 10 query Route A information Route B of topic map cost : 100 12 Data Engineering Lab 2010/10/6
  • 13. Cost of objects - definition of cost  We measured the total execution time and the retrieval time of objects  The object retrieval time dominates the processing time more than 99%  It is enough to measure the time to retrieve objects to evaluate the cost of query processing Execution time of retrieval Retrieval time Retrieval time of Execution Time The ratio of object of objects (B) objects : (A) (nano sec) retrieval time (B/A) (nano sec) More than 99% Association 6.025×108 5.991×108 99.44 (%) Route Topic Other time : 1.035×108 1.033×10 8 99.81 (%) Less than 1% Route 13 Data Engineering Lab 2010/10/6
  • 14. Cost estimation formula for the association route We need to retrieve all associations since multiple associations may have A study in Conan 1 the same name Scarlet Doyle write Cassoc_ route  Ca  N  2Car  Ct  N 2 2 1 Q 2 The cost is doubled since we retrieve We approximate the number of two topics both sides of the association associations with the specified name by the average number of associations per their unique name 14 Data Engineering Lab 2010/10/6
  • 15. Cost estimation formula for the topic route The average times of topic retrieval 3 1 ( note that each topic must have a A study in Conan unique name ) 2 Scarlet Doyle write Ctopic_ route  Ct   Car  Ca   Car  M 2N 2N 2 M MQ 1 2 3 The average number of associations The average number of associations per that have the name specified by the topic query 15 Data Engineering Lab 2010/10/6
  • 16. Experiment  In order to demonstrate our method, we applied our technique to TOME  TOME is a prototype topic map database developed by authors  As target topic maps, we selected following two that have different sizes  Rampo Edogawa* topic map  # of topics:29 (his name, his works and his hometown)  # of associations:15 (his works and his hometown)  Pokemon topic map  # of topics:174 (Pokemon names and their attributes)  # of associations:432 (evolutional and attribute relationships) *Rampo Edogawa is a famous mystery story writer in Japan. 16 Data Engineering Lab 2010/10/6
  • 17. Evaluation of cost estimation formulae  In order to evaluate our cost estimation formulae, we measured the execution time of a query and compared the tendency of the value of cost We can see the tendencies : the less estimated costs are, the short the execution time is The average time of query execution The evalueated cost for each query (nano sec) execution plan Topic Maps The association The association The topic route The topic route route route Rampo Edogawa Topic Map 31 < 157 133.2 < 164.0 Pokemon Topic Map 297 > 31 2533 > 697.7 17 Data Engineering Lab 2010/10/6
  • 18. Conclusion  We proposed the optimization technique based on the estimation of execution cost  We showed that there are possibly more than one way to retrieve the same objects  We defined the cost estimation formulae for the retrieval cost of each plan  We estimated our optimization technique  The result of our experiment shows that we can see a proportional tendency of the retrieval time and the object size  We can also see the tendencies that estimated costs are small in the case that the execution time is short 18 Data Engineering Lab 2010/10/6
  • 19. Thank you for your kind attention 19 Data Engineering Lab 2010/10/6
  • 20. The effect of buffers  If the objects existing on the memory are required to be loaded, a buffer shortens the retrieval time  the cost estimated by the formulae needs to be modified (reduced) because of the effect of buffers  In our target query, there are two cases that the buffer is used : The topic existing on The Sign Conan the memory is loaded of Four Doyle from buffer The topic for association A Study name existing on the in Scarlet memory is also loaded Write from buffer 20 Data Engineering Lab 2010/10/6
  • 21. The coefficients of buffer  In our target query, we need two coefficients :  For retrieval of topic M  M    r 1   2N  2N  The probability that the topic do not exist on buffer  For retrieval of topic for the association names r : the effective retrieval Q  Q ratio of cost for buffer   r 1   N:the number of N  N association objects The probability that the topic for the M:the number of association names do not exist on topic objects Q:the number of unique buffer association names 21 Data Engineering Lab 2010/10/6
  • 22. The modified cost estimation formulae  Taking the buffering effect into consideration, we modify the cost estimation formulae into this  The contribution of loading topic name objects is also taken into consideration Cassoc_ route  Ca   Ct  Ctn N  2Car   Ct  Ctn  N Q Ctopic_ route  Ct  Ctn   Car  Ca   Ct  Ctn   Car   Ct  Ctn  M 2N 2N 2 M MQ 22 Data Engineering Lab 2010/10/6
  • 23. Cost estimation formula for the association route  We define the cost estimation formula as follows C1  Ca   Ct  Ctn N  2Car   Ct  Ctn  N Q Q  Q TMDM permits the redundant existence of    r 1   multiple associations that have the same name N  N Retrieval of M  M  TopicMap objects   r 1   We assume that the association roles are 2N  2N  Retrieval of Retrieval of Topic Retrieval of TopicName uniformly assigned to associationsare defined objects that are defined objects that Association objects N:the number of as the Association name as the Association name association objects M:the number of Retrieval of topic objects AssociationRole objects Q:the number of unique Retrieval of TopicName association names Retrieval of objects that are defined Topic objects as the Topic name 23 Data Engineering Lab 2010/10/6
  • 24. The accurate cost estimation formula for the association route Cassoc_ route  Ca   Ct  Ctn N  2Car   Ct  Ctn  N Q We have to consider the retrieval cost of We have to consider the retrieval topic and topic cost of topic name objects and name objects and effect of buffer effect of buffer Cassoc_ route  Ca  N  2Car  Ct  N Q Ca: the retrieval cost of association objects Q  Q Car: the retrieval cost of   r 1   N  N association role objects N:the number of association objects Ct: the retrieval cost of M  M  M:the number of topic objects topic objects   r 1   Q:the number of Ctn: the retrieval cost of 2N  2N  unique association names topic name objects 24 Data Engineering Lab 2010/10/6
  • 25. Cost estimation formula for the topic route  We define the cost estimation formula as follows C2  Ct  Ctn   Car  Ca   Ct  Ctn   Car   Ct  Ctn  M 2N 2N 2 M MQ Retrieval of TopicMap objects TMDM permits the existence of only one topic Retrieval of Retrieval of TopicName objects Topic objects that are defined as the Topic name name that has the same Retrieval of AssociationRole objects Regarding the topic map as a graph, this is equal to the average degree Retrieval of Retrieval of Topic objects that are Retrieval of TopicName objects that Association objects defined as the Association name are defined as the Association name Retrieval of We assume that the association roles are AssociationRole objects uniformly assigned to associations Retrieval of Retrieval of TopicName objects Topic objects that are defined as the Topic name 25 Data Engineering Lab 2010/10/6
  • 26. The accurate cost estimation formula for the topic route Ctopic_ route  Ct  Ctn   Car  Ca   Ct  Ctn   Car   Ct  Ctn  M 2N 2N 2 M MQ We have to We have to consider We have to consider the the retrieval cost of consider the retrieval cost of topic objects and retrieval cost of topic name topic name objects topic name objects objects and effect of buffer and effect of buffer  Car  Ca   Car  M 2N 2N Ctopic_ route  Ct  2 M MQ Ca: the retrieval cost of association objects Q  Q Car: the retrieval cost of   r 1   N  N association role objects N:the number of association objects Ct: the retrieval cost of M  M  M:the number of topic objects topic objects   r 1   Q:the number of Ctn: the retrieval cost of 2N  2N  unique association names topic name objects 26 Data Engineering Lab 2010/10/6
  • 27. Result-Cost estimation of an object of each class We can see a similar tendency between the retrieval time and the object size The normalized value The object The normalized value The retrieval time Topic Maps The object name by setting the retrieval time Size by setting the object size (nano sec) to be 1 (byte) to be 1 The retrieval time of topic 969200 3.34 608 4.75 The retrieval time of Rampo topicname 496700 1.71 376 2.94 Edogawa The retrieval time of Topic Map associationrole 289900 1 128 1 The retrieval time of association 562600 1.94 376 2.94 The retrieval time of topic 1053000 5.5 608 4.75 The retrieval time of Pokemon topicname 501600 2.62 376 2.94 Topic Map The retrieval time of associationrole 191400 1 128 1 The retrieval time of association 577700 3.02 376 2.94 27 Data Engineering Lab 2010/10/6
  • 28. Retrieval cost of each object  We measured the retrieval time and the object size of each object  The result tells us that the retrieval time is almost proportional to the object size  Based on this, we define the cost as an object size scale factor ( the ratio of object size to association role objects) We can see a similar tendency between the retrieval time and the object size The normalized value by setting Topic Maps The object name Object size scale factor the retrieval time to be 1 Topic object 5.5 4.75 Pokemon Topic name object 2.62 2.94 Topic Map Association role object 1 1 Association object 3.02 2.94 28 Data Engineering Lab 2010/10/6
  • 29. Future perspective  We will apply our method to other topic maps that have much larger size  Our target topic maps are less than 1000 topics  We need to confirm the universality of cost estimate formulae by evaluating of various topic maps  We will develop the mechanism to measure the size of objects in a topic map  Since the size of objects depends on each topic map, we have to measure it to set the value of costs adequate to evaluate execution plan 29 Data Engineering Lab 2010/10/6
  • 30. Reference  M. Naito:An Introduction to Topic Maps. Tokyo Denki University Press, 2006.  Yuki Kuribara, Takeshi Hosoya, Masaomi Kimura : TOME : The Topic Map Database Extended, 2009  Ontopia:tolog Language tutorial. http://www.ontopia.net/  ISO/IEC JTC1/SC34, Topic Map – Data Model http://www.isotopicmaps.org/sam/sam-model/  Pokemon Topic Map http://www.ontopia.net/omnigator/models/topicmap_complete .jsp?tm=pokemon.ltm  Pajek, http://vlado.fmf.uni-lj.si/pub/networks/pajek/ 30 Data Engineering Lab 2010/10/6