de.fuberlin.wiwiss.semmf.engine
Class TaxonomicMatcher

java.lang.Object
  extended by de.fuberlin.wiwiss.semmf.engine.TaxonomicMatcher
All Implemented Interfaces:
Matcher

public class TaxonomicMatcher
extends java.lang.Object
implements Matcher

Object providing functionality for calculationg similarity between two given concepts based on their respective positions in an underlying hierarchy.

Version:
1.1 last modified on 29.11.2006
Author:
Radoslaw Oldakowski

Field Summary
private  float[] milestones
          Stores milestone values for all levels of this taxonomy.
private  boolean simInheritance
          If set to true (default setting) then sim(queryConcept, resourceConcept = any Subclass of queryConcept) = 1.
private  Taxonomy taxonomy
          Concept taxonomy based on which the similarity calculation is performed
 
Constructor Summary
TaxonomicMatcher(Taxonomy t, MilestoneCalculator mc)
          Constructor.
 
Method Summary
private  float calcConSim(BottomConcept qc, BottomConcept rc)
          Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy.
private  float calcConSim(BottomConcept qc, SuperConcept rc)
          Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy.
private static float calcConSim(float mC1, float mC2, float mCCP)
          Calculates similarity of two concepts according the the formula: sim(c1,c2) = 1 - d(c1,c2), where d(c1,c2) = d(c1, ccp) + d(c2, ccp), and ccp = closest common parent of c1 and c2
private  float calcConSim(java.lang.String qcURI, java.lang.String rcURI)
          Given URIs of two concepts the similarity between them is calculated based on their respective positions in the concept hierarchy.
private  float calcConSim(SuperConcept qc, BottomConcept rc)
          Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy.
private  float calcConSim(SuperConcept qc, SuperConcept rc)
          Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy.
 float calcSim(com.hp.hpl.jena.rdf.model.RDFNode n1, com.hp.hpl.jena.rdf.model.RDFNode n2)
          Calculates the concept similarity of two given RDFNodes from Jena2 framework based on their relative position in the concept hierarchy.
private  float getMilestone(int level)
           
 void setProperty(java.lang.String key, java.lang.String value)
          Sets matcher parameter which will influence the outcome of the simlarity value Currently suportet parameters (with corresponding values): - "simInheritance" (true | false)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

taxonomy

private Taxonomy taxonomy
Concept taxonomy based on which the similarity calculation is performed


milestones

private float[] milestones
Stores milestone values for all levels of this taxonomy. These values are calculated by the MatchingCalculator passed to the constructor.


simInheritance

private boolean simInheritance
If set to true (default setting) then sim(queryConcept, resourceConcept = any Subclass of queryConcept) = 1. This assumption seems to be reasonable because a subclass is always a kind of its superclass. Hovewer, there may be cases in which the user wants the actual distance between a subclass and a superclass to be respected in the similarity calculation. In this case, set this parameter to false.

Constructor Detail

TaxonomicMatcher

public TaxonomicMatcher(Taxonomy t,
                        MilestoneCalculator mc)
Constructor.

Parameters:
t - based on which the matching will be performed
mc - MilestoneCalculator object which determines milestone values for all taxonomy levels
Method Detail

setProperty

public void setProperty(java.lang.String key,
                        java.lang.String value)
                 throws java.lang.IllegalArgumentException
Sets matcher parameter which will influence the outcome of the simlarity value Currently suportet parameters (with corresponding values): - "simInheritance" (true | false)

Parameters:
key - name of the property to be set
value - of the property to be set
Throws:
java.lang.IllegalArgumentException - if either of the values is illegal

calcSim

public float calcSim(com.hp.hpl.jena.rdf.model.RDFNode n1,
                     com.hp.hpl.jena.rdf.model.RDFNode n2)
Calculates the concept similarity of two given RDFNodes from Jena2 framework based on their relative position in the concept hierarchy. Both Nodes have to be of Jena type Resoure, i.e. have a unique URI. Moreover they also have to be members in the hierarchy tree. WARNING: This method does not check the above conditions !!!

Specified by:
calcSim in interface Matcher
Parameters:
n1 - node from the query graph
n2 - node from the resource graph
Returns:
the similarity of these nodes

calcConSim

private float calcConSim(java.lang.String qcURI,
                         java.lang.String rcURI)
Given URIs of two concepts the similarity between them is calculated based on their respective positions in the concept hierarchy. Note that sim(queryConcept, resourceConcept) != sim(resourceConcept, queryConcept), unles simInheritance = false. WARNING: this method does not verify if either of the given URIs is a member of the taxonomy !!!

Parameters:
qcURI - URI of the query concept
rcURI - URI of the resource concept
Returns:
the similarity of the two concepts

calcConSim

private float calcConSim(BottomConcept qc,
                         BottomConcept rc)
Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy. Note that sim(queryConcept, resourceConcept) != sim(resourceConcept, queryConcept) Both parameters are BottomConcepts which may have more than one position in the taxonomy (i.e. have more than one SuperConcept), so this method has to consider all posible positions and return the maximum similarity.

Parameters:
qc - concept from the query graph
rc - concept from the resource graph
Returns:
similarity of the two concepts

calcConSim

private float calcConSim(BottomConcept qc,
                         SuperConcept rc)
Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy. Note that sim(queryConcept, resourceConcept) != sim(resourceConcept, queryConcept) The fist parameter is a BottomConcept which may have more than one position in the taxonomy (i.e. have more than one SuperConcept), so this method has to consider all posible positions and return the maximum similarity.

Parameters:
qc - concept from the query graph
rc - concept from the resource graph
Returns:
similarity of the two concepts

calcConSim

private float calcConSim(SuperConcept qc,
                         BottomConcept rc)
Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy. Note that sim(queryConcept, resourceConcept) != sim(resourceConcept, queryConcept) The second parameter is a BottomConcept which may have more than one position in the taxonomy (i.e. have more than one SuperConcept), so this method has to consider all posible positions and return the maximum similarity.

Parameters:
qc - concept from the query graph
rc - concept from the resource graph
Returns:
similarity of the two concepts

calcConSim

private float calcConSim(SuperConcept qc,
                         SuperConcept rc)
Calculates concept similarity between two given concepts based on their respective positions in the concept hierarchy. Note that sim(queryConcept, resourceConcept) != sim(resourceConcept, queryConcept)

Parameters:
qc - concept from the query graph
rc - concept from the resource graph
Returns:
similarity of the two concepts

calcConSim

private static float calcConSim(float mC1,
                                float mC2,
                                float mCCP)
Calculates similarity of two concepts according the the formula: sim(c1,c2) = 1 - d(c1,c2), where d(c1,c2) = d(c1, ccp) + d(c2, ccp), and ccp = closest common parent of c1 and c2

Parameters:
mC1 - milestone of concept 1
mC2 - milestone of concept 2
mCCP - milestone of the closest common parent of c1 and c2
Returns:
similarity between c1 and c2

getMilestone

private float getMilestone(int level)
Parameters:
level - hierarchy level
Returns:
the milestones for the given concept level in a taxonomy.