Question

G. Li, B. C. Ooi, J. Feng, J. Wang, and L. Zhou. EASE: an effective 3-in-1...

G. Li, B. C. Ooi, J. Feng, J. Wang, and L. Zhou. EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data. In SIGMOD, pages 903-914, 2008. Located in Ares through the Course Reserves on the left-hand navigation.

Please read Section 4 of the following paper, and describe the ranking function of the r-radius Steiner graph. [30 points]

0 0
Add a comment Improve this question Transcribed image text
Answer #1

RANKING FUNCTION OF THE R-RADIUS STEINER GRAPH

1. TF.IDF-BASED IR RANKING

In the existing systems to rank an r-radius graph the following methods were used;

  • Assign each graph a score using IR-ranking formula
  • Combine the individual scores using aggregation function

For TF.IDF- based IR style ranking parameters like term frequency-tf, inverse document frequency-idf and normalized document length-ndl are taken in account.These parameters are calculated as,

To evaluate the document relevancy the three parameters are combined. For a given input ki and a given Steiner graph SG, the equation is given as

where G is r-radius.

The overall score between an input keyword query K and SG is calculated as,

TF.IDF based ranking is inefficient for semi-structured and structured data. So the system proposes a ranking function from the DB point of view.

2. STRUCTURAL COMPACTNESS-BASED DB RANKING

Structural compactness score of SG should be larger and the compactness should include

  • Structural compactness between content nodes
  • Structural relevancy between input keywords

When the length of a path between two content nodes is larger the relevancy is smaller.

The multiple paths between two content nodes should be considered. The overall structural compactness between any two content nodes be

Where

  • ni, nj -content nodes
  • - any path between ni, nj
  • -length of

SIM(ni,nj) can be pre-computed and materialized offline, since.

The overall structural compactness be calculated by summing up several materialized scores.

For evaluating the structural relevancy among input keywords-

  • Smaller distance between input keywords indicates higher structural relevancy

So,, where Cki denotes set of all content nodes that contain ki in SG and | Cki | denotes no.of nodes in Cki.

a larger value of structural compactness score implies that SG is more relevant and meaningful to K.

The structural relevancy between any two input keywords can be used to capture the relevancy between all of the input keywords.

By considering document relevancy from IR perspective and structural compactness from the DB perspective a more accurate function can for r-radius Steiner graphs can be obtained,

Score (<ki, kj>|SG) measures the overall relevancy score of <ki, kj> in SG based on the structural compactness/relevancy and IR scores. Note that, Sim(<ki, kj>|SG) is taken as the weight of the sum of two IR scores, A larger Sim(<ki, kj>|SG) means that ki and kj are more relevant w.r.t. SG, and thus, the overall score of <ki, kj> in SG is expected to be larger.

Add a comment
Know the answer?
Add Answer to:
G. Li, B. C. Ooi, J. Feng, J. Wang, and L. Zhou. EASE: an effective 3-in-1...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT