[dev] need help to understand content tagger getTags by tag-id.

roman stachura roman at stachura.ch
Thu Aug 18 13:46:34 UTC 2011


Am 17.08.2011 19:46, schrieb Michael J Rubinsky:
>
> Quoting roman stachura <roman at stachura.ch>:
>
>> Am 16.08.2011 21:49, schrieb Michael J Rubinsky:
>>> [...]
>> I am not sure how good the idea is, to limit the inner Query.
>> Performance concerns?
>
> Not sure about performance implications of the limit, but this is 
> necessary to enforce the radius. Unless you have some other 
> suggestion? Additionally, I'm not sure if the intention was to use the 
> same value for radius and limit, as it is currently implemented. I see 
> those as two discrete settings. Will look into this as I write new 
> tests for that method.
this inner query needs a LIMIT statement, otherwise it can grow really 
big. --> rampage_tagged, there is the meat.
in general, keep temp table as small as possible.
A bit more information on this:
http://forge.mysql.com/wiki/TagSchemaFAQ#Using_LIMIT_to_Prevent_Wild_Queries

I do not know which limit is appropriate, but the inner and outer limit 
can differ.
I tend to set the inner bigger than the outer.

This limit should be set statistically correct to get a solid set of tags
in correlation to objects and tag distribution.

Or we have to seek out for a "better" solution:

http://dablog.ulcc.ac.uk/wp-content/uploads/2007/12/tagging_folksonomy.pdf

Slide 20:

SELECT t2p2.tag_id, t2.tag_text
FROM (
     SELECT post_id FROM Tags t1
     INNER JOIN Tag2Post
     ON t1.tag_id = Tag2Post.tag_id
     WHERE t1.tag_text = 'beach' LIMIT 10
) AS t2p1
INNER JOIN Tag2Post t2p2
ON t2p1.post_id = t2p2.post_id
INNER JOIN Tags t2
ON t2p2.tag_id = t2.tag_id
GROUP BY t2p2.tag_id LIMIT 10;


regards roman

>
> Thanks for the feedback.
>



More information about the dev mailing list