

Recent blog posts
Resources |
Making a Drupal Folksonomy Tag Cloud
Submitted by kentbye on Fri, 2005-06-17 12:44.
Development | Drupal | Folksonomy | Open Source | Tagcloud
I created a tag cloud for my website, and I'd like to see this feature added as a dynamic Drupal module. I thought I'd briefly go through the steps that I went through to give a leg up for anyone who wants to code this up in PHP. The hardest part is the algorithm that automatically determines the distribution of font sizes based upon the frequency distribution of tags. Below is my distribution that I used to determine the font sizes: Notice that my tag distribution exhibits some Power Law behavior of the Long-Tail of the Internet More technical details below... The first step to creating a tag cloud in Drupal is to use Morbus Iff's free-tagging patch to go through and tag all of your archived blog posts. This saves your entered folksonomy free tags as regular taxonomy terms.
I then went into PHPMyAdmin to hack into two MySQL databases within Drupal: "term_data" & "term_node" I exported all of the data into a CSV file so that I could import it into an XL spreadsheet. The data in "term_data" are used to correlate the folksonomy tag "name" with the "tid" The "vid" variable also in "term_data" is the vocabulary id that can be used to isolate groups of terms into separate tag clouds. In my case, my "Folksonomy Tags" vocabulary "vid" was 4. The data in "term_node" are used to count the total number of occurences of a folksonomy tag "tid" across the entire site. I order the data according to tid, and created a counter in XL using four total columns Columns I imagine these counts could easily be coded up in PHP. I copied the values of 0 (Column D), tid (Column B) & Count (Column D) into a separate column and sorted in the order of 0, tid, & Count This gives you the total number of occurences of each folksonomy tag "tid." This frequency will determine the relative font sizes of the tag "name" in the tag cloud The next step is to correlated the tid number with the tag "name" using the data from term_data. The hardest step for dynamically automating this into a Drupal module is determining how to automate the font size distribution based upon the frequency of tags. I just plotted my tag distribution in XL and eyed it I qualitatively determined that I could break up the distribution by dividing the tag frequency total by 10 -- round down -- and then add 1. For example, the New Media tag occurs 53 times. Collaboration = 41 times = font size 5 This algorithm doesn't work in all cases, but it works for now.
The Drupal font size range seems to be from 1 to 7 which gives 7 possible sections. I'll have to see if any solution pops to mind, but I think I'll pass the baton to someone with a computer science background to figure it out and code it up in PHP for the whole Drupal Community.
This type of dynamic tag cloud aggregator could provide a very helpful organizational and navigational tool for Drupal sites.
Will Look into Popular Module TooSubmitted by kentbye on Fri, 2005-06-17 17:19.
It appears as though the developer of tagadelic -- another tag cloud-like Drupal module -- also just left a comment above I going to bounce my algorithm off of him after I get finished tweaking it. I'll have to look into the term_popular.module module as well -- It looks as though they have implemented it into a block module which is how I've envisioned how I'd have a tag cloud implemented. Back to work on the algorithm. tagadelicSubmitted by Anonymous (not verified) on Fri, 2005-06-17 15:12.
There already was a module that did what you want. So maybe you want to improve the algorythm for generating fontsizes in there? Maybe we should collaborate to get one goodmodule, instead of two less good ones? tagadelic page on webschuur.com YES! to tag cloud collaborationSubmitted by kentbye on Fri, 2005-06-17 17:00.
Definitely! Although I don't know any PHP -- I can only come up with the algorithm that does it. I'll pass it along to you and see what we can come up with. I figured out a pretty good algorithm for evenly distributing the font sizes for many more different cases -- I'm going to try it out in Microsoft XL first and then post some updated graphs. I'm definitely willing to pass along the algorithms for you to code up into your tagadelic module. I'll get in touch with you again after I post it. By the way, Morbus Iff pointed me out to tagadelic and I actually updated my previous post with a link and should've updated this one as well. I'll update the post above as well with tagadelic links. I am the "anonymous"Submitted by Bèr (not verified) on Mon, 2005-07-04 08:04.
Hello. Indeed I am the anonymous poster in the abovementioned comment. I just updated the APIs for tagadelic. It now gives a lot of flexibility for generating tag pages. With url schemes similar to taxonomy (/tagadelic/chunk/1,2,5) you can compile nice pages. About term popular: Can we incorporate our code into one project? I think, that if we use some of your algorythms, a lock from term popular and my module, we are all setteld. And then we have one working module, instead of two modules in some pre-beta state. Tagcloud Follow-upSubmitted by kentbye on Tue, 2005-07-05 15:19.
Hey Bèr, I did however write an algorithm for the font distribution aspects of the tag cloud -- and I have a few flowcharts for making personalized tagclouds. I'll follow-up with an e-mail to you. |
That is awesome Kent! We
That is awesome Kent! We just did it over on our main blog too, http://www.developmentseed.org/blog/. We are using the 'tag cloud' in the side bar as our main navigation, http://www.developmentseed.org/blog/poptags. We are using term_popular.module http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/frjo/term_popular.module... it is great - Ian added the block.