Month: November 2004

<nickname>.id.au

Post author By ricky
Post date Monday, November 15, 2004

According to Enetica, one can use one’s nickname as a domain name in the .id.au 2LD. Here is part of an e-mail that I sent to Enetica enquiring about random.id.au:

You have allowed a company (Random Technologies) to register in the .id.au 2LD (random.id.au). This is in direct breach of Schedule D of auDA’s Domain Name Eligibility and Allocation Rules for the Open 2LDs (2002-07). What action, if any, will you be taking in this matter?

And here is their reply:

Thanks, for your email,

the registrant of random.id.au is Edson Galindo, the contact details he has provided are not relevant to his eligibility to hold that domain.
Edson has warranted that the domain is somehow related to himself, ie his nickname, and as such he is eligible to hold that domain.

Of course, I have no claim to that domain name either. I was just curious to see what their response would be. Enetica interpretation of auDA’s rules is very strange, to say the least!

Random observations

Firefox and Thunderbird

Post author By ricky
Post date Sunday, November 14, 2004

I jumped on the Firefox bandwagon a few months ago. I’ve been very happy with it. It’s extension mechanism is fantastic, and couldn’t be easier to use. One slightly annoying thing is that search engine plugins can only be installed system-wide, which seems a bit silly. But this has already been listed as a bug by others and is currently being fixed. If you want to make the switch from another browser, especially the evil one, please click the Firefox image in the left margin.

I’ve also switched to Thunderbird from Evolution at home and at work. I found Evolution to be exceedingly slow when my Inbox got large. Thunderbird has not displayed the same problem so far, dealing with exactly the same number of messages. Unfortunately, Thunderbird will not import mail from Evolution, so it has to be done manually. Since Evolution and Thunderbird use the standard mbox format, it’s just a question of copying your inbox and subdirectories to the relevant place in your Thunderbird profile folder. That part takes all of ten seconds. The tedious part is having to set up all your mail filters again. Thunderbird has a great spam detection feature which has been working really well for me so far. Another problem is importing your address book. Thunderbird will only import LDIF, CSV and a few others. Evolution will only export vCard. Therefore, you have to find a way to convert from vCard to LDIF or one of the other text formats. Since my address book wasn’t too large, I found it easy just to recreate my address book by hand. The "Add to Address Book…" item in the context menu that pops up whenever you right-click an e-mail address in Thunderbird expedited this process somewhat.

Random observations

ACSC paper

Post author By ricky
Post date Friday, November 12, 2004

It turns out that the reviews for the paper that we (Ryan, Jaga, Audun and myself) submitted to the Australian Computer Science Conference were lost, and that our paper was supposed to be accepted. This is quite unbelievable, but our paper got accepted, so I can’t complain too much. The organisers had to contact the reviewers of our paper to double-check that our paper was supposed to be accepted. It must have been some major glitch. Firstly, we received an e-mail saying our paper had been rejected. Secondly, the follow-up e-mail, which was supposed to contain our reviews, did not contain anything of the sort. Anyway, at this stage Ryan will be presenting our paper in Newcastle in a few months’ time.

Random observations

Technorati pinger

Post author By ricky
Post date Thursday, November 11, 2004

I’ve written an exceedingly simple and braindead shell script pinger for use with Technorati. It uses the POST (lwp-request) utility that comes with libwww-perl library to deliver the ping message. Usage is as follows:

technorati <blogname> <blogurl>

The shell script is invoked whenever I add a new entry to my weblog. It’s been working for me, but I give no guarantees. Use at your own risk and do with it what you will.

Random observations

Domain Daddy

Post author By ricky
Post date Thursday, November 11, 2004

So I’ve finally acquired my very own domain (rickyrobinson.id.au). I intend to care for it, and treat it with kindness and respect, and I promise it will never be neglected. It will be diligently tended to, and whenever it cries for help, I will come running. I want only the best for my domain so that it may bloom and grow… like Edelweiss.

Random observations

PageRank

Post author By ricky
Post date Tuesday, November 09, 2004

After initially being told that a paper I wrote with Ryan, Jaga and Audun had been rejected from the Australian Computer Science Conference, it now appears that there was a glitch in the system and that our paper, in fact, was supposed to be accepted. We became aware of this today after I sent off a missive to the conference organisers telling them that they’d forgotten to attach the reviews of our paper. Anyway, more on that matter as further details become available.

Now that it appears the paper will be accepted, like a good little researcher, I started updating the paper for the camera-ready version. I came across the following claim in the paper regarding Google’s PageRank algorithm:

The algorithm reduces the rank of web pages with outgoing hyperlinks and increases the rank of web pages with incoming hyperlinks. This means that a page with few outgoing links and many incoming links will have a high rating.

I had previously flagged this description of the PageRank algorithm as being in need of revision. Therefore, I reviewed Brin’s and Page’s The Anatomy of a Large-Scale Hypertextual Web Search Engine (pdf) in the hope of distilling its contents and deriving a more satisfactory, yet still concise, description of PageRank. Unless I am mistaken, which is a distinct possibility, the original description is not as wildly off target as I thought. The key point to note is that the PageRanks form a probability distribution over web pages, so the sum of all web pages’ PageRanks will be one. In PageRank, a link from page A to page B is essentially a vote (the weight of this vote depends upon a number of factors, including the number of other links on page A and the PageRank of page A, but these factors are not directly relevant to this discussion). In linking to page B, then, page A has helped to increase page B’s PageRank. But since the sum of the PageRanks must equal one, the increase of B’s PageRank must necessarily come at the expense of a decrease in the PageRanks of other pages. In other words, there’s no such thing as a free lunch: somebody has to pay. The question is, does A alone incur the cost of linking to B? The answer is no, since the calculation of A’s PageRank is not dependent in any way on the number of outgoing links from A, as shown by the PageRank algorithm (which you can find in the manuscript linked above). So where does the extra Googlejuice imbibed by B come from? One way to think about it is this. The addition of a link from A to B slightly reduces the weight of all the other links in the global collection of links. Therefore, it is not A alone that pays the price. Rather, the cost is shared by all existing pages. Another way to think about this is using the "random surfer" model suggested by Page and Brin. The PageRank of A equates to the probability that a user who starts on a random page and clicks randomly on hyperlinks will visit A before boredom sets in, at which time the surfer will randomly select another page to start from. Notice that adding a link from A to B does not lower the probability that the surfer will visit A any more than it reduces the probability of the surfer visiting any other page. What is clear, however, is that the probability that the surfer will visit B has increased, since B now has more incident edges, and has therefore increased its proportion of the total number of incident edges in the graph.

I was not at all surprised to find a great many discussions and (sometimes heated) arguments about the way PageRank works. Just Google it! The really weird stuff shows up when the search is constrained to discussions talking about outgoing links.

This investigation started me thinking about alternative algorithms for ranking results. In Google, the PageRank of page A utilises the PageRanks of all the pages that link to A. It occurred to me that not all pages linking to A would be relevant to the current search. A possible improvement to the PageRank algorithm might be to use only those pages linking to A which also appear in the query results. For instance, if my query is about "fluffy white dogs", and page A is a hit for this query, then only pages which were also a hit for this query and which link to A should be used in the calculation of A’s PageRank for this particular query. Why should a page which links to A for some other reason, say because A also discusses "lazy ginger cats", be included in the ranking of A for this query? Surely this adjustment to the algorithm would improve the ordering of results in Google. The one reason I can think of not to do this is that PageRank would be calculated at the time of query rather than offline, meaning that results would be delayed slightly longer. Mind you, it’s entirely possible that Google already does something like this, because there’s no doubt that the PageRank algorithm must have been updated and modified since 1998!

The question still remains: how do I go about adjusting our paper?

Random observations

<em>The Thin Line</em>

Post author By ricky
Post date Sunday, November 07, 2004

Welcome to The Thin Line: my new look weblog. I hope you like it. The new name was chosen for a whole swag of reasons, which I won’t bother outlining here. If you’re interested in the reasons, I’m only an e-mail away.

Random observations

Valid RSS

Post author By ricky
Post date Sunday, November 07, 2004

While I was at it, I decided I may as well add a link to show that my RSS generator conforms to the 2.0 standard. Click the RSS heart in the left margin. I’m not sure why I’m updating the blog template, because I’m about to give it a major overhaul.

Random observations

Standards compliance

Post author By ricky
Post date Saturday, November 06, 2004

You will notice a few little images that I’ve added in the margin on the left hand side. The first of these relates to copyright. Clicking on it will take you to the Creative Commons license under which the material appearing in this blog is licensed. The second image is a link and allows you check whether the blog page you are viewing conforms to the XHTML 1.0 standard. The template from which this blog is created conforms to the XHTML 1.0 standard. However, from time to time, I may slip up when creating a blog entry (actually, this will happen regularly; try checking it right now). The third image is a link which validates the style file that this blog uses. These two image links have been put here to encourage you to use a browser that conforms to these standards (like this one) if this blog is not being rendered properly by your current browser.

Random observations

Johnny Warren, rest in peace

Post author By ricky
Post date Saturday, November 06, 2004

Australian soccer’s favourite son, Johnny Warren, has passed away at the age of 61 after a battle with cancer. Anybody who has any connection to the local soccer scene will know what a sad loss this is. Johnny Warren was the captain of the only Australian team ever to play in the World Cup. But for the past several years he has featured as a commentator on SBS. It’s hard to come to terms with the fact that The World Game on SBS will be without Johnny Warren from now on. There is nobody, nobody who’s done more for the game in this country than Johnny. He was a tireless champion for bringing about change to the domestic league, and the new national competition is the fruit of his labour. I hope that in the years to come, a successful A-League will be seen as Johnny Warren’s legacy. Johnny Warren lived to witness the launch of the new league. May he rest in peace.