Skip to content

geoip-0.2.0 released

Finally, GeoIP is available on PECL. It's been a while since it all started, but now the first release is out the door.

First of all, GeoIP is a piece of software that can map an IP address or hostname to a geographic place. Although it cannot be 100% accurate (since your IP address is from your ISP which might not be next door), it gives you a rough idea in which city the user is coming from. With this data handy, you can then choose to present a local version of your website. Also, it can be really useful for demographic analysis where you calculate where your users are located based on log files. Anyway, what you do with that piece of software is up to you.

The good news though, is that it is now easily available from PHP. After long debates and licenses clashes (GPL being incompatible with the PHP License). The author of the GeoIP C Library kindly accepted to release the new version as LGPL. With this change, the GeoIP PECL module could be created, using the skeleton from SourceForge. Since that, many memory leaks were fixed, and documentation has been written. It should now be in good working conditions, let me know if you find bugs.

By the way, if someone builds a website using GeoIP to determine where the user is from, outputting this on a Yahoo! map using their API and fetching photos near where the user is from Flickr using geotags, leave a comment, I want to see this :)

stem-1.4.3 for PHP released!

As of today, you can download a new release of the PHP stem interface to the Snowball API on PECL. While this extension has been written by Jay Smith, I have since joined him to help on further development.

If you do not know what a stemmer is, the article on Wikipedia is self explanatory. Basically, it allows a computer program to find a common root for different forms of the same word. While Dr. Porter did a great job creating stemmers for different languages and the Snowball API, it was not available directly from a PHP script.

Now that this limitation is gone, you might want to try using the stemmer to create an intelligent search engine for your website. If you want to give it a try, issue the following command on your favorite UNIX based machine: pecl install stem. Once the installation has completed, you might want to modify your php.ini to load the extension and then try the following example:

  print stem_english('cleaner') ."\n";
  print stem_french('épouses') ."\n";

This would output clean and épous respectively. In some cases, the word outputed by the stemmer will not exist in a dictionary, but this is rarely a problem. In fact, you should only stem words to use them as keywords in some kind of database.

PECL on Gentoo

If you try to install a PECL package without using Portage (thus using the PHP tool pecl), you might encounter an error like this snippet:

bender ~ # pecl install apc
downloading APC-3.0.8.tgz ...
autoconf: Undefined macros:
ERROR: `phpize' failed

The main problem lies with the use of automake v1.9.x. Since Gentoo comes with a bunch of different versions of the autotools, you can choose to use automake v1.8, which will result in a complete built.

bender ~ # WANT_AUTOMAKE="1.8" pecl install apc

As simple as it seems, it took me a while to fix it. Let me know if this helps.

PFE presentation

The slides for my PFE (Projet de Fin d'Études) are available here.

It is in French, but if you are looking on how to integrate different technologies to build a full featured email server, you might have a look at some of the schematics. I will see if I can post the full report online shortly. The best would be to create an HOWTO, but it would be long to write.

Tweaking Webalizer

Looking at Webalizer's reports, I have just noticed that webpages returning a 302 do not get into the Total URLs listing.

I do not know if this is a bug or not, but a 302 is a Moved Temporarly status code. I do want to count those since it reflects how many people I am bouncing elsewhere. This simple patch does the trick:

--- webalizer.c 2002-04-16 18:11:31.000000000 -0400
+++     2005-08-11 11:02:52.000000000 -0400
@@ -1080,7 +1080,7 @@
      /* URL/ident hash table (only if valid response code) */
      if ((log_rec.resp_code==RC_OK)||(log_rec.resp_code==RC_NOMOD)||
-         (log_rec.resp_code==RC_PARTIALCONTENT))
+         (log_rec.resp_code==RC_PARTIALCONTENT)||
+         (log_rec.resp_code==RC_MOVEDTEMP))
         /* URL hash table */
         if (put_unode(log_rec.url,OBJ_REG,(u_long)1,

What would I become without having access to source code...