stem-1.4.3 for PHP released!

As of today, you can download a new release of the PHP stem interface to the Snowball API on PECL. While this extension has been written by Jay Smith, I have since joined him to help on further development.

If you do not know what a stemmer is, the article on Wikipedia is self explanatory. Basically, it allows a computer program to find a common root for different forms of the same word. While Dr. Porter did a great job creating stemmers for different languages and the Snowball API, it was not available directly from a PHP script.

Now that this limitation is gone, you might want to try using the stemmer to create an intelligent search engine for your website. If you want to give it a try, issue the following command on your favorite UNIX based machine: pecl install stem. Once the installation has completed, you might want to modify your php.ini to load the extension and then try the following example:

<?php
  print stem_english('cleaner') ."\n";
  print stem_french('épouses') ."\n";
?>

This would output clean and épous respectively. In some cases, the word outputed by the stemmer will not exist in a dictionary, but this is rarely a problem. In fact, you should only stem words to use them as keywords in some kind of database.

Trackbacks

No Trackbacks

Comments

Display comments as (Linear | Threaded)

  1. Simonator says:

    Le jour des élections en plus!

  1. Sarp says:

    Hii olivier,

    Sory I have little english.
    you are my hero:) I love you; becaouse you made stem extension for php..This very good..but I look www.snowball.tartaris.org all languages engslih,french,duc... and Turkish but your extension there is not turkish!!!! Maybe last versiyon you add turkish language...plssssss plssssss :)
    You add or not add no problem I love you:)

  1. Sarp says:

    Hi olivier,

    You are very good man...maybe you are hero for php stem.. :)

    But Snowball API add new documentasion and new langues...

    Mybe you update stem-1.4.3 for PHP :)

    plssssssss

  1. Mac says:

    waiting new update..
    Last update 2006 !?

  1. Olivier says:

    Yes indeed. The code on CVS has been updated, but I need to push a release with more and updated stemmers.

  1. desfrenes says:

    Thanks for this extension, this is very useful.

    I was able to compile it under debian with no problem, however I couldn't achieve it under windows due to some missing development tools (msdev...). pecl4win is down, do you know any place where I could find a windows build ?


Add Comment


Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA