Porter Stemmer Algorithm Class in C#

C# 11.5.2009 1 Comment

The Porter Stemmer Algorithm is an algorithmn which was created to by Martin Porter to reduce english words to their root word stems. For example, the word “forms” would reduce to “form” and the word “connections” would reduce to “connect”. The details of the algorithmn can be found here

Typically, you would need this functionality when you want to create your own search engine so that you can index your content against what the user is searching for more effectively.

I came across a well written class in PHP on Jon Abernathy’s site and decided to port it to C# for a project I’m working on.

I won’t get into the depths of the code, but the basic documentation is this…

Method stem
Description: stems a single string to it’s root stem
Parameters: string
Returns: string

Method stem_list
Description: takes a comman, semi-comma, or space sperated string and returns an array list of the stemmed words
Parameters: string
Returns: ArrayList

You can download the class here: Stemmer C# Class

One Response to "Porter Stemmer Algorithm Class in C#"

  1. There’s wrong implementation of some parts of the original porter stemmer algorithm. Actually, I have tried the PHP implementation by Jon and it seems that it has the same error.

    Try to stem the words “building”, “normalizing”…etc!!!

Leave a Reply