Bayesian Filter Performance in Ruby

A while ago I was in an online discussion regarding “Correct Tools for the Jobs” and how ruby was not a good language for developing say, an operating system.

Someone made a comment that it was not a good idea to use ruby for bayesian filtering of things like forum posts. (Bayesian filtering is one of the primary algorithms used for determining if an email or forum post is spam or not).

They made some performance claims which seemeds exceedingly slow, but also made some statements that made me suspect their application design was not all that it could be so I thought I would see if ruby was the guilty party.

This post is not primarily about bayesian filtering but about performance testing; it is probably most helpful to low to intermediate ruby developers.

Resources for Ruby Developers

As someone (Probably a really famous guy whose name I cannot remember) once said becoming a software developer means having to learn new things for the rest of your life.

This is just as true for Ruby as for any other language so here are the resources I like to use to keep up to date.

These are besides things like API sites and the ruby on rails guides etc.

Setting Up Japanese Input on Kubuntu 12.04

Historically setting up Japanese input on Linux has been a trial-and-error affair with much forum searching, dark config file incantations and other black magic.

Luckily with Kubuntu 12.04 getting setup with Japanese input has become a very simple process

Textile Filtering With RedCloth

I wanted to use RedCloth to let users of my rails app make textile formatted posts but I wanted to restrict the input they were allowed to use. How to do this? I thought it would be simple.

Warning: This article was imported from an old site and is therefore itself rather old. It may not still be accurate for current versions of RedCloth.

How to Ask for Help on IRC

Greetings. If you have arrived at this page via your own volition and intent then be aware that this is an abridged, slightly more modern, IRC specific version of Eric S. Raymond’s “How to ask questions the smart way” which is excellent, but lengthy, reading.

If, on the other hand, you have been directed to this page by someone else and quickly want to find out what this is about, Read on.