Textile filtering with RedCloth
So, I am making a forum over at the Open Source Country
I wanted to use RedCloth to let members make textile formatted posts but I wanted to restrict the input they were allowed to use. How to do this? I thought it would be simple.
Textile has a +filter_html+ option which I thought would do the trick but that only filters what HTML RedCloth allows users to enter. It doesn’t filter the HTML created by Redcloth itself when a user uses textile tags.
So how to filter the textile tags?
First, assuming you are using Rails 2.3 create the following file
config/initializers/redcloth_extension.rb
This file will contain the code we want to override. Now paste the following code into the file
module RedCloth::Formatters::HTML
include RedCloth::Formatters::Base
def after_transform(text)
text.chomp!
clean_html(text, ALLOWED_TAGS)
end
ALLOWED_TAGS = {
'a' => ['href', 'title'],
'br' => [],
'i' => nil,
'u' => nil,
'b' => nil,
'pre' => nil,
'kbd' => nil,
'code' => ['lang'],
'cite' => nil,
'strong' => nil,
'em' => nil,
'ins' => nil,
'sup' => nil,
'sub' => nil,
'del' => nil,
'table' => nil,
'tr' => nil,
'td' => ['colspan', 'rowspan'],
'th' => nil,
'ol' => ['start'],
'ul' => nil,
'li' => nil,
'p' => nil,
'h3' => nil,
'h4' => nil,
'h5' => nil,
'h6' => nil,
'blockquote' => ['cite'],
}
end
ALLOWED_TAGS is a hash of tags that you want to allow. You can take the BASIC_TAGS to use as a base and strip tags you don’t want form the hash.
So we have defined the tags that we want to allow. Now we need to actually do some stripping (ooo-er). This is where the after_transform method comes in. This is called by RedCloth as standard after initial modification. So what we can do is override the method and tell RedCloth to clean_html again with the HTML string it has just created. To give you a list of steps.
- RedCloth is configured with +filter_html+ enabled
- User enters string (Textile and HTML)
- RedCloth strips HTML tags from the string according to the BASIC_TAGS using clean_html method
- RedCloth converts the textile tags in the string to HTML
At this point the HTML’ised string is usually returned; however we do some overriding so that…
- RedCloth strips HTML tags from the above generated HTML string according to our ALLOWED_TAGS using the clean_html method
- RedCloth returns the twice filtered HTML string.
Thinking about it you don’t even need +filter_html+ since it will all be filtered the second time around but what the hell. I feel a little more secure by stripping all the user generated HMTL cruft before stripping our textile generated HTML.
Enjoy
Comments
said on 09/11/10 (Tuesday) at 17:12 UTC:
Hi Jeffrey,
this is just what I was looking for – thanks for posting!
Cheers
Dave