Chain Responsibility Pattern to filtering messages

Posted on

Problem

Is this a good implementation of a Chain Responsibility Pattern? This code attempts to filter a message received (for instance, in a chat application).

import urlmarker
import re

bad_words = []
with open("bad-words.txt") as file:
    for entry in file:
        entry = entry.strip()
        bad_words.append(entry)


class Message(object):

    def __init__(self, content):
        self.content = content.split()
        self.can_broadcast = True
        self.construct_word()

    def construct_word(self):
        self.content = ' '.join(self.content)

class Filter():

    def __init__(self, successor = None):
        self.successor = successor

    def has_successor(self):
        if self.successor is not None:
            return True
        return False

    def filter_message(self, message):
        if self.has_successor():
            self.successor.filter_message(message)

class Slang_Filter(Filter):

    def filter_message(self, message):
        has_bad_words = False

        for word in message.content:
                for bad_word in bad_words:
                    if bad_word == word.lower():
                        has_bad_words = True
                        break

        if has_bad_words == False:
            if self.has_successor():
                self.successor.filter_message(message)
        else:
            message.can_broadcast = False

class Website_Filter(Filter):

    def filter_message(self, message):
        has_websites = False

        if re.findall(urlmarker.URL_REGEX, message.content) is None:
            if self.has_successor():
                self.successor.filter_message(message)
        else:
            message.can_broadcast = False



def main():
    new_message = Message("This is full of fucking shit www.google.com")
    slang_filter = Slang_Filter()
    website_filter = Website_Filter(slang_filter)
    website_filter.filter_message(new_message)
    print new_message.can_broadcast

if __name__ == '__main__':
    main()

Solution

Chain of Responsibility

  • Rather than building the chain on instantiation, make the creation of the chain a function.
    This can allow you to chain the creation, which is easier to read and nicer to look at.

  • I’d also make the chain return, rather than mutate a foreign object.
    This removes side effects, which lead to hard to understand code, and potential bugs.

  • has_successor is a poor function, it’s simpler to read and understand using if self._next is not None, rather than using that abstraction.
    And so I’d remove it, I’d favour writing out the non-function way anyway.

And so I’d come to:

class Filter(object):
    _next = None
    def add_filter(self, next_filter):
        self._next = next_filter
        return next_filter

    def filter_message(self, message):
        if self._next is None:
            return True
        return self._next.filter_message(message)

class SlangFilter(Filter):
    def filter_message(self, message):
        for word in message.content:
            for bad_word in bad_words:
                if bad_word == word.lower():
                    return False

        if self._next is None:
            return True
        return self._next.filter_message(message)

class Website_Filter(Filter):
    def filter_message(self, message):
        if re.findall(urlmarker.URL_REGEX, message.content) is not None:
            return False

        if self._next is None:
            return True
        return self._next.filter_message(message)

This then lets you create the chain as:

def main():
    new_message = Message("This is full of fucking shit www.google.com")
    filter = WebsiteFilter()
        .add_filter(SlangFilter())
    print filter.filter_message(new_message)

Going forward, you’d want to change your classes in a couple of other ways too.

  • Rather than relying on a global bad words variable, SlangFilter should be passed these words on instantiation.
  • Rather than manually re-creating in, you can use it to reduce code complexity, and for a performance boost, if you were to change the list of words to a set.
  • Your code shouldn’t rely on an unneeded model, as a string is good enough for this.

And so I’d use:

import urlmarker
import re


class Filter(object):
    _next = None
    def add_filter(self, next_filter):
        self._next = next_filter
        return next_filter

    def filter_message(self, message):
        if self._next is None:
            return True
        return self._next.filter_message(message)


class SlangFilter(Filter):
    def __init__(self, bad_words):
        self._bad_words = {word.lower() for word in bad_words}
    
    def filter_message(self, message):
        for word in message.lower().split():
            if word in self._bad_words:
                return False

        if self._next is None:
            return True
        return self._next.filter_message(message)


class Website_Filter(Filter):
    def filter_message(self, message):
        if re.findall(urlmarker.URL_REGEX, message) is not None:
            return False

        if self._next is None:
            return True
        return self._next.filter_message(message)


def main():
    with open("bad-words.txt") as file:
        bad_words = [word.strip() for word in file]
    
    message = "This is full of fucking shit www.google.com"
    filter = WebsiteFilter()
        .add_filter(SlangFilter(bad_words))
    print filter.filter_message(message)


if __name__ == '__main__':
    main()

After thoroughly analysing my code and @Peilonrayz answer, I came to the conclusion that there is a bug in the WebsiteFilter class. The following code will always return False:

if re.findall(urlmarker.URL_REGEX, message) is not None:
    return False

The problem is that re.findall will return a list, empty or not, and as such it will never be None. It should be changed to:

if not re.findall(urlmarker.URL_REGEX, message):
    return False

Other bug that I found is that in @Peilonrayz answer

filter = WebsiteFilter()
        .add_filter(SlangFilter(bad_words))

will make the filter be initialised with SlangFilter, instead of being initialised with WebsiteFilter, or as @Peilonrayz put it: “we’re using the leaf, rather than the root”.

The correct way to initialise a filter would be:

filter = WebsiteFilter()
filter.add_filter(SlangFilter())

Conclusion
@Peilonrayz answer is still the correct answer. However, if you are seeing this, make sure you correct the above bugs so the code will work as it’s supposed to.

Leave a Reply

Your email address will not be published. Required fields are marked *