Problem
Is this a good implementation of a Chain Responsibility Pattern? This code attempts to filter a message received (for instance, in a chat application).
import urlmarker
import re
bad_words = []
with open("bad-words.txt") as file:
for entry in file:
entry = entry.strip()
bad_words.append(entry)
class Message(object):
def __init__(self, content):
self.content = content.split()
self.can_broadcast = True
self.construct_word()
def construct_word(self):
self.content = ' '.join(self.content)
class Filter():
def __init__(self, successor = None):
self.successor = successor
def has_successor(self):
if self.successor is not None:
return True
return False
def filter_message(self, message):
if self.has_successor():
self.successor.filter_message(message)
class Slang_Filter(Filter):
def filter_message(self, message):
has_bad_words = False
for word in message.content:
for bad_word in bad_words:
if bad_word == word.lower():
has_bad_words = True
break
if has_bad_words == False:
if self.has_successor():
self.successor.filter_message(message)
else:
message.can_broadcast = False
class Website_Filter(Filter):
def filter_message(self, message):
has_websites = False
if re.findall(urlmarker.URL_REGEX, message.content) is None:
if self.has_successor():
self.successor.filter_message(message)
else:
message.can_broadcast = False
def main():
new_message = Message("This is full of fucking shit www.google.com")
slang_filter = Slang_Filter()
website_filter = Website_Filter(slang_filter)
website_filter.filter_message(new_message)
print new_message.can_broadcast
if __name__ == '__main__':
main()
Solution
Chain of Responsibility
-
Rather than building the chain on instantiation, make the creation of the chain a function.
This can allow you to chain the creation, which is easier to read and nicer to look at. -
I’d also make the chain return, rather than mutate a foreign object.
This removes side effects, which lead to hard to understand code, and potential bugs. -
has_successor
is a poor function, it’s simpler to read and understand usingif self._next is not None
, rather than using that abstraction.
And so I’d remove it, I’d favour writing out the non-function way anyway.
And so I’d come to:
class Filter(object):
_next = None
def add_filter(self, next_filter):
self._next = next_filter
return next_filter
def filter_message(self, message):
if self._next is None:
return True
return self._next.filter_message(message)
class SlangFilter(Filter):
def filter_message(self, message):
for word in message.content:
for bad_word in bad_words:
if bad_word == word.lower():
return False
if self._next is None:
return True
return self._next.filter_message(message)
class Website_Filter(Filter):
def filter_message(self, message):
if re.findall(urlmarker.URL_REGEX, message.content) is not None:
return False
if self._next is None:
return True
return self._next.filter_message(message)
This then lets you create the chain as:
def main():
new_message = Message("This is full of fucking shit www.google.com")
filter = WebsiteFilter()
.add_filter(SlangFilter())
print filter.filter_message(new_message)
Going forward, you’d want to change your classes in a couple of other ways too.
- Rather than relying on a global bad words variable,
SlangFilter
should be passed these words on instantiation. - Rather than manually re-creating
in
, you can use it to reduce code complexity, and for a performance boost, if you were to change the list of words to aset
. - Your code shouldn’t rely on an unneeded model, as a string is good enough for this.
And so I’d use:
import urlmarker
import re
class Filter(object):
_next = None
def add_filter(self, next_filter):
self._next = next_filter
return next_filter
def filter_message(self, message):
if self._next is None:
return True
return self._next.filter_message(message)
class SlangFilter(Filter):
def __init__(self, bad_words):
self._bad_words = {word.lower() for word in bad_words}
def filter_message(self, message):
for word in message.lower().split():
if word in self._bad_words:
return False
if self._next is None:
return True
return self._next.filter_message(message)
class Website_Filter(Filter):
def filter_message(self, message):
if re.findall(urlmarker.URL_REGEX, message) is not None:
return False
if self._next is None:
return True
return self._next.filter_message(message)
def main():
with open("bad-words.txt") as file:
bad_words = [word.strip() for word in file]
message = "This is full of fucking shit www.google.com"
filter = WebsiteFilter()
.add_filter(SlangFilter(bad_words))
print filter.filter_message(message)
if __name__ == '__main__':
main()
After thoroughly analysing my code and @Peilonrayz answer, I came to the conclusion that there is a bug in the WebsiteFilter class. The following code will always return False:
if re.findall(urlmarker.URL_REGEX, message) is not None:
return False
The problem is that re.findall
will return a list, empty or not, and as such it will never be None
. It should be changed to:
if not re.findall(urlmarker.URL_REGEX, message):
return False
Other bug that I found is that in @Peilonrayz answer
filter = WebsiteFilter()
.add_filter(SlangFilter(bad_words))
will make the filter be initialised with SlangFilter, instead of being initialised with WebsiteFilter, or as @Peilonrayz put it: “we’re using the leaf, rather than the root”.
The correct way to initialise a filter would be:
filter = WebsiteFilter()
filter.add_filter(SlangFilter())
Conclusion
@Peilonrayz answer is still the correct answer. However, if you are seeing this, make sure you correct the above bugs so the code will work as it’s supposed to.