Problem
- Get names from names.txt
- Get transactions from transactions.txt
- The output of the program should show who owes what like so:
Alice pays $xx.xx to David. Bob pays $xx.xx to Claire.
And so on.
names.txt
Alice Bob Claire David
transactions.txt
Bob paid $55.90 for petrol. Bob paid $70 for house-cleaning. Bob paid $85.60 for lunch. Claire paid $100.10 for phone bill. Claire paid $103.45 for dinner. Claire paid $30.80 for snacks. David paid $170.80 for groceries. David paid $33.40 for breakfast. David paid $63.50 for utilities.
My code works and gets the right result. I’m just wondering if there is a cleaner way.
The code creates a payment object, adds people, then adds payments, then gets liabilities.
import re
from decimal import Decimal
def filer(file):
f = open(file, 'r')
lines = []
for line in f.readlines():
lines.append(line.strip())
return lines
def parse_lines(lines, **kwargs):
compilers = kwargs.get('keys', ['money', 'name', 'bill'])
parses = []
for line in lines:
l1 = Parser(line, keys=compilers)
parses.append(l1.get_dict())
return parses
class Parser(object):
patterns = {'name_pattern':'^([w-]+)','money_pattern':'(d+.?d*)','bill_pattern':"[w'-]+.$"}
keys = ['money', 'name', 'bill']
def __init__(self, line, **kwargs):
if 'patterns' in kwargs:
self.patterns.update(kwargs.get('patterns'))
del kwargs['patterns']
self.__dict__.update(kwargs)
for key in self.keys:
key_str = key+'_pattern'
parser = re.compile(self.patterns[key_str])
value = parser.search(line)
setattr(self, key, value.group(0))
def get_dict(self):
rtn = {}
for key in self.keys:
rtn[key] = getattr(self, key)
return rtn
class Payments(object):
def __init__(self, **kwargs):
self.people = {}
self.formatter = "{from} pays {currency}{amount} to {to}."
self.currency = '$'
self.__dict__.update(kwargs)
def add_person(self, person):
self.people[str(person.name)] = person
def add_payment(self, payment):
person = self.people[payment['name']]
person.add_to(payment)
def get_liabilities(self):
liabilities = []
for name, person in self.people.items():
person.set_share(len(self.people.items()))
for name, person in self.people.items():
"""gets share & sets it against each person in the group"""
person.set_liability(self.people, self.currency)
liabilities += person.liabilities
return liabilities
def print_liabilities(self):
liabilities = []
for liability in self.get_liabilities():
liabilities.append(self.formatter.format(**liability))
return liabilities
class Person(object):
def __init__(self, **kwargs):
self.name = ""
self.items = []
self.paid = 0
self.share = 0
self.liabilities = []
self.__dict__.update(kwargs)
def add_to(self, payment):
""" payment must have keys: 'name', 'money'"""
if payment['name'] == self.name:
self.items.append(payment)
def _get_paid(self):
balance = 0
for item in self.items:
amount = Decimal(item['money'],2)
balance += amount
return Decimal(balance, 2)
def set_share(self, total_people):
"""total amount paid is divided by the number of
people to get the liabilities of each person"""
self.paid = self._get_paid()
self.share = round(self.paid/Decimal(total_people), 2)
def _set_extras(self, liability, keys):
"""sets extra properties"""
for key in keys:
if hasattr(self, key):
liab_dict[key] = getattr(self, key)
def set_liability(self, people, currency):
"""if owed is share is greater than owed, name needs to pay this"""
for name, person in people.items():
if self.name != name:
"""if owed is greater than what I have paid then add to liabilities"""
diff = self.share - person.share
if diff < 0:
liab_dict = {'from':self.name,'to': name,'amount':-diff, 'currency':currency }
self.liabilities.append(liab_dict)
class PaymentBuilder(object):
people = []
def __init__(self, **kwargs):
transactions = kwargs.get('transaction_file','./transactions.txt')
names = kwargs.get('names_file','./names.txt')
self.payments = Payments()
self.trans = parse_lines(filer(transactions))
self.names = filer(names)
for name in self.names:
self.payments.add_person(Person(name=name))
for tran in self.trans:
self.payments.add_payment(tran)
def liabilities(self):
return [ liability for liability in self.payments.print_liabilities()]
To run just init PageBuilder
then get liabilities. Please be sure to place the names.txt and transactions.txt in the root folder otherwise pass locations.
For specific location file use:
PaymentBuilder(transaction_file=’/var/location/filename.txt’)
payments = PaymentBuilder()
for line in payments.liabilities():
print(line)
Output:
Claire pays $8.33 to David.
Bob pays $5.71 to Claire.
Bob pays $14.04 to David.
Alice pays $58.59 to Claire.
Alice pays $66.92 to David.
Alice pays $52.88 to Bob.
Solution
Low hanging fruits
You should read PEP8 to make your code look more like Python code to others. You have a few violations:
- spacing around operators (
+
,,
, and=
mostly); - blank lines around top-level code;
- differences between comments and docstrings.
There is also this **kwargs
thing… I don’t get why you use it in most of your functions and constructors. You don’t need variable number of named arguments, all you need is a default value for some of them: use the more idiomatic
def __init__(self, transactions_file='./transactions.txt', names_file='./names.txt'):
(for PaymentBuilder
) instead.
You should also use a with
statement to manage your files instead of manually closing them; which you forgot to do anyway…
Lastly, the Person._set_extras
method is never called and can be safely removed (especially since it uses an undefined variable).
Stop writing classes
Two of your classes are only an __init__
and a single method. These should be functions instead, as there is not really any state that we need to manipulate.
And as we will see later, the other two can be simplified to a few lines as well so there really is no need into using classes here: the whole point of Person
is only to be a dictionary value, and Payments
is the holding dictionary. Keep it simple, use dictionaries instead.
The whole logic of PaymentBuilder.liabilities()
can be expressed as follow, from your code:
- For each name in the file, initialise a
dict
entry to the emptylist
to store future transactions - For each transaction in the file, store the amount spend into the right name in the aforementioned dictionary
- For each entry in the dictionary, compute the share spent by the person
- For each couple of persons, compute the difference in spent shares and, if the first owe something to the second, print that information.
Each of these steps can be seen as a dict-comprehension or a for-loop over a dictionary content.
Use generators
Instead of building lists and appending to them in order to build other lists by appending to them… You can reduce your memory footprint by using generators.
This will allow you to parse files one line at a time, for instance, without having to read them upfront. Or print liabilities one by one without having to compute them all upfront either.
Proposed improvements
import re
from decimal import Decimal
def filer(filename):
with open(filename, 'r') as f:
yield from map(str.strip, f)
def parse_line(line, patterns):
return {
key: parser.search(line).group(0)
for key, parser in patterns.items()
}
def parse_lines(lines, patterns=None):
if patterns is None:
patterns = {
'name': re.compile('^([w-]+)'),
'money': re.compile('(d+.?d*)'),
'bill': re.compile("[w'-]+.$"),
}
yield from (parse_line(line, patterns) for line in lines)
def liabilities(transactions_file='./transactions.txt', names_file='./names.txt', currency='$'):
persons = {name: [] for name in filer(names_file)}
for transaction in parse_lines(filer(transactions_file)):
persons[transaction['name']].append(transaction['money'])
total_people = Decimal(len(persons))
shares = {
name: round(sum(Decimal(p) for p in paid) / total_people, 2)
for name, paid in persons.items()
}
for name, share in shares.items():
for other, other_share in shares.items():
if name == other:
continue
difference = share - other_share
if difference < 0:
yield '{debitor} pays {currency}{amount} to {creditor}.'.format(
currency=currency,
amount=-difference,
debitor=name, creditor=other)
if __name__ == '__main__':
for line in liabilities():
print(line)
Design
I echo the advice from @MathiasEttinger to stop writing classes. I think that your solution is over-engineered. However, at the same time, your design is fragile and prone to misuse. Some observations about your design:
-
You rely on a specific sequence of mutations to be performed, particularly in
Payments.get_liabilities()
:def get_liabilities(self): liabilities = [] for name, person in self.people.items(): person.set_share(len(self.people.items())) for name, person in self.people.items(): """gets share & sets it against each person in the group""" person.set_liability(self.people, self.currency) liabilities += person.liabilities return liabilities
So, before
get_liabilities()
is called, thepeople
must already have had their expenses added. Then, withinperson.set_liability()
, you expect every other person’s.share
to already have been computed. That correct usage of thePerson
class is not apparent from any of the documentation of thePerson
class.(By the way,
len(self.people)
would suffice in the excerpt above;len(self.people.items())
is overkill.) -
The calculation actually involves a nested for-loop, but the way it works is obscured due to the fact that the outer loop is in
Payments.get_liabilities()
, and the inner loop is inPerson.set_liability()
. -
Payments.print_liabilities()
is misleading. It never prints anything! -
The multi-currency support isn’t effective. It’s basically an exercise in parameter-passing, after setting the hard-coded
self.currency = '$'
in the constructor of thePayments
class. If you had inferred the$
symbol from parsingtransactions.txt
and passed it along, then that would actually be useful. In any case, currency formatting can be complicated. Some currency symbols are, for example, traditionally placed after the number. In summary, this simplistic attempt is not worthwhile. You’d be better off hard-coding the$
sign in theformatter
string. -
Person._set_extras()
is unused (and if it were used, would be a sign of poor class design). -
The
Parser
is needlessly complex. You could accomplish the same thing using just one regular expression with named capture groups. It should just be hard-coded; there is no advantage to making the sub-expressions configurable.
Style
-
A lot of the code could be shortened by writing comprehensions instead of
for
loops. For example,filer()
,Parser.get_dict()
,Payments.print_liabilities()
, andPerson._get_paid()
could each be one-liners (or nearly so).On the other hand, this is a completely redundant list comprehension:
def liabilities(self): return [ liability for liability in self.payments.print_liabilities()]
-
Don’t abuse
**kwargs
. YourPaymentBuilder
constructor should take two parameters (with default values). Yourparse_lines()
function should accept one list. As for yourParser
constructor, I have no idea what magic it’s trying to apply. (Anyway, as I mentioned above, yourParser
should just be one hard-coded regular expression.) -
Use
r'raw strings'
for regular expressions that contain backslashes, to avoid the possibility of the backslashes being misinterpreted. -
Write
"""docstrings"""
for all of your methods. But don’t write docstrings where they are actually# code comments
.
Suggested solution
I’d just write two functions, plus a bit of glue code to call the functions and print out the results. Note that all of the decimal rounding occurs in the formatter.
import re
from decimal import Decimal
def parse_expenses(names_file, transactions_file):
"""
Parse a file containing transactions to obtain a dict listing the total
expenses incurred by each person listed in the names file.
names_file should be a path to a text file, containing one name per line.
transactions_file should be a path to a text file, with each line of the
form "John paid $1.23 for blah".
"""
regex = re.compile(r'^(?P<name>[w-]+).*?$(?P<amount>d+.?d*)')
with open(names_file) as f:
expenses = {line.strip(): Decimal('0') for line in f}
with open(transactions_file) as f:
for line in f:
match = regex.match(line)
if not match or match.group('name') not in expenses:
raise ValueError('Bad transaction: "{0}"'.format(line.strip()))
expenses[match.group('name')] += Decimal(match.group('amount'))
return expenses
def equalization_payments(expenses):
"""
Given a dictionary that maps names to expenses, generate
(payer, payee, amount) tuples that would result in the expenses being
equalized.
"""
n = len(expenses)
for debtor, debt in expenses.items():
for creditor, credit in expenses.items():
diff = credit - debt
if diff > 0:
yield debtor, creditor, diff / n
if __name__ == '__main__':
expenses = parse_expenses('names.txt', 'transactions.txt')
for payer, payee, amount in equalization_payments(expenses):
print('{payer} pays ${amount:0.2f} to {payee}.'.format(
payer=payer, payee=payee, amount=amount
))