Simple text localization system

Posted on

Problem

My first question relates to the first Python module that I published on Python. This is a very small and simple module, the package contains only one class. This is a port of a Perl module Locale::Wolowitz this is a module for managing translations, but uses the serialization format JSON or YAML. Below the code of my Python class, I will like to have an opinion on the code:

import glob
import re
import os.path
from collections import defaultdict


class Pylocwolowitz(object):
    '''Pylocwolitz is a very simple text localization system, meant to be used
by web applications (but can pretty much be used anywhere). Yes, another
localization system.'''

def __init__(self, path, format_deserializer='json'):

    if format_deserializer not in ('json', 'yaml'):
        raise ValueError('FormatNotSupported')

    self.path = path
    self.format_deserializer = format_deserializer
    self.locales = defaultdict(dict)
    self._find_file()

def _find_file(self):
    '''Find all json files'''
    listing = glob.glob(
        os.path.join(self.path, '*.' + self.format_deserializer))
    for infile in listing:
        self._make_loc(infile)

def _make_loc(self, infile):
    '''Store content of the file in a memory'''
    lang = os.path.basename(infile).split('.')[0]
    if self.format_deserializer == 'json':
        import json
        with open(infile) as fil:
            data = json.load(fil)
    else:
        import yaml
        with open(infile) as fil:
            data = yaml.load(fil)
    for key, value in data.items():
        if isinstance(value, dict):
            self.locales[key].update(value)
        else:
            self.locales[key].update({lang: value})

def loc(self, key, lang, values=None):
    '''Return the string key, translated to the requested language (if such
    a translation exists, otherwise no traslation occurs). Any other
    parameters passed to the method are injected to the placeholders in the
    string (if present).'''

    if self.locales[key].get(lang) is None:
        return key

    ret = self.locales[key][lang] if key in self.locales else key
    if values is None:
        return ret
    else:
        return ret % values

I get an error with Python 3.2 on travis.ci but the tests pass on Python 2.x and PyPy, if you have an idea let me know. I’m not sure, but I think as I read it was better to use format or template, rather than % with Python 3 is that I’m wrong.

Solution

import glob
import re
import os.path
from collections import defaultdict


class Pylocwolowitz(object):
    '''Pylocwolitz is a very simple text localization system, meant to be used
by web applications (but can pretty much be used anywhere). Yes, another
localization system.'''

def __init__(self, path, format_deserializer='json'):

Calling it format_deserializer suggests to me that it should be a function. Perhaps it should just be format?

    if format_deserializer not in ('json', 'yaml'):
        raise ValueError('FormatNotSupported')

    self.path = path
    self.format_deserializer = format_deserializer
    self.locales = defaultdict(dict)
    self._find_file()

def _find_file(self):
    '''Find all json files'''

You are finding files, find a file. Consider changing the name.

    listing = glob.glob(
        os.path.join(self.path, '*.' + self.format_deserializer))
    for infile in listing:

I’d do

 path = os.path.join(self.path, '*.' + self.format_deserializer)
 for infile in glob.glob(path):

Which I think comes out a bit cleaner

        self._make_loc(infile)

def _make_loc(self, infile):
    '''Store content of the file in a memory'''

Avoid abbreviations, I have no idea what loc is.

    lang = os.path.basename(infile).split('.')[0]

There is a splitext function which you might find useful here. You also don’t ue this until way latter. Perhaps you should determine it later.

    if self.format_deserializer == 'json':
        import json
        with open(infile) as fil:
            data = json.load(fil)
    else:
        import yaml
        with open(infile) as fil:
            data = yaml.load(fil)

Instead, you could do

 deseralizer = __import__(self.format_deserializer)
 with open(infile) as fil:
    data = deserializer.load(fil)

That would avoid doing the same thing twice.

    for key, value in data.items():
        if isinstance(value, dict):
            self.locales[key].update(value)
        else:
            self.locales[key].update({lang: value})

def loc(self, key, lang, values=None):
    '''Return the string key, translated to the requested language (if such
    a translation exists, otherwise no traslation occurs). Any other
    parameters passed to the method are injected to the placeholders in the
    string (if present).'''

    if self.locales[key].get(lang) is None:
        return key

Do you actually want to skip formatting this case?

    ret = self.locales[key][lang] if key in self.locales else key

Why are you checking if key is in self.locales? It will be because the previous statement will have put it in regardless. And if it’s not there, the first statement would have already returned. Also, its more pythonic to try and fetch it and catch the exception like:

   try:
       ret = self.locales[key][lang]
   except KeyError:
       ret =  key

That way you only lookup the dictionary once. But we can actually do even better by using get with a default

   ret = self.locales[key].get(lang, key)

That, I think, is a very slick approach.

    if values is None:
        return ret
    else:
        return ret % values

Using .format would probably be preferred

Leave a Reply

Your email address will not be published. Required fields are marked *