Convert Persian digits to English numbers

Posted on

Problem

I use the following code to convert ۱۲۳۴۵ to ‍12345, effectively transliterating the persian numbers into latin numbers:

String.prototype.toEnglishDigits = function () {
        var num_dic = {
            '۰': '0',
            '۱': '1',
            '۲': '2',
            '۳': '3',
            '۴': '4',
            '۵': '5',
            '۶': '6',
            '۷': '7',
            '۸': '8',
            '۹': '9',
        }

        return parseInt(this.replace(/[۰-۹]/g, function (w) {
            return num_dic[w]
        }));
    }

console.log('۱۲۳۴۵'.toEnglishDigits());

Solution

Instead of using a map,
you can get the corresponding digits directly by subtracting the character code of '۰'.
That is, since the character code of '۳' is 1779
and the character code of '۰' is 1776, you can calculate that:

'۳'.charCodeAt(0) - '۰'.charCodeAt(0) = 1779 - 1776 = 3

Using the above logic, the function can be written shorter:

String.prototype.toEnglishDigits = function () {
    var charCodeZero = '۰'.charCodeAt(0);
    return parseInt(this.replace(/[۰-۹]/g, function (w) {
        return w.charCodeAt(0) - charCodeZero;
    }));
}

Patching core prototypes like String to add non-standard functionality is considered bad software engineering practice, especially in a language where code from various libraries may need to coexist on the same webpage. I’d just define a regular function.

There’s nothing “English” about the output. The result is a Number. It’s not even in base 10. Really, this is an enhancement to parseInt(), and should be named accordingly.

Interestingly, the inverse operation is built-in to many browsers (notably not WebKit):

(12345).toLocaleString('fa')      # => "۱۲۳۴۵"

What you are trying to do, though, has no built-in browser support. However, note that the problem is not unique to Persian numerals — many other languages in that part of the world have positional base-10 numeral systems. Therefore, I suggest a generalization:

var PERSIAN_NUMERALS = '۰'.charCodeAt(0);

function numeralParseInt(zero, str) {
    var digits = new Array(str.length);
    for (var i = 0; i < str.length; i++) {
        digits[i] = str.charCodeAt(i);
        if (zero <= digits[i] && digits[i] < zero + 10) {
            digits[i] -= zero - 48;     // '0' = ASCII 48
        }
    }
    return String.fromCharCode.apply(null, digits);
}

function persianParseInt(str) {
    return numeralParseInt(PERSIAN_NUMERALS, str);
}

Leave a Reply

Your email address will not be published. Required fields are marked *