Writing image pixel data to printer code

Posted on

Problem

I’ve been trying to take image pixel data and write it out into printer code and my results are rather slow.

Here is a simplified version of what I have so far (image is a PIL Image object of 1200 x 1800 pixels):

# ~36 seconds on beaglebone
f.write(pclhead)
pix = image.getdata()
for y in xrange(1800):
  row = 'xff'*72
  ### vvv slow code vvv ###
  for x in xrange(1200):
    (r,g,b) = pix[y*1200+x]
    row += chr(g)+chr(b)+chr(r)
  ### ^^^ slow code ^^^ ###
  row += 'xff'*72
  f.write(row)
f.write(pclfoot)

I know the loop can be optimized way better, but how?

My code is running on a beaglebone so speed is slower than you’d expect, but composing the complex images takes about 5 seconds. I wouldn’t expect my printer code function (which just reorders the data) to take much longer than about 2 or 3 seconds. My first attempt (with getpixel) took 90 seconds. Now I have it down to 36 seconds. Surely I can make this quite a bit faster yet.

For comparison, just so we can all see where the hold up is, this code runs in 0.1 secs (but, of course, is lacking the important data):

# ~0.1 seconds on beaglebone
f.write(pclhead)
pix = image.getdata()
for y in xrange(1800):
  row = 'xff'*72
  ### vvv substituted vvv ###
  row += 'xff'*3600
  ### ^^^ substituted ^^^ ###
  row += 'xff'*72
  f.write(row)
f.write(pclfoot)

I guess a simplified version of this problem is to rewrite something like the following:

[ (1,2,3), (1,2,3) ... 1200 times ]

into

[ 2, 3, 1, 2, 3, 1, etc... ]

but as a string

"x02x03x01x02x03x01 ... "

Solution

Start with storing the 'xff' * 72 string as a constant; Python strings are immutable, recreating that string each time is not necessary.

Next, use a list to collect all strings, then join at the end; this is cheaper than constant string concatenation.

Last, avoid attribute lookups in critical sections; assign any attribute lookups done more than once to a local name:

from operator import itemgetter

bgr = itemgetter(1,2,0)

pix = image.getdata()
rowstart = 'xff' * 72
f_write = f.write
empty_join = ''.join
for y in xrange(1800):
    row = [rowstart]
    r_extend = row.extend
    for x in xrange(1200):
        r_extend(map(chr, bgr(pix[y*1200+x])))
    r.append(rowstart)

    f_write(empty_join(row))

You can experiment with joining the whole row (including rowstart) or writing out rowstart values separately; the following version might be faster still depending on write speed versus list concatenation speed:

for y in xrange(1800):
    f_write(rowstart)
    row = []
    r_extend = row.extend
    for x in xrange(1200):
        r_extend(map(chr, bgr(pix[y*1200+x])))
    f_write(empty_join(row))
    f_write(rowstart)

First Profile your code to determine what is your bottleneck.

But generally speaking, there are scopes of improvement

  1. Do not generate the strings of HIGH Values inside the Loop
  2. Use Generators
  3. Process a 1D List as a 1D list rather than Indexing with two loops

Sample Code

f.write(pclhead)
pix = image.getdata()
HIGH_VALUES = 'xff'*72
#pix_it = (''.join(map(chr, reversed(e))) for e in pix)

pix_it = (chr(b)+chr(g)+chr(r) for r,g,b in pix)
while True:
    data = ''.join(islice(pix_it, 1200))
    if not data: break
    f.write(HIGH_VALUES)
    f.write(data)
    f.write(HIGH_VALUES)

Can you use numpy? If you provide a test image it would help (I know most of this answer works, but unsure about tofile doing the right thing here)

import numpy
f.write(pclhead)
pix = numpy.array(image).astype(numpy.uint8) #Array shape is (x, y, 3)
pix[:,:,[0,1,2]] = pix[:,:,[1,2,0]] #Swap rgb values to gbr, this is a 'view' so is fast
to_insert = numpy.ones((72, pix.shape[1], 3)).astype(numpy.uint8) * 255
pix = numpy.concatenate((to_insert, pix, to_insert), axis=0).astype(numpy.uint8)
pix.tofile(f)
f.write(pclfoot)

I think you can save even more time by keeping to iterables of integers as long as possible, and switching to bytearray at the last possible moment. Also, as tradeoffs usually go, you can save time by keeping more in RAM and saving it all for one final .write. Exactly how well that works will depend on buffering.

from itertools import chain, imap, izip_longest
from operator import itemgetter

def grouper(iterable, n, fillvalue=None):
    """Collect data into fixed-length chunks or blocks
    Copied directly from http://docs.python.org/2/library/itertools.html#recipes
    """
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

pix = chain.from_iterable(imap(itemgetter(1, 2, 0), image.getdata()))
rows = grouper(pix, 3600)
# Rows is an iterable of iterables of integers in the range 0-255.

space = bytearray([255] * 72)
img = space + (2 * space).join(map(bytearray, rows)) + space
# You'll need at least ~6.4M in RAM, but the BeagleBone has 256! :)
f.write(img)

This is the code I finally arrived at. Not only is it way faster (less than 1 second!), but it still maintains good legibility and doesn’t rely on additional libraries.

This assumes the image format is raw ‘RGB’ as was the case in the original scenario. Instead of image.getdata() which returns a set of tuples, I am now using image.getstring() which doesn’t need as much coercsion to manipulate its data. Also, by preallocating rpix as a bytearray of known size, you save a lot of memory block copying.

# <1 second on beaglebone
f.write(pclhead)

mgn = 'xff'*72
pix = image.getstring()
rpix = bytearray(len(pix))
rpix[0::3] = pix[1::3]
rpix[1::3] = pix[2::3]
rpix[2::3] = pix[0::3]

offs = 0
for y in xrange(1800):
  f.write(mgn)
  f.write(str(rpix[offs:offs+3600]))
  f.write(mgn)
  offs += 3600
f.write(pclfoot)

Leave a Reply

Your email address will not be published. Required fields are marked *