Bash script to download sequentially numbered images

Posted on

Problem

This is a pretty basic bash script (3.2 on Mac). I am downloading 584 images from a site in order to create an album.

#!/bin/bash

urlFirst="http://150.216.68.252:8080/adore-djatoka/resolver?url_ver=Z39.88-2004&rft_id=http://150.216.68.252/ncgre000/00000012/00011462/00011462_ac_0"
urlSecond=".jp2&svc_id=info:lanl-repo/svc/getRegion&svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000&svc.format=image/jpeg&svc.level=6"

for i in {1..584}
do
    printf -v j "%03d" $i

    url=$urlFirst$j$urlSecond 

    wget $url -O $j".jpeg"

done

What improvements should be made to this?

I understand that I could remove the variables in most cases but for debug purposes it was helpful to have them, as I could add basic echo statements.

I’m not very well versed in Bash scripting though (lots of basic scripts) – any improvements or best practices from Bash I’m missing would be appreciated.

Solution

I would write it this way:

#!/bin/bash

url_template="http://150.216.68.252:8080/adore-djatoka/resolver?url_ver=Z39.88-2004&svc_id=info:lanl-repo/svc/getRegion&svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000&svc.format=image/jpeg&svc.level=6&rft_id=http://150.216.68.252/ncgre000/00000012/00011462/00011462_ac_%04d.jp2"

for i in {1..584}; do
    wget -O $(printf '%03d.jpeg' $i) "$(printf "$url_template" $i)"
done

Specifically,

  • At first glance, I thought that $urlFirst and $urlSecond were two URLs, until I figured out what was happening.
  • If there is another leading 0, then %04d is more appropriate than %03d.
  • The web server isn’t going to care about the order of the parameters in the query string. For the benefit of humans, though, it would be nicer to put the parameter that changes either first or last.
  • Fewer variables is nice. I do everything by interpolation. You can get a feel for what the wget command does, without tracing which variable came from where.

Nicely written. I especially like the printf -v trick to write to a variable directly, as opposed to j=$(printf ...). I didn’t know this feature of printf, very cool, so thanks for that.

Speaking of which, j is a poor name, like most single-letter variable names. filename would have been better, and the code would have been instantly easier to read.

ShellCheck gives a warning for the wget $url, suggesting to double quote to prevent globbing and word splitting. For example if the URL contained a space, the script wouldn’t work. Although this is not a real concern in your example, I suggest to take the good advice and double-quote it anyway.

The double-quoting of ".jpeg" is a bit odd. You didn’t need to quote that.

Leave a Reply

Your email address will not be published. Required fields are marked *