I have an xml file with a listing of images that I need for a site. Typically I fire up IPython with Amara and do my xml wrangling there. But I also needed the files. I thought about using python to also grab and save the files but a little search led to this post. Ironically, while the tip to use Python is good, there was a tip to use curl to get the feed, grep to parse the images, and finally xargs to feed it to wget for downloading.
Thinking Serious » Using Python to Grab Images From a Web Site
curl -s http://99designs.com/contests/6999/feed | grep -Po “src=\”.*(png|jpg)” | grep -o “http.*” | xargs wget -q
My situation is a bit different though. There are no extensions on the files, they are in a tag, and I need to rename them with extensions. After a little Googling I used this which worked very well.
curl -s http://domain.tld/feed | egrep -o "<tag>.*</tag>" | egrep -o "<tag>(http.*)</tag>" | sed -e 's/<[^>]*>//g'
for f in *; do mv ./"$f" "${f}.jpg"; done
I still need to do some xml wrangling with Amara but the files are now just need to be moved to the right directories. Nice.