Pez! PEZ! …PEEEEEZZZZZZ! (How to scrape Google Image photos for fun and profit.)

By North Street, A Creative Studio

screen-shot-2016-12-01-at-4-01-24-pm

So you need to scrape Google Image photos en masse?

Well, Google doesn’t want you to do that, so it’s not as simple as just hitting file > save webpage.

But…there are a bunch of bash scripts out there to accomplish this task.

Most are broken.

As of December 1, 2016, this script works ….sorta:

https://github.com/hardikvasa/google-images-download

Clone the repo, then open up the google-images-download.py file and adjust the ‘search_keyword’ and ‘keywords’ arrays.

I don’t know why there are two arrays for essentially the same thing, but you do need to include two words at least, one in each array, for this puppy to work right.

So, for my quest for pez images I did the following:

search_keyword = ['Pez']
keywords = [' candy']

That space in the before ‘ candy’ is required. It proceeded to download 100 images.

I than re-ran it with ‘ dispensers’ in the second array, as follows:

search_keyword = ['Pez']
keywords = [' candy']

Note: This script is pretty rough and it just downloads the images to the script directory. If you run it more than once, the images will get overwritten — so be sure to move the downloaded images out between executions. 

Finally, the script is limited to 100 images which I think is a limitation of Google and not the script as I couldn’t find anyplace in the script where “100” was actually specified.

SO MUCH PEZ!

 

 

About north street

We engineer the thoughtful transformation of great organizations. Our proven process helps us understand what your competitors are doing right — and wrong. Want to learn more? Let’s chat.

More Notes

Team Spotlight: Matt Potter, Web Designer

Postcards from Puerto Rico: North Street Retreat ’22

We’re Hiring: Marketing Coordinator