Extracting images from Excel spreadsheets
Category: Computers Category: MS OfficeYou have images stuck inside a Microsoft Excel spreadsheet.
You need to save them to your hard drive.
Your problem is:
- There are multiple images. They’re on different worksheets.
- The pictures need to be named after the worksheet they were on.
- There might be multiple images on each worksheet.
Use this Python code. It interfaces with Microsoft Excel (so you’ll need to have Excel installed - I have Excel 2010.) You will also need the Python packages pywin32
and PIL
.
- It will save all the images in the Excel workbook to the same folder as the workbook.
- The images will be saved as JPEG files.
- The images will be named after the worksheet they were on:
Sheet1.jpg
,Sheet2.jpg
, and so on. - If there was more than one image on a worksheet, the images will be numbered:
Sheet1.jpg
,Sheet1_001.jpg
,Sheet1_002.jpg
, and so on.
Limitations: When Excel copies an image to the clipboard, it appears to use a fixed DPI. So the resolution of the image may be decreased.
Alternate approaches if this doesn’t suit you:
Save the Excel file to HTML format; all the images drop out as files with names like
image001.png
.Dive into the Excel file; Excel 2007
xlsx
files are justzip
files inside. The images are stored in their original format (JPG or PNG) and original size. The images are helpfully namedimage112.jpg
and so on.
import win32com.client # Need pywin32 from pip
from PIL import ImageGrab # Need PIL as well
import os
excel = win32com.client.Dispatch("Excel.Application")
workbook = excel.ActiveWorkbook
wb_folder = workbook.Path
wb_name = workbook.Name
wb_path = os.path.join(wb_folder, wb_name)
print "Extracting images from %s" % wb_path
image_no = 0
for sheet in workbook.Worksheets:
for n, shape in enumerate(sheet.Shapes):
if shape.Name.startswith("Picture"):
# Some debug output for console
image_no += 1
print "---- Image No. %07i ----" % image_no
# Sequence number the pictures, if there's more than one
num = "" if n == 0 else "_%03i" % n
filename = sheet.Name + num + ".jpg"
file_path = os.path.join (wb_folder, filename)
print "Saving as %s" % file_path # Debug output
shape.Copy() # Copies from Excel to Windows clipboard
# Use PIL (python imaging library) to save from Windows clipboard
# to a file
image = ImageGrab.grabclipboard()
image.save(file_path,'jpeg')