Thursday, March 31, 2011

Export PDF pages to a series of images in Java

I need to export the pages of an arbitrary PDF document into a series of individual images in jpeg/png/etc format. I need to do this in in Java.

Although I do know about iText, PDFBox and various other java pdf libraries, I am hoping for a pointer to some working example, or some how-to.

Thanks.

From stackoverflow
  • Here is one way to do it, combining some code fragments from around the web.

    How do I draw a PDF into an Image?

    https://pdf-renderer.dev.java.net/examples.html

    Creating a Buffered Image from an Image

    http://www.exampledepot.com/egs/java.awt.image/Image2Buf.html

    Saving a Generated Graphic to a PNG or JPEG File

    http://www.exampledepot.com/egs/javax.imageio/Graphic2File.html

    Combined together into something that works like this to turn all the pages into images:

    import com.sun.pdfview.PDFFile;
    import com.sun.pdfview.PDFPage;
    import java.awt.Image;
    import java.awt.Rectangle;
    import java.io.*;
    import java.nio.ByteBuffer;
    import java.nio.channels.FileChannel;
    import javax.swing.*;
    import javax.imageio.*;
    import java.awt.image.*;
    
    public class ImageMain {
    
    public static void setup() throws IOException {
    
        //load a pdf from a byte buffer
        File file = new File("test.pdf");
        RandomAccessFile raf = new RandomAccessFile(file, "r");
        FileChannel channel = raf.getChannel();
        ByteBuffer buf = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
        PDFFile pdffile = new PDFFile(buf);
    
        int numPgs = pdffile.getNumPages();
    
    
       for (int i=0; i<numPgs; i++)
       {
       // draw the first page to an image
        PDFPage page = pdffile.getPage(i);
    
        //get the width and height for the doc at the default zoom 
        Rectangle rect = new Rectangle(0,0,
                (int)page.getBBox().getWidth(),
                (int)page.getBBox().getHeight());
    
        //generate the image
        Image img = page.getImage(
                rect.width, rect.height, //width & height
                rect, // clip rect
                null, // null for the ImageObserver
                true, // fill background with white
                true  // block until drawing is done
                );
    
    
        //save it as a file
        BufferedImage bImg = toBufferedImage( img );
        File yourImageFile = new File("page_" + i + ".png");
        ImageIO.write( bImg,"png",yourImageFile);
        }
    }
    
    
    // This method returns a buffered image with the contents of an image
    public static BufferedImage toBufferedImage(Image image) {
        if (image instanceof BufferedImage) {
            return (BufferedImage)image;
        }
    
        // This code ensures that all the pixels in the image are loaded
        image = new ImageIcon(image).getImage();
    
        // Determine if the image has transparent pixels; for this method's
        // implementation, see e661 Determining If an Image Has Transparent Pixels
        boolean hasAlpha = hasAlpha(image);
    
        // Create a buffered image with a format that's compatible with the screen
        BufferedImage bimage = null;
        GraphicsEnvironment ge = GraphicsEnvironment.getLocalGraphicsEnvironment();
        try {
            // Determine the type of transparency of the new buffered image
            int transparency = Transparency.OPAQUE;
            if (hasAlpha) {
                transparency = Transparency.BITMASK;
            }
    
            // Create the buffered image
            GraphicsDevice gs = ge.getDefaultScreenDevice();
            GraphicsConfiguration gc = gs.getDefaultConfiguration();
            bimage = gc.createCompatibleImage(
                image.getWidth(null), image.getHeight(null), transparency);
        } catch (HeadlessException e) {
            // The system does not have a screen
        }
    
        if (bimage == null) {
            // Create a buffered image using the default color model
            int type = BufferedImage.TYPE_INT_RGB;
            if (hasAlpha) {
                type = BufferedImage.TYPE_INT_ARGB;
            }
            bimage = new BufferedImage(image.getWidth(null), image.getHeight(null), type);
        }
    
        // Copy image to buffered image
        Graphics g = bimage.createGraphics();
    
        // Paint the image onto the buffered image
        g.drawImage(image, 0, 0, null);
        g.dispose();
    
        return bimage;
    }
    
    public static void main(final String[] args) {
        SwingUtilities.invokeLater(new Runnable() {
            public void run() {
                try {
                    ImageMain.setup();
                } catch (IOException ex) {
                    ex.printStackTrace();
                }
            }
        });
    }
    }
    
    dasp : This is exactly what I was looking for, thank you!
    zaletniy : the hasAlpha method missed can be found here http://www.biddata.net/joel/photo/final/Photo.java
  • If you consider the JPedal PDF library, its built in and documented with eample source at http://support.idrsolutions.com/default.asp?W22

  • Hi,

    I was looking for similar kind of solution for myself. When I tried the same example with a pdf file with more than 1 page, all output png files are coming for first page only. Any idea whats wrong here.

    Thanks, Abhi

  • hi.

    jst start your for loop from 1 instead of 0...Your problem will be solved..

    cheers, Ravie

0 comments:

Post a Comment