Wednesday, April 6, 2011

To remove a space break between in a PDF -file by a low level code

How can you remove the pagebreak between two pieces of text at the bottom of the pdf file without having the source of the original document?

I use Ubuntu.

--

Just for curiosity: I put the document with the pagebreak to my LaTeX -document. I converted it to PDF by pdflatex. Pdflatex ignores the second page completely. If somebody knows, please let me know how you can insert the second page by includegraphics -command or other command.

From stackoverflow
  • There are two pages in your document. That gray space between the two pieces of text at the bottom of the file is not part of the document. It's actually part of Adobe Reader, or whichever PDF viewer you use, and it's meant to indicate a page break -- in this case a separation between page one and page two.

    So if you want to combine these two pages into one page and remove that gray space, then you'll need to find a PDF library (or another non-library tool) that works in a Linux environment, that will allow you to stitch/merge two pages together to create one page.

    Before you go down that path though, I'd recommend that you try to get a hold of the original document and try to re-create PDF, this time using a larger page size so that you fit all of the content onto one page.

    Masi : This is clearly the best practical solution. However, I am interested in file standards and pdf-system works with pagebreaks at the low-level. This means that you should now the exact code for a pagebreak in PDF -documents to find it in the document. - I did not find it by google.
    Rowan : At a low level there is no code for a page break in PDFs. Each page in a PDF is represented by a page object. The page object is a dictionary which includes references to the page's content and other attributes. The individual page objects are tied together in a structure called the page tree. However, the structure of the page tree is not necessarily related to the logical structure or flow of the document. You can read more about this in the PDF reference - but don't bother modifying the internals of a PDF until you've read the reference. http://www.adobe.com/devnet/pdf/pdf_reference.html

0 comments:

Post a Comment