It also has extra fields such as MIME Type, Document ID, etc. The 'Info’ dictionary contains the following values: title subject keywords author creator producer creation date modification date. The output of exiftool for example.pdf covers the output of pdfinfo. Pdfinfo prints the contents of the 'Info’ dictionary (plus some other useful information) from a Portable Document Format (PDF) file. Warning : Ignored duplicate Info dictionary Title : Introduction to Programming Languagesĭocument ID : uuid:e5db80e2-4fef-4596-b7a4-bcc15ba1a0da The rclpdf.py script in Recoll version 1.23.2 and later can extract XMP metadata fields by executing the pdfinfo command (usually found with poppler-utils). : Package Contents Chapter 1: Product Overview Package Contents Model. Metadata Date : 2013:02:04 11:00:11-05:00 Under Windows, the easiest way to do this is drop to command prompt (Start >. Web Statement : Producer : Acrobat Distiller 7.0.5 (Windows) Since I am interested in the same kind of job (though not necessarily to OCR the PDF files, but to convert them to DjVu and then OCR them), I found this question and the responses lacking (since I needed to guess the DPI of the images with the number of pixels and then use the size as output by pdfinfo or other tricks-not to mention that the images inside a PDF may have. However it doesnt work with pdfLaTeX: in that case another driver will be forced instead of f. This will be loaded if you specify the pdfmark option to hyperref (which is an alias for dvips ). Let’s use exiftool to examine example.pdf: $ exiftool example.pdfįile Modification Date/Time : 2022:10:07 10:15:34+03:00įile Access Date/Time : 2022:12:01 08:28:07+03:00įile Inode Change Date/Time : 2022:10:07 10:15:34+03:00 Background: The macro pdfmark is defined within f which is part of the hyperref package. It can also be used for PDF documents.Įxif in its name stands for Exchangeable Image File Format.Įxiftool comes installed with the perl-Image-ExifTool package. It supports many image, audio and video file types. This unit corresponds to a PostScript point. For example: exiftool -createdate -ext pdf. Im new to it, but looks like Exiftool can extract this. The pdfinfo command has a -isodates flag, which will make the dates a bit easier to parse ( ISO 8601 format). Since we passed 2 for the -f option and 3 for the -l option as arguments, pdfinfo printed the page information for these pages. There are bindings for many popular languages including Python (through the python-poppler package). The -f option specifies the first page to examine, while the -l option specifies the last page to examine. Producer: Acrobat Distiller 7.0.5 (Windows)ĬreationDate: Mon Feb 4 10:16:29 2013 EST Title: Introduction to Programming Languages If we want to see the page information of other pages, we can use the -f and -l options: $ pdfinfo –f 2 –l 3 example.pdf pdfinfo just prints the information of the first page by default if it’s called without any options. The page size and page rotation in the output of pdfinfo belongs to the first page.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |