Can each page in a PDF contain its own metadata?

1

Is it possible to have the same field names, but with different metadata values for each page in a PDF?

For example, let's say each page could have the fields 'author', 'document reference', 'location', etc... and these have different values on each page. So this example would have on page 1 the author "John Smith" and page 2 would have the author "Jane Simmons" and so on.

The only examples I've seen for PDF metadata all relate to document-wide information - but none for page-only information.

I'm developing in Python.

Thank You. :)

teracow

Posted 2015-06-05T08:56:35.223

Reputation: 13

The easiest way to confirm is to go right to the source, i.e. the ISO 32000-1 Standards document. – Karan – 2015-06-05T09:08:32.980

@Karan, that is slightly difficult as the actual standard is very expensive - CHF200. Fortunately there are legitimate downloads.

– Julian Knight – 2015-06-05T09:14:07.937

@JulianKnight: Meh, cost never occurred to me because I've always obtained it for free (including supplemental changes) from http://www.adobe.com/devnet/pdf/pdf_reference.html

– Karan – 2015-06-05T09:42:40.363

@Karan - ah, well it was you who mentioned the ISO standard :) It always annoys me that you have to pay so much to get copies of ISO standards. – Julian Knight – 2015-06-05T10:02:37.237

@JulianKnight: I did, but the very first search result for "ISO 32000-1 Standards" is Adobe's PDF so... Re. the cost I'm completely with you. I see no reason why they need to charge so much, especially from individuals and not companies. – Karan – 2015-06-05T10:05:47.493

@Karan - the joys of Google which, as you know, gives personalised results so yours will not be the same as mine. The ISO standards website came up first for me. Probably because I was searching for a new security standard last week. – Julian Knight – 2015-06-05T10:16:21.240

@JulianKnight: Ok, let's not quibble over search positions. :) It does come up on searching and better still you've linked to it and saved the OP the effort. Job well done and dusted. – Karan – 2015-06-05T10:18:57.327

Answers

1

It looks as though the standard supports metadata at more than just the document level:

In general, any PDF stream or dictionary may have metadata attached to it as long as the stream or dictionary represents an actual information resource, as opposed to serving as an implementation artifact. Some PDF constructs are considered implementational, and hence may not have associated metadata.

Clear as mud! Thankfully there are some additional notes. Including:

In addition, metadata may also be associated with marked content within a content stream. This association shall be created by including an entry in the property list dictionary whose key shall be Metadata and whose value shall be the metadata stream dictionary. Because this construct refers to an object outside the content stream, the property list is referred to indirectly as a named resource (see 14.6.2, “Property Lists”).

This means that you can attach metadata to certain artifacts within your document but I don't believe that you can attach them to a specific page, you would have to have an object that you attached the data to - an image would be the obvious example though the standard seems to refer to shadings too.

Of course, although the standard seems to allow it, that doesn't mean that common PDF handling libraries and applications support it.

Adobe's downloadable version of the Standard (will save you CHF200)

Julian Knight

Posted 2015-06-05T08:56:35.223

Reputation: 13 389

I read this and as you said it's "clear as mud". Still no sure whether all the metadata fields the OP wants can be stored per page. Plus of course you're absolutely right, finding apps that properly conform to the specs is not easy. – Karan – 2015-06-05T09:43:58.663

Thanks @Karan, I've improved the wording to more explicitly answer the question. – Julian Knight – 2015-06-05T10:00:17.910

+1 Based on the official specs I think this is the best we can get, unless some PDF expert can prove otherwise (that per page metadata of the sort wanted by the OP is indeed possible). – Karan – 2015-06-05T10:15:47.630

@JulianKnight - yes, this might be what I'm looking for. Each page consists of a scanned image. I'm hoping to store some easily searchable reference info along with each image. Thanks. :) – teracow – 2015-06-07T02:53:39.523

0

PDF pages can have annotations; the most common type of them might be those that are like sticky notes but these are not the only ones. These are described in part 8.4 of the Adobe PDF 1.7 reference. You can create text annotations, name them with keys like "author" and set the contents to the corresponding string values. Then set the hidden flag to true so the annotation is not displayed or allowed to interact with the user. There's a requirement to set a rectangle for the annotation but since it's not going to be displayed any rectangle inside the page should work.

etr

Posted 2015-06-05T08:56:35.223

Reputation: 1