Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering or logging in.

RT 120375 - support of ISO 19005 – archival Document management – PDF/A

  • 6 Replies
  • 2488 Views
*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
Sat Feb 25 15:20:49 2017 rico.zienke [...] posteo.de - Ticket created
Subject:    [question] support of ISO 19005 – Document management – Electronic document file format for long-term preservation (PDF/A)
Date:    Sat, 25 Feb 2017 21:20:02 +0100
To:    bug-PDF-API2 [...] rt.cpan.org
From:    Rico Zienke <rico.zienke [...] posteo.de>

Hello,
is it  possible to create PDF/A valid documents with the perl PDF::API2 api? Is there is already a way/option that the api takes care completely of responsibility for valid pdf/a creation? (so that consumers doesn't have to take care) If not is possible for consumers to generate a pdf with the right knowledge today? Or do you see any gaps on API side, which makes it impossible at all?

If it is not possible, are there any plans to do it?

If you need further information. I can send you an example pdf with verapdf validation results.

Thanks and Best regards,
Rico Zienke

<formatting cleanup - Mod.>
« Last Edit: May 01, 2017, 10:31:03 AM by Phil »

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
PDF/A (there are a number of sub-versions) is intended to "future-proof" PDF documents by banning required references to external files (fonts, color profiles, etc.), encryption, patented compression, and some other restrictions. A document should be readable all by itself, and without someone having to remember a password.

Some resources:
This could certainly be a global flag for PDF production (PDF/A) by something using this library.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
Tue May 23 22:50:15 2017 steve [...] deefs.net - Correspondence added

Quote
is it possible to create PDF/A valid documents with the perl PDF::API2 api?
I have no idea.  Theoretically, yes, but I haven't looked at that spec.

Quote
Is there is already a way/option that the api takes care completely of responsibility for valid pdf/a creation? (so that consumers doesn't have to take care)
Not built-in.

Quote
If not is possible for consumers to generate a pdf with the right knowledge today? Or do you see any gaps on API side, which makes it impossible at all?

If it is not possible, are there any plans to do it?
I'm focused on just the base PDF specification, but if someone wanted to add support for PDF/A, I wouldn't be opposed.
#
Tue May 23 22:50:16 2017 The RT System itself - Status changed from 'new' to 'open'
#
Tue May 23 22:50:31 2017 steve [...] deefs.net - Status changed from 'open' to 'resolved'

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
 PhilterPaper commented Dec 29

Just to add to the fun, in addition to PDF/A (long term archival storage), there are

  • PDF/E (interactive exchange of engineering documents, intended to be open and neutral)
  • PDF/VT (VDP content extension of PDF/X, for better control over printers and the like)
  • PDF/X (graphic content exchange, particularly for print publishing workflow with stringent color specifications)

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
Validation tools:


*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
Note that while this request has been rejected on PDF::API2, I will keep it open here in hopes that someone will have the time and inclination to do something about it. It could be an option or setting (in PDF::Builder->new()) to forbid certain PDF features (e.g., encryption), require others (e.g., font embedding), and otherwise modify settings to ensure that no external resources are needed. I don't know if the file needs to be "marked" or certified in some way, or you just create a compliant PDF that happens to pass the inspection process.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 582
    • View Profile
I exported an Open Office Writer document as a PDF, with the PDF/A box checked. It includes the following objects:
Code: [Select]
243 0 obj
<</N 3/Length 244 0 R/Filter/FlateDecode>>
stream
.... binary data ....
endstream
endobj

244 0 obj
2644
endobj

245 0 obj
<</Type/OutputIntent/S/GTS_PDFA1/OutputConditionIdentifier(sRGB IEC61966-2.1)/DestOutputProfile 243 0 R>>
endobj

424 0 obj
<</Type/Catalog/Pages 225 0 R
/OpenAction[1 0 R /XYZ null null 0]
/StructTreeRoot 247 0 R
/Lang(en-US)
/MarkInfo<</Marked true>>
/OutputIntents[245 0 R]/Metadata 246 0 R>>
endobj

425 0 obj
<</Author<FEFF005000680069006C002000500065007200720079>
/Creator<FEFF005700720069007400650072>
/Producer<FEFF004F00700065006E004F0066006600690063006500200034002E0031002E0035>
/CreationDate(D:20190216203607-05'00')>>
endobj

and in the trailer
Code: [Select]
...trailer
<</Size 426/Root 424 0 R
/Info 425 0 R
/ID [ <A26A9A1BCCEF28991EF878F211F098A8>
<A26A9A1BCCEF28991EF878F211F098A8> ]
/DocChecksum /8233BDDDED366A1B0D1804B30676CA29
>>
startxref...

So, it looks like there's some stuff to put in the PDF to mark it as PDF/A (and related), and not just obeying some rules about being self-contained without encryption or patents.

There may be more stuff than what's listed here... it's just a first look.