Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering or logging in.

[RT 131223] corrupted PDF generated

  • 4 Replies
  • 259 Views
*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 706
    • View Profile
[RT 131223] corrupted PDF generated
« December 23, 2019, 07:39:37 PM »
Mon Dec 23 14:20:51 2019 welleozean@googlemail.com - Ticket created [Reply] [Forward]
Subject:    corrupted PDF generated
Date:    Mon, 23 Dec 2019 20:23:27 +0100
To:    bug-PDF-API2@rt.cpan.org
From:    welle ozean <welleozean@googlemail.com>

On Windows 10 running the latest PDF::API2 generates corrupted files:
Code: [Select]
use strict;
use warnings;
use PDF::API2;
use PDF::API2::Annotation;
use PDF::API2::Basic::PDF::Utils;

my $pdf = PDF::API2->open('C:\\Users\\WC\\Desktop\\original.pdf');
my $page = $pdf->openpage(1);

my $sticky = $page-> annotation;
$sticky-> text( 'Text in pop-up window',
    -rect => [ 100, 500, 100, 500 ], -open => 1 );
$sticky-> { C } = PDFArray( map PDFNum( $_ ), 1, 0.65, 0 );
$pdf->saveas( 'C:\\Users\\WC\\Desktop\\target.pdf' );

For what it matters, also simply opening the file and saveas without any operation in between generates a corrupted file. With corrupt I mean the latest Adobe reader is not able to open it (Error 14)

Mon Dec 23 19:34:32 2019 PMPERRY@cpan.org - Correspondence added

I just tried your code example, and it worked fine for me. The only change was to switch original.pdf to a local known-good PDF that I had lying around. By current PDF::API2, do you mean 2.036? Your original.pdf is known to be good (load into reader with no error messages, no offer to save it when quitting the reader)? I'm using Adobe Acrobat Reader DC (I think it lives in the Cloud) 19.021.20061, which I just updated yesterday, on Windows 10.

Anyway, do you still get this corruption with a variety of other PDFs?

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 706
    • View Profile
Re: [RT 131223] corrupted PDF generated
« Reply #1: December 24, 2019, 12:53:37 PM »
Tue Dec 24 09:40:13 2019 welleozean@googlemail.com - Correspondence added

This are my spec:
Windows 10
Perl 5.28.1
PDF::API2 2.036
Adobe Acrobad Reader DC 19.021.20061

All my PDF can be easily opened in Adobe with no error message. I extended my tests. All my files have been edited, probably using FoxyReader. All the files present the same issue after running my script (the original file, as said, can be opened with no issue). Other files downloaded from the Web for test reasons can be opened fine also after running the script. At this link, you can find a file that fails:
https://filebin.net/2rp3p3xua17twwe1/making_sense_of_NMT.pdf?t=ureuhq16  <too large to attach>

Tue Dec 24 12:40:32 2019 PMPERRY@cpan.org - Correspondence added

Two problems:

  • Your PDF is version 1.5, which is likely to cause problems with PDF::API2. It may have structures or data that PDF::API2 has no idea how to handle.
  • It starts at page 291 and runs to 309 (19 pages). I can't get to any page before 291. It looks like a complete article, but I've never seen this kind of behavior before.

I tried the same code and PDF file with PDF::Builder, and it seems to work (didn't blow up, at least). PDF::Builder is a little more forgiving of post-1.4 items, but not knowing what PDF::API2 is choking on, I can't guarantee that PDF::Builder is working properly. Anyway, you might want to try PDF::Builder (it can be installed alongside PDF::API2) and see if it works for you.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 706
    • View Profile
Re: [RT 131223] corrupted PDF generated
« Reply #2: January 05, 2020, 11:28:34 AM »
Fri Jan 03 08:15:53 2020 welleozean [...] googlemail.com - Correspondence added

Thank you for your feedback.

I was able to annotate the same PDF with PDF::Builder, so for this task on similar PDFs, I will use the suggested module.

Fri Jan 03 09:08:31 2020 PMPERRY [...] cpan.org - Correspondence added

It's good to hear that you have a way forward to do your work. It still would be nice to figure out what's going wrong with PDF::API2 so it could be fixed.

Something I didn't mention before is that PDF::Builder also had extensive rewrites of the Annotation functionality, so it's possible that the difference is in the Annotation code rather than in PDF 1.5+ handling.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 706
    • View Profile
Re: [RT 131223] corrupted PDF generated
« Reply #3: February 05, 2020, 08:11:22 PM »
Wed Feb 05 17:00:28 2020 steve [...] deefs.net - Correspondence added

Not having a test case (the filebin link no longer works), I'm going to guess from your description that the original PDF has a cross-reference stream in it.  PDF::API2 can read those as of 2.026, but can't yet write them.  See RT #117184.

You can work around the issue by creating a new PDF and importing the pages from the original file into the new one.

Wed Feb 05 17:00:29 2020 steve [...] deefs.net - Status changed from 'open' to 'stalled'

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 706
    • View Profile
Re: [RT 131223] corrupted PDF generated
« Reply #4: September 01, 2020, 06:59:42 PM »
I'm going to go ahead a close (reject) this one, at there is no sign of trouble with PDF::Builder. There is nothing really happening on PDF::API2, either.