Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering or logging in.

[RT 130805] Problem with particular PDF

  • 4 Replies
  • 37 Views
*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 601
    • View Profile
[RT 130805] Problem with particular PDF
« October 24, 2019, 11:17:06 AM »
Wed Oct 23 10:04:11 2019 apache2@netcasters.com - Ticket created [Reply] [Forward]
Subject:    Problem with particular PDF
Date:    Wed, 23 Oct 2019 09:25:01 -0400
To:    bug-pdf-api2@rt.cpan.org
From:    apache2 <apache2@netcasters.com>

Hi,

I have a particular pdf (actually there are a few of them), that when written to via PDF::API2, Adobe cannot render the file.  This is via Internet Explorer, Edge and Firefox.  Chrome renders the file fine.

I did a simple offline test:
Code: [Select]
- Call ghostscript to make a copy of the file.  SUCCESS
- Add "Hello World" text to file.  SUCCESS
- Save file  SUCCCESS
- Call ghostscript to make a copy of the file.  ERROR
So, in this case ghostscript has an issue with the file.  Maybe the issue is with ghostscript and Adobe.

Seems like you need to see the pdf and possible my test code, but I'd rather not make it public since it is a client file.  What can we do to resolve?

Let me know if you need any further information.

Thank you very much

Ted

Thu Oct 24 11:11:10 2019 PMPERRY@cpan.org - Correspondence added

Well, yes, we ARE going to need an example PDF that shows this problem. If I understand your explanation, you have a working PDF, modify it with GS, and then Adobe doesn't like it? Was this PDF originally produced with PDF::API2? Which "Adobe" are you talking about -- Reader, Acrobat, or something else? First of all, PDF::API2 produces PDF 1.4, which while old, should work OK with GS or Adobe. If GS takes a working PDF (that Reader doesn't ask to SAVE after displaying), and breaks it, that sounds like a problem with GS. Without seeing an example, though, it's hard to say for sure whether there was something odd about the original PDF.

If the PDF was produced with PDF::API2, please supply the Perl code for PDF::API2, rather than just the PDF output. Cut down the code to the minimum that shows the problem, while removing proprietary information. Just for giggles, you might try it with PDF::Builder too, and see if the same problem shows up. Note that if Adobe Reader asks to save a PDF when exiting, that means it had to fix up a defective PDF when it loaded it.

Thu Oct 24 11:11:10 2019 The RT System itself - Status changed from 'new' to 'open'

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 601
    • View Profile
Re: [RT 130805] Problem with particular PDF
« Reply #1: October 26, 2019, 09:26:32 AM »
Sat Oct 26 09:21:16 2019 PMPERRY@cpan.org - Correspondence added

(received via email from OP)

Hi,

Sorry to contact you directly instead of through this page

https://rt.cpan.org/Public/Bug/Display.html?id=130805

But, I couldn't figure out how to post a reply there.

==================================================
To add a comment to this thread, just email bug-PDF-API2 [at] rt.cpan.org with subject line [rt.cpan.org #130805]. Note 1 space between org and #, and the [ ] around the whole subject. Nothing else. If you don't follow this format carefully, you will end up creating a new bug report! HTML formatting within the body does not work.
==================================================

-------------------------------------------------------------

Comments inserted below ....

Quote
Well, yes, we ARE going to need an example PDF that shows this problem. If I understand your explanation, you have a working PDF, modify it with GS, and then Adobe doesn't like it?

- Have an existing .pdf (v1.4).  Added "Hello World" to the pdf via PDF::API2, then saved.  Then tried to make a copy in GS, GS fails to make a copy.

Code: [Select]
use PDF::API2 ();

$input_file = '66.pdf';
$output_file = '67.pdf';

system "/usr/bin/gs", "-sDEVICE=pdfwrite", "-o", "${output_file}GS", "$input_file";

print "------------------------------------------------------\n";

my $pdf = PDF::API2->new();
my $template_pdf = PDF::API2->open($input_file);
$pdf->import_page($template_pdf, 1);

$page = $pdf->openpage(1);

$font = $pdf->corefont('Helvetica-Bold');

$text = $page->text();
$text->font($font, 20);
$text->translate(200, 700);
$text->text('Hello World!');

$pdf->saveas($output_file);

system "/usr/bin/gs", "-sDEVICE=pdfwrite", "-o", "${output_file}GS", "$output_file";

Code: [Select]
GPL Ghostscript 9.25 (2018-09-13)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: Form stream has unbalanced q/Q operators (too many q's)
               Output may be incorrect.
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
Loading NimbusSans-Bold font from /usr/share/ghostscript/Resource/Font/NimbusSans-Bold... 5460108 4049994 3001452 1570283 3 done.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.

Quote
Was this PDF originally produced with PDF::API2?

- Don't know, it was provided by a client and is a few years old

Quote
Which "Adobe" are you talking about -- Reader, Acrobat, or something else?

- File can't be read via Explorer/Firefox either, so it's the   browser plugin (Acrobat Reader DC?).  Ok in Chrome.

Quote
First of all, PDF::API2 produces PDF 1.4, which while old, should work OK with GS or Adobe. If GS takes a working PDF (that Reader doesn't ask to SAVE after displaying), and breaks it, that sounds like a problem with GS. Without seeing an example, though, it's hard to say for sure whether there was something odd about the original PDF.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 601
    • View Profile
Re: [RT 130805] Problem with particular PDF
« Reply #2: October 26, 2019, 10:13:57 AM »
Sat Oct 26 10:08:15 2019 PMPERRY@cpan.org - Correspondence added

==================================================
You can also reply if you have a CPAN account registered, but usually only developers who are uploading a package to CPAN will bother with applying for an account.
==================================================

OK, so the only involvement with PDF::API2 was to add a line to the PDF? Can I assume that you tried the GS run ON THE ORIGINAL PDF (without the PDF::API2 change) and it did NOT get error(s)? What exactly are you using GS for in this chain? Are you changing the format, just printing it, or what? Maybe you can do what you want without involving GS. It's certainly not needed just to make a simple copy of a PDF. Does the ORIGINAL PDF load correctly into a PDF reader (such as Adobe Reader) without error messages or asking to save it after viewing it?

Check the logic of your GS code usage. In the first GS run, are you just showing that input 66.pdf and writing 67.pdfGS does (or does not) run correctly? Then you use PDF::API2 to output 67.pdf as a modified 66.pdf. Finally, in the second GS run input 67.pdf (just created) and output 67.pdfGS to show that the PDF::API2 output is (or is not) correct? Are the GS error messages from only the second run?

In the PDF::API2 code, I see that you open the PDF, import_page, and then openpage. If you're trying to extract one page out of the original PDF, you might try the example code for import_page:
Code: [Select]
    $pdf = PDF::API2->new();
    $old = PDF::API2->open('our/old.pdf');

    # Add page 2 from the old PDF as page 1 of the new PDF
    $page = $pdf->import_page($old, 2);

    $pdf->saveas('our/new.pdf');
and see if that makes any difference.

Finally, the complaint from GS is about unbalanced q/Q (save and restore graphics state). If nothing else works, try installing PDF::Builder and doing the same steps (instead of with PDF::API2). I rewrote some of the code in that area, and it might make a difference. At least it would show that there IS a bug in PDF::API2, and narrow down the area.

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 601
    • View Profile
Re: [RT 130805] Problem with particular PDF
« Reply #3: October 30, 2019, 11:37:23 AM »
Tue Oct 29 21:37:44 2019 PMPERRY [...] cpan.org - Correspondence added

I received your email (see below), but the mail system reported that it could not deliver my reply to you (so here it is):

Hi,

I tried using PDF::Builder and I got the same error.  (Really cool though, that all I had to change in my code to get it to work was to replace API2 with Builder.)

I'm using GS strictly for testing.  I'm really opening up the file in a browser, Chrome is good, all others fail to open up the pdf.

Just to try and clarify my test.

- Have pdf
- Make a copy of orig pdf using GS - SUCCESS
- Add "Hello World" text to orig pdf using PDF::API2/Builder - SUCCESS
- Save to new pdf using PDF::API2/Builder - SUCCESS
- Make a copy of new pdf using GS - FAIL

Thanks
Ted

My response:

Just to clarify, your original PDF (step 1) loads correctly in most/all browsers or readers -- no error messages, and no asking to save the PDF when you exit the browser or reader? If not, that means the original PDF is faulty, and it would not be surprising if PDF::API2/Builder trying to modify it, would break.

When you say "make a copy... using GS", are you changing the format of the PDF file, or performing some other operation on it? I'm trying to understand why you're using GS if you're simply making a copy of the PDF -- why can't you just do a command-line copy/cp old.pdf new.pdf? If that's all you're doing, let's try eliminating the GS steps and see if the problem goes away. If it does go away (no GS involved), that means that GS is corrupting the PDF (although PDF::API2's changes may be an accessory to the crime).

Keep in mind that PDF::API2 is a PDF version 1.4 system, and often breaks when reading PDF 1.5 and up. Find out what version your PDF file is (it's the first line of the file). If it's greater than 1.4, that might be a problem for both API2 and Builder. A few features they can handle, but some will break things.
« Last Edit: October 30, 2019, 06:55:31 PM by Phil »

*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 601
    • View Profile
Re: [RT 130805] Problem with particular PDF
« Reply #4: November 04, 2019, 02:45:42 PM »
Thu Oct 31 22:34:14 2019 PMPERRY [...] cpan.org - Correspondence added

I'm making some progress. I still don't know why you're making a copy with GS, so I took it out. It still produces a corrupt PDF (starts to render then quits with an error). The following code does produce a good  PDF:
Code: [Select]
use strict;
use warnings;
use PDF::Builder ();

my $input_file = '357.pdf';
my $output_file = '357Text.pdf';

print "------------------------------------------------------\n";

my $pdf = PDF::Builder->open($input_file);

my $page = $pdf->openpage(1);
my @size = $page->mediabox();

my $font = $pdf->corefont('Helvetica-Bold-Oblique');

my $text = $page->text();
$text->font($font, 40);
$text->translate($size[2]/2, $size[3]/2);
$text->text_center('Hello World!');

$pdf->saveas($output_file);

Note that it simply opens the template file, selects page 1 (the only page), adds some text, and writes (saveas) to a different name. It should work the same with the current PDF::API2 (change text_center to text and correct the location X value).

When I use something closer to your original code, where you import one page into a new empty PDF, that produces a corrupt PDF all by itself (even without adding the text). Something in import_page() is not working right. I'll have to keep looking at it. For some reason having to do with resource names, import_page() creates a Form and outputs the page with formimage(). That passes the t-tests, but seems to create a corrupt PDF in the process.

Sat Nov 02 13:16:27 2019 PMPERRY [...] cpan.org - Correspondence added

BTW, you can ignore the bit about "change text_center to text". I forgot that PDF::API2 does in fact include a text_center() call. Otherwise, just changing Builder to API2 should work on a PDF::API2 system.

I haven't gotten anywhere yet on why import_page() seems to mess up the page.