Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering or logging in.

[GH 149] PDFs failing on some readers

  • 0 Replies
  • 257 Views
*

Offline Phil

  • Global Moderator
  • Hero Member
  • *****
  • 810
    • View Profile
[GH 149] PDFs failing on some readers
« March 05, 2021, 07:55:45 PM »
started by PhilterPaper 25 February 2021

This is split off from #141, as that issue should be restricted to the black/white color inversion on bilevel TIFFs. I think I have that one fixed now, although I'm not fully comfortable with when to invert the colors.

Per the previous issue, some TIFFs produce PDFs that cause some Readers to choke. For example, @carygravel supplied a G4 bilevel TIFF that creates a PDF. evince (Linux), Firefox (Windows and maybe Linux), and XpdfReader (Windows) all read this PDF just fine, but Adobe Acrobat Reader DC (Windows) fails to display the image part, giving a message about insufficient image data. So far I have not been able to track this down. I see that one that worked (PDF::Builder not using Graphics::TIFF library, I think) had raster data that was actually 8 bytes shorter than the failing one, plus the last 4 bytes (in common) were different. Cary swears that libtiff should not be doing anything to the raster data, but did raise the question whether Windows (what I'm using) CRLF line-ends could produce a different result than on Linux (NL line-ends). I'm wondering whether Adobe is expecting an EOFB marker and failing to find it (thus the "short" raster data, but why only on this image?), while other Readers either ignore the marker or silently work around it. Anyway, AR is the only one that seems to fail -- other Readers are happy to properly display the page, and don't report any errors.

Now this problem has its own issue, and hopefully I'll be able to fix it at some point.

reply by PhilterPaper 26 February 2021

I've also tried my test suite on the non-Graphics::TIFF (old) code, with odd results. Some, such as G4.tiff, aren't supported anyway, but some give strange errors such as "package "1" does not support "val()" call" when used in one order, while in another order of conversion they give inverted images. I need to look into this some more to find a rhyme and reason behind what's failing. Perhaps some minor fixes can be made to the non-GT code to at least better support some cases (e.g., fix the color inversion).

reply by PhilterPaper 27 February 2021

I think I've got the non-GT issue straightened out. I've also fixed some inverted color bilevels on non-GT (just pushed to GitHub). Here's how it now stands with my collection of test files:

  • non-Graphics::TIFF -- 8.tiff (8 bit grayscale) doesn't display correctly (white blocks are lines). Three color images also give "insufficient data for image" (display distorted on other Readers). Looking at these.
  • Graphics::TIFF -- G4.tiff produces a PDF page that on Adobe Acrobat Reader DC says "insufficient data for image", although XpdfReader, Firefox, and eVince display it fine.
Note that a number of test suite TIFF files with unsupported formats (alpha layer, G4 compressed bilevel) are omitted from the non-Graphics::TIFF test. No promises on the non-GT problems... if the fix looks fairly easy, I'll go ahead and do it, but otherwise it's just better to use Graphics::TIFF.

reply by PhilterPaper 28 February 2021

There's been mention of "JBIG2" here and there -- it appears to be another bilevel compression method (along with "JBIG") that is incompatible with other methods (and PDF), if it snuck in somewhere. It would be good to find out the signature to check if G4 and perhaps some other troublemakers are in fact JBIG2-compressed. Maybe non-Adobe readers are able to handle it?

By the way, JBIG2, although it offers great compression, sounds somewhat dangerous. According to the Wikipedia article, it can substitute similar looking graphics blocks (such as "6" for "8"), resulting in a radically incorrect image!

reply by carygravel 1 March 2021

TIFF does not support JBIG2, as far as I know.

But PDF does support JBIG2, but not JBIG, I think. As you say, there can be issues with the compression and thus some authorities, particularly in the EU, do not allow PDFs with JBIG2 compression for archives.