Subject: Cyrillic letters
1. The following Cyrillic glyphs (names according to
http://www.adobe.com/devnet/font/pdfs/5013.Cyrillic_Font_Spec.pdf)
afii10047 (uppercase 'Э')
afii10049 (uppercase 'Я')
afii10095 (lowercase 'э')
are not displayed when using TrueType fonts. I tried different encodings (CP1251, UTF8) with the same result.
2. When using core fonts, all the cyrillics are displayed overlapping each other with CP1251 encoding, and are not displayed at all with UTF8 encoding.
Perl version v5.10.1 built for MSWin32-x86-multi-thread
Binary build 1007 [291969] provided by ActiveState
Operating system Windows Vista Home Premium, Service Pack 1 (ver. 6.0.6001)
Subject: test-utf8.pdf
use locale;
use POSIX;
use PDF::Report;
my $encoding = 'cp1251';
POSIX::setlocale($encoding)
or die 'cannot set locale';
my $pdf = new PDF::API2( );
$pdf->mediabox( 'A4' );
my $page = $pdf->page();
my $txt = $page->text;
my $font = $pdf->ttfont('Times.ttf', '-encode' => $encoding );
my $fontsize = 12;
$txt->font($font,$fontsize);
$txt->translate(10,700);
$txt->text("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
$txt->translate(10,650);
$txt->text("abcdefghijklmnopqrstuvwxyz");
$txt->translate(10,600);
$txt->text("àáâãä叿çèéêëìíîïðñòóôõö÷øùüûúýþÿ");
$txt->translate(10,550);
$txt->text("ÀÁÂÃÄŨÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÜÛÚÝÞß");
my $font = $pdf->corefont('Times', '-encode' => $encoding );
my $fontsize = 12;
$txt->font($font,$fontsize);
$txt->translate(10,400);
$txt->text("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
$txt->translate(10,350);
$txt->text("abcdefghijklmnopqrstuvwxyz");
$txt->translate(10,300);
$txt->text("àáâãä叿çèéêëìíîïðñòóôõö÷øùüûúýþÿ");
$txt->translate(10,250);
$txt->text("ÀÁÂÃÄŨÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÜÛÚÝÞß");
my $font = $pdf->corefont('Times', '-encode' => $encoding );
my $fontsize = 12;
$txt->font($font,$fontsize);
$txt->translate(10,750);
$txt->text("Using true type font:");
$txt->translate(10,450);
$txt->text("Using core font:");
$pdf->saveas( 'test.pdf' );
#
Subject: [rt.cpan.org #57248]
Date: Mon, 15 Feb 2016 16:40:51 -0500
To: bug-PDF-API2 [...] rt.cpan.org
I modified the example text file to display x40 through xFF for both TrueType and Core fonts. I ran it for CP1251 (Cyrillic), CP1252 (Latin 1), CP1253 (Greek), and CP1254 (Turkish). This is Windows XP SP3, PDF::API2 2.025, Adobe Reader 11.0.08. All four character sets have some variety of MS "Smart Quotes" in the x80 - x9F range. I have not yet tried UTF-8 encoded text.
In all cases, the TTF displays perfectly, even the unassigned characters in the Smart Quotes range. The three Cyrillic characters reported missing in the original bug report are present and in the right place. All the CoreFont displays have problems with the Smart Quotes unassigned characters still displaying the empty box, but evidently having a near-zero width (so that the following character mostly overprints it).
Core Font only problems:
CP1251: All Cyrillic and possibly some other characters print correctly, but apparently have about 33% width and are overprinted by following characters.
CP1252: The unassigned characters in the Smart Quotes range get overprinted, but the rest of the Latin-1 characters look OK.
CP1253: The Greek letters behave just like the Cyrillic letters in 1251.
CP1254: The Turkish letters behave just like the Latin-1 letters in 1252.
The bottom line is that TTF looks OK from here (at least for CP125x encoding), but Core Fonts have trouble with unassigned ("box")
characters and non-Latin characters, where the characters look OK, but the text location is not advanced far enough and we get overprinting. Perhaps the font data (especially character width) isn't being read correctly? Since it works for (e.g.) CP1252, it seems odd that it would fail for non-Latin sets (note that Turkish is Latin). That would imply that the font files themselves are defective or non-standard in some way.