Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering.
If I write some text to use CP-1252 (Window's Latin 1 + "Smart Quotes"), naturally I would have to tell PDF::API2 what it's working with.
$line = readline($filehandle); # line is external encoding$data = decode("ISO-8859.1", $line); # data is now Perl internal encoding$text->text($data); # add to PDF
print $outfile ( encode_utf8($data) );
While true ISO-8859-1 (Latin-1) text is OK to leave undecoded, many people on Windows machines may well end up supplying CP-1252 encoded text strings (a variation of Latin-1), which will cause problems
For example, there was a proposed change for hyphenation, where the soft hyphen was encoded as UTF-8 (two bytes).QuoteLink?
Yet another issue that is nonexistent provided the input data is properly decoded.
Not quite. If I'm looking to match a soft hyphen, the code will have to know whether it's to look for ISO-8859-1 (single byte xAD) or UTF-8 (double byte xC2xAD)
Regarding glyph detection, maybe take a walk through the examples/ directory and look at the code that prints out all the glyphs in a font (e.g., 020_corefonts). That code seems to be able to tell if a glyph is defined for a given code point.