Post without Account — your post will be reviewed, and if appropriate, posted under Anonymous. You can also use this link to report any problems registering.

Pseudo page objects

  • 5 Replies
  • 1460 Views
*

Offline sciurius

  • Jr. Member
  • **
  • 67
    • View Profile
    • Website
Pseudo page objects
« July 07, 2017, 02:29:13 AM »
For a complex typesetting job I would like to do the following:
- create text and gfx objects
- create the desired texts and graphics
- establish the size (bounding box)
- put the text and gfx objects onto another page

Main purpose it so decide whether the result still fits on the page, or should be moved to the next page.

My gut feeling is that this should be possible with PDF::API2 but my attempts have been unsuccessful.
Any suggestions/ideas?

*

Offline Phil

  • Global Moderator
  • Sr. Member
  • *****
  • 392
    • View Profile
Re: Pseudo page objects
« Reply #1: July 07, 2017, 09:27:13 AM »
Well, if your markup needs are very simple (just a stream of text to be fit to a column), the paragraph and section calls may be enough. They will return any text that needs to go to the next page (but could allow a widow). Anything more complicated than that would require either

  • A virtual output page, where you could tenatively write text and other things to the page, and if you're happy with it (i.e., it doesn't overflow), give a command to "put ink on the page" and make the write permanent.
  • A test write to ask how much space something will take, and make the decision whether to write on this page or go to the next. It would be the normal processing up to the point of "writing" to the output data structure. All the processing would have to be done again for the actual write.

If a full paragraph doesn't fit, you would want to know if you can split it (without widows or orphans), which of course gets into the field of paragraph shaping. You probably would not want to rearrange text, but it might be desirable to move up an image in order to fill a hole at the bottom of a page, or move text above an image to do the same thing. Either way (with text), you're likely going to need to split a paragraph. It would be up to you to be aware of text such as "in the image above" when that text has been moved above the referenced image! Some sort of cross-reference call to output your choice of appropriate text might be nice.

It shouldn't be difficult to add a call to tell you how much space remains on the page (both in lines at current settings, and dimension such as points or cm). It might not be too bad to add a call to ask how much space (lines and dimension) a given new paragraph will take up (simple text from a string), and where (if anywhere) it can be split without introducing a widow or orphan. Here you're getting into more advanced typesetting that may not be appropriate for PDF::API2, but should be thought about for companion packages.

*

Offline sciurius

  • Jr. Member
  • **
  • 67
    • View Profile
    • Website
Re: Pseudo page objects
« Reply #2: July 07, 2017, 03:10:57 PM »
Thanks for the feedback.
Unfortunately my needs are more complex than a paragraph of text.
Although both suggestions (virtual page and test write) are interesting, they involve repeating everything (on the real page and position) after the testing — precisely what I would like to avoid.

*

Offline Phil

  • Global Moderator
  • Sr. Member
  • *****
  • 392
    • View Profile
Re: Pseudo page objects
« Reply #3: July 07, 2017, 05:45:29 PM »
I think the virtual page method could be done without repeating everything. It would involve adding a flag to items added to objects, indicating whether this is a real write or if it's tentative. A "tentative" write, if it didn't overflow the page (or otherwise displease you), would be converted into a "real" write by changing the flag. Otherwise, you either keep adding "tentative" content, or erase it and do something else (new page, etc.). It might even be possible to change X and Y values of "tentative" content to re-position it (e.g., to stretch text baselines slightly to fill the page). That might require additional changes to the object data structure to mark what is a changeable address, and what should be kept relative to another location (e.g., when drawing you might want to change only absolute addresses, and not relative addresses).

(As an alternative), in general, would it be useful to be able to "walk" all objects on the page, and move or delete items under program control? There could be "helper" functions to change text baseline spacing in a consistent manner, etc., or scale up/down some drawing (graphics). If a paragraph has overflowed, you could even chop off the bottom and move it to the next page, or call a paragraph shaper to do a little "nip and tuck" to get the paragraph to fit (replace the existing paragraph). None of this would be trivial, of course. It would probably involve keeping extra data with the page's objects, which would be purged when it's actually written to file.

If none of this works for you, perhaps you could describe some detailed examples of what you're trying to do here. I take it you're trying to output pages without a lot of (or perhaps, any) manual intervention, so trial-and-error fitting is unacceptable.

*

Offline sciurius

  • Jr. Member
  • **
  • 67
    • View Profile
    • Website
Re: Pseudo page objects
« Reply #4: July 09, 2017, 04:49:17 PM »
You're thinking too much in terms of low-level operations.
I thought it would be straightforward to just create the objects, and then move the top-object to another location on another page if necessary.
I think I'll need to rethink this a bit more.

*

Offline Phil

  • Global Moderator
  • Sr. Member
  • *****
  • 392
    • View Profile
Re: Pseudo page objects
« Reply #5: July 10, 2017, 11:26:18 AM »
Well, I can't think of getting much lower in operations than manually moving objects between pages... but that's an interesting idea. Would the object be unique on the page (i.e., not sharing the same $text object as everything else)? Then it could be moved as one object, rather than having to first split up an object. It might be feasible if this is done early enough in the process, before a bunch of other stuff is done that creates a lot of cross links between PDF objects (targeted to a page) and complicates things.

I've been mulling over something like this for a while, and think that "writing" to a virtual page might be best, giving the program a chance to move stuff around on a page and even between pages. You might keep the last two or three pages "written" in virtual form (sort of a VM) to ease the task of adjustment, and when the next new page is started, declare the oldest page "done" and actually write it out to the file. Something like that.