[cairo] PDF memory usage?
britten at caris.com
Wed Jun 10 09:26:48 PDT 2009
Adrian Johnson wrote:
>> - Does the PDF surface keep all data in memory until the final
>> finish/flush is called?
> Data is kept in memory for the current page. When show_page() is called
> the data is written to the PDF file and freed from memory.
Ah, thanks. I confess I didn't see show_page() on the base Surface
there - I had been looking at the PDF surface...
Is there any conceptual problem with something like flush_to_file()
that writes the current contents to disk and frees them, but doesn't
advance to a new page? If it's theoretically possible, I might be
interested in exploring that option...
Also, can you (or someone) provide a bit of information about
cairo_pdf_surface_create_for_stream(), and when/how would someone
make use of it? The "written incrementally" in the description
sounds intriguing, but I'm not quite sure if it'd be useful to me...
>> For example, a 32-bit image that completely fills an A0 sheet
>> at 1200dpi would need over 8G of memory.
One thing I forgot to ask about:
I believe I recall seeing discussion here about somehow putting
JPEG data into a PDF. Would that approach be of any interest
to me, I wonder? Wouldn't writing out my image as a compressed
JPEG be [much] smaller than the in-memory 32-bit pixels?
It sounds to be related to some of the stuff covered at
http://www.verypdf.com/pdfinfoeditor/compression.htm, but I'm not
sure if any of that stuff is currently used/exposed by Cairo, or
perhaps planned for the future?
> PDF has a file size limit of about 10GB due to the use of a 10 digit
> number for the byte offset of each object.
Interesting - I didn't know that!
> Cairo is currently limited to
> 2GB on 32-bit architectures, and 10GB on 64-bit. I would not be
> surprised if many PDF consumers do not handle file sizes greater than
> the 2/4GB limit of 32-bit signed/unsigned integers.
Yep, I realize all the various OS limits override everything else.
I'm still just trying to explore what all the various limits are,
to get an idea what sort of solution might be possible...
>> The typical approach for dealing with large output like this
>> seems to be to try and chunk/tile the data. However, with
>> the target being PDF, I'm not sure if this is possible
> Cairo does not support this. My recommendation is to use a 64-bit
> machine with enough memory for the job.
For development, I've already done that (On Linux), and that may
end up being my "Hail Mary" play. Unfortunately, getting the
(Windows) applications upgraded to 64-bit is going to take
longer ... :(
As always, many thanks for all the info!
More information about the cairo