[cairo] [PATCH 3/3] win32: Allow gdi operations for argb32 surfaces (allowed by surface flags)

Vasily Galkin galkin-vv at yandex.ru
Sat Apr 28 19:27:01 UTC 2018

This ends the patch series that speedups the CAIRO_OPERATOR_SOURCE
when used to copy data
to a argb32 cairo   surface corresponding to a win32 dc
from a "backbuffer" - DibSection-based cairo surface
created with cairo_surface_create_similar().

This final patch allows gdi compositor to be used on argb32 surfaces.
Actually for display surfaces	only copying is allowed with gdi (by BitBlt),
since other operations are filtered by flags in implementations.

But since copying pixels is the only used operation in common scenario
"prepare offscreen image and put it to screen" - this is important for
presenting argb32 windows with cairo directly
or with gtk+gdk (which nowdays always create argb32 windows)

Before this patch pixel copy worked by:
1. mapping image to memory (by copying data from window dc to system memory
which is very slow on windows maybe due to gpu or interprocess access)
2. copying new data over that image.
3. copying updated image from system memory back to window dc.
After this patch there is only one step:

2+3. Copying new data over window dc.

Completely eliminating step 1 gives a very huge speedup and allows
argb32 cairo drawing be as fats as typical dibsection-buffered gdi drawing.

There is quick&dirty cairo-vs-gdi perf test made for this patch set:
The results show multiple times improvement:

Before speedup

Painting 5000 32bits-per-pixel single-color frames of size 1056x1056 for profiling
GDI entire pipeline      : 4.123983 GB/s, 5408.053900 ms, 924.546998 FPS
GDI entire drawing      : 4.156272 GB/s, 5366.039400 ms
cairo entire pipeline      : 0.835951 GB/s, 26679.463300 ms, 187.410067 FPS
cairo entire drawing      : 0.838992 GB/s, 26582.750800 ms
cairo fill inmem    : 16.130683 GB/s, 1382.627100 ms
cairo to window     : 1.102623 GB/s, 20226.963700 ms

After speedup (running several times shows that there is 5-10% inaccuracy, so this results sgouldn't be used as a source for comparing raw gdi vs cairo)

Painting 5000 32bits-per-pixel single-color frames of size 1056x1056 for profiling
GDI entire pipeline      : 4.139421 GB/s, 5387.883400 ms, 928.008204 FPS
GDI entire drawing      : 4.165124 GB/s, 5354.635400 ms
cairo entire pipeline      : 4.029344 GB/s, 5535.075100 ms, 903.330110 FPS
cairo entire drawing      : 4.063073 GB/s, 5489.126000 ms
cairo fill inmem    : 22.665569 GB/s, 983.991200 ms
cairo to window     : 5.049950 GB/s, 4416.423700 ms

End-user visible speedup does present too - it relates to the following bug


Cairo speedup allow more simultaneous meld windows
without eating 100% of cpu core time on spinner rendering.

gtk's speedup is near 1.7x, not such huge as pure cairo ~7-8x on results above
It looks that gtk has some problems in caching cairo surfaces
and recreates them every frame with initial black fill.
 src/win32/cairo-win32-gdi-compositor.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/win32/cairo-win32-gdi-compositor.c b/src/win32/cairo-win32-gdi-compositor.c
index 0873391..4a09a70 100644
--- a/src/win32/cairo-win32-gdi-compositor.c
+++ b/src/win32/cairo-win32-gdi-compositor.c
@@ -488,7 +488,8 @@ static cairo_bool_t check_blit (cairo_composite_rectangles_t *composite)
     if (dst->fallback)
 	return FALSE;
-    if (dst->win32.format != CAIRO_FORMAT_RGB24)
+    if (dst->win32.format != CAIRO_FORMAT_RGB24
+	&& dst->win32.format != CAIRO_FORMAT_ARGB32)
 	return FALSE;
     if (dst->win32.flags & CAIRO_WIN32_SURFACE_CAN_BITBLT)

More information about the cairo mailing list