A Quick Overview of Chrome's Rendering Path

I just wanted to see what Chrome was doing for it's rendering compared to Firefox, especially after Slimming Paint. Chrome has three main processes that interact to render a web page.

  1.  The Renderer process - This is what you think of when you hear about multi-process Chrome. Each tab has it's own renderer process.

  2. The GPU process - This is a sandboxed GPU environment where Chrome issues all hardware accelerated GPU calls to composite.

  3. The UI process - One process that renders the UI such as the tabs at the top of the browser.

From a high level overview, content is first rendered in the render process (1). The rendered content is shared with the GPU process to do the actual composting (2). The GPU process then issues opengl calls to composite the content along with the browser UI on the screen. The most interesting stuff happens in the render process. The overall flow in the renderer process as of today to render a webpage (lots are changing in Slimming Paint) is:

  1. On the main thread of the renderer process, rendering starts during parsing of the HTML, which creates a DOM tree.

  2. These DOM nodes mostly get turned into LayoutObjects, which know how to paint themselves. These get layerized into multiple different kinds of layers you can read up on here.

  3. GraphicsLayers paint the items off the LayoutObjects into a blink::Display Items. When content gets painted, Chrome issues Skia draw calls to draw the content. But the content isn't actually drawn, it's recorded into an SkPicture. Most content of the display items in Chrome are wrappers around SkPictures or map to the Skia API.

  4. The generated blink::DisplayList is then translated to Chrome Compositor (cc) Display Lists, which are roughly the same, but no longer contain references to the LayoutObjects. Yes, there are two display lists.

  5. This cc::display list is shipped to the compositor thread of the renderer process. Using a work queue, multiple compositor threads replay the recorded Skia drawing commands to draw to 256x256 tiles.

  6. The resulting tiles are then composited with hardware acceleration in the GPU process. The GPU process does the tile GPU upload.

Let's take a look at this with a real example. Consider a simple page with just some text and a 256x256 image:

<html>
Hello World
<img src="firefox.png" />
</html>

This generates into a blink DisplayList as such:

[{index: 0, client: "0x1cd204404018 LayoutView #document", type: "DrawingDocumentBackground", rect: [0.000000,0.000000 1050.000000x755.000000], cacheIsValid: true},
{index: 1, client: "0x1cd204438010 InlineTextBox 'Hello World'", type: "DrawingPaintPhaseForeground", rect: [8.000000,8.000000 80.000000x18.000000], cacheIsValid: true},
{index: 2, client: "0x1cd204428018 LayoutImage IMG", type: "DrawingPaintPhaseForeground", rect: [8.000000,42.000000 256.000000x256.000000], cacheIsValid: false}]<p>Hello, World!</p>

The index here where the item exists in the display list. The client is what the actual display item is there to paint, and is a reference to a LayoutObject. The first LayoutView #document is just the empty white background we have in a simple page. The "type" is the only unique thing about the blink display item, which is which phase these items should be painted. Painting in this context means going through the Layout tree and generating blink::DisplayItems, which are wrappers around an SkPicture. An SkPicture is just a recording of draw commands such as "draw text" or "draw image" which map to the Skia API. Thus unlike Gecko, blink doesn't draw from the display items. Instead, display items are the result of a recording of draw commands issued by painting items recursively down the layout tree. Once we have the blink display items, these are translated into cc display items, which are also just wrappers around the recorded SkPicture. Once the translation is done, the main thread then notifies the compositor thread that there is some work to do.

The compositor thread actually rasterizes items on the screen based on 256x256 tiles. The compositor thread itself also doesn't actually replay the SkPicture draw commands, but instead sets up a work queue with items to draw. Then multiple compositor tile worker threads take an work item and replay the SkPicture draw calls onto an SkCanvas. In our example, we have 3 display items but all items are painted onto one SkCanvas. Once the tile worker threads have finished rasterizing the cc::DisplayItems, Chrome can start compositing everything together. Chrome by default on Windows doesn't use GPU acceleration to rasterize content, but it does for compositing. Remember the GPU process at the beginning though? All GPU commands must be executed in the GPU process so that when the GPU driver does something bad (and it happens too often), it doesn't take down the browser. To do this, Chrome uses shared memory with the GPU process. The GPU process reads the final rasterized image and uploads it to the GPU. Then the GPU process issues GL draw calls to draw quads, which is just the rectangle containing the final bitmap image for the tile. Finally, Chrome's ubercompositor composites all the images together with the browser's UI.