The Render Graph
Live Demos: Custom Pass | Custom Multipass | Render Layers
The render graph is the system that schedules GPU work. Passes declare what they read and what they write, and the graph figures out the ordering, the memory layout, and the load and store operations. The motivation is that hand-ordering a real renderer does not scale.
The Problem: Manual Pass Ordering
A real renderer has dozens of passes. Shadow maps, geometry, SSAO, SSR, bloom, tonemapping, UI. Each one reads from and writes to intermediate textures. Doing this without automation means four kinds of pain.
The first is ordering. Shadow maps before geometry, geometry before SSAO, SSAO before compositing. Adding one pass means figuring out where it fits in the chain, and reordering one pass means re-checking three others.
The second is texture lifecycle. Allocate intermediate textures, track which ones are alive when, decide when to clear versus load and when to store versus discard. Getting this wrong shows up as black screens or stale data from a previous frame.
The third is memory. SSAO's intermediate texture and SSR's intermediate texture might never be alive at the same time. Without aliasing, both live in VRAM whether they need to or not.
The fourth is dynamic passes. Disabling bloom should not require rewriting the compositing pass's inputs. With hardcoded ordering, every conditional pass becomes an if statement threaded through the entire pipeline.
The render graph (also called a frame graph, after the Frostbite GDC 2017 talk "FrameGraph: Extensible Rendering Architecture in Frostbite") replaces all four with declared dependencies. Passes describe what they need. The graph handles ordering, memory, and lifecycle.
How It Works
The frame is modeled as a directed acyclic graph. Nodes are passes. Edges are resource dependencies. An edge from pass A to pass B means A produces data that B consumes.
This is the same abstraction as a build system (Make, Bazel) or a task scheduler. Given the edges, a topological sort produces a valid execution order. Once the order exists, the graph can analyze resource lifetimes, alias memory, compute load and store operations, and cull passes that do not contribute to any external output.
The key property is that dependencies are declarative, not imperative. A pass says "I read scene_color, I write bloom," not "run me after MeshPass and before PostProcessPass." Adding a new pass means declaring its slots, not editing every other pass that touches the same resources.
What This Buys
Five things fall out of the graph automatically.
Pass ordering is topologically sorted from read and write dependencies. Transient textures with non-overlapping lifetimes share GPU memory. The graph picks LoadOp::Clear, LoadOp::Load, StoreOp::Store, and StoreOp::Discard per attachment based on who reads what. Passes that do not contribute to any external output are culled. Passes can be enabled and disabled at runtime without recompiling the graph.
The RenderGraph Struct
#![allow(unused)] fn main() { pub struct RenderGraph<C = ()> { graph: DiGraph<GraphNode<C>, ResourceId>, // petgraph directed graph pass_nodes: HashMap<String, NodeIndex>, // pass name -> graph node resources: RenderGraphResources, // texture/buffer descriptors and handles execution_order: Vec<NodeIndex>, // topologically sorted pass order store_ops: HashMap<ResourceId, StoreOp>, // per-resource store operations clear_ops: HashSet<(NodeIndex, ResourceId)>,// which passes clear which resources aliasing_info: Option<ResourceAliasingInfo>,// memory sharing between transients culled_passes: HashSet<NodeIndex>, // passes removed by dead-pass culling // ... } }
The generic parameter C is the "configs" type passed to passes during execution. Nightshade uses RenderGraph<World> so passes can read ECS state directly.
Lifecycle
1. Setup Phase (once at startup)
#![allow(unused)] fn main() { let mut graph = RenderGraph::new(); // Declare textures let depth = graph.add_depth_texture("depth") .size(1920, 1080) .clear_depth(0.0) .transient(); let scene_color = graph.add_color_texture("scene_color") .format(wgpu::TextureFormat::Rgba16Float) .size(1920, 1080) .clear_color(wgpu::Color::BLACK) .transient(); let swapchain = graph.add_color_texture("swapchain") .format(surface_format) .external(); // Add passes with slot bindings graph.add_pass( Box::new(clear_pass), &[("color", scene_color), ("depth", depth)], )?; graph.add_pass( Box::new(mesh_pass), &[("color", scene_color), ("depth", depth)], )?; graph.add_pass( Box::new(blit_pass), &[("input", scene_color), ("output", swapchain)], )?; // Compile: build edges, sort, compute aliasing graph.compile()?; }
2. Per-Frame Execution
#![allow(unused)] fn main() { // Provide the swapchain texture for this frame graph.set_external_texture(swapchain_id, swapchain_view, width, height); // Execute all passes, get command buffers let command_buffers = graph.execute(&device, &queue, &world)?; // Submit to GPU queue.submit(command_buffers); }
Key Methods
| Method | Description |
|---|---|
new() | Create an empty graph |
add_color_texture() | Declare a color render target (returns builder) |
add_depth_texture() | Declare a depth buffer (returns builder) |
add_buffer() | Declare a GPU buffer (returns builder) |
add_pass() | Add a pass with slot-to-resource bindings |
pass() | Fluent pass builder (alternative to add_pass) |
compile() | Build dependency graph, topological sort, compute aliasing |
execute() | Prepare and run all passes, return command buffers |
set_external_texture() | Provide an external texture (e.g. swapchain) each frame |
set_pass_enabled() | Enable/disable a pass at runtime |
get_pass_mut() | Access a pass for runtime configuration |
resize_transient_resource() | Change dimensions of a transient texture |
Compilation Steps
compile() runs seven steps in sequence.
- Build dependency edges. For each resource, an edge is created from the writer to every reader.
- Topological sort. The passes are ordered so every pass executes after its dependencies.
- Compute store ops. Each resource write is marked
StoreorDiscardbased on whether any later pass reads it. - Compute clear ops. The first pass that writes a resource with a clear value gets
Clear. The rest getLoad. - Compute resource lifetimes. Each transient gets a
first_useandlast_usepass index. - Compute resource aliasing. Transient resources with non-overlapping lifetimes are assigned to the same pool slot.
- Dead pass culling. Passes that do not contribute to any external output are marked for skipping.
Sub-Chapters
- Resources & Textures covers resource types, builders, and the difference between external and transient.
- Passes & the PassNode Trait covers how to implement a custom pass.
- Dependency Resolution & Scheduling covers how the order is computed.
- Resource Aliasing & Memory covers how transients share memory.
- Custom Passes walks through end-to-end examples.