17. RenderGraph Compute/UAV pattern (SSAO/SSR type)
This chapter aims to learn Compute Pass as a “practical pattern” in URP RenderGraph.
- Like SSAO, “screen-based data (Depth/Normals) → output texture” structure
- Structure that combines “time (History) + motion/depth/normal/color” like SSR
Predecessor:
17.1 Raster vs Compute: When is Compute Advantageous?
Typical cases where compute is advantageous:
- When UAV(RandomWrite) is needed (e.g. histogram/reduction/tile classification)
- Preprocessing to divide the screen into tiles/clusters (Forward+ light list, SSR tile, etc.)
- Multiple outputs/unstructured memory access (e.g. writing scatter/list to buffer)
Conversely, cases where Raster (full screen pass) is advantageous:
- Focuses on texture sampling, like simple per-pixel filters (blur/color correction)
- When load/store can be minimized on tile-based GPUs
Practical conclusion
“Using Compute” is usually a choice that pays the price of complicating resource design (buffer/UAV/barrier).
So first clarify why you really need compute.
17.2 Basic framework of Compute Pass in RenderGraph
Based on URP document flow, Compute Pass is focused on:
- Instead of
AddRasterRenderPass,AddComputePass - Instead of
RasterGraphContext,ComputeGraphContext
Concept Code:
public override void RecordRenderGraph(RenderGraph renderGraph, ContextContainer frameData)
{
using (var builder = renderGraph.AddComputePass<PassData>("MyComputePass", out var passData))
{
builder.SetRenderFunc((PassData data, ComputeGraphContext ctx) =>
{
// ctx.cmd로 compute 커맨드 기록
});
}
}
Important differences
In Raster, the “render target” is set toSetRenderAttachment,
In Compute, UAV/resource binding is usually centered aroundSetComputeTextureParam/SetComputeBufferParam.
17.2.1 Compute pass is also a “Contract”
Like Raster, Compute must follow the rules of RenderGraph.
- Texture/buffer to read is declared as Read
- Texture/buffer to be used is declared as Write
Based on this declaration, RenderGraph:
- Resource lifetime (when to create/discard)
- Barrier (synchronization)
- Execution order
Configure .
The reason why compute often breaks in practice is because compute is used instead of “render target binding mistake”. Because “UAV declaration/binding mistakes” are more common.
17.3 UAV(RandomWrite) texture design: enableRandomWrite
To write a texture as a “write target (UAV)” in Compute, you must allow RandomWrite during the texture creation step.
RenderGraph texture desc (concept):
desc.enableRandomWrite = true
Additionally, textures to be used for UAV have format restrictions (support by platform). If possible, choose a format that has been verified by the URP/platform.
Practice pattern:
- SSAO:
R8,R16,RHalf(precision/bandwidth balance) - SSR:
RGBAHalfRyu (Interim results/history)
17.4 BufferHandle / GraphicsBuffer: Input/output buffer pattern
As examples from the URP document show, compute often “prints to a buffer” and reads from the CPU, or Another pass consumes that buffer.
17.4.1 Creating an output buffer (Structured Buffer)
Concept:
- Create
GraphicsBuffer(Structured) - Carry it to the RenderGraph pass data as
BufferHandle - Write from pass to compute
Caution
To read from the CPU, paths such asAsyncGPUReadbackmust be considered, and synchronous readback will cause a stall.
17.4.2 Five design questions when using buffers
- Is this buffer “frame-by-frame temporary” or “camera history”?
- Is the number of elements fixed? (If it is variable like the SSR tile list, counter/prefix sum is required)
- What is needed: Structured/Raw/Append/Consume?
- Should the CPU read it? (If you need to read, design the read timing/cycle/delay)
- Should I separate buffers for XR (eye star) and multicamera?
17.5 Pattern 1: SSAO (Half-Resolution) Compute Design Step
When implementing SSAO with “RenderGraph Compute”, the minimum design is usually like this.
Enter (required)
- Depth (Scene Depth)
- Normals (or normal reconstruction)
- Random/Noise (Blue Noise/Rotation Vector)
- Camera parameters (projection/near plane, etc.)
output
- AO texture (half resolution)
- Upsample results (full resolution) or blur results if necessary
RenderGraph pass configuration (recommended)
- AO creation (Compute, half-res UAV write)
- AO Blur (Compute or Raster)
- Upsample + synthesis (usually Raster)
Practical Tips
A hybrid that uses compute to create the AO and processes blur/compositing in full screen (raster) is often used.
17.6 Pattern 2: SSR (with time/history) Compute design steps
Screen Space Reflection (SSR) has rapidly increasing resource demands due to “geometric constraints + time accumulation.”
Input (Representative)
- Color (current frame)
- Depth
- Normals
- Roughness/Metallic (material parameters)
- Motion Vectors
- History Color (previous frame)
Output (representative)
- Reflection Color (current)
- Temporal Accumulation Results (History Update)
RenderGraph Design Points
- History textures are camera-specific and require a reset for cut/resolution changes.
Related: 04.8 History Render Textures
17.6.1 “Minimum pass decomposition” in SSR (practical perspective)
SSR types usually combine some of the following:
- Ray march (or Hierarchical Z): Search for candidate hit points (Compute)
- Resolve: Color sampling/fade at hit points (Compute or Raster)
- Temporal Accumulation: History Accumulation (Compute)
- Denoise/Blur: Noise removal (Compute or Raster)
Of these, (3) is why History texture/motion vector design becomes essential.
17.7 Practical Code Skeleton: URP RendererFeature + Compute Pass
This code is a skeleton showing the “structure”. The actual API signature may differ depending on the URP/RenderGraph version.
Be sure to check and adjust by compiling in a Unity 6.3 project.
using UnityEngine;
using UnityEngine.Rendering;
using UnityEngine.Rendering.Universal;
using UnityEngine.Rendering.RenderGraphModule;
public sealed class ComputeAoFeature : ScriptableRendererFeature
{
[System.Serializable]
public sealed class Settings
{
public ComputeShader computeShader;
public int kernelIndex = 0;
}
public Settings settings = new();
sealed class ComputeAoPass : ScriptableRenderPass
{
readonly ComputeShader _cs;
readonly int _kernel;
class PassData
{
public ComputeShader cs;
public int kernel;
public TextureHandle depth;
public TextureHandle normals;
public TextureHandle aoUav;
public Vector4 dispatch; // (gx, gy, gz, unused)
}
public ComputeAoPass(ComputeShader cs, int kernel)
{
_cs = cs;
_kernel = kernel;
}
public override void RecordRenderGraph(RenderGraph renderGraph, ContextContainer frameData)
{
var resources = frameData.Get<UniversalResourceData>();
// 입력(프로젝트/설정에 따라 존재 여부가 달라질 수 있음)
TextureHandle depth = resources.activeDepthTexture;
TextureHandle normals = resources.cameraNormalsTexture;
// 출력 UAV 텍스처(desc는 카메라 컬러 기반으로 잡는 것을 권장)
var aoDesc = renderGraph.GetTextureDesc(resources.activeColorTexture);
aoDesc.name = "AO_UAV";
aoDesc.enableRandomWrite = true;
aoDesc.width /= 2;
aoDesc.height /= 2;
var ao = renderGraph.CreateTexture(aoDesc);
using (var builder = renderGraph.AddComputePass<PassData>("Compute AO", out var passData))
{
passData.cs = _cs;
passData.kernel = _kernel;
passData.depth = depth;
passData.normals = normals;
passData.aoUav = ao;
passData.dispatch = new Vector4(
Mathf.CeilToInt(aoDesc.width / 8.0f),
Mathf.CeilToInt(aoDesc.height / 8.0f),
1, 0);
builder.UseTexture(passData.depth, AccessFlags.Read);
builder.UseTexture(passData.normals, AccessFlags.Read);
builder.UseTexture(passData.aoUav, AccessFlags.Write);
builder.SetRenderFunc((PassData data, ComputeGraphContext ctx) =>
{
var cmd = ctx.cmd;
cmd.SetComputeTextureParam(data.cs, data.kernel, "_CameraDepthTexture", data.depth);
cmd.SetComputeTextureParam(data.cs, data.kernel, "_CameraNormalsTexture", data.normals);
cmd.SetComputeTextureParam(data.cs, data.kernel, "_AOTexture", data.aoUav);
cmd.DispatchCompute(data.cs, data.kernel, (int)data.dispatch.x, (int)data.dispatch.y, (int)data.dispatch.z);
});
}
// 다음 패스가 접근할 수 있도록 전역 슬롯으로 노출(선택)
// (프로젝트 사정에 따라 필요)
// resources.xyz = ao; 또는 cmd.SetGlobalTexture(...)
}
}
ComputeAoPass _pass;
public override void Create()
{
if (settings.computeShader != null)
_pass = new ComputeAoPass(settings.computeShader, settings.kernelIndex);
}
public override void AddRenderPasses(ScriptableRenderer renderer, ref RenderingData renderingData)
{
if (_pass == null)
return;
renderer.EnqueuePass(_pass);
}
}
17.7.1 HLSL (Compute) example of using UAV texture in shader
// Compute shader snippet
RWTexture2D<float> _AOTexture;
Texture2D<float> _CameraDepthTexture;
[numthreads(8,8,1)]
void CSMain(uint3 id : SV_DispatchThreadID)
{
float d = _CameraDepthTexture[id.xy];
_AOTexture[id.xy] = saturate(d); // 예시: depth를 그대로 써보기
}
17.8 Performance design points (Compute)
17.8.1 Tile/thread group size
numthreads(8,8,1)is a common default value, but it is not correct.- The optimal point varies depending on whether memory access (continuity), cache, or shared memory is used.
Practical Routine:
- First, ensure operation/accuracy with 8x8
- Measure performance by changing to 16x16, etc. (by platform)
- Check whether it is a bandwidth bottleneck or ALU bottleneck (Profiler/GPU capture)
17.8.2 Reduce bandwidth (texture read/write)
If Compute is designed incorrectly, there will be too many “textures to read and write,” creating a bandwidth bottleneck.
- Active use of half-res (SSAO/blur/some SSR intermediate results)
- Reduce format (R8/R16 etc if possible)
- Reduce unnecessary intermediate textures (unify desc for RenderGraph to reuse)
Related: 04.9 Blit 최적화
17.8 Debugging checklist (Compute)
- Is
enableRandomWrite=truewritten in the texture desc? - Did you declare write in the builder? (
UseTexture(..., Write)) - Are the Dispatch group size (numthreads) and dispatch calculation consistent?
- Does the input texture actually exist? (Check with Requirements/RenderGraph Viewer)
- Does the platform support compute? (
SystemInfo.supportsComputeShaders)
17.9 Official documentation (recommended)
- Execute compute shader in Render Pass (URP, RenderGraph): https://docs.unity3d.com/Manual/urp/render-graph-compute-shader-run.html
17.9 Read next
- Lit include chain/function map: 16. URP Lit 실제 맵
- “Fully compatible” Pass template: 18. URP Lit 호환 Pass 템플릿