[フレーム]

Alex_Vlachos_Advanced_VR_Rendering_GDC2015

- The document discusses advanced rendering techniques for virtual reality. It outlines Valve's research into VR hardware and software over the past 3 years. - Key topics covered include stereo rendering methods, timing techniques like prediction and avoiding GPU bubbles, reducing specular aliasing using normal maps and roughness values, and geometric specular aliasing. The goal is high quality rendering at low GPU specifications to support widespread adoption of VR.

40 / 67Roughness to Exponent Conversion 40 void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent } くろまる Diffuse lighting is Lambert raised to exponent (N.Lk) where k is in the range 0.6-1.4 くろまる Experimented with anisotropic diffuse lighting, but not worth the instructions くろまる Specular exponent range is 1-16,384 and is a modified Blinn-Phong with anisotropy (more on this later)
42 / 67Shader Code 42 Anisotropic Specular Lighting: float3 vHalfAngleDirWs = normalize( vPositionToLightDirWs.xyz + vPositionToCameraDirWs.xyz ); float3 vSpecularNormalX = vHalfAngleDirWs.xyz - ( vTangentUWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentUWs.xyz ) ); float3 vSpecularNormalY = vHalfAngleDirWs.xyz - ( vTangentVWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentVWs.xyz ) ); float flNDotHX = max( 0.0, dot( vSpecularNormalX.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkX = pow( flNDotHX, vSpecularExponent.x * 0.5 ); flNDotHkX *= vSpecularScale.x; float flNDotHY = max( 0.0, dot( vSpecularNormalY.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkY = pow( flNDotHY, vSpecularExponent.y * 0.5 ); flNDotHkY *= vSpecularScale.y; float flSpecularTerm = flNDotHkX * flNDotHkY; Isotropic Diffuse Lighting: float flDiffuseTerm = pow( flNDotL, flDiffuseExponent ) * ( ( flDiffuseExponent + 1.0 ) * 0.5 ); Isotropic Specular Lighting: float flNDotH = saturate( dot( vNormalWs.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHk = pow( flNDotH, dot( vSpecularExponent.xy, float2( 0.5, 0.5 ) ) ); flNDotHk *= dot( vSpecularScale.xy, float2( 0.33333, 0.33333 ) ); // 0.33333 is to match the spec intensity of the aniso algorithm above float flSpecularTerm = flNDotHk; void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent }
Advanced VR Rendering Alex Vlachos, Valve Alex@ValveSoftware.com
Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 2
VR at Valve くろまる Began VR research 3+ years ago くろまる Both hardware and software engineers くろまる Custom optics designed for VR くろまる Display technology – low persistence, global display くろまる Tracking systems くろまる Fiducial-based positional tracking くろまる Desktop dot-based tracking and controllers くろまる Laser-tracked headset and controllers くろまる SteamVR API – Cross-platform, OpenVR 3
HTC Vive Developer Edition Specs くろまる Refresh rate: 90 Hz (11.11 ms per frame) くろまる Low persistence, global display くろまる Framebuffer: 2160x1200 (1080x1200 per-eye) くろまる Off-screen rendering ~1.4x in each dimension: くろまる 1512x1680 per-eye = 2,540,160 shaded pixels per-eye (brute-force) くろまる FOV is about 110 degrees くろまる 3600 room-scale tracking くろまる Multiple tracked controllers and other input devices 4
Room-Scale Tracking 5
Optics & Distortion (Pre-Warp) Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion (Visualizing 1.4x render target scalar) 6
Optics & Distortion (Post-Warp) Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion (Visualizing 1.4x render target scalar) 7
Shaded Visible Pixels per Second くろまる 720p @ 30 Hz: 27 million pixels/sec くろまる 1080p @ 60 Hz: 124 million pixels/sec くろまる 30" Monitor 2560x1600 @ 60 Hz: 245 million pixels/sec くろまる 4k Monitor 4096x2160 @ 30 Hz: 265 million pixels/sec くろまる VR 1512x1680x2 @ 90 Hz: 457 million pixels/sec くろまる We can reduce this to 378 million pixels/sec (later in the talk) くろまる Equivalent to 30" Monitor @ 100 Hz for a non-VR renderer 8
There Are No "Small" Effects くろまる Tracking allows users to get up close to anything in the tracked volume くろまる Can’t implement a super expensive effect and claim "it’s just this small little thing in the corner" くろまる Even your floors need to be higher fidelity than we have traditionally authored くろまる If it’s in your tracked volume, it must be high fidelity 9
VR Rendering Goals 10 くろまる Lowest GPU min spec possible くろまる We want VR to succeed, but we need customers くろまる The lower the min spec, the more customers we have くろまる Aliasing should not be noticeable to customers くろまる Customers refer to aliasing as "sparkling" くろまる Algorithms should scale up to multi-GPU installations くろまる Ask yourself, "Will ‘X’ scale efficiently to a 4-GPU machine?"
Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 11
Stereo Rendering (Single-GPU) くろまる Brute-force run your CPU code twice (BAD) くろまる Use geometry shader to amplify geometry (BAD) くろまる Resubmit command buffers (GOOD, our current solution) くろまる Use instancing to double geo (BETTER. Half the API calls, improved cache coherency for VB/IB/texture reads) くろまる "High Performance Stereo Rendering For VR", Timothy Wilson, San Diego Virtual Reality Meetup 12
Stereo Rendering (Multi-GPU) くろまる AMD and NVIDIA both provide DX11 extensions to accelerate stereo rendering across multiple GPUs くろまる We have already tested the AMD implementation and it nearly doubles our framerate – have yet to test the NVIDIA implementation but will soon くろまる Great for developers くろまる Everyone on your team can have a multi-GPU solution in their dev box くろまる This allows you to break framerate without uncomfortable low-framerate VR くろまる But lie to your team about framerate and report single-GPU fps :) 13
Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 14
Prediction くろまる We aim to keep prediction times (render to photons) for the HMD and controller transforms as short as possible (accuracy is more important than total time) くろまる Low persistence global displays: panel is lit for only ~2 ms of the 11.11 ms frame NOTE: Image above is not optimal VR rendering, but helps describe prediction (See later slides) 15
Pipelined Architectures くろまる Simulating next frame while rendering the current frame くろまる We re-predict transforms and update our global cbuffer right before submit くろまる VR practically requires this due to prediction constraints くろまる You must conservatively cull on the CPU by about 5 degrees 16
Waiting for VSync くろまる Simplest VR implementation, predict right after VSync くろまる Pattern #1: Present(), clear back buffer, read a pixel くろまる Pattern #2: Present(), clear back buffer, spin on a query くろまる Great for initial implementation, but please DO NOT DO THIS. GPUs are not designed for this. くろまる See John McDonald’s talk: くろまる "Avoiding Catastrophic Performance Loss: Detecting CPU-GPU Sync Points", John McDonald, NVIDIA, GDC 2014 17
GPU Bubbles くろまる If you start submitting draw calls after VSync: くろまる Ideally, your capture should look like this: (Images are screen captures of NVIDIA Nsight) 18
"Running Start" くろまる If you start to submit D3D calls after VSync: くろまる Instead, start submitting D3D calls 2 ms before VSync. (2 ms is a magic number based on the 1.5-2.0ms GPU bubbles we measured on current GPUs): くろまる But, you end up predicting another 2 ms (24.22 ms total) 19
"Running Start" VSync 20 くろまる Question: How do you know how far you are from VSync? くろまる Answer: It’s tricky. Rendering APIs don’t directly provide this. くろまる The SteamVR/OpenVR API on Windows in a separate process spins on calls to IDXGIOutput::WaitForVBlank() and notes the time and increments a frame counter. The application can then call GetTimeSinceLastVSync() that also returns a frame ID. くろまる GPU vendors, HMD devices, and rendering APIs should provide this
"Running Start" Details 21 くろまる To deal with a bad frame, you need to partially synchronize with the GPU くろまる We inject a query after clearing the back buffer, submit our entire frame, spin on that query, then call Present() くろまる This ensures we are on the correct side of VSync for the current frame, and we can now spin until our running start time
Why the Query Is Critical 22 くろまる If a frame is late, the query will keep you on the right side of VSync for the following frame ensuring your prediction remains accurate
Running Start Summary 23 くろまる This is a solid 1.5-2.0ms GPU perf gain! くろまる You want to see this in NVIDIA Nsight: くろまる You want to see this in Microsoft’s GPUView:
Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 24
Aliasing Is Your Enemy くろまる The camera (your head) never stops moving. Aliasing is amplified because of this. くろまる While there are more pixels to render, each pixel fills a larger angle than anything we’ve done before. Here are some averages: くろまる 2560x1600 30" monitor: ~50 pixels/degree (50 degree H fov) くろまる 720p 30" monitor: ~25 pixels/degree (50 degree H fov) くろまる VR: ~15.3 pixels/degree (110 degree fov w/ 1.4x) くろまる We must increase the quality of our pixels 25
4xMSAA Minimum Quality くろまる Forward renderers win for antialiasing because MSAA just works くろまる We use 8xMSAA if perf allows くろまる Image-space antialiasing algorithms must be compared side-by-side with 4xMSAA and 8xMSAA to know how your renderer will compare to others in the industry くろまる Jittered SSAA is obviously the best using the HLSL ‘sample’ modifier, but only if you can spare the perf 26
Normal Maps Are Not Dead くろまる Most normal maps work great in VR...mostly. くろまる What doesn’t work: くろまる Feature detail larger than a few cm inside tracked volume is bad くろまる Surface shape inside a tracked volume can’t be in a normal map くろまる What does work: くろまる Distant objects outside the tracked volume you can’t inspect up close くろまる Surface "texture" and fine details: 27
Normal Map Mipping Error 28 Blinn-Phong Specular Zoomed out normal map box filtered mips Zoomed out super-sampled 36 samples Expected glossiness Incorrect glossiness
Normal Map Mipping Problems くろまる Any mip filter that just generates an averaged normal loses important roughness information 29
Normal Map Visualization 30 4096x4096 Normal Map Fire Alarm 4x4 Mip Visualization 2x2 Mip 1x1 Mip
Normal Map Visualization 31 4096x4096 Normal Map Fire Alarm 8x8 Mip Visualization16x16 Mip Visualization
Normal Map Visualization 32 4096x4096 Normal Map Dota 2 Mirana Body 4x4 Mip Visualization 2x2 Mip 1x1 Mip
Normal Map Visualization 33 4096x4096 Normal Map Dota 2 Juggernaut Sword Handle 4x4 Mip Visualization 2x2 Mip 1x1 Mip
Normal Map Visualization 34 4096x4096 Normal Map Shoulder Armor 4x4 Mip Visualization 2x2 Mip 1x1 Mip
Normal Map Visualization 35 4096x4096 Normal Map Metal Siding 4x4 Mip Visualization 2x2 Mip 1x1 Mip
Roughness Encoded in Mips くろまる We can store a single isotropic value (visualized as the radius of a circle) that is the standard deviation of all 2D tangent normals from the highest mip that contributed to this texel くろまる We can also store a 2D anisotropic value (visualized as the dimensions of an ellipse) for the standard deviation in X and Y separately that can be used to compute tangent-space axis-aligned anisotropic lighting! 36
Final Mip Chain 37
Add Artist-Authored Roughness くろまる We author 2D gloss = 1.0 – roughness くろまる Mip with a simple box filter くろまる Add/sum it with the normal map roughness at each mip level くろまる Because we have anisotropic gloss maps anyway, storing the generated normal map roughness is FREE 38 Isotropic Gloss Anisotropic Gloss
Tangent-Space Axis-Aligned Anisotropic Lighting くろまる Standard isotropic lighting is represented along the diagonal くろまる Anisotropy is aligned with either of the tangent-space axes くろまる Requires only 2 additional values paired with a 2D tangent normal = Fits into an RGBA texture (DXT5 >95% of the time) 39
Roughness to Exponent Conversion 40 void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent } くろまる Diffuse lighting is Lambert raised to exponent (N.Lk) where k is in the range 0.6-1.4 くろまる Experimented with anisotropic diffuse lighting, but not worth the instructions くろまる Specular exponent range is 1-16,384 and is a modified Blinn-Phong with anisotropy (more on this later)
How Anisotropy Is Computed 41 Tangent U Lighting * = Tangent V Lighting Final Lighting * =
Shader Code 42 Anisotropic Specular Lighting: float3 vHalfAngleDirWs = normalize( vPositionToLightDirWs.xyz + vPositionToCameraDirWs.xyz ); float3 vSpecularNormalX = vHalfAngleDirWs.xyz - ( vTangentUWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentUWs.xyz ) ); float3 vSpecularNormalY = vHalfAngleDirWs.xyz - ( vTangentVWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentVWs.xyz ) ); float flNDotHX = max( 0.0, dot( vSpecularNormalX.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkX = pow( flNDotHX, vSpecularExponent.x * 0.5 ); flNDotHkX *= vSpecularScale.x; float flNDotHY = max( 0.0, dot( vSpecularNormalY.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkY = pow( flNDotHY, vSpecularExponent.y * 0.5 ); flNDotHkY *= vSpecularScale.y; float flSpecularTerm = flNDotHkX * flNDotHkY; Isotropic Diffuse Lighting: float flDiffuseTerm = pow( flNDotL, flDiffuseExponent ) * ( ( flDiffuseExponent + 1.0 ) * 0.5 ); Isotropic Specular Lighting: float flNDotH = saturate( dot( vNormalWs.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHk = pow( flNDotH, dot( vSpecularExponent.xy, float2( 0.5, 0.5 ) ) ); flNDotHk *= dot( vSpecularScale.xy, float2( 0.33333, 0.33333 ) ); // 0.33333 is to match the spec intensity of the aniso algorithm above float flSpecularTerm = flNDotHk; void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent }
Geometric Specular Aliasing くろまる Dense meshes without normal maps also alias, and roughness mips can’t help you! くろまる We use partial derivatives of interpolated vertex normals to generate a geometric roughness term that approximates curvature. Here is the hacky math: float3 vNormalWsDdx = ddx( vGeometricNormalWs.xyz ); float3 vNormalWsDdy = ddy( vGeometricNormalWs.xyz ); float flGeometricRoughnessFactor = pow( saturate( max( dot( vNormalWsDdx.xyz, vNormalWsDdx.xyz ), dot( vNormalWsDdy.xyz, vNormalWsDdy.xyz ) ) ), 0.333 ); vRoughness.xy = max( vRoughness.xy, flGeometricRoughnessFactor.xx ); // Ensure we don’t double-count roughness if normal map encodes geometric roughness 43 Visualization of flGeometricRoughnessFactor
Geometric Specular Aliasing Part 2 くろまる MSAA center vs centroid interpolation: It’s not perfect くろまる Normal interpolation can cause specular sparkling at silhouettes due to over-interpolated vertex normals くろまる Here’s a trick we are using: くろまる Interpolate normal twice: once with centroid, once without float3 vNormalWs : TEXCOORD0; centroid float3 vCentroidNormalWs : TEXCOORD1; くろまる In the pixel shader, choose the centroid normal if normal length squared is greater than 1.01 if ( dot( i.vNormalWs.xyz, i.vNormalWs.xyz ) >= 1.01 ) { i.vNormalWs.xyz = i.vCentroidNormalWs.xyz; } 44
Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 45
Normal Map Encoding くろまる Projecting tangent normals onto Z plane only uses 78.5% of the range of a 2D texel くろまる Hemi-octahedron encoding uses the full range of a 2D texel くろまる "A Survey of Efficient Representations for Independent Unit Vectors", Cigolle et al., Journal of Computer Graphics Techniques Vol. 3, No. 2, 2014 (Image modified from above paper) 46
Scale Render Target Resolution くろまる Turns out, 1.4x is just a recommendation for the HTC Vive (Each HMD design has a different recommended scalar based on optics and panels) くろまる On slower GPUs, scale the recommended render target scalar down くろまる On faster GPUs, scale the recommended render target scalar up くろまる If you’ve got GPU cycles to burn, BURN THEM 47
Anisotropic Texture Filtering くろまる Increases the perceived resolution of the panels (don’t forget, we only have fewer pixels per degree) くろまる Force this on for color and normal maps くろまる We use 8x by default くろまる Disable for everything else. Trilinear only, but measure perf. Anisotropic filtering may be "free" if you are bottlenecked elsewhere. 48
Noise Is Your Friend くろまる Gradients are horrible in VR. Banding is more obvious than LCD TVs. くろまる We add noise on the way into the framebuffer when we have floating-point precision in the pixel shader float3 ScreenSpaceDither( float2 vScreenPos ) { // Iestyn's RGB dither (7 asm instructions) from Portal 2 X360, slightly modified for VR float3 vDither = dot( float2( 171.0, 231.0 ), vScreenPos.xy + g_flTime ).xxx; vDither.rgb = frac( vDither.rgb / float3( 103.0, 71.0, 97.0 ) ) - float3( 0.5, 0.5, 0.5 ); return ( vDither.rgb / 255.0 ) * 0.375; } 49
Environment Maps くろまる Standard implementation at infinity = only works for sky くろまる Need to use some type of distance remapping for environment maps くろまる Sphere is cheap くろまる Box is more expensive くろまる Both are useful in different situations くろまる Read this online article: くろまる "Image-based Lighting approaches and parallax-corrected cubemaps", Sébastien Lagarde, 2012 50
Stencil Mesh (Hidden Area Mesh) くろまる Stencil out the pixels you can’t actually see through the lenses. GPUs are fast at early stencil-rejection. くろまる Alternatively you can render to the depth buffer at near z so everything early z-rejects instead くろまる Lenses produce radially symmetric distortion which means you effectively see a circular area projected on the panels 51
Stencil Mesh (Warped View) 52
Stencil Mesh (Ideal Warped View) 53
Stencil Mesh (Wasted Pixels) 54
Stencil Mesh (Unwarped View) 55
Stencil Mesh (Unwarped View) 56
Stencil Mesh (Final Unwarped View) 57
Stencil Mesh (Final Warped View) 58
Stencil Mesh (Hidden Area Mesh) くろまる SteamVR/OpenVR API will provide this mesh to you くろまる Results in a 17% fill rate reduction! くろまる No stencil mesh: VR 1512x1680x2 @ 90Hz: 457 million pixels/sec くろまる 2,540,160 pixels per eye (5,080,320 pixels total) くろまる With stencil mesh: VR 1512x1680x2 @ 90Hz: 378 million pixels/sec くろまる About 2,100,000 pixels per eye (4,200,000 pixels total) 59
Warp Mesh (Lens Distortion Mesh) 60
Warp Mesh (Brute-Force) 61
Warp Mesh (Cull UV’s Outside 0-1) 62
Warp Mesh (Cull Stencil Mesh) 63
Warp Mesh (Shrink Wrap) 15% of pixels culled from the warp mesh 64
Performance Queries Required! くろまる You are always VSync’d くろまる Disabling VSync to see framerate will make you dizzy くろまる Need to use performance queries to report GPU workload くろまる Simplest implementation is to measure first to last draw call くろまる Ideally measure these things: くろまる Idle time from Present() to first draw call くろまる First draw call to last draw call くろまる Idle time from last draw call to Present() 65
Summary くろまる Stereo Rendering くろまる Prediction くろまる "Running Start" (Saves 1.5-2.0 ms/frame) くろまる Anisotropic Lighting & Mipping Normal Maps くろまる Geometric Specular Antialiasing くろまる Stencil Mesh (Saves 17% pixels rendered) くろまる Optimized Warp Mesh (Reduces cost by 15%) くろまる Etc. 66
Thank You! Alex Vlachos, Valve Alex@ValveSoftware.com 67

More Related Content

Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
PDF
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Mobile VR, Programming, Rendering
PPTX
Mobile VR, Programming, Rendering
Avoiding 19 Common OpenGL Pitfalls
PDF
Avoiding 19 Common OpenGL Pitfalls
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
PDF
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
Siggraph 2016 - Vulkan and nvidia : the essentials
PPTX
Siggraph 2016 - Vulkan and nvidia : the essentials
Porting the Source Engine to Linux: Valve's Lessons Learned
PPTX
Porting the Source Engine to Linux: Valve's Lessons Learned
GPU accelerated path rendering fastforward
PPT
GPU accelerated path rendering fastforward
Migrating from OpenGL to Vulkan
PPTX
Migrating from OpenGL to Vulkan
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Alex_Vlachos_Advanced_VR_Rendering_Performance_GDC2016
Mobile VR, Programming, Rendering
Mobile VR, Programming, Rendering
Avoiding 19 Common OpenGL Pitfalls
Avoiding 19 Common OpenGL Pitfalls
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...
Siggraph 2016 - Vulkan and nvidia : the essentials
Siggraph 2016 - Vulkan and nvidia : the essentials
Porting the Source Engine to Linux: Valve's Lessons Learned
Porting the Source Engine to Linux: Valve's Lessons Learned
GPU accelerated path rendering fastforward
GPU accelerated path rendering fastforward
Migrating from OpenGL to Vulkan
Migrating from OpenGL to Vulkan

What's hot

OpenGL 4.5 Update for NVIDIA GPUs
PPTX
OpenGL 4.5 Update for NVIDIA GPUs
Efficient Buffer Management
PPTX
Efficient Buffer Management
Avoiding Catastrophic Performance Loss
PPTX
Avoiding Catastrophic Performance Loss
Customizing a production pipeline
PPTX
Customizing a production pipeline
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
PDF
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
vkFX: Effect(ive) approach for Vulkan API
PPTX
vkFX: Effect(ive) approach for Vulkan API
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
PPSX
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
Unity mobile game performance profiling – using arm mobile studio
PPT
Unity mobile game performance profiling – using arm mobile studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
PPTX
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
NVIDIA OpenGL and Vulkan Support for 2017
PPT
NVIDIA OpenGL and Vulkan Support for 2017
OpenGL 4.5 Update for NVIDIA GPUs
OpenGL 4.5 Update for NVIDIA GPUs
Efficient Buffer Management
Efficient Buffer Management
Avoiding Catastrophic Performance Loss
Avoiding Catastrophic Performance Loss
Customizing a production pipeline
Customizing a production pipeline
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
[2017 GDC] Radeon ProRender and Radeon Rays in a Gaming Rendering Workflow
vkFX: Effect(ive) approach for Vulkan API
vkFX: Effect(ive) approach for Vulkan API
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
Unity mobile game performance profiling – using arm mobile studio
Unity mobile game performance profiling – using arm mobile studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
NVIDIA OpenGL and Vulkan Support for 2017
NVIDIA OpenGL and Vulkan Support for 2017

Similar to Alex_Vlachos_Advanced_VR_Rendering_GDC2015

Oculus insight building the best vr aaron davies
PPTX
Oculus insight building the best vr aaron davies
Killzone Shadow Fall Demo Postmortem
PDF
Killzone Shadow Fall Demo Postmortem
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
PPTX
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
Smedberg niklas bringing_aaa_graphics
PDF
Smedberg niklas bringing_aaa_graphics
High performance graphics and computation - OpenGL ES and RenderScript
PDF
High performance graphics and computation - OpenGL ES and RenderScript
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
PPTX
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Design of vga based pong game using fpga
PPT
Design of vga based pong game using fpga
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
PDF
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dynamic Resolution and Interlaced Rendering
PPTX
Dynamic Resolution and Interlaced Rendering
"Computational Photography: Understanding and Expanding the Capabilities of S...
PDF
"Computational Photography: Understanding and Expanding the Capabilities of S...
Tessellation on any_budget-gdc2011
PPT
Tessellation on any_budget-gdc2011
"CMOS Image Sensors: A Guide to Building the Eyes of a Vision System," a Pres...
PDF
"CMOS Image Sensors: A Guide to Building the Eyes of a Vision System," a Pres...
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...
PPTX
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...
Luis cataldi-ue4-vr-best-practices2
PPTX
Luis cataldi-ue4-vr-best-practices2
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介
PDF
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介
TGDF 2024 Unreal Lumen with Arm Immortalis : The Best Practices of Ray Tracin...
PPT
TGDF 2024 Unreal Lumen with Arm Immortalis : The Best Practices of Ray Tracin...
[1C7] Developing with Oculus
PDF
[1C7] Developing with Oculus
Crysis 2-key-rendering-features
PDF
Crysis 2-key-rendering-features
Performance Evaluation and Comparison of Service-based Image Processing based...
PDF
Performance Evaluation and Comparison of Service-based Image Processing based...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
PDF
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
Oculus insight building the best vr aaron davies
Oculus insight building the best vr aaron davies
Killzone Shadow Fall Demo Postmortem
Killzone Shadow Fall Demo Postmortem
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
How the Universal Render Pipeline unlocks games for you - Unite Copenhagen 2019
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
High performance graphics and computation - OpenGL ES and RenderScript
High performance graphics and computation - OpenGL ES and RenderScript
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Unite Berlin 2018 - Book of the Dead Optimizing Performance for High End Cons...
Design of vga based pong game using fpga
Design of vga based pong game using fpga
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dissecting and fixing Vulkan rendering issues in drivers with RenderDoc
Dynamic Resolution and Interlaced Rendering
Dynamic Resolution and Interlaced Rendering
"Computational Photography: Understanding and Expanding the Capabilities of S...
"Computational Photography: Understanding and Expanding the Capabilities of S...
Tessellation on any_budget-gdc2011
Tessellation on any_budget-gdc2011
"CMOS Image Sensors: A Guide to Building the Eyes of a Vision System," a Pres...
"CMOS Image Sensors: A Guide to Building the Eyes of a Vision System," a Pres...
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...
XR graphics in Unity: delivering the best AR/VR experiences – Unite Copenhage...
Luis cataldi-ue4-vr-best-practices2
Luis cataldi-ue4-vr-best-practices2
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介
【Unite 2017 Tokyo】NVIDIA Gameworks アップデートおよびAnselとVRWorksの紹介
TGDF 2024 Unreal Lumen with Arm Immortalis : The Best Practices of Ray Tracin...
TGDF 2024 Unreal Lumen with Arm Immortalis : The Best Practices of Ray Tracin...
[1C7] Developing with Oculus
[1C7] Developing with Oculus
Crysis 2-key-rendering-features
Crysis 2-key-rendering-features
Performance Evaluation and Comparison of Service-based Image Processing based...
Performance Evaluation and Comparison of Service-based Image Processing based...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...
[2018 GDC] Real-Time Ray-Tracing Techniques for Integration into Existing Ren...

Recently uploaded

Phishing for Answers: A Tech Filler Quiz.pdf
PDF
Phishing for Answers: A Tech Filler Quiz.pdf
Dr. PANKAJ DHUSSA NASA 2025 NORTHERN HEMISPHERE OBSERVE THE MOON NIGHT
PDF
Dr. PANKAJ DHUSSA NASA 2025 NORTHERN HEMISPHERE OBSERVE THE MOON NIGHT
Session 1 - Specialized AI Associate Series: Essential Studio Automation Fea...
PDF
Session 1 - Specialized AI Associate Series: Essential Studio Automation Fea...
Introduction-to-Machine-Learning-Part-1-3.pptx
PPTX
Introduction-to-Machine-Learning-Part-1-3.pptx
Redefining HR Content Creation: Job Description Automation with UiPath's Agen...
PDF
Redefining HR Content Creation: Job Description Automation with UiPath's Agen...
Readying Enterprise Networks for Artificial Intelligence
PDF
Readying Enterprise Networks for Artificial Intelligence
CIDT: Blockchain, DeFi & Product Engineering - MVPs in 4 Weeks with a Senior-...
PDF
CIDT: Blockchain, DeFi & Product Engineering - MVPs in 4 Weeks with a Senior-...
Unit 3 DS - Distributed Data Storage and Retrieval - PowerPoint Content.pdf
PDF
Unit 3 DS - Distributed Data Storage and Retrieval - PowerPoint Content.pdf
Certified Kubernetes Security Specialist (CKS): Unit 5
PDF
Certified Kubernetes Security Specialist (CKS): Unit 5
Mapping Your Ecosystem: Uniting Content and Data across Systems
PDF
Mapping Your Ecosystem: Uniting Content and Data across Systems
A SEA of Energy Efficiency Opportunities!
PDF
A SEA of Energy Efficiency Opportunities!
Comprehensive Analysis of OpenAI's Strategic Transformation at DevDay 2025.pdf
PDF
Comprehensive Analysis of OpenAI's Strategic Transformation at DevDay 2025.pdf
TrustArc Webinar - The Future of Third-Party Privacy Risk: Trends, Tactics & ...
PDF
TrustArc Webinar - The Future of Third-Party Privacy Risk: Trends, Tactics & ...
Session 1 - Agentic Automation Building the Enterprise Agent of Tomorrow
PDF
Session 1 - Agentic Automation Building the Enterprise Agent of Tomorrow
"How to run 200+ PHP services in production without losing your mind?", Yurii...
PPTX
"How to run 200+ PHP services in production without losing your mind?", Yurii...
DOC-20250918-WA0002.pptx Understanding the World Around Us
PPTX
DOC-20250918-WA0002.pptx Understanding the World Around Us
GDG Cloud Southlake #46: Ozan Unlu: AI at Scale in Observability and Security
PDF
GDG Cloud Southlake #46: Ozan Unlu: AI at Scale in Observability and Security
The Role of Human Experiences (HX) in GenAI Adoption
PDF
The Role of Human Experiences (HX) in GenAI Adoption
sap 2016年09月14日_PP_MRPType_VB_Blogged.pptx
PPTX
sap 2016年09月14日_PP_MRPType_VB_Blogged.pptx
Session 2 - Specialized AI Associate Series: Orchestrator Deep-Dive and UiPa...
PDF
Session 2 - Specialized AI Associate Series: Orchestrator Deep-Dive and UiPa...
Phishing for Answers: A Tech Filler Quiz.pdf
Phishing for Answers: A Tech Filler Quiz.pdf
Dr. PANKAJ DHUSSA NASA 2025 NORTHERN HEMISPHERE OBSERVE THE MOON NIGHT
Dr. PANKAJ DHUSSA NASA 2025 NORTHERN HEMISPHERE OBSERVE THE MOON NIGHT
Session 1 - Specialized AI Associate Series: Essential Studio Automation Fea...
Session 1 - Specialized AI Associate Series: Essential Studio Automation Fea...
Introduction-to-Machine-Learning-Part-1-3.pptx
Introduction-to-Machine-Learning-Part-1-3.pptx
Redefining HR Content Creation: Job Description Automation with UiPath's Agen...
Redefining HR Content Creation: Job Description Automation with UiPath's Agen...
Readying Enterprise Networks for Artificial Intelligence
Readying Enterprise Networks for Artificial Intelligence
CIDT: Blockchain, DeFi & Product Engineering - MVPs in 4 Weeks with a Senior-...
CIDT: Blockchain, DeFi & Product Engineering - MVPs in 4 Weeks with a Senior-...
Unit 3 DS - Distributed Data Storage and Retrieval - PowerPoint Content.pdf
Unit 3 DS - Distributed Data Storage and Retrieval - PowerPoint Content.pdf
Certified Kubernetes Security Specialist (CKS): Unit 5
Certified Kubernetes Security Specialist (CKS): Unit 5
Mapping Your Ecosystem: Uniting Content and Data across Systems
Mapping Your Ecosystem: Uniting Content and Data across Systems
A SEA of Energy Efficiency Opportunities!
A SEA of Energy Efficiency Opportunities!
Comprehensive Analysis of OpenAI's Strategic Transformation at DevDay 2025.pdf
Comprehensive Analysis of OpenAI's Strategic Transformation at DevDay 2025.pdf
TrustArc Webinar - The Future of Third-Party Privacy Risk: Trends, Tactics & ...
TrustArc Webinar - The Future of Third-Party Privacy Risk: Trends, Tactics & ...
Session 1 - Agentic Automation Building the Enterprise Agent of Tomorrow
Session 1 - Agentic Automation Building the Enterprise Agent of Tomorrow
"How to run 200+ PHP services in production without losing your mind?", Yurii...
"How to run 200+ PHP services in production without losing your mind?", Yurii...
DOC-20250918-WA0002.pptx Understanding the World Around Us
DOC-20250918-WA0002.pptx Understanding the World Around Us
GDG Cloud Southlake #46: Ozan Unlu: AI at Scale in Observability and Security
GDG Cloud Southlake #46: Ozan Unlu: AI at Scale in Observability and Security
The Role of Human Experiences (HX) in GenAI Adoption
The Role of Human Experiences (HX) in GenAI Adoption
sap 2016年09月14日_PP_MRPType_VB_Blogged.pptx
sap 2016年09月14日_PP_MRPType_VB_Blogged.pptx
Session 2 - Specialized AI Associate Series: Orchestrator Deep-Dive and UiPa...
Session 2 - Specialized AI Associate Series: Orchestrator Deep-Dive and UiPa...

Alex_Vlachos_Advanced_VR_Rendering_GDC2015

  • 1.
    Advanced VR Rendering Alex Vlachos, Valve Alex@ValveSoftware.com
  • 2.
    Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 2
  • 3.
    VR at Valve くろまる Began VR research 3+ years ago くろまる Both hardware and software engineers くろまる Custom optics designed for VR くろまる Display technology – low persistence, global display くろまる Tracking systems くろまる Fiducial-based positional tracking くろまる Desktop dot-based tracking and controllers くろまる Laser-tracked headset and controllers くろまる SteamVR API – Cross-platform, OpenVR 3
  • 4.
    HTC Vive Developer Edition Specs くろまる Refresh rate: 90 Hz (11.11 ms per frame) くろまる Low persistence, global display くろまる Framebuffer: 2160x1200 (1080x1200 per-eye) くろまる Off-screen rendering ~1.4x in each dimension: くろまる 1512x1680 per-eye = 2,540,160 shaded pixels per-eye (brute-force) くろまる FOV is about 110 degrees くろまる 3600 room-scale tracking くろまる Multiple tracked controllers and other input devices 4
  • 5.
  • 6.
    Optics & Distortion (Pre-Warp) Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion (Visualizing 1.4x render target scalar) 6
  • 7.
    Optics & Distortion (Post-Warp) Warp pass uses 3 sets of UVs for RGB separately to account for spatial and chromatic distortion (Visualizing 1.4x render target scalar) 7
  • 8.
    Shaded Visible Pixels per Second くろまる 720p @ 30 Hz: 27 million pixels/sec くろまる 1080p @ 60 Hz: 124 million pixels/sec くろまる 30" Monitor 2560x1600 @ 60 Hz: 245 million pixels/sec くろまる 4k Monitor 4096x2160 @ 30 Hz: 265 million pixels/sec くろまる VR 1512x1680x2 @ 90 Hz: 457 million pixels/sec くろまる We can reduce this to 378 million pixels/sec (later in the talk) くろまる Equivalent to 30" Monitor @ 100 Hz for a non-VR renderer 8
  • 9.
    There Are No "Small" Effects くろまる Tracking allows users to get up close to anything in the tracked volume くろまる Can’t implement a super expensive effect and claim "it’s just this small little thing in the corner" くろまる Even your floors need to be higher fidelity than we have traditionally authored くろまる If it’s in your tracked volume, it must be high fidelity 9
  • 10.
    VR Rendering Goals 10 くろまる Lowest GPU min spec possible くろまる We want VR to succeed, but we need customers くろまる The lower the min spec, the more customers we have くろまる Aliasing should not be noticeable to customers くろまる Customers refer to aliasing as "sparkling" くろまる Algorithms should scale up to multi-GPU installations くろまる Ask yourself, "Will ‘X’ scale efficiently to a 4-GPU machine?"
  • 11.
    Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 11
  • 12.
    Stereo Rendering (Single-GPU) くろまる Brute-force run your CPU code twice (BAD) くろまる Use geometry shader to amplify geometry (BAD) くろまる Resubmit command buffers (GOOD, our current solution) くろまる Use instancing to double geo (BETTER. Half the API calls, improved cache coherency for VB/IB/texture reads) くろまる "High Performance Stereo Rendering For VR", Timothy Wilson, San Diego Virtual Reality Meetup 12
  • 13.
    Stereo Rendering (Multi-GPU) くろまる AMD and NVIDIA both provide DX11 extensions to accelerate stereo rendering across multiple GPUs くろまる We have already tested the AMD implementation and it nearly doubles our framerate – have yet to test the NVIDIA implementation but will soon くろまる Great for developers くろまる Everyone on your team can have a multi-GPU solution in their dev box くろまる This allows you to break framerate without uncomfortable low-framerate VR くろまる But lie to your team about framerate and report single-GPU fps :) 13
  • 14.
    Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 14
  • 15.
    Prediction くろまる We aim to keep prediction times (render to photons) for the HMD and controller transforms as short as possible (accuracy is more important than total time) くろまる Low persistence global displays: panel is lit for only ~2 ms of the 11.11 ms frame NOTE: Image above is not optimal VR rendering, but helps describe prediction (See later slides) 15
  • 16.
    Pipelined Architectures くろまる Simulating next frame while rendering the current frame くろまる We re-predict transforms and update our global cbuffer right before submit くろまる VR practically requires this due to prediction constraints くろまる You must conservatively cull on the CPU by about 5 degrees 16
  • 17.
    Waiting for VSync くろまる Simplest VR implementation, predict right after VSync くろまる Pattern #1: Present(), clear back buffer, read a pixel くろまる Pattern #2: Present(), clear back buffer, spin on a query くろまる Great for initial implementation, but please DO NOT DO THIS. GPUs are not designed for this. くろまる See John McDonald’s talk: くろまる "Avoiding Catastrophic Performance Loss: Detecting CPU-GPU Sync Points", John McDonald, NVIDIA, GDC 2014 17
  • 18.
    GPU Bubbles くろまる If you start submitting draw calls after VSync: くろまる Ideally, your capture should look like this: (Images are screen captures of NVIDIA Nsight) 18
  • 19.
    "Running Start" くろまる If you start to submit D3D calls after VSync: くろまる Instead, start submitting D3D calls 2 ms before VSync. (2 ms is a magic number based on the 1.5-2.0ms GPU bubbles we measured on current GPUs): くろまる But, you end up predicting another 2 ms (24.22 ms total) 19
  • 20.
    "Running Start" VSync 20 くろまる Question: How do you know how far you are from VSync? くろまる Answer: It’s tricky. Rendering APIs don’t directly provide this. くろまる The SteamVR/OpenVR API on Windows in a separate process spins on calls to IDXGIOutput::WaitForVBlank() and notes the time and increments a frame counter. The application can then call GetTimeSinceLastVSync() that also returns a frame ID. くろまる GPU vendors, HMD devices, and rendering APIs should provide this
  • 21.
    "Running Start" Details 21 くろまる To deal with a bad frame, you need to partially synchronize with the GPU くろまる We inject a query after clearing the back buffer, submit our entire frame, spin on that query, then call Present() くろまる This ensures we are on the correct side of VSync for the current frame, and we can now spin until our running start time
  • 22.
    Why the Query Is Critical 22 くろまる If a frame is late, the query will keep you on the right side of VSync for the following frame ensuring your prediction remains accurate
  • 23.
    Running Start Summary 23 くろまる This is a solid 1.5-2.0ms GPU perf gain! くろまる You want to see this in NVIDIA Nsight: くろまる You want to see this in Microsoft’s GPUView:
  • 24.
    Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 24
  • 25.
    Aliasing Is Your Enemy くろまる The camera (your head) never stops moving. Aliasing is amplified because of this. くろまる While there are more pixels to render, each pixel fills a larger angle than anything we’ve done before. Here are some averages: くろまる 2560x1600 30" monitor: ~50 pixels/degree (50 degree H fov) くろまる 720p 30" monitor: ~25 pixels/degree (50 degree H fov) くろまる VR: ~15.3 pixels/degree (110 degree fov w/ 1.4x) くろまる We must increase the quality of our pixels 25
  • 26.
    4xMSAA Minimum Quality くろまる Forward renderers win for antialiasing because MSAA just works くろまる We use 8xMSAA if perf allows くろまる Image-space antialiasing algorithms must be compared side-by-side with 4xMSAA and 8xMSAA to know how your renderer will compare to others in the industry くろまる Jittered SSAA is obviously the best using the HLSL ‘sample’ modifier, but only if you can spare the perf 26
  • 27.
    Normal Maps Are Not Dead くろまる Most normal maps work great in VR...mostly. くろまる What doesn’t work: くろまる Feature detail larger than a few cm inside tracked volume is bad くろまる Surface shape inside a tracked volume can’t be in a normal map くろまる What does work: くろまる Distant objects outside the tracked volume you can’t inspect up close くろまる Surface "texture" and fine details: 27
  • 28.
    Normal Map Mipping Error 28 Blinn-Phong Specular Zoomed out normal map box filtered mips Zoomed out super-sampled 36 samples Expected glossiness Incorrect glossiness
  • 29.
    Normal Map Mipping Problems くろまる Any mip filter that just generates an averaged normal loses important roughness information 29
  • 30.
    Normal Map Visualization 30 4096x4096 Normal Map Fire Alarm 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 31.
    Normal Map Visualization 31 4096x4096 Normal Map Fire Alarm 8x8 Mip Visualization16x16 Mip Visualization
  • 32.
    Normal Map Visualization 32 4096x4096 Normal Map Dota 2 Mirana Body 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 33.
    Normal Map Visualization 33 4096x4096 Normal Map Dota 2 Juggernaut Sword Handle 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 34.
    Normal Map Visualization 34 4096x4096 Normal Map Shoulder Armor 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 35.
    Normal Map Visualization 35 4096x4096 Normal Map Metal Siding 4x4 Mip Visualization 2x2 Mip 1x1 Mip
  • 36.
    Roughness Encoded in Mips くろまる We can store a single isotropic value (visualized as the radius of a circle) that is the standard deviation of all 2D tangent normals from the highest mip that contributed to this texel くろまる We can also store a 2D anisotropic value (visualized as the dimensions of an ellipse) for the standard deviation in X and Y separately that can be used to compute tangent-space axis-aligned anisotropic lighting! 36
  • 37.
  • 38.
    Add Artist-Authored Roughness くろまる We author 2D gloss = 1.0 – roughness くろまる Mip with a simple box filter くろまる Add/sum it with the normal map roughness at each mip level くろまる Because we have anisotropic gloss maps anyway, storing the generated normal map roughness is FREE 38 Isotropic Gloss Anisotropic Gloss
  • 39.
    Tangent-Space Axis-Aligned Anisotropic Lighting くろまる Standard isotropic lighting is represented along the diagonal くろまる Anisotropy is aligned with either of the tangent-space axes くろまる Requires only 2 additional values paired with a 2D tangent normal = Fits into an RGBA texture (DXT5 >95% of the time) 39
  • 40.
    Roughness to Exponent Conversion 40 void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent } くろまる Diffuse lighting is Lambert raised to exponent (N.Lk) where k is in the range 0.6-1.4 くろまる Experimented with anisotropic diffuse lighting, but not worth the instructions くろまる Specular exponent range is 1-16,384 and is a modified Blinn-Phong with anisotropy (more on this later)
  • 41.
    How Anisotropy Is Computed 41 Tangent U Lighting * = Tangent V Lighting Final Lighting * =
  • 42.
    Shader Code 42 Anisotropic Specular Lighting: float3 vHalfAngleDirWs = normalize( vPositionToLightDirWs.xyz + vPositionToCameraDirWs.xyz ); float3 vSpecularNormalX = vHalfAngleDirWs.xyz - ( vTangentUWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentUWs.xyz ) ); float3 vSpecularNormalY = vHalfAngleDirWs.xyz - ( vTangentVWs.xyz * dot( vHalfAngleDirWs.xyz, vTangentVWs.xyz ) ); float flNDotHX = max( 0.0, dot( vSpecularNormalX.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkX = pow( flNDotHX, vSpecularExponent.x * 0.5 ); flNDotHkX *= vSpecularScale.x; float flNDotHY = max( 0.0, dot( vSpecularNormalY.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHkY = pow( flNDotHY, vSpecularExponent.y * 0.5 ); flNDotHkY *= vSpecularScale.y; float flSpecularTerm = flNDotHkX * flNDotHkY; Isotropic Diffuse Lighting: float flDiffuseTerm = pow( flNDotL, flDiffuseExponent ) * ( ( flDiffuseExponent + 1.0 ) * 0.5 ); Isotropic Specular Lighting: float flNDotH = saturate( dot( vNormalWs.xyz, vHalfAngleDirWs.xyz ) ); float flNDotHk = pow( flNDotH, dot( vSpecularExponent.xy, float2( 0.5, 0.5 ) ) ); flNDotHk *= dot( vSpecularScale.xy, float2( 0.33333, 0.33333 ) ); // 0.33333 is to match the spec intensity of the aniso algorithm above float flSpecularTerm = flNDotHk; void RoughnessEllipseToScaleAndExp( float2 vRoughness, out float o_flDiffuseExponentOut, out float2 o_vSpecularExponentOut, out float2 o_vSpecularScaleOut ) { o_flDiffuseExponentOut = ( ( 1.0 - ( vRoughness.x + vRoughness.y ) * 0.5 ) * 0.8 ) + 0.6; // Outputs 0.6-1.4 o_vSpecularExponentOut.xy = exp2( pow( 1.0 - vRoughness.xy, 1.5 ) * 14.0 ); // Outputs 1-16384 o_vSpecularScaleOut.xy = 1.0 - saturate( vRoughness.xy * 0.5 ); // This is a pseudo energy conserving scalar for the roughness exponent }
  • 43.
    Geometric Specular Aliasing くろまる Dense meshes without normal maps also alias, and roughness mips can’t help you! くろまる We use partial derivatives of interpolated vertex normals to generate a geometric roughness term that approximates curvature. Here is the hacky math: float3 vNormalWsDdx = ddx( vGeometricNormalWs.xyz ); float3 vNormalWsDdy = ddy( vGeometricNormalWs.xyz ); float flGeometricRoughnessFactor = pow( saturate( max( dot( vNormalWsDdx.xyz, vNormalWsDdx.xyz ), dot( vNormalWsDdy.xyz, vNormalWsDdy.xyz ) ) ), 0.333 ); vRoughness.xy = max( vRoughness.xy, flGeometricRoughnessFactor.xx ); // Ensure we don’t double-count roughness if normal map encodes geometric roughness 43 Visualization of flGeometricRoughnessFactor
  • 44.
    Geometric Specular Aliasing Part 2 くろまる MSAA center vs centroid interpolation: It’s not perfect くろまる Normal interpolation can cause specular sparkling at silhouettes due to over-interpolated vertex normals くろまる Here’s a trick we are using: くろまる Interpolate normal twice: once with centroid, once without float3 vNormalWs : TEXCOORD0; centroid float3 vCentroidNormalWs : TEXCOORD1; くろまる In the pixel shader, choose the centroid normal if normal length squared is greater than 1.01 if ( dot( i.vNormalWs.xyz, i.vNormalWs.xyz ) >= 1.01 ) { i.vNormalWs.xyz = i.vCentroidNormalWs.xyz; } 44
  • 45.
    Outline くろまる VR at Valve くろまる Methods for Stereo Rendering くろまる Timing: Scheduling, Prediction, VSync, GPU Bubbles くろまる Specular Aliasing & Anisotropic Lighting くろまる Miscellaneous VR Rendering Topics 45
  • 46.
    Normal Map Encoding くろまる Projecting tangent normals onto Z plane only uses 78.5% of the range of a 2D texel くろまる Hemi-octahedron encoding uses the full range of a 2D texel くろまる "A Survey of Efficient Representations for Independent Unit Vectors", Cigolle et al., Journal of Computer Graphics Techniques Vol. 3, No. 2, 2014 (Image modified from above paper) 46
  • 47.
    Scale Render Target Resolution くろまる Turns out, 1.4x is just a recommendation for the HTC Vive (Each HMD design has a different recommended scalar based on optics and panels) くろまる On slower GPUs, scale the recommended render target scalar down くろまる On faster GPUs, scale the recommended render target scalar up くろまる If you’ve got GPU cycles to burn, BURN THEM 47
  • 48.
    Anisotropic Texture Filtering くろまる Increases the perceived resolution of the panels (don’t forget, we only have fewer pixels per degree) くろまる Force this on for color and normal maps くろまる We use 8x by default くろまる Disable for everything else. Trilinear only, but measure perf. Anisotropic filtering may be "free" if you are bottlenecked elsewhere. 48
  • 49.
    Noise Is Your Friend くろまる Gradients are horrible in VR. Banding is more obvious than LCD TVs. くろまる We add noise on the way into the framebuffer when we have floating-point precision in the pixel shader float3 ScreenSpaceDither( float2 vScreenPos ) { // Iestyn's RGB dither (7 asm instructions) from Portal 2 X360, slightly modified for VR float3 vDither = dot( float2( 171.0, 231.0 ), vScreenPos.xy + g_flTime ).xxx; vDither.rgb = frac( vDither.rgb / float3( 103.0, 71.0, 97.0 ) ) - float3( 0.5, 0.5, 0.5 ); return ( vDither.rgb / 255.0 ) * 0.375; } 49
  • 50.
    Environment Maps くろまる Standard implementation at infinity = only works for sky くろまる Need to use some type of distance remapping for environment maps くろまる Sphere is cheap くろまる Box is more expensive くろまる Both are useful in different situations くろまる Read this online article: くろまる "Image-based Lighting approaches and parallax-corrected cubemaps", Sébastien Lagarde, 2012 50
  • 51.
    Stencil Mesh (Hidden Area Mesh) くろまる Stencil out the pixels you can’t actually see through the lenses. GPUs are fast at early stencil-rejection. くろまる Alternatively you can render to the depth buffer at near z so everything early z-rejects instead くろまる Lenses produce radially symmetric distortion which means you effectively see a circular area projected on the panels 51
  • 52.
  • 53.
    Stencil Mesh (Ideal Warped View) 53
  • 54.
  • 55.
  • 56.
  • 57.
    Stencil Mesh (Final Unwarped View) 57
  • 58.
    Stencil Mesh (Final Warped View) 58
  • 59.
    Stencil Mesh (Hidden Area Mesh) くろまる SteamVR/OpenVR API will provide this mesh to you くろまる Results in a 17% fill rate reduction! くろまる No stencil mesh: VR 1512x1680x2 @ 90Hz: 457 million pixels/sec くろまる 2,540,160 pixels per eye (5,080,320 pixels total) くろまる With stencil mesh: VR 1512x1680x2 @ 90Hz: 378 million pixels/sec くろまる About 2,100,000 pixels per eye (4,200,000 pixels total) 59
  • 60.
    Warp Mesh (Lens Distortion Mesh) 60
  • 61.
  • 62.
    Warp Mesh (Cull UV’s Outside 0-1) 62
  • 63.
    Warp Mesh (Cull Stencil Mesh) 63
  • 64.
    Warp Mesh (Shrink Wrap) 15% of pixels culled from the warp mesh 64
  • 65.
    Performance Queries Required! くろまる You are always VSync’d くろまる Disabling VSync to see framerate will make you dizzy くろまる Need to use performance queries to report GPU workload くろまる Simplest implementation is to measure first to last draw call くろまる Ideally measure these things: くろまる Idle time from Present() to first draw call くろまる First draw call to last draw call くろまる Idle time from last draw call to Present() 65
  • 66.
    Summary くろまる Stereo Rendering くろまる Prediction くろまる "Running Start" (Saves 1.5-2.0 ms/frame) くろまる Anisotropic Lighting & Mipping Normal Maps くろまる Geometric Specular Antialiasing くろまる Stencil Mesh (Saves 17% pixels rendered) くろまる Optimized Warp Mesh (Reduces cost by 15%) くろまる Etc. 66
  • 67.
    Thank You! Alex Vlachos, Valve Alex@ValveSoftware.com 67

AltStyle によって変換されたページ (->オリジナル) /