Scalable Strand-Based Hair Rendering by Erik S. V. Jansson

Scalable Strand-Based Hair Rendering Erik S. V. Jansson

Introduction The Witcher 3 (2013) Layered “Ruby” Mesh from [1] e.g. Level of Detail Scheme [2] Tomb Raider (2013) Deus Ex: Mankind Divided (2016) • Hair is an important component when rendering characters Humans and animals (i.e. fur) in games (real-time) and movies (offline) • Simulating and rendering hair is computationally expensive Strands are self-shadowing, translucent, anisotropic and quite thin • But the main issue is that there are just too many of them! It’s still quite common to approximate hair “cheaply” as a layered mesh • Still, using an strand-based representation has advantages 1) closer to reality, 2) high-quality close-up shots, 3) familiar to artists • Problem:the number of characters we can render is limited We need a scalable level of detail method that can handle animations

Introduction Goals Alice: Madness Returns (2011) Assassin's Creed Odyssey (2018) • Design a “complete” strand-based hair rendering solution Shading: how can we quickly model the reflection of light for a strand? Self-Shadowing: how should these strands receive or cast shadows? Transparency: how do we efficiently sort and blend transparent hair? Aliasing: what should we do with these jaggy sub-pixel sized strands? • Each individual solution must be doable in real-time frames i.e. none of these interactive or “real-time” solutions you see in papers • And it must be scalable and general enough to be useful  Scalable: should “gracefully” transition in performance/quality domain General: must be able to handle animation or simulation of the strand Sounds reasonable? Let’s go!

Hair Rendering

Hair Rendering Light Scattering • We’re using the vanilla Kajiya-Kay[3] strand lighting model It models strands as thin cylindersto find the light scatteringinside • It’s the Phong reflection model equivalent for hair rendering Instead of the surface normal , it uses the tangent along the strand • It has both a diffuse and specular term, just like in Phong’s! Not a physically-based model, but the use is still widespread in games • Kajiya-Kay is cheap to compute and needs few parameters: Shader boils-down to a few dot-products, with a single pow and sqrt • The tangent is easy to find, if we are just using rasterization Just follow the strand segment from and do interpolation

Light Scattering Results

Hair Rendering Self-Shadowing • The naïve way would be to just use vanilla shadow mapping But the hard shadows lead to unnatural looking hair even after filtering • A popular alternative is to use a Deep Opacity Map[4], but: It’s expensive! Needs multiple shadow maps per style and light source • Instead we use the Approximated Deep Shadow Maps[5] “How many strands should be occluding ?” – using one shadow map! • , how far away is from the 1st strand • Visibility is, where is the opacity of the strands Extra: sample shadow map in -sized strides to get smooth shadows • Problem: assumes constantstrand spacing (not general) In practice it still works OK though, Tomb Raider (2013) split the mesh

Self-Shadowing Results

Self-Shadowing Ambient Occlusion • Our self-shadowing only accounts for directional occlusion We’d also like to find the ambient occlusion from neighboring strands! • For this we need to find the number of occluding hair strands • Luckily, finding the strand density is already quite close to it: Observation: a raymarch gives us the amount of hair in the way • We’ve tried two ways of finding the strand ambient occlusion 1) Ray march in the faces and corners of the voxel, gathering up strands 2) A Local Ambient Occlusion (LAO) [6] estimate by sphere projection • Solutions align locally with the ground-truth raytraced result i.e. while the intensities might need some knobs, the shadows are there

Ambient Occlusion Results

Hair Rendering Voxelization • So how can we find this strand density? By voxelizing lines! The general idea is to find how many lines have passed through a voxel • We’ve tried two schemes based on vertices and segments Vertex-based: project onto grid and count number of strands there, Segment-based: similar to [7], use 3-D DDA raster. Can you see why? • Both algorithms have their own advantages / disadvantages Vertex-based: fast, but might miss voxels if grid size or spacing is high Segment-based: doesn’t miss strands but cost scales with resolution • Efficient implementation possible by using imageAtomicAdd Memory requirements: VK_FORMAT_R8 volume needs 16 MiB

Voxelization Results Filtering vertex-based solution reduces most gaps for 256³ volumes: 1) Average densities 2) Maximum density In our case we use a 3³ kernel size, but this will change with hair / grid. Vertex-Based Segment-Based

Hair Rendering Results

Hair Rendering Level of Detail Return of the Obra Dinn (2018) • For the magnified case we use a line rasterization solution: Looks acceptable and performs acceptably in the “close-up” scenarios • In “far away” scenarios our magnified technique breaks down Performance doesn’t scale well, and the results tend to get very “noisy” • Using our density volume we can do direct volume rendering Raymarching scales directly by the number of fragments on the screen • The minified case therefore uses strand volume rendering Good performance for “far away”, but breaks down on “close-up” shots • We haven’t implemented any automatic LoD-transitions yet Will probably just dither or blend these magnified and minified solutions

Level of Detail Direct Volume Rendering • Create proxy-geometry (e.g. AABB) to house the volume • Rendering it will give you a “fake” surface point to find an: • Surface: raymarch with constant steps until a density limit This will give you a point on the densityisosurface used for shading! • Tangent:not the finite difference of (that’s the normal!) Instead: quantize and voxelize the tangents by imageAtomicAdd or: try to find the direction of least change in the plane of the normal? • Shading: need to estimate components in magnified case Kajiya-Kay: trivial once we have the tangents: , just do it like before, ADSM: raycasting from to give you the number of occluding hairs, LAO: same calculations as the magnified case (both use the volume)

Direct Volume Rendering Results Which one isRasterized? 1 ? or 2

Implementation • Written in C++17 from scratch using the Vulkan graphics API Uses mostly header only libraries: stb, jsoncpp, glm, imgui and tinyobj • Optionally uses Embree for the raytracing. glfwfor surfaces: Renderer builds and runs well on both Windows and Linux (I’ve tested it) • Seems to work well on AMD, NVIDIAand Intel hardware too Raymarching seems to run a bit better on AMD GPUs than on NVIDIA… • Wrote a Vulkan wrapper: vkpp, alongside the strand renderer Will maybe release it as a separate thing. Not to be used in production  In hindsight: should perhaps have handled buffer allocations with vma! • Hair styles are specified in an “extended” Cem Yuksel format

Implementation Demo Let’s not jinx it now…

Implementation Performance • System: Radeon RX Vega 64 and Ryzen Threadripper 1950X Tested on Windows 10 @ 1920x1080 resolution without multisampling • Method: Vulkan timestamp queries averaged over 60 frames • Ponytail: 136,000 hair strands with 1,635,840 line segments Rasterized (close): 3.10ms (1.32ms for shadow mapping and voxelization) Rasterized (far): 2.65ms Raycasted (close): 10.5ms (0.37ms for voxelization, no shadow map cost) Raycasted (far): 0.98ms • Bear: 961,280 strands of “fur" with 3,845,120 line segments: Rasterized (close): 4.70ms (2.39ms for shadow mapping and voxelization) Rasterized (far): 3.82ms Raycasted (close): 10.8ms (0.77ms for voxelization, no shadow map cost) Raycasted (far): 1.69ms • Note!: that’s with ~7x more strands than in Tomb Raider (2013) [5]

Implementation Let’s Enhance!

Hair Rendering Transparency and Anti-Aliasing • We haven’t been able to get around to this in the thesis just yet, • However, we have a pretty good idea of what we’re going to do: Phone-Wire AA: change fragment proportional to pixel “coverage” [10] Per-Pixel Linked List: store a linked-list of fragments per pixel, sort themby depth, and blend of these fragments in the “correct” order, just as [9] • These are all “safe-bets” as they have been proven to work well • We also have a few ideas that we’d like to try (if we have time…): Screen-Space Hair Density: use volume to find low strand density area,and adapt the number of fragments in PPLL based on earlier observations Weight-Blended and Moment-Based OIT: both of these use weights forblending correctly. Can we use our volume to choose any “good” weights?

Hair Rendering Optimizations Raycasting Optimizations in [8] • Right now we’re using a “brute-force” raymarching technique Doing constant-sized steps through density volume to find a isosurface • Accounts for most of the time spent in the volume rendering • Some optimizations can be done to speed up the ray casting: Hierarchical Spatial Enumeration: make larger steps by using the LoD • Another issue is that clearing the density volumes takes time Workaround: do async work while we’re doing this or reduce dimension Left: brute-force access patterns Right: optimized access patterns

Summary • We’ve developed a strand-based hair renderer that is: Scalable: by using a hybrid line rasterizer and volume raymarcherapproach that gracefully switches between performance/quality, General: doesn’t pre-compute anything (i.e. handles simulation), For: shading, self-shadowing, transparency, and anti-aliasing. • All of these run in real-time frames, with time to spare: Remember: our evaluation data set is ~7x larger than you’d want • Along the way, we’ve shown hair rendering methods to: Efficiently voxelize stands of hair into a volume by GPU compute, a way to approximate strand ambient occlusion with this volume, and how to do Kajiya-Kay in a volume, by voxelizing hair tangents.

Thank You! • Matthäus Chajdas, Dominik Baumeister and Jason Lacroix, • Ingemar Ragnemalm and Harald Nautsch, • … and finally AMD GmbH for providing a seat in their office! • Source Code:https://github.com/CaffeineViking/vkhr • and the report will (later) be in: https://eriksvjansson.net • GitHub: @CaffeineViking • Mailing: eriksvjansson@gmail.com • Twitter: @CaffeineViking Questions?

References • Scheuermann, Thorsten. “Hair Rendering and Shading”. GDC 2004 Presentation. (2004) • Stewart and Doyon. “Augmented Hair for Deus Ex Universe Projects: TressFX 3.0”. (2015) • Kajiya and Kay. “Rendering Fur with Three Dimensional Textures”. ACM SIGGRAPH. (1989) • Yuksel and Keyser. “Deep Opacity Maps”. Computer Graphics Forum (Vol 27 # 2). (2008) • Lacroix, Jason. “Survivor Reborn: Tomb Raider on DX11”. GDC 2013 Presentation. (2013) • Hernell et al. “Local Ambient Occlusion for Direct Volume Rendering”. IEEE ToVCG. (2010) • Kanzler et al. “Voxel-Based Rendering Pipeline for Large 3D Line Sets”. IEEE TVCG. (2018) • Pawasauskas, John. “Volume Visualization With Ray Casting”. Worcester Institute. (1997) • Yang et al. “Real-Time Concurrent Linked List Construction On The GPU”. EG CGF. (2010) • Persson, Emil. “Phone-Wire Antialiasing”. GPU Pro 5 (Chapter 6). (2014)

Appendices Kajiya-Kay

Scalable Strand-Based Hair Rendering by Erik S. V. Jansson