Raw DirectX 12

Alain Galvan
16 min readOct 31, 2021

--

DirectX 12 is the latest iteration of Microsoft’s proprietary computer graphics API used for Windows and Xbox platforms. DirectX 12, like new graphics APIs such as Vulkan, Apple Metal, or WebGPU, is focused on making a less complex driver and an API that’s closer to the architecture of modern GPUs. This means the creation of pipeline states directly, more control over when commands get executed with queues, and the ability to build these data structures with multiple threads.

DirectX focuses on real time rendering, thus is designed for Game Developers and Computer Aided Design (CAD) Software Engineers. As the industry standard in Computer Graphics APIs, you can expect nearly all compliant hardware to have robust support for the API, making it the standard for commercial projects.

From 3D/image authoring software like Marmoset Toolbag, Adobe PhotoShop, Autodesk Maya, to commercial games such as Activision Blizzard’s OverWatch, Call of Duty, Epic’s Fortnite, the vast majority of titles on Valve’s Steam, and much more, this graphics API is the most popular and ubiquitous one of them all (despite its platform exclusivity).

That’s not to say there aren’t uses for it outside of rendering, Tensorflow recently added support for a DirectX 12 backend for machine learning execution via DirectML, and GPGPU computations with the compute pipeline can handle physics simulations and much more.

DirectX 12 is currently supported on:

  • 🗔 Windows 7–11
  • ✖️ Xbox One — Xbox Series X/S

With Windows 7 having partial support for it via Microsoft’s D3D12 On 7.

While the official languages might be C and C++, many languages support DirectX 12:

I’ve prepared a Github Repo with everything we need to get started. We’re going to walk through a Hello Triangle app in C++, a program that creates a triangle with the raster graphics pipeline and renders it onto the screen.

Setup

First install:

Then type the following in any terminal your such as VS Code’s Integrated Terminal.

# 🐑 Clone the repo
git clone https://github.com/alaingalvan/directx12-seed --recurse-submodules
# 💿 go inside the folder
cd directx12-seed
# 👯 If you forget to `recurse-submodules` you can always run:
git submodule update --init
# 👷 Make a ./build/ folder and place your project files inside:# 🖼️ To build your Visual Studio solution on Windows x64
cmake . -B build -A x64
# 🍎 To build your XCode project on Mac OS
cmake . -B build -G Xcode
# 🐧 To build your .make file on Linux
cmake . -B build
# 🔨 Build on any platform:
cmake --build build

Overview

DirectX 12’s documentation recommends the use of ComPtr<T> as an alternative to std::shared_ptr<T>, with the benefit of better debugging and easier initialization of DirectX 12 data structures.

Regardless of whether or not you choose to use ComPtr<T>, the steps to rendering raster graphics with DirectX 12 are pretty similar to other modern graphics APIs:

  1. Initialize the API — Create your IDXGIFactory, IDXGIAdapter, ID3D12Device, ID3D12CommandQueue, ID3D12CommandAllocator, ID3D12GraphicsCommandList.
  2. Setup Frame Backings — Create your IDXGISwapChain,ID3D12DescriptorHeap for your back buffers, your back buffer ID3D12Resource Render Target Views, a ID3D12Fence to detect when a frame is finished rendering.
  3. Initialize Resources — Create your Triangle Data such as your ID3D12Resource Vertex Buffers, ID3D12Resource Index Buffer, ID3D12Fences to detect when uploads to GPU memory are complete. Load your shaders ID3DBlobs, your constant buffer ID3D12Resources and their ID3D12DescriptorHeap, describe what resources will be accessible with a ID3D12RootSignature, and build your ID3D12PipelineState.
  4. Encode Commands — Write commands to your ID3D12GraphicsCommandList with what pipelines commands you intend to execute, making sure to put ResourceBarriers where appropriate.
  5. Render — Update your GPU constant buffer data (Uniforms), submit commands to the ID3D12CommandQueue with ExecuteCommandLists, Present your swapchain, and await the next frame.
  6. Destroy — Destroy any data structures you’re done using with Release() or rely on ComPtr<T> deallocating for you.

The following will explain snippets that can be found in the Github repo, with certain parts omitted, and member variables (mMemberVariable) declared inline without the m prefix so their type is easier to see and the examples here can work on their own.

Window Creation

We’re using CrossWindow to handle cross platform window creation, so creating a Win32 window and updating it is very easy:

#include "CrossWindow/CrossWindow.h"
#include "Renderer.h"
#include <iostream>void xmain(int argc, const char** argv)
{
// 🖼️ Create Window
xwin::WindowDesc wdesc;
wdesc.title = "DirectX 12 Seed";
wdesc.name = "MainWindow";
wdesc.visible = true;
wdesc.width = 640;
wdesc.height = 640;
wdesc.fullscreen = false;
xwin::Window window;
xwin::EventQueue eventQueue;
if (!window.create(wdesc, eventQueue))
{ return; };
// 🎨 Create a renderer
Renderer renderer(window);
// 🏁 Engine loop
bool isRunning = true;
while (isRunning)
{
bool shouldRender = true;
// ♻️ Update the event queue
eventQueue.update();
// 🎈 Iterate through that queue:
while (!eventQueue.empty())
{
//Update Events
const xwin::Event& event = eventQueue.front();
// 💗 On Resize:
if (event.type == xwin::EventType::Resize)
{
const xwin::ResizeData data = event.data.resize;
renderer.resize(data.width, data.height);
shouldRender = false;
}
// ❌ On Close:
if (event.type == xwin::EventType::Close)
{
window.close();
shouldRender = false;
isRunning = false;
}
eventQueue.pop();
}
// ✨ Update Visuals
if (shouldRender)
{
renderer.render();
}
}
}

As an alternative to CrossWindow, you could use another library like GLFW, SFML, SDL, QT, or just interface directly with the Win32 or UWP APIs.

Initialize API

Factory

Factories are the entry point to the DirectX 12 API, and will allow you to find adapters that you can use to execute DirectX 12 commands.

You can also create debug data structures such as Debug Controllers which can enable API usage validation.

// 👋 Declare DirectX 12 Handles
IDXGIFactory4* factory;
ID3D12Debug1* debugController;
// 🏭 Create Factory
UINT dxgiFactoryFlags = 0;
#if defined(_DEBUG)
// 🐛 Create a Debug Controller to track errors
ID3D12Debug* dc;
ThrowIfFailed(D3D12GetDebugInterface(IID_PPV_ARGS(&dc)));
ThrowIfFailed(dc->QueryInterface(IID_PPV_ARGS(&debugController)));
debugController->EnableDebugLayer();
debugController->SetEnableGPUBasedValidation(true);
dxgiFactoryFlags |= DXGI_CREATE_FACTORY_DEBUG;dc->Release();
dc = nullptr;
#endif
HRESULT result = CreateDXGIFactory2(dxgiFactoryFlags, IID_PPV_ARGS(&factory));

Adapter

An Adapter provides information on the physical properties of a given DirectX device. You can query your current GPU, how much memory it has, etc.

// 👋 Declare Handles
IDXGIAdapter1* adapter;
// 🔌 Create Adapter
for (UINT adapterIndex = 0;
DXGI_ERROR_NOT_FOUND != factory->EnumAdapters1(adapterIndex, &adapter);
++adapterIndex)
{
DXGI_ADAPTER_DESC1 desc;
adapter->GetDesc1(&desc);
// ❌ Don't select the Basic Render Driver adapter.
if (desc.Flags & DXGI_ADAPTER_FLAG_SOFTWARE)
{
continue;
}
// ✔️ Check if the adapter supports Direct3D 12, and use that for the rest
// of the application
if (SUCCEEDED(D3D12CreateDevice(adapter, D3D_FEATURE_LEVEL_12_0,
_uuidof(ID3D12Device), nullptr)))
{
break;
}
// ❌ Else we won't use this iteration's adapter, so release it
adapter->Release();
}

Device

A Device is your primary entry point to the DirectX 12 API, giving you access to the inner parts of the API. This is key to accessing important data structures and functions such as pipelines, shader blobs, render state, resource barriers, etc.

// 👋 Declare Handles
ID3D12Device* device;
// 💻 Create Device
ID3D12Device* pDev = nullptr;
ThrowIfFailed(D3D12CreateDevice(adapter, D3D_FEATURE_LEVEL_12_0,
IID_PPV_ARGS(&device)));

A Debug Device allows you to rely on DirectX 12’s debug mode. It can be difficult to keep track of data structures created with DirectX. With this you’ll be able to prevent data leaks or verify if you’re creating or using the API correctly.

// 👋 Declare Handles
ID3D12DebugDevice* debugDevice;
#if defined(_DEBUG)
// 💻 Get debug device
ThrowIfFailed(device->QueryInterface(&debugDevice));
#endif

Command Queue

A Command Queue allows you to submit groups of draw calls, known as command lists, together to execute in order, thus allowing a GPU to stay busy and optimize its execution speed.

// 👋 Declare Handles
ID3D12CommandQueue* commandQueue;
// 📦 Create Command Queue
D3D12_COMMAND_QUEUE_DESC queueDesc = {};
queueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE;
queueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT;
ThrowIfFailed(device->CreateCommandQueue(&queueDesc,
IID_PPV_ARGS(&commandQueue)));

Command Allocator

A Command Allocator allows you to create command lists where you can define the functions you want the GPU to execute.

// 👋 Declare Handles
ID3D12CommandAllocator* commandAllocator;
// 🎅 Create Command Allocator
ThrowIfFailed(device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT,
IID_PPV_ARGS(&commandAllocator)));

Synchronization

DirectX 12 features a number of synchronization primitives that can help the driver know how resources will be used in the future, know when tasks have been completed by the GPU, etc.

A Fence lets your program know when certain tasks have been executed by the GPU, be it uploads to GPU exclusive memory, or when you’ve finished presenting to the screen.

// 👋 Declare handles
UINT frameIndex;
HANDLE fenceEvent;
ID3D12Fence* fence;
UINT64 fenceValue;
// 🚧 Create fence
ThrowIfFailed(device->CreateFence(0, D3D12_FENCE_FLAG_NONE,
IID_PPV_ARGS(&fence)));

A Barrier lets the driver know how a resource should be used in upcoming commands. This can be useful if say, you’re writing to a texture, and you want to copy that texture to another texture (such as the swapchain’s render attachment).

// 👋 Declare handles
ID3D12GraphicsCommandList* commandList;
// 🔮 Create Barrier
D3D12_RESOURCE_BARRIER barrier = {};
result.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
result.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
barrier.Transition.pResource = texResource;
barrier.Transition.StateBefore = D3D12_RESOURCE_STATE_COPY_SOURCE;
barrier.Transition.StateAfter = D3D12_RESOURCE_STATE_UNORDERED_ACCESS;
barrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(1, &barrier);

Swapchain

Swapchains handle swapping and allocating back buffers to display what you’re rendering to a given window.

// 💾 Declare Data
unsigned width = 640;
unsigned height = 640;
// 👋 Declare Handles
static const UINT backbufferCount = 2;
UINT currentBuffer;
ID3D12DescriptorHeap* renderTargetViewHeap;
ID3D12Resource* renderTargets[backbufferCount];
UINT rtvDescriptorSize;
// ⛓️ Swapchain
IDXGISwapChain3* swapchain;
D3D12_VIEWPORT viewport;
D3D12_RECT surfaceSize;
surfaceSize.left = 0;
surfaceSize.top = 0;
surfaceSize.right = static_cast<LONG>(width);
surfaceSize.bottom = static_cast<LONG>(height);
viewport.TopLeftX = 0.0f;
viewport.TopLeftY = 0.0f;
viewport.Width = static_cast<float>(width);
viewport.Height = static_cast<float>(height);
viewport.MinDepth = .1f;
viewport.MaxDepth = 1000.f;
if (swapchain != nullptr)
{
// Create Render Target Attachments from swapchain
swapchain->ResizeBuffers(backbufferCount, width, height,
DXGI_FORMAT_R8G8B8A8_UNORM, 0);
}
else
{
// ⛓️ Create swapchain
DXGI_SWAP_CHAIN_DESC1 swapchainDesc = {};
swapchainDesc.BufferCount = backbufferCount;
swapchainDesc.Width = width;
swapchainDesc.Height = height;
swapchainDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
swapchainDesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
swapchainDesc.SwapEffect = DXGI_SWAP_EFFECT_FLIP_DISCARD;
swapchainDesc.SampleDesc.Count = 1;
IDXGISwapChain1* newSwapchain =
xgfx::createSwapchain(window, factory, commandQueue, &swapchainDesc);
HRESULT swapchainSupport = swapchain->QueryInterface(
__uuidof(IDXGISwapChain3), (void**)&newSwapchain);
if (SUCCEEDED(swapchainSupport))
{
swapchain = (IDXGISwapChain3*)newSwapchain;
}
}
frameIndex = swapchain->GetCurrentBackBufferIndex();// Describe and create a render target view (RTV) descriptor heap.
D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {};
rtvHeapDesc.NumDescriptors = backbufferCount;
rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV;
rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE;
ThrowIfFailed(device->CreateDescriptorHeap(
&rtvHeapDesc, IID_PPV_ARGS(&renderTargetViewHeap)));
rtvDescriptorSize =
device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV);
// 🎞️ Create frame resourcesD3D12_CPU_DESCRIPTOR_HANDLE
rtvHandle(renderTargetViewHeap->GetCPUDescriptorHandleForHeapStart());
// Create a RTV for each frame.
for (UINT n = 0; n < backbufferCount; n++)
{
ThrowIfFailed(swapchain->GetBuffer(n, IID_PPV_ARGS(&renderTargets[n])));
device->CreateRenderTargetView(renderTargets[n], nullptr, rtvHandle);
rtvHandle.ptr += (1 * rtvDescriptorSize);
}

Initialize Resources

Root Signature

Root Signatures are objects that define what type of resources are accessible to your shaders, be it constant buffers, samplers, textures, structured buffers, etc.

// 👋 Declare Handles
ID3D12RootSignature* rootSignature;
// 🔎 Determine if we can get Root Signature Version 1.1:
D3D12_FEATURE_DATA_ROOT_SIGNATURE featureData = {};
featureData.HighestVersion = D3D_ROOT_SIGNATURE_VERSION_1_1;
if (FAILED(device->CheckFeatureSupport(D3D12_FEATURE_ROOT_SIGNATURE,
&featureData, sizeof(featureData))))
{
featureData.HighestVersion = D3D_ROOT_SIGNATURE_VERSION_1_0;
}
// 📂 Individual GPU Resources
D3D12_DESCRIPTOR_RANGE1 ranges[1];
ranges[0].BaseShaderRegister = 0;
ranges[0].RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV;
ranges[0].NumDescriptors = 1;
ranges[0].RegisterSpace = 0;
ranges[0].OffsetInDescriptorsFromTableStart = 0;
ranges[0].Flags = D3D12_DESCRIPTOR_RANGE_FLAG_NONE;
//🗄️ Groups of GPU Resources
D3D12_ROOT_PARAMETER1 rootParameters[1];
rootParameters[0].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE;
rootParameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_VERTEX;
rootParameters[0].DescriptorTable.NumDescriptorRanges = 1;
rootParameters[0].DescriptorTable.pDescriptorRanges = ranges;
// 🏢 Overall Layout
D3D12_VERSIONED_ROOT_SIGNATURE_DESC rootSignatureDesc;
rootSignatureDesc.Version = D3D_ROOT_SIGNATURE_VERSION_1_1;
rootSignatureDesc.Desc_1_1.Flags =
D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT;
rootSignatureDesc.Desc_1_1.NumParameters = 1;
rootSignatureDesc.Desc_1_1.pParameters = rootParameters;
rootSignatureDesc.Desc_1_1.NumStaticSamplers = 0;
rootSignatureDesc.Desc_1_1.pStaticSamplers = nullptr;
ID3DBlob* signature;
ID3DBlob* error;
try
{
// 🌱 Create the root signature
ThrowIfFailed(D3D12SerializeVersionedRootSignature(&rootSignatureDesc,
&signature, &error));
ThrowIfFailed(device->CreateRootSignature(0, signature->GetBufferPointer(),
signature->GetBufferSize(),
IID_PPV_ARGS(&rootSignature)));
rootSignature->SetName(L"Hello Triangle Root Signature");
}
catch (std::exception e)
{
const char* errStr = (const char*)error->GetBufferPointer();
std::cout << errStr;
error->Release();
error = nullptr;
}
if (signature)
{
signature->Release();
signature = nullptr;
}

While these work well enough, using bindless resources is significantly more easy, Matt Pettineo wrote a blog post about this.

Vertex Buffer

A Vertex Buffer stores the per vertex information available as attributes in your Vertex Shader. All buffers are ID3D12Resource objects in DirectX 12, be it Vertex Buffers, Index Buffers, Constant Buffers, etc.

// 💾 Declare Data
struct Vertex
{
float position[3];
float color[3];
};
Vertex vertexBufferData[3] = {{{1.0f, -1.0f, 0.0f}, {1.0f, 0.0f, 0.0f}},
{{-1.0f, -1.0f, 0.0f}, {0.0f, 1.0f, 0.0f}},
{{0.0f, 1.0f, 0.0f}, {0.0f, 0.0f, 1.0f}}};
// 👋 Declare Handles
ID3D12Resource* vertexBuffer;
D3D12_VERTEX_BUFFER_VIEW vertexBufferView;
const UINT vertexBufferSize = sizeof(vertexBufferData);D3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_UPLOAD;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
D3D12_RESOURCE_DESC vertexBufferResourceDesc;
vertexBufferResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
vertexBufferResourceDesc.Alignment = 0;
vertexBufferResourceDesc.Width = vertexBufferSize;
vertexBufferResourceDesc.Height = 1;
vertexBufferResourceDesc.DepthOrArraySize = 1;
vertexBufferResourceDesc.MipLevels = 1;
vertexBufferResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
vertexBufferResourceDesc.SampleDesc.Count = 1;
vertexBufferResourceDesc.SampleDesc.Quality = 0;
vertexBufferResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
vertexBufferResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &vertexBufferResourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&vertexBuffer)));
// 📄 Copy the triangle data to the vertex buffer.
UINT8* pVertexDataBegin;
// 🙈 We do not intend to read from this resource on the CPU.
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(vertexBuffer->Map(0, &readRange,
reinterpret_cast<void**>(&pVertexDataBegin)));
memcpy(pVertexDataBegin, vertexBufferData, sizeof(vertexBufferData));
vertexBuffer->Unmap(0, nullptr);
// 👀 Initialize the vertex buffer view.
vertexBufferView.BufferLocation = vertexBuffer->GetGPUVirtualAddress();
vertexBufferView.StrideInBytes = sizeof(Vertex);
vertexBufferView.SizeInBytes = vertexBufferSize;

Index Buffer

An Index Buffer contains the individual indices of each triangle/line/point that you intend to draw.

// 💾 Declare Data
uint32_t indexBufferData[3] = {0, 1, 2};
// 👋 Declare Handles
ID3D12Resource* indexBuffer;
D3D12_INDEX_BUFFER_VIEW indexBufferView;
const UINT indexBufferSize = sizeof(indexBufferData);D3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_UPLOAD;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
D3D12_RESOURCE_DESC vertexBufferResourceDesc;
vertexBufferResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
vertexBufferResourceDesc.Alignment = 0;
vertexBufferResourceDesc.Width = indexBufferSize;
vertexBufferResourceDesc.Height = 1;
vertexBufferResourceDesc.DepthOrArraySize = 1;
vertexBufferResourceDesc.MipLevels = 1;
vertexBufferResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
vertexBufferResourceDesc.SampleDesc.Count = 1;
vertexBufferResourceDesc.SampleDesc.Quality = 0;
vertexBufferResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
vertexBufferResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &vertexBufferResourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&indexBuffer)));
// 📄 Copy data to DirectX 12 driver memory:
UINT8* pVertexDataBegin;
// 🙈 We do not intend to read from this resource on the CPU.
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(indexBuffer->Map(0, &readRange,
reinterpret_cast<void**>(&pVertexDataBegin)));
memcpy(pVertexDataBegin, indexBufferData, sizeof(indexBufferData));
indexBuffer->Unmap(0, nullptr);
// 👀 Initialize the index buffer view.
indexBufferView.BufferLocation = indexBuffer->GetGPUVirtualAddress();
indexBufferView.Format = DXGI_FORMAT_R32_UINT;
indexBufferView.SizeInBytes = indexBufferSize;

Constant Buffer

A Constant Buffer describes data that we’ll be sending to shader stages when drawing. Normally you would put Model View Projection Matrices or any specific variable data like colors, sliders, etc. here.

// 💾 Declare Data
struct
{
glm::mat4 projectionMatrix;
glm::mat4 modelMatrix;
glm::mat4 viewMatrix;
} cbVS;
// 👋 Declare Handles
ID3D12Resource* constantBuffer;
ID3D12DescriptorHeap* constantBufferHeap;
UINT8* mappedConstantBuffer;
// 🧊 Create the Constant BufferD3D12_HEAP_PROPERTIES heapProps;
heapProps.Type = D3D12_HEAP_TYPE_UPLOAD;
heapProps.CPUPageProperty = D3D12_CPU_PAGE_PROPERTY_UNKNOWN;
heapProps.MemoryPoolPreference = D3D12_MEMORY_POOL_UNKNOWN;
heapProps.CreationNodeMask = 1;
heapProps.VisibleNodeMask = 1;
D3D12_DESCRIPTOR_HEAP_DESC heapDesc = {};
heapDesc.NumDescriptors = 1;
heapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
heapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
ThrowIfFailed(device->CreateDescriptorHeap(&heapDesc,
IID_PPV_ARGS(&constantBufferHeap)));
D3D12_RESOURCE_DESC cbResourceDesc;
cbResourceDesc.Dimension = D3D12_RESOURCE_DIMENSION_BUFFER;
cbResourceDesc.Alignment = 0;
cbResourceDesc.Width = (sizeof(cbVS) + 255) & ~255;
cbResourceDesc.Height = 1;
cbResourceDesc.DepthOrArraySize = 1;
cbResourceDesc.MipLevels = 1;
cbResourceDesc.Format = DXGI_FORMAT_UNKNOWN;
cbResourceDesc.SampleDesc.Count = 1;
cbResourceDesc.SampleDesc.Quality = 0;
cbResourceDesc.Layout = D3D12_TEXTURE_LAYOUT_ROW_MAJOR;
cbResourceDesc.Flags = D3D12_RESOURCE_FLAG_NONE;
ThrowIfFailed(device->CreateCommittedResource(
&heapProps, D3D12_HEAP_FLAG_NONE, &cbResourceDesc,
D3D12_RESOURCE_STATE_GENERIC_READ, nullptr, IID_PPV_ARGS(&constantBuffer)));
constantBufferHeap->SetName(L"Constant Buffer Upload Resource Heap");
// 👓 Create our Constant Buffer View
D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {};
cbvDesc.BufferLocation = constantBuffer->GetGPUVirtualAddress();
cbvDesc.SizeInBytes =
(sizeof(cbVS) + 255) & ~255; // CB size is required to be 256-byte aligned.
D3D12_CPU_DESCRIPTOR_HANDLE
cbvHandle(constantBufferHeap->GetCPUDescriptorHandleForHeapStart());
cbvHandle.ptr = cbvHandle.ptr + device->GetDescriptorHandleIncrementSize(
D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV) *
0;
device->CreateConstantBufferView(&cbvDesc, cbvHandle);// 🙈 We do not intend to read from this resource on the CPU.
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(constantBuffer->Map(
0, &readRange, reinterpret_cast<void**>(&mappedConstantBuffer)));
memcpy(mappedConstantBuffer, &cbVS, sizeof(cbVS));
constantBuffer->Unmap(0, &readRange);

Vertex Shader

Vertex Shaders execute per vertex, and are perfect for transforming a given object, performing per vertex animations with blend shapes, GPU skinning, etc.

// 👋 Declare handles
ID3DBlob* vertexShaderBlob = nullptr;
ID3DBlob* errors = nullptr;
#if defined(_DEBUG)
// 🐜 Enable better shader debugging with the graphics debugging tools.
UINT compileFlags = D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION;
#else
UINT compileFlags = 0;
#endif
std::string path = "";
char pBuf[1024];
_getcwd(pBuf, 1024);
path = pBuf;
path += "\\";
std::wstring wpath = std::wstring(path.begin(), path.end());
std::wstring vertPath = wpath + L"assets/triangle.vert.hlsl";try
{
ThrowIfFailed(D3DCompileFromFile(vertPath.c_str(), nullptr, nullptr, "main",
"vs_5_0", compileFlags, 0,
&vertexShaderBlob, &errors));
}
catch (std::exception e)
{
const char* errStr = (const char*)errors->GetBufferPointer();
std::cout << errStr;
errors->Release();
errors = nullptr;
}

Here’s the vertex shader:

cbuffer cb : register(b0)
{
row_major float4x4 projectionMatrix : packoffset(c0);
row_major float4x4 modelMatrix : packoffset(c4);
row_major float4x4 viewMatrix : packoffset(c8);
};
struct VertexInput
{
float3 inPos : POSITION;
float3 inColor : COLOR;
};
struct VertexOutput
{
float3 color : COLOR;
float4 position : SV_Position;
};
VertexOutput main(VertexInput vertexInput)
{
float3 inColor = vertexInput.inColor;
float3 inPos = vertexInput.inPos;
float3 outColor = inColor;
float4 position = mul(float4(inPos, 1.0f), mul(modelMatrix, mul(viewMatrix, projectionMatrix)));
VertexOutput output;
output.position = position;
output.color = outColor;
return output;
}

Pixel Shader

Pixel Shaders execute per each pixel of your output, including the other attachments that correspond to that pixel coordinate.

// 👋 Declare handles
ID3DBlob* pixelShaderBlob = nullptr;
ID3DBlob* errors = nullptr;
#if defined(_DEBUG)
// 🐜 Enable better shader debugging with the graphics debugging tools.
UINT compileFlags = D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION;
#else
UINT compileFlags = 0;
#endif
std::string path = "";
char pBuf[1024];
_getcwd(pBuf, 1024);
path = pBuf;
path += "\\";
std::wstring wpath = std::wstring(path.begin(), path.end());
std::wstring fragPath = wpath + L"assets/triangle.frag.hlsl";try
{
ThrowIfFailed(D3DCompileFromFile(fragPath.c_str(), nullptr, nullptr, "main",
"ps_5_0", compileFlags, 0,
&pixelShaderBlob, &errors));
}
catch (std::exception e)
{
const char* errStr = (const char*)errors->GetBufferPointer();
std::cout << errStr;
errors->Release();
errors = nullptr;
}

And here’s the pixel shader:

struct PixelInput
{
float3 color : COLOR;
};
struct PixelOutput
{
float4 attachment0 : SV_Target0;
};
PixelOutput main(PixelInput pixelInput)
{
float3 inColor = pixelInput.color;
PixelOutput output;
output.attachment0 = float4(inColor, 1.0f);
return output;
}

Pipeline State

The Pipeline State describes everything necessary to execute a given raster based draw call.

// 👋 Declare handles
ID3D12PipelineState* pipelineState;
// ⚗️ Define the Graphics Pipeline
D3D12_GRAPHICS_PIPELINE_STATE_DESC psoDesc = {};
// 🔣 Input Assembly
D3D12_INPUT_ELEMENT_DESC inputElementDescs[] = {
{"POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 0,
D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0},
{"COLOR", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, 12,
D3D12_INPUT_CLASSIFICATION_PER_VERTEX_DATA, 0}};
psoDesc.InputLayout = {inputElementDescs, _countof(inputElementDescs)};
// 🦄 Resources
psoDesc.pRootSignature = rootSignature;
// 🔺 Vertex Shader
D3D12_SHADER_BYTECODE vsBytecode;
vsBytecode.pShaderBytecode = vertexShaderBlob->GetBufferPointer();
vsBytecode.BytecodeLength = vertexShaderBlob->GetBufferSize();
psoDesc.VS = vsBytecode;
// 🖌️ Pixel Shader
D3D12_SHADER_BYTECODE psBytecode;
psBytecode.pShaderBytecode = pixelShaderBlob->GetBufferPointer();
psBytecode.BytecodeLength = pixelShaderBlob->GetBufferSize();
psoDesc.PS = psBytecode;
// 🟨 Rasterization
D3D12_RASTERIZER_DESC rasterDesc;
rasterDesc.FillMode = D3D12_FILL_MODE_SOLID;
rasterDesc.CullMode = D3D12_CULL_MODE_NONE;
rasterDesc.FrontCounterClockwise = FALSE;
rasterDesc.DepthBias = D3D12_DEFAULT_DEPTH_BIAS;
rasterDesc.DepthBiasClamp = D3D12_DEFAULT_DEPTH_BIAS_CLAMP;
rasterDesc.SlopeScaledDepthBias = D3D12_DEFAULT_SLOPE_SCALED_DEPTH_BIAS;
rasterDesc.DepthClipEnable = TRUE;
rasterDesc.MultisampleEnable = FALSE;
rasterDesc.AntialiasedLineEnable = FALSE;
rasterDesc.ForcedSampleCount = 0;
rasterDesc.ConservativeRaster = D3D12_CONSERVATIVE_RASTERIZATION_MODE_OFF;
psoDesc.RasterizerState = rasterDesc;
psoDesc.PrimitiveTopologyType = D3D12_PRIMITIVE_TOPOLOGY_TYPE_TRIANGLE;
// 🌀 Color/Blend
D3D12_BLEND_DESC blendDesc;
blendDesc.AlphaToCoverageEnable = FALSE;
blendDesc.IndependentBlendEnable = FALSE;
const D3D12_RENDER_TARGET_BLEND_DESC defaultRenderTargetBlendDesc = {
FALSE,
FALSE,
D3D12_BLEND_ONE,
D3D12_BLEND_ZERO,
D3D12_BLEND_OP_ADD,
D3D12_BLEND_ONE,
D3D12_BLEND_ZERO,
D3D12_BLEND_OP_ADD,
D3D12_LOGIC_OP_NOOP,
D3D12_COLOR_WRITE_ENABLE_ALL,
};
for (UINT i = 0; i < D3D12_SIMULTANEOUS_RENDER_TARGET_COUNT; ++i)
blendDesc.RenderTarget[i] = defaultRenderTargetBlendDesc;
psoDesc.BlendState = blendDesc;
// 🌑 Depth/Stencil State
psoDesc.DepthStencilState.DepthEnable = FALSE;
psoDesc.DepthStencilState.StencilEnable = FALSE;
psoDesc.SampleMask = UINT_MAX;
// 🖼️ Output
psoDesc.NumRenderTargets = 1;
psoDesc.RTVFormats[0] = DXGI_FORMAT_R8G8B8A8_UNORM;
psoDesc.SampleDesc.Count = 1;
// 🌟 Create the raster pipeline state
try
{
ThrowIfFailed(device->CreateGraphicsPipelineState(
&psoDesc, IID_PPV_ARGS(&pipelineState)));
}
catch (std::exception e)
{
std::cout << "Failed to create Graphics Pipeline!";
}

Encoding Commands

In order to execute draw calls, you’re going to need a place to write commands. A Command List is an object that can encode a number of commands to be executed by the GPU, be it configuring barriers, setting root signatures, etc.

// 👋 Declare handles
ID3D12CommandAllocator* commandAllocator;
ID3D12PipelineState* initialPipelineState;
ID3D12GraphicsCommandList* commandList;
// 📃 Create the command list.
ThrowIfFailed(device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT,
commandAllocator, initialPipelineState,
IID_PPV_ARGS(&commandList)));

Later, encode those commands and submit them:

// 🚿 Reset the command list and add new commands.
ThrowIfFailed(commandAllocator->Reset());
// 🖌️ Begin using the Raster Graphics Pipeline
ThrowIfFailed(commandList->Reset(commandAllocator, pipelineState));
// 🔳 Setup Resources
commandList->SetGraphicsRootSignature(rootSignature);
ID3D12DescriptorHeap* pDescriptorHeaps[] = {constantBufferHeap};
commandList->SetDescriptorHeaps(_countof(pDescriptorHeaps), pDescriptorHeaps);
D3D12_GPU_DESCRIPTOR_HANDLE
cbvHandle(constantBufferHeap->GetGPUDescriptorHandleForHeapStart());
commandList->SetGraphicsRootDescriptorTable(0, cbvHandle);
// 🖼️ Indicate that the back buffer will be used as a render target.
D3D12_RESOURCE_BARRIER renderTargetBarrier;
renderTargetBarrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
renderTargetBarrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
renderTargetBarrier.Transition.pResource = renderTargets[frameIndex];
renderTargetBarrier.Transition.StateBefore = D3D12_RESOURCE_STATE_PRESENT;
renderTargetBarrier.Transition.StateAfter = D3D12_RESOURCE_STATE_RENDER_TARGET;
renderTargetBarrier.Transition.Subresource =
D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(1, &renderTargetBarrier);
D3D12_CPU_DESCRIPTOR_HANDLE
rtvHandle(rtvHeap->GetCPUDescriptorHandleForHeapStart());
rtvHandle.ptr = rtvHandle.ptr + (frameIndex * rtvDescriptorSize);
commandList->OMSetRenderTargets(1, &rtvHandle, FALSE, nullptr);
// 🎥 Record raster commands.
const float clearColor[] = {0.2f, 0.2f, 0.2f, 1.0f};
commandList->RSSetViewports(1, &viewport);
commandList->RSSetScissorRects(1, &surfaceSize);
commandList->ClearRenderTargetView(rtvHandle, clearColor, 0, nullptr);
commandList->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
commandList->IASetVertexBuffers(0, 1, &vertexBufferView);
commandList->IASetIndexBuffer(&indexBufferView);
commandList->DrawIndexedInstanced(3, 1, 0, 0, 0);// 🖼️ Indicate that the back buffer will now be used to present.
D3D12_RESOURCE_BARRIER presentBarrier;
presentBarrier.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;
presentBarrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
presentBarrier.Transition.pResource = renderTargets[frameIndex];
presentBarrier.Transition.StateBefore = D3D12_RESOURCE_STATE_RENDER_TARGET;
presentBarrier.Transition.StateAfter = D3D12_RESOURCE_STATE_PRESENT;
presentBarrier.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;
commandList->ResourceBarrier(1, &presentBarrier);ThrowIfFailed(commandList->Close());

Render

Rendering in DirectX 12 is a simple matter of changing any constant buffer data you intend to update, submitting your command lists to be executed, presenting the swapchain so your Win32 or UWP window updates, and signaling your application that you’ve finished rendering.

// 👋 declare handles
std::chrono::time_point<std::chrono::steady_clock> tStart, tEnd;
float elapsedTime = 0.0f;
void render()
{
// ⌚ Frame limit set to 60 fps
tEnd = std::chrono::high_resolution_clock::now();
float time =
std::chrono::duration<float, std::milli>(tEnd - tStart).count();
if (time < (1000.0f / 60.0f))
{
return;
}
tStart = std::chrono::high_resolution_clock::now();
// 🎛️ Update Uniforms
elapsedTime += 0.001f * time;
elapsedTime = fmodf(elapsedTime, 6.283185307179586f);
cbVS.modelMatrix = Matrix4::rotationY(elapsedTime);
D3D12_RANGE readRange;
readRange.Begin = 0;
readRange.End = 0;
ThrowIfFailed(constantBuffer->Map(
0, &readRange, reinterpret_cast<void**>(&mappedConstantBuffer)));
memcpy(mappedConstantBuffer, &cbVS, sizeof(cbVS));
constantBuffer->Unmap(0, &readRange);
setupCommands(); ID3D12CommandList* ppCommandLists[] = {commandList};
commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists);
// 🎥 Present, then wait till finished to continue execution
swapchain->Present(1, 0);
const UINT64 fence = fenceValue;
ThrowIfFailed(commandQueue->Signal(fence, fence));
fenceValue++;
if (fence->GetCompletedValue() < fence)
{
ThrowIfFailed(fence->SetEventOnCompletion(fence, fenceEvent));
WaitForSingleObject(fenceEvent, INFINITE);
}
frameIndex = swapchain->GetCurrentBackBufferIndex();
}

Destroy Handles

If you’re using ComPtr<T> data structures, then just like with shared pointers, you don't need to worry about destroying any handles you create. If you don't, you can call the Release() function built into every DirectX data structure.

Conclusion

DirectX 12 is a feature rich and robust computer graphics API, perfect for commercial projects. The API is similar to modern graphics APIs in its design, while also being the primary API maintained by both hardware driver engineers and engineers of commercial projects. This post reviews raster based drawing, however there are more aspects of DirectX not discussed here worth reviewing, such as:

  • DirectML — Hardware accelerated machine learning model execution.
  • DirectX Raytracing — Hardware accelerated ray tracing and scene traversal.
  • Compute Shaders — GPGPU based execution of arbitrary tasks, such as image processing, physics, etc.

More Resources

Be sure to check out some of the following blog posts, tools, and projects:

Articles

Samples

Tools

  • Tim Jones (@tim_jones) released a VS Code Plugin called HLSL Tools that lets you lint your shaders more easily.

You’ll find all the source code described in this post in the Github repo here.

--

--

Alain Galvan

https://Alain.xyz | Graphics Software Engineer @ AMD, Previously @ Marmoset.co. Guest lecturer talking about 🛆 Computer Graphics, ✍ tech author.