ADD: KTX loader
This commit is contained in:
@@ -1,7 +1,7 @@
|
|||||||
Texture Loading & Streaming
|
Texture Loading & Streaming
|
||||||
|
|
||||||
Overview
|
Overview
|
||||||
- Streaming cache: `src/core/texture_cache.{h,cpp}` asynchronously decodes images (stb_image) on a small worker pool (1–4 threads, clamped by hardware concurrency) and uploads them via `ResourceManager` with optional mipmaps. Descriptors registered up‑front are patched in‑place once the texture becomes resident. Large decodes can be downscaled on workers before upload to cap peak memory.
|
- Streaming cache: `src/core/texture_cache.{h,cpp}` asynchronously decodes images (stb_image) on a small worker pool (1–4 threads, clamped by hardware concurrency) and uploads them via `ResourceManager` with optional mipmaps. For FilePath keys, a sibling `<stem>.ktx2` (or direct `.ktx2`) is preferred over PNG/JPEG. Descriptors registered up‑front are patched in‑place once the texture becomes resident. Large decodes can be downscaled on workers before upload to cap peak memory.
|
||||||
- Uploads: `src/core/vk_resource.{h,cpp}` stages pixel data and either submits immediately or registers a Render Graph transfer pass. Mipmaps use `vkutil::generate_mipmaps(...)` and finish in `SHADER_READ_ONLY_OPTIMAL`.
|
- Uploads: `src/core/vk_resource.{h,cpp}` stages pixel data and either submits immediately or registers a Render Graph transfer pass. Mipmaps use `vkutil::generate_mipmaps(...)` and finish in `SHADER_READ_ONLY_OPTIMAL`.
|
||||||
- Integration points:
|
- Integration points:
|
||||||
- Materials: layouts use `UPDATE_AFTER_BIND`; descriptors can be rewritten after bind.
|
- Materials: layouts use `UPDATE_AFTER_BIND`; descriptors can be rewritten after bind.
|
||||||
@@ -18,11 +18,14 @@ Data Flow
|
|||||||
- `pumpLoads(...)` looks for entries in `Unloaded` or `Evicted` state that were seen recently (`now == 0` or `now - lastUsed <= 1`) and starts at most `max_loads_per_pump` decodes per call, while enforcing a byte budget for uploads per frame.
|
- `pumpLoads(...)` looks for entries in `Unloaded` or `Evicted` state that were seen recently (`now == 0` or `now - lastUsed <= 1`) and starts at most `max_loads_per_pump` decodes per call, while enforcing a byte budget for uploads per frame.
|
||||||
- Render passes mark used sets each frame with `markSetUsed(...)` (or specific handles via `markUsed(...)`).
|
- Render passes mark used sets each frame with `markSetUsed(...)` (or specific handles via `markUsed(...)`).
|
||||||
- Decode
|
- Decode
|
||||||
- Worker threads decode to RGBA8 with stb_image (`stbi_load` / `stbi_load_from_memory`). Results are queued for the main thread.
|
- FilePath: if the path ends with `.ktx2` or a sibling exists, parse KTX2 (2D, single‑face, single‑layer, no supercompression). Otherwise, decode to RGBA8 via stb_image.
|
||||||
-- Admission & Upload
|
- Bytes: always decode via stb_image (no sibling discovery possible).
|
||||||
- Before upload, an expected resident size is computed from chosen format (R/RG/RGBA) and mip count (full chain or clamped). A per‑frame byte budget (`max_bytes_per_pump`) throttles the total amount uploaded each pump.
|
- Admission & Upload
|
||||||
- If a GPU texture budget is set, the cache tries to free space by evicting least‑recently‑used textures not used this frame. If it still cannot fit, the decode is deferred (kept in the ready queue) or dropped with backoff if VRAM is tight.
|
- Before upload, an expected resident size is computed (exact for KTX2 by summing level byte lengths; estimated for raster by format×area×mip‑factor). A per‑frame byte budget (`max_bytes_per_pump`) throttles uploads.
|
||||||
- Uploads are created via `ResourceManager::create_image(...)`, which now supports an explicit mip count. Deferred upload paths generate exactly the requested number of mips.
|
- If a GPU texture budget is set, the cache evicts least‑recently‑used textures not used this frame. If it still cannot fit, the decode is deferred or dropped with backoff.
|
||||||
|
- Raster: `ResourceManager::create_image(...)` stages a single region, then optionally generates mips on GPU.
|
||||||
|
- KTX2: `ResourceManager::create_image_compressed(...)` allocates an image with the file’s `VkFormat` and records one `VkBufferImageCopy` per mip level (no GPU mip gen). Immediate path transitions to `SHADER_READ_ONLY_OPTIMAL`; RG path leaves it in `TRANSFER_DST` until a sampling pass.
|
||||||
|
- If the device cannot sample the KTX2 format, the cache falls back to raster decode.
|
||||||
- After upload: state → `Resident`, descriptors recorded via `watchBinding` are rewritten to the new image view with the chosen sampler and `SHADER_READ_ONLY_OPTIMAL` layout. For Bytes‑backed keys, compressed source bytes are dropped unless `keep_source_bytes` is enabled.
|
- After upload: state → `Resident`, descriptors recorded via `watchBinding` are rewritten to the new image view with the chosen sampler and `SHADER_READ_ONLY_OPTIMAL` layout. For Bytes‑backed keys, compressed source bytes are dropped unless `keep_source_bytes` is enabled.
|
||||||
- Eviction & Reload
|
- Eviction & Reload
|
||||||
- `evictToBudget(bytes)` rewrites watchers to fallbacks, destroys images, and marks entries `Evicted`. Evicted entries can reload automatically when seen again and a short cooldown has passed (default ~2 frames), avoiding immediate thrash.
|
- `evictToBudget(bytes)` rewrites watchers to fallbacks, destroys images, and marks entries `Evicted`. Evicted entries can reload automatically when seen again and a short cooldown has passed (default ~2 frames), avoiding immediate thrash.
|
||||||
@@ -67,11 +70,15 @@ Implementation Notes
|
|||||||
- Progressive downscale
|
- Progressive downscale
|
||||||
- The decode thread downsizes large images by powers of 2 until within `Max Upload Dimension`, reducing both staging and VRAM. You can increase the cap or disable it (set to 0) from the UI.
|
- The decode thread downsizes large images by powers of 2 until within `Max Upload Dimension`, reducing both staging and VRAM. You can increase the cap or disable it (set to 0) from the UI.
|
||||||
|
|
||||||
|
KTX2 specifics
|
||||||
|
- Supported: 2D, single‑face, single‑layer, no supercompression; pre‑transcoded BCn (including sRGB variants).
|
||||||
|
- Not supported: UASTC/BasisLZ transcoding at runtime, cube/array/multilayer.
|
||||||
|
|
||||||
Limitations / Future Work
|
Limitations / Future Work
|
||||||
- Linear‑blit capability check
|
- Linear‑blit capability check
|
||||||
- `generate_mipmaps` always uses `VK_FILTER_LINEAR`. Add a format/feature check and a fallback path (nearest or compute downsample).
|
- `generate_mipmaps` always uses `VK_FILTER_LINEAR`. Add a format/feature check and a fallback path (nearest or compute downsample).
|
||||||
- Texture formats
|
- Texture formats
|
||||||
- Only 8‑bit RGBA uploads via stb_image today. Consider KTX2/BasisU for ASTC/BCn, specialized R8/RG8 paths, and float HDR support (`stbi_loadf` → `R16G16B16A16_SFLOAT`).
|
- Raster path: 8‑bit R/RG/RGBA via stb_image. Compressed path: BCn via `.ktx2`. Future: ASTC/ETC2, specialized R8/RG8 parsing, and float HDR support (`stbi_loadf` → `R16G16B16A16_SFLOAT`).
|
||||||
- Normal‑map mip quality
|
- Normal‑map mip quality
|
||||||
- Linear blits reduce normal length; consider a compute renormalization pass.
|
- Linear blits reduce normal length; consider a compute renormalization pass.
|
||||||
- Samplers
|
- Samplers
|
||||||
|
|||||||
@@ -34,6 +34,8 @@ add_executable (vulkan_engine
|
|||||||
core/frame_resources.cpp
|
core/frame_resources.cpp
|
||||||
core/texture_cache.h
|
core/texture_cache.h
|
||||||
core/texture_cache.cpp
|
core/texture_cache.cpp
|
||||||
|
core/ktx2_loader.h
|
||||||
|
core/ktx2_loader.cpp
|
||||||
core/config.h
|
core/config.h
|
||||||
core/vk_engine.h
|
core/vk_engine.h
|
||||||
core/vk_engine.cpp
|
core/vk_engine.cpp
|
||||||
|
|||||||
@@ -6,9 +6,12 @@
|
|||||||
#include <core/config.h>
|
#include <core/config.h>
|
||||||
#include <algorithm>
|
#include <algorithm>
|
||||||
#include "stb_image.h"
|
#include "stb_image.h"
|
||||||
|
#include "ktx2_loader.h"
|
||||||
#include <algorithm>
|
#include <algorithm>
|
||||||
#include "vk_device.h"
|
#include "vk_device.h"
|
||||||
#include <cstring>
|
#include <cstring>
|
||||||
|
#include <filesystem>
|
||||||
|
#include <fstream>
|
||||||
#include <limits>
|
#include <limits>
|
||||||
#include <cmath>
|
#include <cmath>
|
||||||
|
|
||||||
@@ -372,29 +375,101 @@ void TextureCache::worker_loop()
|
|||||||
_queue.pop_front();
|
_queue.pop_front();
|
||||||
}
|
}
|
||||||
|
|
||||||
// Decode using stb_image
|
DecodedResult out{};
|
||||||
|
out.handle = rq.handle;
|
||||||
|
out.mipmapped = rq.key.mipmapped;
|
||||||
|
out.srgb = rq.key.srgb;
|
||||||
|
out.channels = rq.key.channels;
|
||||||
|
out.mipClampLevels = rq.key.mipClampLevels;
|
||||||
|
|
||||||
|
// 1) Prefer KTX2 when source is a file path and a .ktx2 version exists
|
||||||
|
bool attemptedKTX2 = false;
|
||||||
|
if (rq.key.kind == TextureKey::SourceKind::FilePath)
|
||||||
|
{
|
||||||
|
std::filesystem::path p = rq.path;
|
||||||
|
std::filesystem::path ktxPath;
|
||||||
|
if (p.extension() == ".ktx2")
|
||||||
|
{
|
||||||
|
ktxPath = p;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
ktxPath = p;
|
||||||
|
ktxPath.replace_extension(".ktx2");
|
||||||
|
}
|
||||||
|
std::error_code ec;
|
||||||
|
bool hasKTX2 = (!ktxPath.empty() && std::filesystem::exists(ktxPath, ec) && !ec);
|
||||||
|
if (hasKTX2)
|
||||||
|
{
|
||||||
|
attemptedKTX2 = true;
|
||||||
|
// Read file
|
||||||
|
fmt::println("[TextureCache] KTX2 candidate for '{}' → '{}'", rq.path, ktxPath.string());
|
||||||
|
std::ifstream ifs(ktxPath, std::ios::binary);
|
||||||
|
if (ifs)
|
||||||
|
{
|
||||||
|
std::vector<uint8_t> fileBytes(std::istreambuf_iterator<char>(ifs), {});
|
||||||
|
fmt::println("[TextureCache] KTX2 read {} bytes", fileBytes.size());
|
||||||
|
KTX2Image ktx{};
|
||||||
|
std::string err;
|
||||||
|
if (parse_ktx2(fileBytes.data(), fileBytes.size(), ktx, &err))
|
||||||
|
{
|
||||||
|
fmt::println("[TextureCache] KTX2 parsed: format={}, {}x{}, mips={}, faces={}, layers={}, supercompression={}",
|
||||||
|
string_VkFormat(static_cast<VkFormat>(ktx.format)), ktx.width, ktx.height,
|
||||||
|
ktx.mipLevels, ktx.faceCount, ktx.layerCount, ktx.supercompression);
|
||||||
|
size_t sum = 0; for (const auto &lv: ktx.levels) sum += static_cast<size_t>(lv.length);
|
||||||
|
fmt::println("[TextureCache] KTX2 levels: {} totalBytes={}", ktx.levels.size(), sum);
|
||||||
|
for (size_t li = 0; li < ktx.levels.size(); ++li)
|
||||||
|
{
|
||||||
|
fmt::println(" L{}: off={}, len={}, extent={}x{}", li, ktx.levels[li].offset,
|
||||||
|
ktx.levels[li].length,
|
||||||
|
std::max(1u, ktx.width >> li),
|
||||||
|
std::max(1u, ktx.height >> li));
|
||||||
|
}
|
||||||
|
out.isKTX2 = true;
|
||||||
|
out.ktxFormat = ktx.format;
|
||||||
|
out.ktxMipLevels = ktx.mipLevels;
|
||||||
|
out.ktx.bytes = std::move(ktx.data);
|
||||||
|
out.ktx.levels.reserve(ktx.levels.size());
|
||||||
|
for (const auto &lv : ktx.levels)
|
||||||
|
{
|
||||||
|
out.ktx.levels.push_back({lv.offset, lv.length, lv.width, lv.height});
|
||||||
|
}
|
||||||
|
out.width = static_cast<int>(ktx.width);
|
||||||
|
out.height = static_cast<int>(ktx.height);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
fmt::println("[TextureCache] parse_ktx2 failed for '{}' ({} bytes): {}",
|
||||||
|
ktxPath.string(), fileBytes.size(), err);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
fmt::println("[TextureCache] Failed to open KTX2 file '{}'", ktxPath.string());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else if (p.extension() == ".ktx2")
|
||||||
|
{
|
||||||
|
fmt::println("[TextureCache] Requested .ktx2 '{}' but file not found (ec={})", p.string(), ec.value());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 2) Raster fallback via stb_image if not KTX2 or unsupported
|
||||||
|
if (!out.isKTX2)
|
||||||
|
{
|
||||||
int w = 0, h = 0, comp = 0;
|
int w = 0, h = 0, comp = 0;
|
||||||
unsigned char *data = nullptr;
|
unsigned char *data = nullptr;
|
||||||
if (rq.key.kind == TextureKey::SourceKind::FilePath)
|
if (rq.key.kind == TextureKey::SourceKind::FilePath)
|
||||||
{
|
{
|
||||||
data = stbi_load(rq.path.c_str(), &w, &h, &comp, 4);
|
data = stbi_load(rq.path.c_str(), &w, &h, &comp, 4);
|
||||||
}
|
}
|
||||||
else
|
else if (!rq.bytes.empty())
|
||||||
{
|
|
||||||
if (!rq.bytes.empty())
|
|
||||||
{
|
{
|
||||||
data = stbi_load_from_memory(rq.bytes.data(), static_cast<int>(rq.bytes.size()), &w, &h, &comp, 4);
|
data = stbi_load_from_memory(rq.bytes.data(), static_cast<int>(rq.bytes.size()), &w, &h, &comp, 4);
|
||||||
}
|
}
|
||||||
}
|
|
||||||
|
|
||||||
DecodedResult out{};
|
|
||||||
out.handle = rq.handle;
|
|
||||||
out.width = w;
|
out.width = w;
|
||||||
out.height = h;
|
out.height = h;
|
||||||
out.mipmapped = rq.key.mipmapped;
|
|
||||||
out.srgb = rq.key.srgb;
|
|
||||||
out.channels = rq.key.channels;
|
|
||||||
out.mipClampLevels = rq.key.mipClampLevels;
|
|
||||||
if (data && w > 0 && h > 0)
|
if (data && w > 0 && h > 0)
|
||||||
{
|
{
|
||||||
// Progressive downscale if requested
|
// Progressive downscale if requested
|
||||||
@@ -425,6 +500,7 @@ void TextureCache::worker_loop()
|
|||||||
{
|
{
|
||||||
stbi_image_free(data);
|
stbi_image_free(data);
|
||||||
}
|
}
|
||||||
|
}
|
||||||
|
|
||||||
{
|
{
|
||||||
std::lock_guard<std::mutex> lk(_readyMutex);
|
std::lock_guard<std::mutex> lk(_readyMutex);
|
||||||
@@ -447,34 +523,38 @@ size_t TextureCache::drain_ready_uploads(ResourceManager &rm, size_t budgetBytes
|
|||||||
{
|
{
|
||||||
if (res.handle == InvalidHandle || res.handle >= _entries.size()) continue;
|
if (res.handle == InvalidHandle || res.handle >= _entries.size()) continue;
|
||||||
Entry &e = _entries[res.handle];
|
Entry &e = _entries[res.handle];
|
||||||
if ((res.heap == nullptr && res.rgba.empty()) || res.width <= 0 || res.height <= 0)
|
if (!res.isKTX2 && ((res.heap == nullptr && res.rgba.empty()) || res.width <= 0 || res.height <= 0))
|
||||||
{
|
{
|
||||||
e.state = EntryState::Evicted; // failed decode; keep fallback
|
e.state = EntryState::Evicted; // failed decode; keep fallback
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
const uint32_t now = _context ? _context->frameIndex : 0u;
|
const uint32_t now = _context ? _context->frameIndex : 0u;
|
||||||
VkExtent3D extent{static_cast<uint32_t>(res.width), static_cast<uint32_t>(res.height), 1u};
|
VkExtent3D extent{static_cast<uint32_t>(std::max(0, res.width)), static_cast<uint32_t>(std::max(0, res.height)), 1u};
|
||||||
TextureKey::ChannelsHint hint = (e.key.channels == TextureKey::ChannelsHint::Auto)
|
TextureKey::ChannelsHint hint = (e.key.channels == TextureKey::ChannelsHint::Auto)
|
||||||
? TextureKey::ChannelsHint::Auto
|
? TextureKey::ChannelsHint::Auto
|
||||||
: e.key.channels;
|
: e.key.channels;
|
||||||
VkFormat fmt = choose_format(hint, res.srgb);
|
|
||||||
|
|
||||||
// Estimate resident size for admission control (match post-upload computation)
|
size_t expectedBytes = 0;
|
||||||
|
VkFormat fmt = VK_FORMAT_UNDEFINED;
|
||||||
uint32_t desiredLevels = 1;
|
uint32_t desiredLevels = 1;
|
||||||
if (res.mipmapped)
|
if (res.isKTX2)
|
||||||
{
|
{
|
||||||
if (res.mipClampLevels > 0)
|
fmt = res.ktxFormat;
|
||||||
{
|
desiredLevels = res.ktxMipLevels;
|
||||||
desiredLevels = res.mipClampLevels;
|
for (const auto &lv : res.ktx.levels) expectedBytes += static_cast<size_t>(lv.length);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
desiredLevels = static_cast<uint32_t>(std::floor(std::log2(std::max(extent.width, extent.height)))) + 1u;
|
fmt = choose_format(hint, res.srgb);
|
||||||
}
|
if (res.mipmapped)
|
||||||
|
{
|
||||||
|
if (res.mipClampLevels > 0) desiredLevels = res.mipClampLevels;
|
||||||
|
else desiredLevels = static_cast<uint32_t>(std::floor(std::log2(std::max(extent.width, extent.height)))) + 1u;
|
||||||
}
|
}
|
||||||
const float mipFactor = res.mipmapped ? mip_factor_for_levels(desiredLevels) : 1.0f;
|
const float mipFactor = res.mipmapped ? mip_factor_for_levels(desiredLevels) : 1.0f;
|
||||||
const size_t expectedBytes = static_cast<size_t>(extent.width) * extent.height * bytes_per_texel(fmt) * mipFactor;
|
expectedBytes = static_cast<size_t>(extent.width) * extent.height * bytes_per_texel(fmt) * mipFactor;
|
||||||
|
}
|
||||||
|
|
||||||
// Byte budget for this pump (frame)
|
// Byte budget for this pump (frame)
|
||||||
if (admitted + expectedBytes > budgetBytes)
|
if (admitted + expectedBytes > budgetBytes)
|
||||||
@@ -503,6 +583,61 @@ size_t TextureCache::drain_ready_uploads(ResourceManager &rm, size_t budgetBytes
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (res.isKTX2)
|
||||||
|
{
|
||||||
|
// Basic format support check: ensure the GPU can sample this format
|
||||||
|
bool supported = true;
|
||||||
|
if (_context && _context->getDevice())
|
||||||
|
{
|
||||||
|
VkFormatProperties props{};
|
||||||
|
vkGetPhysicalDeviceFormatProperties(_context->getDevice()->physicalDevice(), fmt, &props);
|
||||||
|
supported = (props.optimalTilingFeatures & VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!supported)
|
||||||
|
{
|
||||||
|
VkFormatProperties props{};
|
||||||
|
if (_context && _context->getDevice())
|
||||||
|
{
|
||||||
|
vkGetPhysicalDeviceFormatProperties(_context->getDevice()->physicalDevice(), fmt, &props);
|
||||||
|
}
|
||||||
|
fmt::println("[TextureCache] Compressed format unsupported: format={} (optimalFeatures=0x{:08x}) — fallback raster for {}",
|
||||||
|
string_VkFormat(fmt), props.optimalTilingFeatures, e.path);
|
||||||
|
// Fall back to raster path: requeue by synthesizing a non-KTX result
|
||||||
|
// Attempt synchronous fallback decode from file if available.
|
||||||
|
int fw = 0, fh = 0, comp = 0;
|
||||||
|
unsigned char *fdata = nullptr;
|
||||||
|
if (e.key.kind == TextureKey::SourceKind::FilePath)
|
||||||
|
{
|
||||||
|
fdata = stbi_load(e.path.c_str(), &fw, &fh, &comp, 4);
|
||||||
|
}
|
||||||
|
if (!fdata)
|
||||||
|
{
|
||||||
|
e.state = EntryState::Evicted;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
VkExtent3D fext{ (uint32_t)fw, (uint32_t)fh, 1 };
|
||||||
|
VkFormat ffmt = choose_format(hint, res.srgb);
|
||||||
|
uint32_t mips = (res.mipmapped) ? static_cast<uint32_t>(std::floor(std::log2(std::max(fext.width, fext.height)))) + 1u : 1u;
|
||||||
|
e.image = rm.create_image(fdata, fext, ffmt, VK_IMAGE_USAGE_SAMPLED_BIT, res.mipmapped, mips);
|
||||||
|
stbi_image_free(fdata);
|
||||||
|
e.sizeBytes = static_cast<size_t>(fext.width) * fext.height * bytes_per_texel(ffmt) * (res.mipmapped ? mip_factor_for_levels(mips) : 1.0f);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
// Prepare level table for ResourceManager
|
||||||
|
std::vector<ResourceManager::MipLevelCopy> levels;
|
||||||
|
levels.reserve(res.ktx.levels.size());
|
||||||
|
for (const auto &lv : res.ktx.levels)
|
||||||
|
{
|
||||||
|
levels.push_back(ResourceManager::MipLevelCopy{ lv.offset, lv.length, lv.width, lv.height });
|
||||||
|
}
|
||||||
|
e.image = rm.create_image_compressed(res.ktx.bytes.data(), res.ktx.bytes.size(), fmt, levels);
|
||||||
|
e.sizeBytes = expectedBytes;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
// Optionally repack channels to R or RG to save memory
|
// Optionally repack channels to R or RG to save memory
|
||||||
std::vector<uint8_t> packed;
|
std::vector<uint8_t> packed;
|
||||||
const void *src = nullptr;
|
const void *src = nullptr;
|
||||||
@@ -535,6 +670,8 @@ size_t TextureCache::drain_ready_uploads(ResourceManager &rm, size_t budgetBytes
|
|||||||
|
|
||||||
uint32_t mipOverride = (res.mipmapped ? desiredLevels : 1);
|
uint32_t mipOverride = (res.mipmapped ? desiredLevels : 1);
|
||||||
e.image = rm.create_image(src, extent, fmt, VK_IMAGE_USAGE_SAMPLED_BIT, res.mipmapped, mipOverride);
|
e.image = rm.create_image(src, extent, fmt, VK_IMAGE_USAGE_SAMPLED_BIT, res.mipmapped, mipOverride);
|
||||||
|
e.sizeBytes = expectedBytes;
|
||||||
|
}
|
||||||
|
|
||||||
if (vmaDebugEnabled())
|
if (vmaDebugEnabled())
|
||||||
{
|
{
|
||||||
@@ -542,7 +679,6 @@ size_t TextureCache::drain_ready_uploads(ResourceManager &rm, size_t budgetBytes
|
|||||||
vmaSetAllocationName(_context->getDevice()->allocator(), e.image.allocation, name.c_str());
|
vmaSetAllocationName(_context->getDevice()->allocator(), e.image.allocation, name.c_str());
|
||||||
}
|
}
|
||||||
|
|
||||||
e.sizeBytes = expectedBytes;
|
|
||||||
_residentBytes += e.sizeBytes;
|
_residentBytes += e.sizeBytes;
|
||||||
e.state = EntryState::Resident;
|
e.state = EntryState::Resident;
|
||||||
e.nextAttemptFrame = 0; // clear backoff after success
|
e.nextAttemptFrame = 0; // clear backoff after success
|
||||||
|
|||||||
@@ -188,6 +188,17 @@ private:
|
|||||||
bool srgb{false};
|
bool srgb{false};
|
||||||
TextureKey::ChannelsHint channels{TextureKey::ChannelsHint::Auto};
|
TextureKey::ChannelsHint channels{TextureKey::ChannelsHint::Auto};
|
||||||
uint32_t mipClampLevels{0};
|
uint32_t mipClampLevels{0};
|
||||||
|
|
||||||
|
// Compressed path (KTX2 pre-transcoded BCn). When true, 'rgba/heap'
|
||||||
|
// are ignored and the fields below describe the payload.
|
||||||
|
bool isKTX2{false};
|
||||||
|
VkFormat ktxFormat{VK_FORMAT_UNDEFINED};
|
||||||
|
uint32_t ktxMipLevels{0};
|
||||||
|
struct KTXPack {
|
||||||
|
struct L { uint64_t offset{0}, length{0}; uint32_t width{0}, height{0}; };
|
||||||
|
std::vector<uint8_t> bytes; // full file content
|
||||||
|
std::vector<L> levels; // per-mip region description
|
||||||
|
} ktx;
|
||||||
};
|
};
|
||||||
|
|
||||||
void worker_loop();
|
void worker_loop();
|
||||||
|
|||||||
@@ -391,6 +391,17 @@ void ResourceManager::process_queued_uploads_immediate()
|
|||||||
vkutil::transition_image(cmd, imageUpload.image, imageUpload.initialLayout,
|
vkutil::transition_image(cmd, imageUpload.image, imageUpload.initialLayout,
|
||||||
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
|
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
|
||||||
|
|
||||||
|
if (!imageUpload.copies.empty())
|
||||||
|
{
|
||||||
|
vkCmdCopyBufferToImage(cmd,
|
||||||
|
imageUpload.staging.buffer,
|
||||||
|
imageUpload.image,
|
||||||
|
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
|
||||||
|
static_cast<uint32_t>(imageUpload.copies.size()),
|
||||||
|
imageUpload.copies.data());
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
VkBufferImageCopy copyRegion = {};
|
VkBufferImageCopy copyRegion = {};
|
||||||
copyRegion.bufferOffset = 0;
|
copyRegion.bufferOffset = 0;
|
||||||
copyRegion.bufferRowLength = 0;
|
copyRegion.bufferRowLength = 0;
|
||||||
@@ -407,6 +418,7 @@ void ResourceManager::process_queued_uploads_immediate()
|
|||||||
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
|
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
|
||||||
1,
|
1,
|
||||||
©Region);
|
©Region);
|
||||||
|
}
|
||||||
|
|
||||||
if (imageUpload.generateMips)
|
if (imageUpload.generateMips)
|
||||||
{
|
{
|
||||||
@@ -571,6 +583,13 @@ void ResourceManager::register_upload_pass(RenderGraph &graph, FrameResources &f
|
|||||||
VkBuffer staging = res.buffer(binding.stagingHandle);
|
VkBuffer staging = res.buffer(binding.stagingHandle);
|
||||||
VkImage image = res.image(binding.imageHandle);
|
VkImage image = res.image(binding.imageHandle);
|
||||||
|
|
||||||
|
if (!upload.copies.empty())
|
||||||
|
{
|
||||||
|
vkCmdCopyBufferToImage(cmd, staging, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
|
||||||
|
static_cast<uint32_t>(upload.copies.size()), upload.copies.data());
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
VkBufferImageCopy region{};
|
VkBufferImageCopy region{};
|
||||||
region.bufferOffset = 0;
|
region.bufferOffset = 0;
|
||||||
region.bufferRowLength = 0;
|
region.bufferRowLength = 0;
|
||||||
@@ -580,17 +599,21 @@ void ResourceManager::register_upload_pass(RenderGraph &graph, FrameResources &f
|
|||||||
region.imageSubresource.baseArrayLayer = 0;
|
region.imageSubresource.baseArrayLayer = 0;
|
||||||
region.imageSubresource.layerCount = 1;
|
region.imageSubresource.layerCount = 1;
|
||||||
region.imageExtent = upload.extent;
|
region.imageExtent = upload.extent;
|
||||||
|
|
||||||
vkCmdCopyBufferToImage(cmd, staging, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, ®ion);
|
vkCmdCopyBufferToImage(cmd, staging, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, ®ion);
|
||||||
|
}
|
||||||
|
|
||||||
if (upload.generateMips)
|
if (upload.generateMips)
|
||||||
{
|
{
|
||||||
// NOTE: generate_mipmaps_levels() transitions the image to
|
// NOTE: generate_mipmaps_levels() transitions the image to
|
||||||
// VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL at the end.
|
// VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL at the end.
|
||||||
// Do not transition back to TRANSFER here.
|
|
||||||
vkutil::generate_mipmaps_levels(cmd, image, VkExtent2D{upload.extent.width, upload.extent.height},
|
vkutil::generate_mipmaps_levels(cmd, image, VkExtent2D{upload.extent.width, upload.extent.height},
|
||||||
static_cast<int>(upload.mipLevels));
|
static_cast<int>(upload.mipLevels));
|
||||||
}
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
// Transition to final layout for sampling
|
||||||
|
vkutil::transition_image(cmd, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, upload.finalLayout);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
@@ -606,3 +629,63 @@ void ResourceManager::register_upload_pass(RenderGraph &graph, FrameResources &f
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
AllocatedImage ResourceManager::create_image_compressed(const void* bytes, size_t size,
|
||||||
|
VkFormat fmt,
|
||||||
|
std::span<const MipLevelCopy> levels,
|
||||||
|
VkImageUsageFlags usage)
|
||||||
|
{
|
||||||
|
if (bytes == nullptr || size == 0 || levels.empty())
|
||||||
|
{
|
||||||
|
return {};
|
||||||
|
}
|
||||||
|
|
||||||
|
// Determine base extent from level 0
|
||||||
|
VkExtent3D extent{ levels[0].width, levels[0].height, 1 };
|
||||||
|
|
||||||
|
// Stage full payload as-is
|
||||||
|
AllocatedBuffer uploadbuffer = create_buffer(size, VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
|
||||||
|
VMA_MEMORY_USAGE_CPU_TO_GPU);
|
||||||
|
std::memcpy(uploadbuffer.info.pMappedData, bytes, size);
|
||||||
|
vmaFlushAllocation(_deviceManager->allocator(), uploadbuffer.allocation, 0, size);
|
||||||
|
|
||||||
|
// Create GPU image with explicit mip count; no mip generation
|
||||||
|
const uint32_t mipCount = static_cast<uint32_t>(levels.size());
|
||||||
|
AllocatedImage new_image = create_image(extent, fmt,
|
||||||
|
usage | VK_IMAGE_USAGE_TRANSFER_DST_BIT,
|
||||||
|
/*mipmapped=*/true, mipCount);
|
||||||
|
|
||||||
|
PendingImageUpload pending{};
|
||||||
|
pending.staging = uploadbuffer;
|
||||||
|
pending.image = new_image.image;
|
||||||
|
pending.extent = extent;
|
||||||
|
pending.format = fmt;
|
||||||
|
pending.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
|
||||||
|
pending.finalLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
|
||||||
|
pending.generateMips = false;
|
||||||
|
pending.mipLevels = mipCount;
|
||||||
|
pending.copies.reserve(levels.size());
|
||||||
|
|
||||||
|
for (uint32_t i = 0; i < mipCount; ++i)
|
||||||
|
{
|
||||||
|
VkBufferImageCopy region{};
|
||||||
|
region.bufferOffset = levels[i].offset;
|
||||||
|
region.bufferRowLength = 0; // tightly packed
|
||||||
|
region.bufferImageHeight = 0;
|
||||||
|
region.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
|
||||||
|
region.imageSubresource.mipLevel = i;
|
||||||
|
region.imageSubresource.baseArrayLayer = 0;
|
||||||
|
region.imageSubresource.layerCount = 1;
|
||||||
|
region.imageExtent = { levels[i].width, levels[i].height, 1 };
|
||||||
|
pending.copies.push_back(region);
|
||||||
|
}
|
||||||
|
|
||||||
|
_pendingImageUploads.push_back(std::move(pending));
|
||||||
|
|
||||||
|
if (!_deferUploads)
|
||||||
|
{
|
||||||
|
process_queued_uploads_immediate();
|
||||||
|
}
|
||||||
|
|
||||||
|
return new_image;
|
||||||
|
}
|
||||||
|
|||||||
@@ -13,6 +13,13 @@ struct FrameResources;
|
|||||||
class ResourceManager
|
class ResourceManager
|
||||||
{
|
{
|
||||||
public:
|
public:
|
||||||
|
struct MipLevelCopy
|
||||||
|
{
|
||||||
|
uint64_t offset{0};
|
||||||
|
uint64_t length{0};
|
||||||
|
uint32_t width{0};
|
||||||
|
uint32_t height{0};
|
||||||
|
};
|
||||||
struct BufferCopyRegion
|
struct BufferCopyRegion
|
||||||
{
|
{
|
||||||
VkBuffer destination = VK_NULL_HANDLE;
|
VkBuffer destination = VK_NULL_HANDLE;
|
||||||
@@ -37,6 +44,8 @@ public:
|
|||||||
VkImageLayout finalLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
|
VkImageLayout finalLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
|
||||||
bool generateMips = false;
|
bool generateMips = false;
|
||||||
uint32_t mipLevels = 1;
|
uint32_t mipLevels = 1;
|
||||||
|
// For multi-region (per-mip) uploads
|
||||||
|
std::vector<VkBufferImageCopy> copies;
|
||||||
};
|
};
|
||||||
|
|
||||||
void init(DeviceManager *deviceManager);
|
void init(DeviceManager *deviceManager);
|
||||||
@@ -59,6 +68,14 @@ public:
|
|||||||
AllocatedImage create_image(const void *data, VkExtent3D size, VkFormat format, VkImageUsageFlags usage,
|
AllocatedImage create_image(const void *data, VkExtent3D size, VkFormat format, VkImageUsageFlags usage,
|
||||||
bool mipmapped, uint32_t mipLevelsOverride);
|
bool mipmapped, uint32_t mipLevelsOverride);
|
||||||
|
|
||||||
|
// Create an image from a compressed payload (e.g., KTX2 pre-transcoded BCn).
|
||||||
|
// 'bytes' backs a single staging buffer; 'levels' provides per-mip copy regions.
|
||||||
|
// No GPU mip generation is performed; the number of mips equals levels.size().
|
||||||
|
AllocatedImage create_image_compressed(const void* bytes, size_t size,
|
||||||
|
VkFormat fmt,
|
||||||
|
std::span<const MipLevelCopy> levels,
|
||||||
|
VkImageUsageFlags usage = VK_IMAGE_USAGE_SAMPLED_BIT);
|
||||||
|
|
||||||
void destroy_image(const AllocatedImage &img) const;
|
void destroy_image(const AllocatedImage &img) const;
|
||||||
|
|
||||||
GPUMeshBuffers uploadMesh(std::span<uint32_t> indices, std::span<Vertex> vertices);
|
GPUMeshBuffers uploadMesh(std::span<uint32_t> indices, std::span<Vertex> vertices);
|
||||||
|
|||||||
@@ -156,7 +156,6 @@ def process_one(img_path: Path, out_dir: Path, role, opts):
|
|||||||
ktx_trans = [
|
ktx_trans = [
|
||||||
"ktx", "transcode",
|
"ktx", "transcode",
|
||||||
"--target", target_bc,
|
"--target", target_bc,
|
||||||
"--zstd", "18",
|
|
||||||
str(tmp_ktx2), str(out_ktx2)
|
str(tmp_ktx2), str(out_ktx2)
|
||||||
]
|
]
|
||||||
rc = run_cmd(ktx_trans, dry_run=opts.dry_run)
|
rc = run_cmd(ktx_trans, dry_run=opts.dry_run)
|
||||||
|
|||||||
Reference in New Issue
Block a user