2.3 レンダリングパイプラインの並列

ReturnDream · 2024 年 12 月 5 日午前 9:47

At first I thought we needed to move the thread’s computational work to the GPU, but it seems that this is not the intention of the documentation, because to do so we would have to change the source file extension to .cu and modify the compilation method, which does not meet the submission requirements.

I saw the following sentence in the teaching assistant’s publicly shared article:

Starting 8 threads can achieve a speedup ratio of over 4

(edit1 source: https://xjtu.app/t/topic/9693

The document’s screenshot also indeed shows that 8 threads achieved a speedup of \frac{7.569}{1.231}=6.15, but if the eight threads are shared among the three types VertexProcessor::worker_thread, Rasterizer::worker_thread, and FragmentProcessor::worker_thread, assuming the workload is mainly determined by fragment processing and each type must have at least one thread, then the best case would be:
VertexProcessor::worker_thread: 1
Rasterizer::worker_thread: 1
FragmentProcessor::worker_thread: 6
The speedup probably wouldn’t exceed 6.

Because the ideal scenario is that the 6 FragmentProcessor::worker_threads can completely fill the idle time, reducing that portion of time t_0 to \frac{1}{6}t_0
( I’m not sure if this analysis method is correct, or perhaps we can treat these three functions as thread foremen, using std::thread inside these foreman threads to spawn worker threads to do the work?

Similar to the figure below:
f06a4c3273dd78bb3dd7d751e789c476
![2d6fdd877df889eff693e10233ca79b5|690x324, 75%](upload://aU9aP5l

storm314 · 2024 年 12 月 6 日午前 3:47

This experiment mainly implements parallelism based on a “soft” rasterizer, so it does not involve GPUs, and not all devices support CUDA.
I don’t know which public article you are referring to, but superlinear speedup can occur in some cases; you can refer to 加速比 - 维基百科，自由的百科全书 for details.
Of course you could maintain additional threads within work_thread to perform parallel computation of tasks, but it is unnecessary. Keeping the data of these internal threads consistent and achieving acceleration is a rather troublesome matter. Therefore, considering how to increase the work_thread in each stage and how to optimize the algorithms within each stage is already sufficient.

トピック		返信	表示
2023 高考作文题分析——上海卷谈笑风生	2	48	2023 年 6 月 7 日
求助🚪友，记单词有没有什么技巧/方法谈笑风生	10	117	2025 年 3 月 25 日
gpt-4o 写高考作文 2024 版 - 北京谈笑风生 ai	7	58	2024 年 6 月 9 日
复习：有关进程状态的问题。计算机系统导论	3	241	2024 年 6 月 14 日
感觉很糟糕谈笑风生	16	379	2024 年 4 月 12 日
Deadline-Driven Life 谈笑风生	1	104	2023 年 9 月 23 日
大二暑假计划贴谈笑风生	32	671	2024 年 8 月 4 日
发现自己的 paper 被牛组/公司撞车是什么感觉谈笑风生	9	243	2024 年 3 月 28 日
大家用什么记笔记谈笑风生	8	123	2024 年 12 月 17 日
🫒 coremail.club 的倡议谈笑风生 project	2	262	2023 年 6 月 25 日

2.3 レンダリングパイプラインの並列

関連トピック