Game Engine, C 1, P 10, Fiber Benchmark Conclusion

This is the last post from the fiber task system benchmark. We have tested 4 libraries and 2 of our own solutions and it's time to review the result.
Image from Pixelbay

Results

The results of the benchmark are here.
Benchmark results
Launch 1 Launch 2 Launch 3 Launch 4 Launch 5 Launch 6 Launch 7 Total
Single 18.0532 16.256 16.4315 16.4052 16.4346 16.0924 16.2453 16.3108
Multi 11.0562 9.60286 11.0142 11.0135 9.84951 9.2653 11.0359 10.2969
Sewing 8.72879 8.68421 8.94425 8.67256 8.68754 8.66127 8.66274 8.7188
Marl 6.55166 6.90798 6.38344 6.44576 6.34505 6.35757 6.38263 6.4704
Tasking 6.65683 6.18805 6.3974 6.18256 6.20655 6.22112 6.20632 6.2337
Job 7.58917 7.54043 7.65869 7.56652 7.61828 7.56469 7.55989 7.5848

The results were unexpected in some places, but I'm glad I was right in at least a few places. Let's discuss the time we got.
Single-threaded version is not 4 times slower than the other ones, because I did everything in a loop, that the CPU can optimize in a pretty good way. Besides that, fiber switching also requires a bit of overhead.
Multi-threaded version was just a beginner-friendly showcase of multithreading and it achieved just that. It is obviously faster than the single-threaded version, but has a lot of overhead due to thread initialization, destruction and switching.
Sewing for me was very promising at first, having a strict C API and having a function that calculates all of the necessary memory beforehand. However in the end it was the slowest out of these libraries. The number of jobs has to be specified before the task system creation, so it is less flexible. It uses Boost.Context for fibers.
After benchmarking Marl I was prepared for it to be the fastest solution. It seems to have all the functions I may need, synchronization primitives, a tidy API. It has multiple task pool types and it uses a proprietary piece of code for working with fibers, but is is licensed under Apache v2.0. I was really pleased with its result.
FiberTaskingLib was a bit weird to install. As far as I remember, something was wrong with my %PATH%, so the library couldn't be nicely linked to my benchmark project. I am sure that it was me who did something wrong, but I didn't have any issues with the other libraries. The tasks can be added during runtime, their number is not limited (at least I didn't reach that limit). It lacks some useful functions but is very neat overall. It uses Boost.Context for fiber management. At first, I was hesitant to call it the winner, but after some time I grew fond of it. Moreover, it is greatly documented.
Fiber-job-system was the first library that I found. It was said to be developed after the Naughty Dog presentation, so I was confident with its performance and results. As of writing this article, there is no license on the repo, so I asked Jan personally for a permission to use his code in my engine, which he granted. But the performance turned out not so perfect. Windows only support is also an issue. I also ran into trouble with task queue size, constantly getting out of range exceptions. It was clunky to work with, but it uses native Windows Fiber API and has job priorities, so that was exciting. Although, of course, I didn't use them in the benchmark.

Conclusion

What can I do, now that I know the results? I can either use one of the libraries or try to implement my own solution based on them. First of all, I am very curious about the Windows Fiber API. I want to merge a few approaches, for example benchmark Marl with the Windows API backend. This should lead to promising results. Then I want to find out, why FiberTaskingLib got better results than Marl. I think that it would take some time, but I'll be able to write a better system myself afterwards. In the end, I'll write my own library, borrowing code from Marl and FiberTaskingLib. I'll add a priority system as in Fiber-job-system. Thankfully, I have permission to take the code.
For now, I'm tired from all the task / job systems and benchmarks. I will focus on something else, like making some documentation. You can read the article here.

Comments