Investigate simulation performance in the case of large detectors
|Status:||Resolved||Start date:||14 Jun 2017|
|Target version:||Sprint 36|
For large detectors, like 1024x1024, the 70% of time is spent in IDetector2D::createSimulationElements (according to QtCreator/Callgrind, in Debug mode). Seems that Eigen constructor/destructors are taking most of it.
In GUI it leads to the situation, when most of the time the ProgressBar stays at the 0% (while creating simulation elements). It is not a problem per se, but it shows that we have room for improvements.Within this item
- Extend StandardSamples and Tests/PerformanceTest/test_performance.py to measure performance of simple simulations made on large detectors.
- Add a small executable to Tests/Functional/Core/CoreSpecial/ to easily profile the performance in case of large detectors (similar to PolDWBAMagCylinders2.cpp or CoreIOTest.cpp)
- Learn how to compile in Release mode with debug symbols, to be able to use Callgrind reliably.
- Analyze the origin of the slow performance
How many instructions per cycle our processor is doing?
perf stat -a -- sleep 10