Feature #1842

Improve simulation performance in the case of large detectors

Added by herck almost 3 years ago. Updated over 2 years ago.

Status:RejectedStart date:14 Jun 2017
Priority:NormalDue date:
Assignee:dmitry% Done:


Target version:Sprint 36


Related to #1818

  • Optimize the critical sections, found during profiling in #1818.
Possible ways
  • Do sim_element.setPolarization() only when necessary.
  • Revise std::move usage in SimulationElement.
  • Revise IDetector2D::getAxisBinIndex and Co.

From my experience with RegionOfInterest, I was able to speed it up significantly by using custom RegionOfInterest::xcoord with hardcoded number of dimensions = 2. Shouldn't we drop all dimensions checks in IDetector2D and remove for-loops over number of axes?

  • Investigate layout of vector of SimulationElement.

Do we need to add some padding?
Do we have to reorder SimulationElements in a vector for better performance?

ProfilingData.zip - Profiler output (can be extracted and viewed from kcachegrind) (39.8 MB) dmitry, 18 Dec 2017 10:18

valgrind.log Magnifier - Profiling log (288 KB) dmitry, 18 Dec 2017 10:19

Related issues

Copied from BornAgain - Feature #1818: Investigate simulation performance in the case of large detectors Resolved 14 Jun 2017


#1 Updated by herck almost 3 years ago

  • Copied from Feature #1818: Investigate simulation performance in the case of large detectors added

#2 Updated by pospelov over 2 years ago

  • Target version changed from Sprint 35 to Sprint 36

#3 Updated by dmitry over 2 years ago

  • Assignee set to dmitry

#4 Updated by dmitry over 2 years ago

From profiler output (ReleaseWithDebugInfo mode, 2048x2048 detector):
Creation of simulation elements (first 11 dumps):
  • Creating detector pixels - 21 % of time
  • Exponentiation (ieee754_pow_sse2) - 18 % of time
  • Allocating memory for simulation elements - 13 % of time
Running simulation on large detector (dumps 12 - 627):
  • 30 % of time is spent in MagneticMaterialImpl::polarizedSubtrSLD, half of this time is spent in MaterialUtils::MagnetizationCorrection.
    MagneticMaterialImpl::polarizedSubtrSLD is called 20 times per each pixel.

#5 Updated by dmitry over 2 years ago

  • Status changed from Sprint to Rejected

Concerning future employment of computations on GPU, not much could be done for this issue. After refactoring creation of simulation elements has become 15 % faster, but the total time of simulations on large detectors remains more or less the same

Also available in: Atom PDF