Put near in the same compilation unit all the function definitions belonging to the same bottleneck. 4 Template code independent of parameters. In this section techniques are proposed to optimize usage of the processor caches and of the virtual memory, by incrementing the locality of reference of code and data. This principle becomes even more important for performance in multi-threaded applications on multi-core systems, as if several threads running on different cores access the same cache block, the contention causes a performance degradation. The principle that the data and the code processed by a command should reside in near regions of memory is called locality of reference. When the application accesses main memory, it implicitly uses both the various processor caches and the disk swapping mechanism by the virtual memory manager of the operating system.īoth the processor caches and the virtual memory manager process data block-wise, and therefore the software is faster if the few memory blocks contain the code and the data used by a single command.
0 Comments
Leave a Reply. |