We aim at exploiting parallelism in shared-memory multiprocessing systems, in order to speed up the execution time with as small redundancy in work as possible, for an elementary task that comes up frequently as a subroutine in the daily maintenance of large-scale time-dependent graphs representing real-world relationships or technological networks: the many-to-all time-dependent shortest paths (MATDSP) problem. MATDSP requires the computation of one time-dependent shortest-path tree (TDSPT) per origin-vertex and departure-time, from an arbitrary collection of pairs of origins and departure-times, towards all reachable destinations in the graph.

Our goal is to explore the potential and highlight the limitations of amorphous data parallelism, when dealing with MATDSP in multicore computing environments with a given amount of processing elements and a shared memory to exploit. Apart from speeding-up execution time, consumption of resources (and energy) is also critical. Therefore, we aim at limiting the work overhead for solving a MATDSP instance, as measured by the overall number of arc relaxations in shortest-path computations, while trying to minimize the overall execution time. Towards this direction, we provide several algorithmic engineering interventions for solving MATDSP concerning: (i) the compact representation of the instance; (ii) the choice and the improvement of the time-dependent single-source shortest path algorithm that is used as a subroutine; (iii) the way according to which the overall work is allocated to the processing elements; (iv) the adoption of the amorphous data parallelism rationale, in order to avoid costly synchronization among the processing elements while doing their own part of the work.

Our experimental evaluations, both on real-world and on synthetic benchmark instances of time-dependent road networks, provide insight how one should organize heavy MATDSP computations, depending on the application scenario. This insight is in some cases rather unexpected. For instance, it is not always the case that pure data parallelism (among otherwise totally independent processors) is the best choice for minimizing execution times. In certain cases it may be worthwhile to limit the level of data parallelism in favor of algorithmic parallelism, in order to achieve more efficient MATDSP computations.