“STEER: Asymmetry-aware Energy Efficient Task Scheduler for Cluster-based Multicore Architectures”, by Jing Chen, Madhavan Manivannan, Bhavishya Goel, Mustafa Abduljabbar, and Miquel Pericas to be presented in IEEE 34th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2022).

Reducing the energy consumption of parallel applications is becoming increasingly important. Current chip multiprocessors (CMPs) incorporate asymmetric cores (i.e. static asymmetry) and DVFS (i.e. dynamic asymmetry) to enable energy efficient execution. To reduce cost and complexity, designs typically organize asymmetric cores into core-clusters supporting the same DVFS setting across cores in a cluster. Recent approaches that focus on energy efficient […]

Read More

“ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes.” by Jing Chen, Madhavan Manivannan, Mustafa Abduljabbar, and Miquel Pericàs, published in ACM Transaction of Architecture and Code Optimization (TACO) 19, 2, Article 27

Parallel applications often rely on work stealing schedulers in combination with fine-grained tasking to achieve high performance and scalability. However, reducing the total energy consumption in the context of work stealing runtimes is still challenging, particularly when using asymmetric architectures with different types of CPU cores. A common approach for energy savings involves dynamic voltage and frequency scaling (DVFS) wherein […]

Read More

eProcessor Presented in HPCA’s workshop

We are happy that our paper entitled “Accelerating the Wavefront Alignment Algorithm on CPUs, GPUs and FPGAs” by Miquel Moreto and Santiago Marco-Sola was presented at 4th HPCA Workshop on Accelerator Architecture in Computational Biology and Bioinformatics (https://aacbb-workshop.github.io/).

Read More

“FastTrackNoC: A NoC with FastTrack Router Datapaths” published in 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)

This paper introduces FastTrackNoC, a Network-on-Chip (NoC) router architecture that reduces packet latency by bypassing its switch traversal (ST) stage. It is based on the observation that there is a bias in the direction a flit takes through a router, e.g., in a 2D mesh network, non-turning hops are preferred, especially when dimension order routing is used. FastTrackNoC capitalizes on […]

Read More

The RISC-V IOMMU of eProcessor in a Nutshell

In today’s high performance computer systems and particularly the more resource-intensive ones, like servers, the I/O transactions that read data from a hard disk or a network card, constitute a significant part of the overall workload, making them an essential part for a successful system in terms of performance. Most of today’s peripheral devices bypass the processor to minimize the […]

Read More

eProcessor cache coherence in a Nutshell

Multi-core processors have become ubiquitous across domains ranging from embedded systems to data centers because of their ability to facilitate energy-efficient high performance computing. The architecture of a typical multi-core processor mainly comprises cores that are the computational workhorses, caches that act as buffers to reduce access latency to memory and an interconnection network that connects all the on chip […]

Read More