Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decoupled Access-Execute on ARM big.LITTLE (1701.05478v1)

Published 13 Jan 2017 in cs.DC

Abstract: Energy-efficiency plays a significant role given the battery lifetime constraints in embedded systems and hand-held devices. In this work we target the ARM big.LITTLE, a heterogeneous platform that is dominant in the mobile and embedded market, which allows code to run transparently on different microarchitectures with individual energy and performance characteristics. It allows to se more energy efficient cores to conserve power during simple tasks and idle times and switch over to faster, more power hungry cores when performance is needed. This proposal explores the power-savings and the performance gains that can be achieved by utilizing the ARM big.LITTLE core in combination with Decoupled Access-Execute (DAE). DAE is a compiler technique that splits code regions into two distinct phases: a memory-bound Access phase and a compute-bound Execute phase. By scheduling the memory-bound phase on the LITTLE core, and the compute-bound phase on the big core, we conserve energy while caching data from main memory and perform computations at maximum performance. Our preliminary findings show that applying DAE on ARM big.LITTLE has potential. By prefetching data in Access we can achieve an IPC improvement of up to 37% in the Execute phase, and manage to shift more than half of the program runtime to the LITTLE core. We also provide insight into advantages and disadvantages of our approach, present preliminary results and discuss potential solutions to overcome locking overhead.

Citations (4)

Summary

We haven't generated a summary for this paper yet.