As today’s state-of-the-art signal processing systems often require heterogeneous computing and special-purpose accelerators to offer highly efficient performance for mixed application workloads, including not only traditional signal processing algorithms, but also the demands to enable smart applications with data analytics, machine learning, as well as the capability interacting with both physical and cyber worlds via sensors and networks. Thus, the complexity of such systems has been increasing, and the focus of designing has been shifting to exploring the design space with a mixture of processing cores/accelerators and the interconnection networks between the components to optimize the performance and efficiency at the system level. Traditional simulation tools may offer accurate performance estimation at micro architectural level, but it is highly complicated to combine the simulators for various components to perform complex applications, and they fall in short in terms of their capabilities to profiling application workload. Furthermore, the speed of such complex simulation would be unacceptably slow with traditional system-level simulation framework such as SystemC. To solve the problem, we develop a rapid hybrid emulation/simulation framework that allows the user to execute full-blown system and application software and plug in emulators, simulators, and timing models for various components in the prototype system, switching the timing models dynamically with our just-in-time model selection mechanism, and connect the emulated/simulated components with scalable communication channels, so that the framework can be accelerated effectively by a multicore host. Our just-in-time model selection mechanism is capable of detecting and skipping regular program patterns to save the simulation time dramatically. In addition, our framework is capable of estimating the performance of different system configurations with concurrent multiple timing models, which further saves the time needed for traversing the design space. Our experimental results have shown that our dynamic model selection and multi-model approach collectively can speed up the design space exploration by 13.4 times on a quad-core host for cache simulation.
All Science Journal Classification (ASJC) codes