Multicore processor provides large computation capability but also involves the complicate parallel programming. One of major considerations in parallel programming is the performance. Traditional design methodologies which usually start a design on a selected platform spend a lot of effort and time on tuning performance and debugging. When platform is changed even with different number of cores, considerable redesign effort is required. Hence a flexible design methodology is necessary. In this paper, a design methodology is presented for video codec, by using MPEG-4 SP decoder as an example, on multicore processor. The parallelisms of MPEG-4 decoder are discussed and exposed with the dataflow model. The dataflow model provides a high-level abstraction of underlying hardware. Computation and communication of MPEG-4 decoder are separated and represented as modules and channels, respectively. It is possible to synthesize the model targeting to either dedicate hardware or software on multiprocessor. To map the high level dataflow model to Cell processor, the mapping flow, including offline profiling, task allocation and runtime libraries, are developed. According to the profiling results, the allocation algorithm could allocate tasks on multiprocessors as balanced as possible. An efficient synchronization mechanism on Cell processor is also proposed. We also discuss the impact of the model and the mapping flow corresponding to decoding speed. The results show that the proposed methodology gets considerable performance boost when the number of cores is increased.