Instruction Cache Compression

Code compression could lead to less overall system die area and therefore less cost. This is significant in the embedded system field where cost is very sensitive. In most of the recent approaches for code compression, only instruction ROM is compressed. Decompression is done between the cache and memory, and instruction cache is kept uncompressed. Additional saving could be achieved if the decompression unit is moved to between CPU and cache, and keeps the instruction cache compressed as well. In this paper, we explored the issues for compressing the instruction cache. In the process, we discuss a high level implementation for a 64-bit, 5-stage pipeline MIPS like processor with compressed instruction cache. The discussion is carried with the help of a compression algorithm with instruction level random access within the compressed file. In addition we present a branch compensation cache, a small cache mechanism to alleviate the unique branching penalties that branch prediction cannot reduce.

Code compression is a general technique. The most commonly used algorithms for block compression are variations of Huffman and arithmetic that compresses serially within the block, byte wise. A feature of Huffman coding is how the variable length codes can be packed together. A more sophisticated version of the Huffman approach is called arithmetic encoding. In this scheme, sequences of characters are represented by individual codes, according to their probability of occurrence. This has the advantage of better data compression, say 5-10%. Run-length encoding followed by either Huffman or arithmetic encoding is also a common strategy.

In this paper we modify the Code Compressed RISC Processor (CCRP) algorithm to compress the instruction cache. The original CCRP algorithm faces the problem of additional stages of DEC and CLB for single 64-bit instruction and granularity of random access. The modified algorithm can solve the above two problems. But branch penalty is another problem encountered by the modified algorithm. In order to avoid this, we suggest a method called Branch Compensation Cache.