0% found this document useful (0 votes)
55 views

SCAN Chain Based Clock Gating For Low Power Video Codec Design

The document describes a proposed low power video codec design that uses SCAN chain based clock gating. It presents a VLSI architecture for motion estimation that supports multiple video coding standards including H.264, SVC, MVC, AVS, and VC-1. The design challenges motion estimation at lower data rates and speeds. To address this, it develops a Cross diamond search Algorithm to manipulate motion prediction vectors that provides higher compression ratio.

Uploaded by

Saravanan Ns
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views

SCAN Chain Based Clock Gating For Low Power Video Codec Design

The document describes a proposed low power video codec design that uses SCAN chain based clock gating. It presents a VLSI architecture for motion estimation that supports multiple video coding standards including H.264, SVC, MVC, AVS, and VC-1. The design challenges motion estimation at lower data rates and speeds. To address this, it develops a Cross diamond search Algorithm to manipulate motion prediction vectors that provides higher compression ratio.

Uploaded by

Saravanan Ns
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

SCAN chain based clock gating for low power video codec design

R.VENKATASUBRAMANIAN M.E [APPLIED ELECTRONICS] M.N.M JAIN ENGINEERING COLLEGE CHENNAI-600097 [email protected]

C.KARTHIKEYAN ASSISTANT PROFESSOR M.N.M JAIN ENGINEERING COLLEGE CHENNAI-600097 [email protected]

Abstract This paper describes a efficient Video Codec design methodology for reducing power consumption through use of the RTL clock gating feature in Quartus Power Compiler. This paper presents an efficient low power VLSI architecture of motion estimation that supporting multiple video coding standards including H.264 BP/MP/HP, SVC, MVC, AVS, and VC-1. Advanced standards, such as H.264 MP/HP, SVC, and MVC, adopt Micro Block Adaptive Frame Field to enhance motion estimation which results in the poor performance, leads to motion estimation at lower data rate and speed that makes. This design challenge has not been discussed in previous works according to our best knowledge. Therefore, we develop a Cross diamond search Algorithm to manipulate motion prediction vectors that provides higher compression ratio. Index Terms - Block Matching Algorithm, Motion estimation and Motion compensation, Diamond Search Algorithm, Three Step Diamond Search Algorithm.

that the bit per pixel is fractional. This led to the design of socalled block-based codecs.

During the late 1980s study period, of the 15 block based videoconferencing proposals submitted to the ITU-T (formerly the CCITT), 14 were based on the Discrete Cosine Transform (DCT) and only one on Vector Quantization (VQ). The subjective quality of video sequences presented to the panel showed hardly any significant differences between the two coding techniques. In parallel to ITU-T's investigation during 1984-88, the Joint Photographic Experts Group (JPEG) was also interested in compression of static images. They chose the DCT as the main unit of compression, mainly due to the possibility of progressive image transmission. JPEG's decision undoubtedly influenced the ITU-T in favouring DCT over VQ. By now there was a worldwide activity in implementing the DCT in chips and on DSPs. In the late 1980s it was clear that the recommended ITUT videoconferencing codec would use a combination of interframe DPCM for minimum coding delay and the DCT. The codec showed greatly improved picture quality over H.120. In fact, the image quality for videoconferencing applications was found reasonable at 384 kbits/s or higher and good quality was possible at significantly higher bitrates if around 1 Mbit/s. This effort was later extended to systems based on multiples of 64 kbit/s (upto 30 multiples of this value). The standard definition was completed in late 1989 and is officially called the H.261 standard, and the coding method is referred to as the `p x 64' method (p is an integer between 1 and 30). In the early 1990s, the Motion Picture Experts Group (MPEG) started investigating coding techniques for storage of video, such as CD-ROMs. The aim was to develop a video codec capable of compressing highly active video such as movies, on hard disks, with a performance comparable to that of VHS quality. In fact, the basic framework of the H.261 generation of MPEG, called the MPEG-1 standard, was capable of accomplishing this task at 1.5 Mbit/s. Since for the storage of video, encoding and decoding delays are not a major constraint, one can trade delay for compression efficiency. For example in the temporal domain a DCT might be used rather than DPCM, or DPCM used but with much improved motion estimation, such that the motion compensation removes temporal correlation. This later option was adopted with MPEG-1.

INTRODUCTION

igital video compression techniques have played an D important role in the world of telecommunication and multimedia systems where bandwidth is still a valuable commodity. Hence, video compression techniques are of prime importance for reducing the amount of information needed for picture sequence without losing much of its quality, judged by human viewers. Modern compression techniques involve very complex electronic circuits and the cost of these can only be kept to an acceptable level by high volume production of LSI chips. This means that we have to standardize the techniques of video compression.The history of compression begins in the 1960s. An analogue videophone system had been tried out in the 1960s, but it required a wide bandwidth and the postcard-size black-and-white pictures produced did not add appreciably to voice communication. In the 1970s, it was realized that visual speaker identification could substantially improve a multiparty discussion and videoconference services were considered. Interest increased with improvements in picture quality and digital coding. With the available technology in the 1980s, the COST211 video codec (Encoder/Decoder), based on differential pulse code modulation, DPCM (Pulse Code Modulation is still used in CD audio files, so they are called PCM/.wav files), was standardized by CCITT, under the H.120 standard. For more information on the history of conferencing. This codec's target bitrate was 2 Mbit/s for Europe and 1.544 Mbit/s for North America, suitable for their respective first levels of digital

heirarchy. However, the image quality, although having very good spatial resolution (due to the nature of DPCM working on a pixel-by-pixel basis), had a very poor temporal quality. It was soon realized that in order to improve the image quality, without exceeding the target bitrate, less than one bit should be used to code each pixel. This was only
possible if a group of pixels (a ``block'') were coded together, such

These days, MPEG-1 decoders/players are becoming commonplace for multimedia on computers. MPEG-1 decoder plug-in hardware boads (e.g. MPEG magic cards) have been around for a while, and now software MPEG-1 decoders are available with the release of operating systems or multimedia extensions for PC and Mac platforms. Since in all standard video codecs, only the decoders have to comply with proper syntax, software based encoding has added the extra flexibility that might even improve the performance of MPEG-1 in the future.

MPEG-1 was originally optimized for typical applications using non-interlaced video of 25 frames per second (fps), in European format and 29.9 fps in North American format, in the range of 1.2 to 1.5 Mbits/s for image quality comparable to home VCRs, it can certainly be used at higher bitrates and resolutions. Early versions of MPEG-1 for interlaced video, such as those used in broadcase, were called MPEG1+. A new generation of MPEG, called MPEG-2 was soon adopted by broadcasters (who were initially reluctant to use any compression on video!). MPEG-2 codes for interlaced video at bit rates 4-9 Mbits/s, and is now well on its way to making a significant impact in a range of applications such as digital terrestrial broadcasting, digital satellite TV, digital cable TV, digital versatile disc (DVD) and many others. Television broadcasters have started using MPEG-2 coded digital forms since the late 90s A slightly improved version of MPEG-2, called MPEG3, was to be used for coding of High Definition (HD) TV, but since MPEG-2 could itself achieve this, MPEG-3 standards were folded into MPEG-2. It is foreseen that by 2014, the existing transmission of NTSC North America format will cease and instead HDTV with MPEG-2 compression will be used in terrestrial broadcasting.

III.

PROPOSED SYSTEM

VLSI plays vital role in aeronautics, medical field, satellite communications, automation etc. Due to its reliability, reconfigurable character and low power consumption we design High Efficiency Data Access System Architecture for Deblocking Filter Supporting Multiple Video Coding Standards due to advantages of vlsi. As mentioned earlier, we know that deblocking blocking is plays potential role in Video codes such MPEG4, H.264, AVC etc. To handle bottleneck concept in an efficient we propose a new method prediction data management (PDM) scheme to process frame pixels in effective way. We implement a buffer between deblocking filter and motion compensation blocks that get adequate time to process that that information. The second buffer is output frame buffer (OFB). According to the write out data problem we mentioned in Fig.2, we use an output frame module in the PDB to resolve this problem. We use this buffer to collect the filtered data into a completed MB for writing out.

A. Block Diagram of the Proposed Architecture

I.

SCOPE OF THE PROJECT

Due to the great innovation of display and information technology, the stringent requirement of data capacity is drastically increased in human life. The data compression technique is extensively applied to offer acceptable solution for this scenario, some images like satellite images or medical images have very high resolution. Such high resolution images have large file size and computation time required to process such high quality images is more. Hence compression of images and video has become need of hour. The image can be compressed using lossy or lossless compression techniques. In the lossy image compression technique, the reconstructed image is not exactly same as the original image. The lossless image compression can remove redundant information and guarantee that the reconstructed image is without any loss to original image. Different image compression techniques are suggested, but the technique with high data compression with low loss is always preferred. Data compression is the key in giving such fast and efficient communication.

Figure 4.1: Block diagram of the proposed system

B. PROJECT DESCRIPTION
The proposed DS block motion estimation employs two search patterns as illustrated in Fig., which are derived from the crosses () in Fig. 1. The first pattern, called large diamond search pattern (LDSP), comprises nine checking points from which eight points surround the center one to compose a diamond shape. The second pattern consisting of five checking points forms a smaller diamond shape, called Small Diamond Search Pattern (SDSP). In the searching procedure of the DS algorithm, LDSP is repeatedly used until the step in which the minimum block distortion (MBD) occurs at the center point. The search pattern is then switched from LDSP to SDSP as reaching to the final search stage. Among the five checking points in SDSP, the position yielding the MBD provides the motion vector of the best matching block.

II.

EXISTING SYSTEM

We have several low power technologies to save both total energy and peak power consumption for video codec VLSI designs. Existing technologies include two-layer clock gating, utilization of skip mode, memory hierarchy with memory access units, and internal memory partitioning. We implement a standby mode to reduce both clock power and memory power for saving total energy. We also use a novel technology, called active period rescheduling, to reduce the peak power consumption. Applying these technologies to an H.264/AVC codec, we can achieve 43% and 45% savings in total energy for the encoder and decoder, respectively. Moreover, by applying the peak power reduction technology to the H.264/AVC decoder, we achieve 45% saving. We know that video streaming should make bottleneck at input of decoder which is the main drawback current video technologies

C. BLOCK MOTION MATCHING


In video coding, block-based motion estimation plays a vital role for video compression. There are few block matching algorithms existing for motion estimation and motion compensation. In this paper a three step diamond search algorithm is proposed. The performance of this algorithm is compared with other algorithms by means of error metrics and number of search points. This algorithm achieves close performance with that of TSS. It uses less number of search points than TSS. When compared with original DS algorithm, this algorithm requires less computation time and gives an improved performance.

Block diagram of the existing hardware architecture for VLSI-oriented FELICS algorithm

Block-based motion estimation is one of the major components in video compression algorithms and standard. The objective of motion estimation is to reduce temporal redundancy between frames in a video sequence and thus achieve better compression. In block-based motion estimation, each frame is divided into a group of equally sized macro blocks and to find the best matching macro block in the reference frame to the macro block being encoded in the current frame. Once the best match is located, only the difference between the two macro blocks and motion vector information are compressed. The most commonly used matching criterion is the sum of absolute differences (SAD), which is chosen for its simplicity and ease of hardware implementation. For an M x N block, where Sl (x,y) is the pixel value of frame l at relative position x,y from the macro block origin and Vi = (dx, dy) is the displacement vector, SAD can be computed as

The search speed and the performance of an algorithm are determined by the shape and size of the search patterns. The TSS and NTSS algorithms are using a squared shape pattern, whereas the diamond search algorithm uses a Diamond shape. This DS algorithm uses an unrestricted center-biased searching concept and so it is computationally inefficient. In this paper, a three step diamond search algorithm is proposed to attain a computationally efficient search with a reasonable distortion performance.

E. DIAMOND SEARCH ALGORITHM


The shape of the search pattern has an impact on the performance of the Algorithm. Fast block matching algorithms such as TSS, NTSS are having a square shape search pattern and provide reasonable performance. The distribution of global minimum points is centered at the center of the search window. A center biased NTSS is used to achieve better performance than TSS. But it losses the regularity and simplicity. The diamond search algorithm provides a better performance than the TSS, NTSS algorithms. The DS algorithm uses a diamond shape pattern with nine search points, four points located at the corners and another four points located at the midpoint of the edges of the diamond shape. This algorithm uses an unrestricted center biased searching process .The diamond search employs a large diamond search pattern (LDSP) [Fig 5.1(a)] and a small diamond search pattern (SDSP) [Fig 5.1(b)].

----- 4.1 There is a wide range of block matching algorithms (BMAs), full or exhaustive search is one in which the block matching process is performed for each possible macro block in the search window. Full search is computationally expensive but is very regular and easy for hardware implementation. Other block matching algorithms apply fast search techniques such as threestep search (TSS), hierarchical BMA, diamond search, hexagon search, and simplex search (SS) algorithm. Recently, the FTS algorithm was introduced and it was shown that its performance compares very well with other fast BMAs. In these fast algorithms, only selected subsets of search positions are evaluated using SAD. As a result, these algorithms usually produce sub-optimal solutions but the computational saving over FS is significant. When it comes to hardware implementation on the other hand, the number of SAD calculations is not the only criterion for the choice of a motion estimation algorithm. Other criteria, such as algorithm regularity, suitability for pipelining and parallelism, computational complexity and number of gates which directly affect power consumption and cost of hardware, are also very important. Due to these reasons, there have been several implementations of the full search and hierarchal search which are very regular.

(a)

D. MOTION ESTIMATION
Video coding is an important process in many multimedia applications. In addition to spatial redundancy, temporal redundancy plays a vital role in the transmission of video frames. Motion estimation is a technique used to reduce the temporal redundancy. It uses the correlation between the successive frames to predict the content of frames. In the motion estimation process the frame is divided into number of non overlapping areas known as blocks. Each block can be with a standard size of 1616. The difference between the current frame and the predicted frame contents is calculated in motion estimation. In addition to motion estimation, some additional informations are also needed to indicate any changes in the prediction process. This is known as Motion Compensation. Motion estimation and motion compensation algorithms are used to obtain strong temporal redundancy. Full search block matching algorithm is an algorithm which provides a better performance with more number of search points. But, there is a tradeoff between the efficiency of an algorithm and the quality of the prediction image. The suboptimal algorithms are used for this purpose. These algorithms are computationally more efficient but they do not give a good quality as in FSBMA. The suboptimal algorithms used in video transmission are Three step search (TSS), New three step search (NTSS), Diamond search (DS) and the like.

(b)
Figure 5.1: a) Large Diamond Search Pattern b) Small Diamond Search Pattern As some of the search points in the newly formed LDSP are overlapping, only the non-overlapping points need to be evaluated. This greatly reduces the number of search points compared to other existing fast search algorithms. Therefore the search pattern uses five search points in the new LDSP if the MBD point is the corner point [fig9(a)] and three search points if the MBD point is at the edge of the pattern [Fig 5.2(b)]. The LDSP pattern is used until the center point becomes the Minimum Block Distortion (MBD) point. Once the MBD point is at the center, the search is switched to SDSP which uses four checking points [Fig 5.2(C)].

(a)

IV.

SOFTWARE DESCRIPTION

(b)

(c)
Figure 5.2: a) MBD point is at corner. b) MBD point is at edge c) MBD point is at center The MBD points thus obtained will give the motion vector. The DS algorithm reduces the susceptibility of getting stuck at local minima due to its compact shape and relatively large step size in the horizontal and vertical direction. Thus the DS algorithm gives a faster processing and similar distortion performance with the other fast searching algorithms. The increase in number of steps leads to more number of search points which has an effect on the speed of the algorithm. This makes the algorithm computationally inefficient. A three step diamond search (TSDS) is proposed to overcome this disadvantage.

The first digital circuit was designed by using electronic components like vacuum tubes and transistors. Later Integrated Circuits (ICs) were invented, where a designer can be able to place digital circuits on a chip consists of less than 10 gates for an IC called SSI (Small Scale Integration) scale. With the advent of new fabrication techniques designer can place more than 100 gates on an IC called MSI (Medium Scale Integration). Using design at this level, one can create digital sub blocks (adders, multiplexes, counters, registers, and etc.) on an IC. This level is LSI (Large Scale Integration), using this scale of integration people succeeded to make digital subsystems (Microprocessor, I/O peripheral devices and etc.) on a chip. With advent of new technology, i.e., CMOS (Complementary Metal Oxide Semiconductor) process technology one can fabricate a chip contains more than Million of gates

A. TOOLS USED MODELSIM


ModelSim is a verification and simulation tool for VHDL, Verilog, SystemVerilog, mixed language designs. ModelSim simulation environment is divided into four Basic simulation flow Project flow Multiple library flow Debugging tools

F. THREE STEP DIAMOND SEARCH ALGORITHM


The proposed TSDS algorithm uses the same type of patterns used in DS algorithm with a reduction in step size. Based on the location of the MBD point, the number of checking points to be used in the successive steps varies. The number of searching steps is reduced to three and the SDSP search is reached at the third step regardless of the location of the MBD point.

BASIC SIMULATION FLOW


Create a working library

Compile design files

Load and Run Simulator

Figure 5.3: Example of TSDS algorithm The LDSP pattern is repeatedly used until the center point becomes the MBD point. Thus the compact configuration and reduced number of search points provide an improved performance than the other existing algorithms. An example of this algorithm is given in Fig 5.3. The algorithm for this TSDS algorithm is summarized as follows. Algorithm Step1: Initial LDSP is centered at the origin of the search window. Now, test each points in the search pattern .If the MBD point is the center point go to step3. Otherwise go to step2. Step2: Form a new LDSP with the MBD point as the center point. If the new MBD point is at the center position, go to step3.Otherwise repeat this step for one more time. Step3: Form the SDSP pattern with the previous MBD point as the center point. The new MBD point obtained in this step becomes the final solution i.e., the motion vector (x, y). The number of search points depends on the location of MBD point. The MBD point also determines the search direction.

Debug Results

Figure 6.1: Basic Simulation Flow Overview Lab

Creating the Working Library


In ModelSim, all designs are compiled into a library. You typically start a new simulation in ModelSim by creating a working library called "work," which is the default library name used by the compiler as the default destination for compiled design units. Compiling the Design After creating the working library, you compile your design units into it. The ModelSim library format is compatible across all supported platforms. You can simulate your design on any platform without having to recompile your design.

Loading the Simulator with the Design and Running the Simulation With the design compiled, you load the simulator with your design by invoking the simulator on a top-level module (Verilog) or a configuration or entity/architecture pair (VHDL). Assuming the design loads successfully, the simulation time is set to zero, and you enter a run command to begin simulation.

Graphical User Interface Design Flow


You can use the Quartus II software graphical user interface (GUI) to perform all stages of the design flow. Figure 2 shows the Quartus II GUI as it appears when you first start the software. The Quartus II software includes a modular Compiler. The Compiler includes the following modules (modules marked with an asterisk are optional during a compilation, depending on your settings) Analysis & Synthesis Partition Merge Fitter Assembler TimeQuest Timing Analyzer Design Assistant EDA Netlist Writer HardCopy

Debugging the Results


If you dont get the results you expect, you can use ModelSims robust debugging environment to track down the cause of the problem.

Project Flow
A project is a collection mechanism for an HDL design under specification or test. Even though you dont have to use projects in ModelSim, they may ease interaction with the tool and are useful for organizing files and specifying simulation settings. Create a Project

Netlist Writer

Add files to project

Compile design files

Run Simulation Figure 6.4: Graphical User Interface Design Flow Debug Results To run all Compiler modules as part of a full compilation, on the Processing menu, click Start Compilation. You can also run each module individually by pointing to Start on the Processing menu, and then clicking the command for the module you want to start.In addition, you can use the Tasks window to start Compiler modules individually The Tasks window also allows you to change settings or view the report file for the module, or to start other tools related to each stage in a flow. V. SIMULATION RESULTS

Figure 6.2: Project Flow Altera Quartus II design Flow The Altera Quartus II design software provides a complete, multiplatform design environment that easily adapts to your specific design needs. It is a comprehensive environment for system-on-a-programmable-chip (SOPC) design. The Quartus II software includes solutions for all phases of FPGA and CPLD design

A. PERFORMANCE COMPARISON
A window size of 1515 is used for the experimentation of this algorithm and the center point of initial LDSP is at the origin of the search window. The performance of the algorithm is evaluated by error metrics such as the Mean Square Error and signal to noise ratio. The performance analysis has been done for an officer sequence

(a)
Figure 6.3: Quartus II design flow

MOTION DETECTION

(b)
Figure 7.1: Performance comparison a) Mean square error b) Peak Signal to noise ratio Simulation results show that, it gives a better performance compared with the existing TSS and DS algorithms with a reduction in step size also. The search is confined within the search window and the reduction in number of steps results in reduction in computational complexity. Simplicity and regularity of this algorithm provides an efficient implementation. The criterion used for the distortion measurement is Sum of Absolute Difference (SAD), which gives the MBD point for the motion vector calculation. The pixels are arranged in such a way that two in horizontal direction and in vertical direction, and one in each diagonal direction. This makes the algorithm to reach a global minimum point. The maximum number of search points used is 23 whereas the TSS uses 25 search points. It achieves a close MSE performance with the DS, TSS, NTSS algorithms for the image sequences with small motion as well as large motion contents.

Figure 7.3: Motion detection We know that compression will be efficient if the micro motions of the images are identified. Since cross diamond search algorithm is efficient method to identify such micro motions. If concurrent frames have same information, then compression ratio would be high and block matching will be completed within few cycles. Above fig. shows there is minimum number of pixel changes between current frame and reference frame so that only 3 compression vectors are executed. MOTION VECTOR GENERATION

B. OUTPUT PARAMETERS
We use two memory unit, one for current frame and another one for reference frame. For diamond search algorithm, we need 9 pixel values from both current and reference memory. The ratio of compression is based on this 9 pixel values. Thus output is based on 9 pixel values of frames, Sum Absolute Difference and memory addressing parameters. MEMORY READ/WRITE PROCESS Figure 7.4: Motion vector generation Above figure shows complete execution to identify compression vectors. There are 9 pixels are read from current memory block and 9 from reference memory block. It SAD value is executed using comparator. Based on SAD point value, FSM controller moves centre point pixel axis to minimum SAD point. This process is executed for the complete frame and motion vectors are loaded into vector memory. VI. CONCLUSION

Figure 7.2: Memory read/write process Above simulation results shows initial data loading from memories and address decoding logic. Initially Reset signal holds 1 and after 100ns it downs to low with enable signal high. Here we use address decoding logic that control pipelining. After completing a macro blocks from 1st row of the frame, it needs to be continued to same row until it finishes 1 to 640 pixels address. After completing first row, address decoder changes its row address from 0-15 to 16-31 pixels block. This process will be continued until block matching algorithm completes whole frame.

We proposed a Three Step Diamond Search algorithm for computationally efficient block motion estimation for image compression. The proposed technique can be applied for both spatial and temporal image. Because of the compact shape of the search pattern and step size it outperforms other existing algorithms such as TSS, NTSS, and DS in terms of computational efficiency with a better performance. This algorithm can be used in video coding standards such as MPEG-4, H.264 AVC because of its ease of implementation, better performance and reduced computational complexity.

Future Enhancement A de-blocking filter is a video filter applied to blocks in decoded video to improve visual quality and prediction performance by smoothing the sharp edges which can form between macro blocks when block coding techniques are used. The filter aims to improve the appearance of decoded pictures. Decoded frames consist of corner artifacts that reduce the quality of an image. To avoid these artifacts, many de-blocking filters are proposed but its failed to remove ringing artifacts, also the pixels are blurred due to non-uniform filter coefficients equalization. To overcome this issue we design an Adaptive bilateral loop filter that can apply across both vertical and horizontal position of the pixel blocks.

REFERENCES
[1] P.G.Howard and J.S.Vitter,Fast and efficient lossless image compression, in Proc. IEEE Int. Conf. Data Compression, 1993, pp. 501510. [2] J.R.Jain and A.K.Jain, Displacement Measurement and its application in Interframe Image Coding, IEEE Trans, Communication, Vol.COM- 2 9, pp.1799-1808, Dec 1981. [3] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro, Motion Compensated Interframe Coding for Video Conferencing, Proc. of NTC81, New Orleans, LA, pp. G5.3.1-G5.3.5, Nov. 2 9-Dec. 3, 1981 [4] Shan Zhu and Kai-Kuang Ma A New Diamond Search Algorithm For Block Matching Motion Estimation, IEEE Transactions on Image Processing, vol.9.pp. 2 87- 2 90, Feb 2000. [5] Tsung-Han Tsai, Yu-Hsuan Lee, and Yu-Yu Lee, Design and Analysis of High-Throughput Lossless Image Compression Engine Using VLSI-Oriented FELICS Algorithm, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., Vol. 18, No. 1, January 2 010. [6] X. Wu and N. D. Memon, Context-based, adaptive, lossless image coding, IEEE Trans. Commun., vol. 45, no. 4, pp. 437-444, Apr. 1997. [7] M. J. Weinberger, G. Seroussi, and G. Sapiro , The LOCO-I lossless image compression algorithm: Principles and standardization into JPEG-LS, IEEE Trans. Image Process., vol. 9, no. 8, pp. 130913 2 4, Aug. 2000. [8] R.Li.B.Zeng, and M.L.Liou, A New three-step search algorithm for block motion estimation, IEEE Transactions on circuits and systems on video technology, vol.6.pp.438-44 2 , August 1994.

You might also like