SUN Lab
Yokohama National University
Learned Image Codec
Traditional image compresion standards have been developed for more than 30 years. On the other hand, with the developments of neural network, learned image compression (LIC) has shown a superior coding ability than the traditional image compression standards. As shown in the following figure, compared with the latest standard VVC intra, recent LIC can reach better coding ability.
Proposed LIC with GMM and attention, which was the state-of-the-art by 2020.
Proposed LIC with mixed transformer and CNN, which was the state-of-the-art by June 2023.
Video Coding for Machine
Cisco reported that the communication for the machine-to-machine will occupy up to 50% of all the communication in the coming IoT society. Therefore, reducing the transmission burden between machines becomes extremely important. Differently from human vision, video coding for machine aims to improve the accuracy of machine vision tasks such as object detection/tracking.
Proposed semantic segmentation in learned compressed domain
FPGA Neural Network Engine
Among various hardware platforms, FPGA has the advantage of high hardware utilization and power efficiency compared with CPU and GPU. In addition, compared with ASIC, FPGA is more flexible and reconfigurable which can keep up with the fast developments of neural network models. We developed an FPGA neural engine with fine-grained pipeline, and built a FPGA codec system for LIC.
We have a camera (right bottom) which captures the 720p 30fps raw video in the right display. Then one Xilinx FPGA board VCU118 is in charge of encoding. The encoded bitstream is sent to another FPGA board for the decoding. Finally, the decoded video is shown in the left display.
To evaluate the latency, we capture the real-time timestamps, and the difference between the raw timestamp and decoded timestamp is the end-to-end latency, which is around 560 ms. We also use the power meter to evaluate the real power for the overall FPGA board.
ASIC Chip Design
8K@120fps HEVC decoder chip in which I was in charge of Inverse Transform (IT) and De-quantization (IQ) design, led by Prof. Goto and Prof. Zhou.