Evaluating Pre-Processing Pipelines for Thermal-Visual Smart Camera Authors: Irida Shallari, Muhammad Imran, Najeem Lawal, Mattias O’Nils
Embedded processing in smart camera Benefits Reduced data for communication and analysis Real-time monitoring Challenges Energy consumption Performance Limited resources Computational Memory Energy Communication bandwidth
Embedded processing in smart camera
Surveillance of vulnerable areas
Alternative approach High frequency Low frequency
Problem statement Low level Design exploration in pixel-based image pre-processing pipeline for multi-sensor smart camera with respect to data communication vs classification accuracy. Intermediate High level -Spatial filtering -Temporal filtering -Segmentation -Morphology -Labelling -Classfication -Recognition Xmend trade-off communication vs performance
Classification algorithms 1280×1024 320×240 Capturing Capturing ROI Segmented ROI Segmented ROI ROI Classification algorithms SURF, SIFT Pre-processing architecture for smart camera
Image dataset Human and cyclist
Experimental setup Nexys 4 board which includes Artix-7 FPGA NVIDIA Tegra TK1 An IDS CMOS µEye visual camera focal length 12 mm, a resolution of 1280×1024 and a pixel pitch of 5.3 μm A Tamarisk 320 thermal camera focal length 19 mm, a sensor resolution of 320 x 240 and a pixel pitch of 17 μm
Design exploration Thermal Thermal binary Visual Visual binary
Architecture Camera node Client device 320×240 Human Animal Cyclist Capture Background subtract Camera node Segmentation Morphology Binary ROI coding Decoding Classification Client device Human Animal Cyclist
Classification accuracy vs Communication cost
Alternative approach High frequency Low frequency
Proposed architecture
Classification accuracy
Data compression (1280×1024) 3840 (200×300) 1406 (250×500) 2930 3 4 59 Image type Res./Size Kbytes (Raw data) Kbytes (RAW_ROI Kbytes (JPEG_ROI) KBytes (Bin_ROI) KBytes (G4_ROI) Human Cycl. Visual (1280×1024) 3840 (200×300) 1406 (250×500) 2930 3 4 59 122 2 Thermal (320×240) 75 (100×150) 117 (100×250) 20 1 15
Conclusions We propose an architectural approach in which: Thermal images are used as a mask to extract ROI Frequently transmitted ROI visual data Compressed visual data transmitted rarely for situational awareness Visual compressed ROI offers: 13%-64% better classification accuracy than binary visual ROI or thermographic images (raw_ROI, bin_ROI) New applications requiring situational awareness.
Questions?
Thank you!