In the last decade, numerous supervised deep learning approaches requiring large amounts of labeled data have been proposed for visual-inertial odometry (VIO) and depth map estimation. To overcome the data limitation, self-supervised learning has emerged as a promising alternative, exploiting constraints such as geometric and photometric consistency in the scene. In this study, we introduce a novel self-supervised deep learning-based VIO and depth map recovery approach (SelfVIO) using adversarial training and self-adaptive visual-inertial sensor fusion. SelfVIO learns to jointly estimate 6 degrees-of-freedom (6-DoF) ego-motion and a depth map of the scene from unlabeled monocular RGB image sequences and inertial measurement unit (IMU) readings. The proposed approach is able to perform VIO without the need for IMU intrinsic parameters and/or the extrinsic calibration between the IMU and the camera. estimation and single-view depth recovery network. We provide comprehensive quantitative and qualitative evaluations of the proposed framework comparing its performance with state-of-the-art VIO, VO, and visual simultaneous localization and mapping (VSLAM) approaches on the KITTI, EuRoC and Cityscapes datasets. Detailed comparisons prove that SelfVIO outperforms state-of-the-art VIO approaches in terms of pose estimation and depth recovery, making it a promising approach among existing methods in the literature.
@article{almalioglu2019selfvio,title={SelfVIO: Self-supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation},author={Almalioglu, Yasin and Turan, Mehmet and Sari, Alp Eren and Risqi U. Saputra, Muhamad and Porto Buarque de Gusmao, Pedro and Markham, Andrew and Trigoni, Niki},journal={Neural Networks},url={https://www.sciencedirect.com/science/article/pii/S0893608022000752},year={2022}}
2021
ICASSP
End-to-End Speech Recognition from Federated Acoustic Models
Yan Gao, Titouan Parcollet, Javier Fernandez-Marques, and 3 more authors
Training Automatic Speech Recognition (ASR) models under federated learning (FL) settings has attracted a lot of attention recently. However, the FL scenarios often presented in the literature are artificial and fail to capture the complexity of real FL systems. In this paper, we construct a challenging and realistic ASR federated experimental setup consisting of clients with heterogeneous data distributions using the French and Italian sets of the CommonVoice dataset, a large heterogeneous dataset containing thousands of different speakers, acoustic environments and noises. We present the first empirical study on attention-based sequence-to-sequence End-to-End (E2E) ASR model with three aggregation weighting strategies – standard FedAvg, loss-based aggregation and a novel word error rate (WER)-based aggregation, compared in two realistic FL scenarios: cross-silo with 10 clients and cross-device with 2K and 4K clients. Our analysis on E2E ASR from heterogeneous and realistic federated acoustic models provides the foundations for future research and development of realistic FL-based ASR applications.
@article{DBLP:journals/corr/abs-2104-14297,author={Gao, Yan and Parcollet, Titouan and Fernandez-Marques, Javier and Porto Buarque de Gusmao, Pedro and Beutel, Daniel J. and Lane, Nicholas D.},title={End-to-End Speech Recognition from Federated Acoustic Models},journal={arXiv preprint arXiv:2104.14297},volume={abs/2104.14297},year={2021},}
T-RO
Graph-based Thermal-Inertial SLAM with Probabilistic Neural Networks
Muhamad Risqi U. Saputra, Chris Xiaoxuan Lu, Pedro Gusmao, and 3 more authors
Simultaneous Localization and Mapping (SLAM) system typically employ vision-based sensors to observe the surrounding environment. However, the performance of such systems highly depends on the ambient illumination conditions. In scenarios with adverse visibility or in the presence of airborne particulates (e.g. smoke, dust, etc.), alternative modalities such as those based on thermal imaging and inertial sensors are more promising. In this paper, we propose the first complete thermal-inertial SLAM system which combines neural abstraction in the SLAM front end with robust pose graph optimization in the SLAM back end. We model the sensor abstraction in the front end by employing probabilistic deep learning parameterized by Mixture Density Networks (MDN). Our key strategies to successfully model this encoding from thermal imagery are the usage of normalized 14-bit radiometric data, the incorporation of hallucinated visual (RGB) features, and the inclusion of feature selection to estimate the MDN parameters. To enable a full SLAM system, we also design an efficient global image descriptor which is able to detect loop closures from thermal embedding vectors. We performed extensive experiments and analysis using three datasets, namely self-collected ground robot and handheld data taken in indoor environment, and one public dataset (SubT-tunnel) collected in underground tunnel. Finally, we demonstrate that an accurate thermal-inertial SLAM system can be realized in conditions of both benign and adverse visibility.
@article{saputra2021graph,title={Graph-based Thermal-Inertial SLAM with Probabilistic Neural Networks},author={Risqi U. Saputra, Muhamad and Xiaoxuan Lu, Chris and Porto Buarque de Gusmao, Pedro and Wang, Bing and Markham, Andrew and Trigoni, Niki},journal={IEEE Transactions on Robotics (T-RO)},year={2021}}
MLSys Workshop
On-device Federated Learning with Flower
Akhil Mathur, Daniel J. Beutel, Pedro Gusmao, and 6 more authors
Federated Learning (FL) allows edge devices to collaboratively learn a shared prediction model while keeping their training data on the device, thereby decoupling the ability to do machine learning from the need to store data in the cloud. Despite the algorithmic advancements in FL, the support for on-device training of FL algorithms on edge devices remains poor. In this paper, we present an exploration of on-device FL on various smartphones and embedded devices using the Flower framework. We also evaluate the system costs of on-device FL and discuss how this quantification could be used to design more efficient FL algorithms.
@inproceedings{mathur2021device,title={On-device Federated Learning with Flower},author={Mathur, Akhil and Beutel, Daniel J. and Porto Buarque de Gusmao, Pedro and Fernandez-Marques, Javier and Topal, Taner and Qiu, Xinchi and Parcollet, Titouan and Gao, Yan and Lane, Nicholas D},workshop={arXiv preprint arXiv:2104.03042},url={https://arxiv.org/abs/2104.03042},year={2021},maintitle={Fourth Conference on Machine Learning and Systems (MLSys)},booktitle={On-device Intelligence Workshop (MLSys)}}
ICRA
RadarLoc: Learning to Relocalize in FMCW Radar
Wei Wang, Pedro Gusmao, Bo Yang, and 2 more authors
In IEEE International Conference on Robotics and Automation, 2021
@inproceedings{wang2020radarloc,title={RadarLoc: Learning to Relocalize in FMCW Radar},author={Wang, Wei and Porto Buarque de Gusmao, Pedro and Yang, Bo and Markham, Andrew and Trigoni, Niki},booktitle={IEEE International Conference on Robotics and Automation},year={2021},}
2020
ACM SenSys
milliEgo: single-chip mmWave radar aided egomotion estimation via deep sensor fusion
Chris Xiaoxuan Lu, Muhamad Risqi U. Saputra, Peijun Zhao, and 6 more authors
In Proceedings of the 18th Conference on Embedded Networked Sensor Systems, 2020
@inproceedings{lu2020milliego,title={milliEgo: single-chip mmWave radar aided egomotion estimation via deep sensor fusion},author={Lu, Chris Xiaoxuan and Risqi U. Saputra, Muhamad and Zhao, Peijun and Almalioglu, Yasin and Port Buarque de Gusmao, Pedro and Chen, Changhao and Sun, Ke and Trigoni, Niki and Markham, Andrew},booktitle={Proceedings of the 18th Conference on Embedded Networked Sensor Systems},pages={109--122},year={2020},}
RA-L
DeepTIO: A Deep Thermal-Inertial Odometry With Visual Hallucination
Muhamad Risqi U. Saputra, Pedro Gusmao, Chris Xiaoxuan Lu, and 7 more authors
@article{8968430,author={Saputra, Muhamad Risqi U. and Porto Buarque de Gusmao, Pedro and Lu, Chris Xiaoxuan and Almalioglu, Yasin and Rosa, Stefano and Chen, Changhao and Wahlström, Johan and Wang, Wei and Markham, Andrew and Trigoni, Niki},journal={IEEE Robotics and Automation Letters},title={DeepTIO: A Deep Thermal-Inertial Odometry With Visual Hallucination},year={2020},volume={5},number={2},pages={1672-1679},doi={10.1109/LRA.2020.2969170},}
Sensors Journal
Sensor Fusion for Magneto-Inductive Navigation
Johan Wahlström, Manon Kok, Pedro Gusmão, and 3 more authors
@article{8844709,author={Wahlström, Johan and Kok, Manon and Porto Buarque de Gusmão, Pedro and Abrudan, Traian E. and Trigoni, Niki and Markham, Andrew},journal={IEEE Sensors Journal},title={Sensor Fusion for Magneto-Inductive Navigation},year={2020},volume={20},number={1},pages={386-396},doi={10.1109/JSEN.2019.2942451},}
2019
DCOSS
Map-aided Navigation for Emergency Searches
Johan Wahlström, Pedro Gusmão, Andrew Markham, and 1 more author
In 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS), 2019
@inproceedings{8804785,author={Wahlström, Johan and Porto Buarque de Gusmão, Pedro and Markham, Andrew and Trigoni, Niki},booktitle={2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS)},title={Map-aided Navigation for Emergency Searches},year={2019},volume={},number={},pages={25-32},doi={10.1109/DCOSS.2019.00027}}
ICRA
GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
Yasin Almalioglu, Muhamad Risqi U. Saputra, Pedro P. B. de Gusmão, and 2 more authors
In 2019 International Conference on Robotics and Automation (ICRA), 2019
@inproceedings{8793512,author={Almalioglu, Yasin and Saputra, Muhamad Risqi U. and Gusmão, Pedro P. B. de and Markham, Andrew and Trigoni, Niki},booktitle={2019 International Conference on Robotics and Automation (ICRA)},title={GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks},year={2019},volume={},number={},pages={5474-5480},doi={10.1109/ICRA.2019.8793512}}
ICRA
Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning
Muhamad Risqi U. Saputra, Pedro P. B. Gusmao, Sen Wang, and 2 more authors
In 2019 International Conference on Robotics and Automation (ICRA), 2019
@inproceedings{8793581,author={Saputra, Muhamad Risqi U. and de Gusmao, Pedro P. B. and Wang, Sen and Markham, Andrew and Trigoni, Niki},booktitle={2019 International Conference on Robotics and Automation (ICRA)},title={Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning},year={2019},volume={},number={},pages={3549-3555},doi={10.1109/ICRA.2019.8793581}}
ICCV
Distilling Knowledge From a Deep Pose Regressor Network
Muhamad Risqi U. Saputra, Pedro Gusmao, Yasin Almalioglu, and 2 more authors
In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019
@inproceedings{9009104,author={Saputra, Muhamad Risqi U. and Gusmao, Pedro and Almalioglu, Yasin and Markham, Andrew and Trigoni, Niki},booktitle={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},title={Distilling Knowledge From a Deep Pose Regressor Network},year={2019},volume={},number={},pages={263-272},doi={10.1109/ICCV.2019.00035}}
IROS
DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network
Wei Wang, Muhamad Risqi U. Saputra, Peijun Zhao, and 5 more authors
In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019
@inproceedings{8967756,author={Wang, Wei and Saputra, Muhamad Risqi U. and Zhao, Peijun and Gusmao, Pedro and Yang, Bo and Chen, Changhao and Markham, Andrew and Trigoni, Niki},booktitle={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},title={DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network},year={2019},volume={},number={},pages={3248-3254},doi={10.1109/IROS40897.2019.8967756}}
2016
CoDIT
Gabor filter based image representation for object classification
Syed Tahir Hussain Rizvi, Gianpiero Cabodi, Pedro Gusmao, and 1 more author
In 2016 International Conference on Control, Decision and Information Technologies (CoDIT), 2016
@inproceedings{7593635,author={Rizvi, Syed Tahir Hussain and Cabodi, Gianpiero and Gusmao, Pedro and Francini, Gianluca},booktitle={2016 International Conference on Control, Decision and Information Technologies (CoDIT)},title={Gabor filter based image representation for object classification},year={2016},volume={},number={},pages={628-632},doi={10.1109/CoDIT.2016.7593635}}
2015
MMSP
Loop detection in robotic navigation using MPEG CDVS
Pedro P. B. Gusmao, Stefano Rosa, Enrico Magli, and 2 more authors
In 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), 2015
@inproceedings{7340871,author={de Gusmao, Pedro P. B. and Rosa, Stefano and Magli, Enrico and Lepsøy, Skjalg and Francini, Gianluca},booktitle={2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP)},title={Loop detection in robotic navigation using MPEG CDVS},year={2015},volume={},number={},pages={1-6},doi={10.1109/MMSP.2015.7340871}}
2011
ICME
Statistical modelling of outliers for fast visual search
Skjalg Lepsøy, Gianluca Francini, Giovanni Cordara, and 1 more author
In 2011 IEEE International Conference on Multimedia and Expo, 2011
@inproceedings{6012184,author={Lepsøy, Skjalg and Francini, Gianluca and Cordara, Giovanni and de Gusmao, Pedro Porto Buarque},booktitle={2011 IEEE International Conference on Multimedia and Expo},title={Statistical modelling of outliers for fast visual search},year={2011},volume={},number={},pages={1-6},doi={10.1109/ICME.2011.6012184}}