Yang Liu (刘阳)

Associate Professor @ Sun Yat-sen University (SYSU)

liuy856@mail.sysu.edu.cn
School of Computer Science and Engineering, SYSU, Guangzhou, China
HCP-Lab

I am currently an Associate Professor at the School of Computer Science and Engineering, Sun Yat-sen University (SYSU). I am a core member of the HCP Lab led by Prof. Liang Lin. I obtained my Ph.D. degree from Xidian University in 2019.

My research primarily focuses on Embodied AI, Multimodal Spatial Perception & Reasoning, and Causal Inference. I have published over 50 papers in top-tier conferences and journals, including TPAMI, TIP, TMECH, TKDE, CVPR, ICCV, and ACM MM. Several of my works have been selected as Oral/Highlight presentations or ESI Highly Cited Papers. I also authored the book "Multimodal Large Models: The New Paradigm of Artificial General Intelligence". 中文主页

Research Interests

Embodied AI 3D Spatial Reasoning VL Navigation Causal Inference Robotic Manipulation

Selected Awards

  • Outstanding Teaching Achievement of Guangdong Province (2nd Prize), 2025
  • Excellent Author of PHE (Publishing House of Electronics Industry), 2024
  • CCF ChinaSoft 2023 Challenge (3rd Prize)
  • 3rd Guangdong Province Young CS Academic Show (1st Prize), 2023
  • National Scholarship for PhD Students, 2018

News

  • 2025-11 One T-IP paper is selected as the ESI Highly Cited Paper!
  • 2025-09 Two papers are accepted by NeurIPS 2025!
  • 2025-08 One paper is accepted by Neural Networks!
  • 2025-07 One paper is accepted by ACM MM 2025 as Oral!
  • 2025-06 One paper is accepted by ICCV 2025!
  • 2025-06 One paper is accepted by IEEE TKDE!
  • 2025-05 Our Embodied AI Survey paper is accepted by IEEE/ASME T-Mechatronics!
  • 2025-05 One paper is accepted by ACL 2025!
  • 2025-05 Paper "Cross-Modal Causal Representation Learning..." accepted by IEEE T-IP!
  • 2025-04 Our CRA-GQA is selected as CVPR 2025 Highlight!
  • 2025-02 Three papers are accepted by CVPR 2025!
  • 2024-07 We release the paper list for Embodied AI!
  • 2024-07 One paper is accepted by ACM MM 2024!
  • 2024-06 The book 《多模态大模型:新一代人工智能技术范式》 is selected for the SYSU Undergraduate Textbook Series!
  • 2024-05 One first-author T-PAMI paper is selected as the ESI Hot Cited Paper!
  • 2024-05 One first-author T-PAMI paper is selected as the ESI Highly Cited Paper!
  • 2024-04 The book of multimodal large model 《多模态大模型:新一代人工智能技术范式》 is published!
  • 2023-12 I won the third prize of CCF ChinaSoft 2023 Robotic Big Model and Embodied Intelligence Challenge!
  • 2023-11 One first-author T-IP paper is selected as the ESI Hot Cited Paper!
  • 2023-10 An invention patent has been granted.
  • 2023-07 One paper accepted by ACM MM 2023!
  • 2023-07 Two papers accepted by ICCV 2023!
  • 2023-06 One paper accepted by T-PAMI!
  • 2023-03 The open-source framework Causal-VLReasoning is online!
  • 2022-03 One paper accepted by CVPR 2022 as Oral presentation.
  • 2021-10 I start working as a research associate professor at Sun-Yat-Sen University.

Publications

Books

Book Cover
Multimodal Large Models: The New Paradigm of Artificial General Intelligence
《多模态大模型:新一代人工智能技术范式》

Yang Liu, Liang Lin

Publishing House of Electronics Industry (PHE), 2024. Textbook Series

GitHub stars
Book Cover
Multimodal Large Models: The New Paradigm of Artificial Intelligence

Liang Lin, Yang Liu

Springer, 2025.

GitHub stars

Open-source Framework

Framework Demo
CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal Reasoning

Yang Liu, Weixing Chen, Guanbin Li, Liang Lin

CausalVLR is a python open-source framework for causal relation discovery and inference, implementing SOTA causality learning algorithms for various visual-linguistic reasoning tasks.

GitHub stars GitHub forks
@article{CausalVLR, title={CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal Reasoning}, author={Liu, Yang and Chen, Weixing and Li, Guanbin and Lin, Liang}, journal={arXiv preprint arXiv:2306.17462}, year={2023} }

Selected Papers

Paper Image
3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians

Zeming Wei#, Junyi Lin#, Yang Liu, Weixing Chen, Jingzhou Luo, Guanbin Li, Liang Lin

ACM International Conference on Multimedia (ACM MM), 2025 Oral

GitHub stars
@article{3DAffordSplat, title={3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians}, author={Wei, Zeming and Lin, Junyi and Liu, Yang and Chen, Weixing and Luo, Jingzhou and Li, Guanbin and Lin, Liang}, year={2025}, journal={arXiv preprint arXiv:2504.11218} }
Paper Image
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering

Kaixuan Jiang, Yang Liu, Weixing Chen, Jingzhou Luo, Ziliang Chen, Ling Pan, Guanbin Li, Liang Lin

IEEE/CVF International Conference on Computer Vision (ICCV), 2025

GitHub stars
@inproceedings{EXPRESSBench, title={Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering}, author={Jiang, Kaixuan and Liu, Yang and Chen, Weixing and Luo, Jingzhou and Chen, Ziliang and Pan, Ling and Li, Guanbin and Lin, Liang}, year={2025}, booktitle={IEEE/CVF International Conference on Computer Vision (ICCV)} }
Paper Image
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI

Yang Liu, Weixing Chen, Yongjie Bai, Xiaodan Liang, Guanbin Li, Wen Gao, Liang Lin

IEEE/ASME Transactions on Mechatronics, 2025

GitHub stars
@article{liu2024aligning, title={Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI}, author={Liu, Yang and Chen, Weixing and Bai, Yongjie and Liang, Xiaodan and Li, Guanbin and Gao, Wen and Lin, Liang}, journal={arXiv preprint arXiv:2407.06886}, year={2024} }
Paper Image
Cross-Modal Causal Representation Learning for Radiology Report Generation

Weixing Chen, Yang Liu, Ce Wang, Jiarui Zhu, Guanbin Li, Cheng-Lin Liu, Liang Lin

IEEE Transactions on Image Processing (T-IP), 2025 ESI Highly Cited

GitHub stars
@article{chen2025visual, title={Cross-Modal Causal Representation Learning for Radiology Report Generation}, author={Chen, Weixing and Liu, Yang and Wang, Ce and Zhu, Jiarui and Li, Guanbin and Liu, Cheng-Lin and Lin, Liang}, journal={IEEE Transactions on Image Processing}, year={2025} }
Paper Image
ODMixer: Fine-grained Spatial-temporal MLP for Metro Origin-Destination Prediction

Yang Liu, Binglin Chen, Yongsen Zheng, Lechao Cheng, Guanbin Li, Liang Lin

IEEE Transactions on Knowledge and Data Engineering (TKDE), 2025

GitHub stars
@article{liu2024fine, title={ODMixer: Fine-grained Spatial-temporal MLP for Metro Origin-Destination Prediction}, author={Liu, Yang and Chen, Binglin and Zheng, Yongsen and Cheng, Lechao and Li, Guanbin and Lin, Liang}, journal={arXiv preprint arXiv:2404.15734}, year={2024} }
Paper Image
Learning to See and Act: Task-Aware View Planning for Robotic Manipulation

Yongjie Bai#, Zhouxia Wang#, Yang Liu, Weixing Chen, Ziliang Chen, Mingtong Dai, Yongsen Zheng, Lingbo Liu, Guanbin Li, Liang Lin

Preprint, 2025

GitHub stars
@misc{bai2025learningacttaskawareview, title={Learning to See and Act: Task-Aware View Planning for Robotic Manipulation}, author={Yongjie Bai and Zhouxia Wang and Yang Liu and Weixing Chen and Ziliang Chen and Mingtong Dai and Yongsen Zheng and Lingbo Liu and Guanbin Li and Liang Lin}, year={2025}, eprint={2508.05186}, archivePrefix={arXiv}, primaryClass={cs.RO} }
Paper Image
AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning

Weixing Chen, Dafeng Chi, Yang Liu, Yuxi Yang, Yexin Zhang, Yuzheng Zhuang, Xingyue Quan, Jianye Hao, Guanbin Li, Liang Lin

Preprint, 2025

@article{chen2025autolayout, title={AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning}, author={Chen, Weixing and Chi, Dafeng and Liu, Yang and Yang, Yuxi and Zhang, Yexin and Zhuang, Yuzheng and Quan, Xingyue and Hao, Jianye and Li, Guanbin and Lin, Liang}, journal={arXiv preprint arXiv:2507.04293}, year={2025} }
Paper Image
DART: Differentiable Dynamic Adaptive Region Tokenizer for Vision Foundation Models

Shicheng Yin, Kaixuan Yin, Yang Liu, Weixing Chen, Liang Lin

Preprint, 2025

GitHub stars
@article{yin2025dart, title={DART: Differentiable Dynamic Adaptive Region Tokenizer for Vision Transformer and Mamba}, author={Shicheng Yin and Kaixuan Yin and Yang Liu and Weixing Chen and Liang Lin}, journal={arXiv preprint arXiv:2506.10390}, year={2025} }
Paper Image
Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

Xinshuai Song*, Weixing Chen*, Yang Liu, Weikai Chen, Guanbin Li, Liang Lin

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

GitHub stars
@inproceedings{song2024towards, title={Towards long-horizon vision-language navigation: Platform, benchmark and method}, author={Song, Xinshuai and Chen, Weixing and Liu, Yang and Chen, Weikai and Li, Guanbin and Lin, Liang}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2025} }
Paper Image
DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

Jingzhou Luo, Yang Liu, Weixing Chen, Zhen Li, Yaowei Wang, Guanbin Li, Liang Lin

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

GitHub stars
@inproceedings{luo2025dspnet, title={DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering}, author={Luo, Jingzhou and Liu, Yang and Chen, Weixing and Li, Zhen and Wang, Yaowei and Li, Guanbin and Lin, Liang}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2025} }
Paper Image
Cross-modal Causal Relation Alignment for Video Question Grounding

Weixing Chen, Yang Liu, Binglin Chen, Jiandong Su, Yongsen Zheng, Liang Lin

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025 Highlight

GitHub stars
@inproceedings{chen2025cross, title={Cross-modal Causal Relation Alignment for Video Question Grounding}, author={Chen, Weixing and Liu, Yang and Chen, Binglin and Su, Jiandong and Zheng, Yongsen and Lin, Liang}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2025} }
Paper Image
InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction

Pengzhen Ren, Min Li, Zhen Luo, Xinshuai Song, Ziwei Chen, Weijia Liufu, Yixuan Yang, Hao Zheng, Rongtao Xu, Zitong Huang, Tongsheng Ding, Luyang Xie, Kaidong Zhang, Changfei Fu, Yang Liu, Liang Lin, Feng Zheng, Xiaodan Liang

arXiv preprint:2412.05789, 2024

GitHub stars
@article{ren2024infiniteworld, title={InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction}, author={Ren, Pengzhen and Li, Min and Luo, Zhen and Song, Xinshuai and Chen, Ziwei and Liufu, Weijia and Yang, Yixuan and Zheng, Hao and Xu, Rongtao and Huang, Zitong and others}, journal={arXiv preprint arXiv:2412.05789}, year={2024} }
Paper Image
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering

Yang Liu, Guanbin Li, Liang Lin

IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2023 ESI Highly Cited & Hot

GitHub stars
@article{liu2022cross, author={Liu, Yang and Li, Guanbin and Lin, Liang}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, title={Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering}, year={2023}, doi={10.1109/TPAMI.2023.3284038} }
Paper Image
Diversity Matters: User-Centric Multi-Interest Learning for Conversational Movie Recommendation

Yongsen Zheng, Guohua Wang, Yang Liu, Liang Lin

ACM International Conference on Multimedia (ACM MM), 2024

Paper Image
MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments

Yang Liu, Xinshuai Song, Kaixuan Jiang, Weixing Chen, Jingzhou Luo, Guanbin Li, Liang Lin

arXiv preprint:2402.00290, 2024

@article{liu2024multimodal, title={MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments}, author={Liu, Yang and Song, Xinshuai and Jiang, Kaixuan and Chen, Weixing and Luo, Jingzhou and Li, Guanbin and Lin, Liang}, journal={arXiv preprint arXiv:2402.00290}, year={2024} }
Paper Image
CausalGPT: Illuminating Faithfulness and Causality for Knowledge-based Reasoning with LLMs

Ziyi Tang, Ruilin Wang, Weixing Chen, Yongsen Zheng, Yang Liu, Keze Wang, Tianshui Chen, Liang Lin

arXiv preprint:2308.11914, 2023

GitHub stars
@article{tang2023towards, title={Towards causalgpt: A multi-agent approach for faithful knowledge reasoning via promoting causal consistency in llms}, author={Tang, Ziyi and Wang, Ruilin and Chen, Weixing and Wang, Keze and Liu, Yang and Chen, Tianshui and Lin, Liang}, journal={arXiv preprint arXiv:2308.11914}, year={2023} }
Paper Image
Visual Causal Scene Refinement for Video Question Answering

Yushen Wei*, Yang Liu*, Hong Yan, Guanbin Li, Liang Lin

ACM International Conference on Multimedia (ACM MM), 2023 Oral

GitHub stars
@inproceedings{10.1145/3581783.3611873, author = {Wei, Yushen and Liu, Yang and Yan, Hong and Li, Guanbin and Lin, Liang}, title = {Visual Causal Scene Refinement for Video Question Answering}, year = {2023}, series = {MM '23} }
Paper Image
SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training

Hong Yan, Yang Liu, Yushen Wei, Zhen Li, Guanbin Li, Liang Lin

IEEE/CVF International Conference on Computer Vision (ICCV), 2023

GitHub stars
@inproceedings{yan2023skeletonmae, title={Skeletonmae: graph-based masked autoencoder for skeleton sequence pre-training}, author={Yan, Hong and Liu, Yang and Wei, Yushen and Li, Zhen and Li, Guanbin and Lin, Liang}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, year={2023} }
Paper Image
Enhanced Soft Label for Semi-Supervised Semantic Segmentation

Jie Ma, Chuan Wang, Yang Liu, Liang Lin, Guanbin Li

IEEE/CVF International Conference on Computer Vision (ICCV), 2023

GitHub stars
@inproceedings{ma2023enhanced, title={Enhanced Soft Label for Semi-Supervised Semantic Segmentation}, author={Ma, Jie and Wang, Chuan and Liu, Yang and Lin, Liang and Li, Guanbin}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, year={2023} }
Paper Image
DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback

Junfan Lin, Yuying Zhu, Lingbo Liu, Yang Liu, Guanbin Li, Liang Lin

International Joint Conference on Artificial Intelligence (IJCAI), 2023

GitHub stars
@inproceedings{ijcai2023p672, title={DenseLight: Efficient Control for Large-scale Traffic Signals with Dense Feedback}, author={Lin, Junfan and Zhu, Yuying and Liu, Lingbo and Liu, Yang and Li, Guanbin and Lin, Liang}, booktitle={Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, {IJCAI-23}}, year={2023} }
Paper Image
Hybrid-Order Representation Learning for Electricity Theft Detection

Yuying Zhu, Yang Zhang, Lingbo Liu, Yang Liu, Guanbin Li, Mingzhi Mao, Liang Lin

IEEE Transactions on Industrial Informatics (T-II), 2023

GitHub stars
@article{zhu2022hybrid, title={Hybrid-Order Representation Learning for Electricity Theft Detection}, author={Zhu, Yuying and Zhang, Yang and Liu, Lingbo and Liu, Yang and Li, Guanbin and Mao, Mingzhi and Lin, Liang}, journal={IEEE Transactions on Industrial Informatics}, year={2023} }
Paper Image
Urban Regional Function Guided Traffic Flow Prediction

Kuo Wang, Lingbo Liu, Yang Liu, Guanbin Li, Liang Lin

Information Sciences (INS), 2023

@article{TFP, title = {Urban regional function guided traffic flow prediction}, journal = {Information Sciences}, year = {2023}, author = {Kuo Wang and LingBo Liu and Yang Liu and GuanBin Li and Fan Zhou and Liang Lin}, }
Paper Image
Dual adversarial adaptation for cross-device real-world image super-resolution

Xiaoqian Xu, Pengxu Wei, Weikai Chen, Yang Liu, Mingzhi Mao, Liang Lin, Guanbin Li

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022 Oral

GitHub stars
@inproceedings{xu2022dual, title={Dual adversarial adaptation for cross-device real-world image super-resolution}, author={Xu, Xiaoqian and Wei, Pengxu and Chen, Weikai and Liu, Yang and Mao, Mingzhi and Lin, Liang and Li, Guanbin}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2022} }
Causal Reasoning Meets Visual Representation Learning: A Prospective Study

Yang Liu, Yushen Wei, Hong Yan, Guanbin Li, Liang Lin

Machine Intelligence Research (MIR), 2022 Top-10 Downloads

GitHub stars
@article{liu2022causal, title={Causal Reasoning Meets Visual Representation Learning: A Prospective Study}, author={Liu, Yang and Wei, Yu-Shen and Yan, Hong and Li, Guan-Bin and Lin, Liang}, journal={Machine Intelligence Research}, year={2022} }
Paper Image
Cross-modal knowledge distillation for Vision-to-Sensor action recognition

Jianyuan Ni, Raunak Sarbajna, Yang Liu, Anne HH Ngu, Yan Yan

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022

@inproceedings{ni2022cross, title={Cross-modal knowledge distillation for vision-to-sensor action recognition}, author={Ni, Jianyuan and Sarbajna, Raunak and Liu, Yang and Ngu, Anne HH and Yan, Yan}, booktitle={ICASSP}, year={2022} }
Paper Image
TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning

Yang Liu, Keze Wang, Lingbo Liu, Haoyuan Lan, Liang Lin

IEEE Transactions on Image Processing (T-IP), 2022 ESI Highly Cited & Hot

GitHub stars
@article{liu2022tcgl, title={TCGL: Temporal Contrastive Graph for Self-Supervised Video Representation Learning}, author={Liu, Yang and Wang, Keze and Liu, Lingbo and Lan, Haoyuan and Lin, Liang}, journal={IEEE Transactions on Image Processing}, year={2022} }
Paper Image
Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition

Yang Liu, Keze Wang, Guanbin Li, Liang Lin

IEEE Transactions on Image Processing (T-IP), 2021

GitHub stars
@article{liu2021semantics, title={Semantics-aware adaptive knowledge distillation for sensor-to-vision action recognition}, author={Liu, Yang and Wang, Keze and Li, Guanbin and Lin, Liang}, journal={IEEE Transactions on Image Processing}, year={2021} }
Paper Image
Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition

Yang Liu, Zhaoyang Lu, Jing Li, Tao Yang, Chao Yao

IEEE Transactions on Image Processing (T-IP), 2020

GitHub stars
@article{liu2019deep, title={Deep image-to-video adaptation and fusion networks for action recognition}, author={Liu, Yang and Lu, Zhaoyang and Li, Jing and Yang, Tao and Yao, Chao}, journal={IEEE Transactions on Image Processing}, year={2019} }
Paper Image
Hierarchically Learned View-Invariant Representations for Cross View Action Recognition

Yang Liu, Zhaoyang Lu, Jing Li, Tao Yang

IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 2019

GitHub stars
@article{liu2018hierarchically, title={Hierarchically learned view-invariant representations for cross-view action recognition}, author={Liu, Yang and Lu, Zhaoyang and Li, Jing and Yang, Tao}, journal={IEEE Transactions on Circuits and Systems for Video Technology}, year={2018} }
Paper Image
Global Temporal Representation based CNNs for Infrared Action Recognition

Yang Liu, Zhaoyang Lu, Jing Li, Tao Yang, Chao Yao

IEEE Signal Processing Letters (SPL), 2018

GitHub stars
@article{liu2018global, title={Global temporal representation based cnns for infrared action recognition}, author={Liu, Yang and Lu, Zhaoyang and Li, Jing and Yang, Tao and Yao, Chao}, journal={IEEE Signal Processing Letters}, year={2018} }

PhD Dissertation

Thesis
Cross-domain Human Action Recognition via Transfer Learning (基于迁移学习的跨域人体行为识别研究)

Yang Liu (Supervisor: Prof. Zhaoyang Lu)

Xidian University, 2019

@phdthesis{刘阳2019基于迁移学习的跨域人体行为识别研究, title={基于迁移学习的跨域人体行为识别研究}, author={刘阳}, year={2019}, school={西安电子科技大学} }

Academic Services

Reviewer for Journals

  • IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI)
  • IEEE Trans. on Image Processing (TIP)
  • IEEE Trans. on Neural Networks and Learning Systems (TNNLS)
  • IEEE Trans. on Cybernetics
  • International Journal of Computer Vision (IJCV)
  • ACM Trans. on Multimedia Computing Comm. and Applications
  • Pattern Recognition (PR)
  • Neural Networks
  • Information Fusion
  • Advanced Science

PC Member / Reviewer for Conferences

CVPR ICCV ECCV NeurIPS ICML ICLR AAAI IJCAI ACM MM ICASSP UbiComp ISWC
Serving as Program Committee member or Reviewer for top-tier computer vision and machine learning conferences.
```