Junyan Lin - Computer Science PhD Student

About Me

I am a Ph.D. student in Computer Science at The Hong Kong Polytechnic University, supervised by Prof. Changwen Chen and Dr. Xin Jin. My recent work focuses on multimodal foundation models and remote sensing understanding. If you are interested in collaboration, feel free to contact me by email.

Research Interests

Multimodal Large Language Models
Vision-Language-Action Models and World Models
Multisource Remote Sensing Image Classification

Education

Ph.D. Student in Computer Science (Aug. 2025 - Expected Jun. 2029)
The Hong Kong Polytechnic University, Department of Computing, Hong Kong, China
Supervisors: Changwen Chen, Xin Jin
Visiting Student (Jan. 2024 - Aug. 2025)
Eastern Institute for Advanced Study, College of Information Science and Technology, Ningbo, China
Supervisor: Xiaoyu Shen
M.Phil. in Computer Science and Technology (GPA: 3.82/4.0, Sep. 2022 - Jun. 2025)
Ocean University of China, Qingdao, China
B.S. in Computer Science and Technology (GPA: 88/100, Rank: 1/68, Sep. 2018 - Jun. 2022)
Zhejiang Gongshang University, Hangzhou, China

Awards

National College Students Information Security Competition - National Second Prize (2021)
National Scholarship for Graduate Student (2024)
Outstanding Graduate Student (2024)

Publications

Multimodal Large Language Models:

J. Zhang, J. Tong, J. Lin*, et al. Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models. CVPR 2026 (CCF-A, CV Tier 1 Conference)
J. Lin, J. Liu, S. Zhao, et al. Efficient Token Compression for the Understanding and Generation Unified MLLMs. ISCAS 2026 (CCF-B)
H. Chen, J. Lin, X. Chen, et al. Multimodal Language Models See Better When They Look Shallower. EMNLP 2025 Oral (CCF-B, NLP Tier 1 Conference)
J. Lin, H. Chen, Y. Fan, et al. Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices. CVPR 2025 (CCF-A, CV Tier 1 Conference)
J. Lin, H. Chen, D. Zhu, et al. To Preserve or Compress: A Study of Connectors Through Perceptual Efficacy in Multimodal Large Language Models. EMNLP 2024 (CCF-B, NLP Tier 1 Conference)
J. Lin, J. Tong, H. Wu, et al. Speak While Watching: Unleashing TRUE Real-Time Video Understanding Capability of Multimodal Large Language Models. arXiv 2026
J. Liu, Y. Wei, J. Lin, et al. Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs. VCIP 2024

Multisource Remote Sensing Image Classification:

J. Lin, F. Gao, X. Shi, et al. SS-MAE: Spatial-spectral masked autoencoder for multisource remote sensing image classification. IEEE TGRS 2023 (CCF-B, CAS Q1, JCR Q1, IF:8.2)
J. Lin, F. Gao, L. Qi, et al. Dynamic Cross-Modal Feature Interaction Network for Hyperspectral and LiDAR Data Classification. IEEE TGRS 2025 (CCF-B, CAS Q1, JCR Q1, IF:8.2)
J. Lin, X. Jin, F. Gao, et al. Boosting Spatial-Spectral Masked Auto-Encoder Through Mining Redundant Spectra for HSI-SAR/LiDAR Classification. IGARSS 2024
L. Cheng, J. Lin, F. Gao, et al. Hyperspectral Image Change Detection via Cross-Sample Slot Attention and Dual Gated Feed-Forward Network. PRCV 2024 (CCF-C)
X. Jin, J. Lin, F. Gao, et al. Sparse Focus Network for Multi-Source Remote Sensing Data Classification. IGARSS 2024
X. Shi, J. Lin, Y. Rao, et al. Gated-Cross Aggregation Network for Hyperspectral and LiDAR Data Classification. IGARSS 2023
L. Lv, J. Lin, F. Gao, et al. Hyperspectral and SAR Image Classification via Recursive Feature Interactive Fusion Network. IGARSS 2023
S. Hu, Y. Hu, J. Lin, et al. Multi-Scale Transformer Network for Hyperspectral Image Denoising. IGARSS 2023

For a complete list of publications, please visit my Google Scholar.

Personal Interests

In my free time, I enjoy motorcycling, fitness, and film appreciation.