👀 About Me

Hi, I’m Xiaorui Jiang, a third-year master’s student at University of Science and Technology of China.

My research interests focus on cross-modal information representation and fusion, with emphasis on the following aspects:

  1. Multimodal Information Fusion: Investigating the representation and fusion of multimodal data in unsupervised environments,
    with a focus on methods such as FL and MVL, aiming to achieve efficient information fusion without relying on labeled data.

  2. Multimodal Large Language Models: Concentrating on improving the inference performance of MLLMs, particularly focusing
    on strategies that do not require additional training, striving to optimize their reasoning ability and generalization.

  3. Applications of MLLMs: Starting in Nov. 2025, I will be working as a RA at PolyU, where I will be dedicated to fine-tuning
    and optimizing MLLMs for specific domains, driving the development and application of these models in practical scenarios.

Currently, my papers under review or in progress include: (a) Unsupervised federated multi-view data representation;
(b) Learning dynamics in multi-view data representation; (c) Research on attention mechanisms in MLLMs and training-free
strategies for performance enhancement; (d) Training-free token compression methods for MLLMs.

If you find any of these topics of interest, I would be truly grateful for your insights.

👇 Please feel free to reach out to me (Last updated: Sept. 20, 2025)

📖 Google Scholar    💬 WeChat     📧 Email