Kaijie Zhu

"To see the world, things dangerous to come to, to see behind walls, draw closer, to find each other and to feel."

prof_pic.jpg

zhukaijie2021@ia.ac.cn

Beijing, China

I’m a third-year Master student at the State Key Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. I have also spent time at Microsoft, advised by Prof. Jingdong Wang and Prof. Xing Xie.

My research interest lies in the development of trustworthy AI systems and evaluation of foundation models.

  • Trustworthy AI:
    • Reinforce the robustness of foundation models to unexpected inputs, such as adversarial examples, jailbreak prompts, etc.
    • Detecting AI-Generated Content (AIGC).
  • Evaluation of foundation models:
    • Dynamic evaluation for test data contamination issue.
    • New evaluation measurements for generation models.
    • Evaluation benchmarks reflecting diverse real-world scenarios.

Please refer to my statement of purpose for details!

news

May 2, 2024 DyVal 2 is accepted by ICML 2024.
Jan 17, 2024 DyVal is accepted by ICLR 2024 as a spotlight paper!
Oct 22, 2023 I am looking for a Ph.D. position in 2024 Fall!
Jul 18, 2023 Our paper “Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning” is accepted by ICCV 2023!

selected publications

  1. dyval.jpg
    DyVal: Graph-informed Dynamic Evaluation of Large Language Models
    Kaijie Zhu, Jiaao Chen, Jindong Wang, and 3 more authors
    ICLR 2024 (Spotlight, Top 5%), 2023
  2. promptbench.jpg
    PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
    Kaijie Zhu, Jindong Wang, Jiaheng Zhou, and 8 more authors
    arXiv preprint arXiv:2306.04528, 2023
  3. rift.jpg
    Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning
    Kaijie Zhu, Xixu Hu, Jindong Wang, and 2 more authors
    In ICCV 2023, 2023
  4. dyval2.jpg
    DyVal 2: Dynamic Evaluation of Large Language Models by Meta Probing Agents
    Kaijie Zhu, Jindong Wang, Qinlin Zhao, and 2 more authors
    In , 2024