Multi-Modality Researcher


Zhipu AI

Company Website


Beijing, China


We develop and align the GLM model series, including ChatGLM2-6/12/32/66/130B, CodeGeeX2, and VisualGLM. Both our open-source ChatGLM-6B model and its successor, the ChatGLM2-6B, have attained #1 rankings on GitHub Trending and Hugging Face Trending (28 days). These two models have been downloaded over five million times within a span of three months on Hugging Face. We extend a warm invitation to talents who wish to join us in our ambitious mission of “teaching machines to think like humans”. We are currently offering positions across various fields, including Natural Language Processing (NLP), Vision, Multi-Modality Research, and Data Science and Engineering, and we welcome applicants of all levels of expertise.

Responsible for the pre-training, alignment, instruction tuning, RLHF, evaluation of large multi-modality models, ideal candidate for the Multi-Modality Researcher Position:

  • Candidates should either possess a Ph.D., or a master’s degree along with a minimum of two years’ experience in a research role.
  • It’s imperative to have a proven track record of published work within the specific topic domain, as evidenced by at least one paper focused on Computer Vision (CV) or Natural Language Processing (NLP).
  • We value a keen interest in multi-modality research. While experience in this field is not a prerequisite, candidates with a robust background in pure CV or NLP are highly encouraged to apply.
  • Exceptional engineering skills are critical for this role, enabling you to convert innovative ideas into tangible results.
  • Candidates who are familiar with large-scale pre-training, especially those with firsthand and extensive experience in training language models with over 6 billion parameters, will receive preferential consideration.


Join us in pioneering the future of AI and pushing the boundaries of possibility.