Citation Recommendation with Large Language Models

Citation Recommendation, Large Language Models, Scientific Document Processing, Natural
Language Processing, Information Retrieval

Supporting claims with citations is a key part of scientific writing; however, due to the ever- increasing size of the literature, finding appropriate papers to cite can be a challenging and time-consuming task. Could Large Language Models (LLMs) like ChatGPT (Liu et al., 2023), which has exploded in popularity in recent months thanks to its impressive performance on language-related tasks, be a solution?
The goal of this project is to explore how LLMs can be used for Citation Recommendation (CR):
ranking papers based on their suitability as a citation within a query text.
The main challenge for this project is that because LLMs are trained to generate text, there is no obvious way as to how LLMs should be used for a recommendation-based problem like CR. Recent works like Galactica (Taylor et al., 2022) have found success with directly generating the title of an appropriate paper to cite, albeit at the cost of occasionally hallucinating papers that do not exist. Another possibility would be to rank candidates by the LLM’s log-probability of generating each candidate’s title. Can you think of other interesting methods?
You will be responsible for (1) conducting a literature review on the use of LLMs in similar recommendation-based problems, (2) designing different ways to use LLMs for CR, and (3) evaluating the performance of pretrained LLMs on CR, and (4) fine-tuning pretrained LLMs for CR. We will schedule regular meetings to discuss ideas, progress, and next steps.
We are looking for a motivated student with interest in Natural Language Processing and good programming skills. We intend for this project to require 6-months full-time work, but the scope and length of the project can be tailored based on the preferences of the applicant.

To apply, please send your CV and a recent transcript to Jessica Lam (lamjessica (at) or Prof. Dr. Richard Hahnloser (rich (at)

Liu, Yiheng, et al. "Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models." arXiv preprint arXiv:2304.01852 (2023).
Taylor, Ross, et al. "Galactica: A large language model for science." arXiv preprint arXiv:2211.09085 (2022).

© 2023 Institut für Neuroinformatik