Kavli Affiliate: Zhuo Li
| First 5 Authors: Zhuo Li, Yuhao Du, Jinpeng Hu, Xiang Wan, Anningzhe Gao
| Summary:
Large language models (LLMs) have shown success in generating high-quality
responses. In order to achieve better alignment with LLMs with human
preference, various works are proposed based on specific optimization process,
which, however, is not suitable to Black-Box LLMs like GPT-4, due to
inaccessible parameters. In Black-Box LLMs case, their performance is highly
dependent on the quality of the provided prompts. Existing methods to enhance
response quality often involve a prompt refinement model, yet these approaches
potentially suffer from semantic inconsistencies between the refined and
original prompts, and typically overlook the relationship between them. To
address these challenges, we introduce a self-instructed in-context learning
framework that empowers LLMs to deliver more effective responses by generating
reliable derived prompts to construct informative contextual environments. Our
approach incorporates a self-instructed reinforcement learning mechanism,
enabling direct interaction with the response model during derived prompt
generation for better alignment. We then formulate querying as an in-context
learning task, using responses from LLMs combined with the derived prompts to
establish a contextual demonstration for the original prompt. This strategy
ensures alignment with the original query, reduces discrepancies from refined
prompts, and maximizes the LLMs’ in-context learning capability. Extensive
experiments demonstrate that the proposed method not only generates more
reliable derived prompts but also significantly enhances LLMs’ ability to
deliver more effective responses, including Black-Box models such as GPT-4.
| Search Query: ArXiv Query: search_query=au:”Zhuo Li”&id_list=&start=0&max_results=3