ORCHID: A Chinese Debate Corpus for Target-Independent Stance Detection and Argumentative Dialogue Summarization

Kavli Affiliate: Ke Wang

| First 5 Authors: Xiutian Zhao, Ke Wang, Wei Peng, ,

| Summary:

Dialogue agents have been receiving increasing attention for years, and this
trend has been further boosted by the recent progress of large language models
(LLMs). Stance detection and dialogue summarization are two core tasks of
dialogue agents in application scenarios that involve argumentative dialogues.
However, research on these tasks is limited by the insufficiency of public
datasets, especially for non-English languages. To address this language
resource gap in Chinese, we present ORCHID (Oral Chinese Debate), the first
Chinese dataset for benchmarking target-independent stance detection and debate
summarization. Our dataset consists of 1,218 real-world debates that were
conducted in Chinese on 476 unique topics, containing 2,436 stance-specific
summaries and 14,133 fully annotated utterances. Besides providing a versatile
testbed for future research, we also conduct an empirical study on the dataset
and propose an integrated task. The results show the challenging nature of the
dataset and suggest a potential of incorporating stance detection in
summarization for argumentative dialogue.

| Search Query: ArXiv Query: search_query=au:”Ke Wang”&id_list=&start=0&max_results=3