Multi-Modal Knowledge Graph Transformer Framework for Multi-Modal Entity Alignment

Kavli Affiliate: Lihong Wang

| First 5 Authors: Qian Li, Cheng Ji, Shu Guo, Zhaoji Liang, Lihong Wang

| Summary:

Multi-Modal Entity Alignment (MMEA) is a critical task that aims to identify
equivalent entity pairs across multi-modal knowledge graphs (MMKGs). However,
this task faces challenges due to the presence of different types of
information, including neighboring entities, multi-modal attributes, and entity
types. Directly incorporating the above information (e.g., concatenation or
attention) can lead to an unaligned information space. To address these
challenges, we propose a novel MMEA transformer, called MoAlign, that
hierarchically introduces neighbor features, multi-modal attributes, and entity
types to enhance the alignment task. Taking advantage of the transformer’s
ability to better integrate multiple information, we design a hierarchical
modifiable self-attention block in a transformer encoder to preserve the unique
semantics of different information. Furthermore, we design two entity-type
prefix injection methods to integrate entity-type information using type
prefixes, which help to restrict the global information of entities not present
in the MMKGs. Our extensive experiments on benchmark datasets demonstrate that
our approach outperforms strong competitors and achieves excellent entity
alignment performance.

| Search Query: ArXiv Query: search_query=au:”Lihong Wang”&id_list=&start=0&max_results=3

Read More