СВЯЗАННЫЕ СУДЬБЫ: КОГДА ЖИЗНЬ СТАВИТ ВСЁ НА СВОИ МЕСТА
Мам, я беремен# Key-Value Memory with Self-Attention for MultiModal Rumor Detection
This repository contains the code used in the paper «Key-Value Memory with Self-Attention for MultiModal Rumor Detection» (He et al.), which is published at ACMMM.
## Overview
Social media is rife with rumors and misinformation. Many previous studies focus on textual modality only for rumor detection. However, news story can be better understood with multimodal data like texts and images. In this paper, we propose a novel model for rumor detection, which incorporates both textual and visual features for rumor detection by learning intermodal relations between images and texts, as well as relations inside each modality. The model consists of three main layers. First, the image-text fusion layer combines image and text feature vectors by applying cross-attention and multi-head self-attention mechanisms to capture their relations. Second, the key-value memory layer stores the multimodal representations of rumors and non-rumors, and uses them as references to compare with the target news. Finally, the prediction layer generates the probability score of the target news being a rumor. Experiments on two real-world datasets show that the proposed model outperforms the state-of-the-art methods.
## Requirements
— Python 3.6
— Tensorflow 1.4.0
— numpy
— pandas
— json
— pickle
— scikit-learn
— keras with tensorflow backend (for feature extraction)
## Data Preparation
### Twitter Dataset
The twitter multimodal dataset can be found at [Weibo Dataset](https://drive.google.com/file/d/1z09AtuyNQ_Xk2jQ62A7KW8bQPQJm8Zrs/view?usp=sharing). You can download the dataset and put the images under the directory `./twitter_images`. Sample images are shown in `./images`.
### Weibo Dataset
The Weibo dataset is collected from [Weibo](https://www.weibo.com). However, we are unable to share the dataset due to the issue of user privacy.
### Feature Extraction
Run the following command to extract visual features with VGG19:
«`angular2
python img_feature_extraction.py —image_dir ./twitter_images —output_file ./data/twitter_img_features.pickle
«`
## Training
Run the following command to train the model for Twitter dataset:
«`angular2
python train.py —data_dir ./data/twitter —image_feature_path ./data/twitter_img_features.pickle —output_dir ./models/twitter
«`
## Testing
Run the following command to test the model for Twitter dataset:
«`angular2
python test.py —data_dir ./data/twitter —image_feature_path ./data/twitter_img_features.pickle —model_dir ./models/twitter
«`
## Citation
If you use the code in your paper, please cite our paper:
«`angular2
@inproceedings{he2020key,
title={Key-Value Memory with Self-Attention for MultiModal Rumor Detection},
author={He, Xin and Zhang, Xingkun and Lei, Tao and Zhang, Min and Chen, Bo and Xu, Peng and Zhou, Ke},
booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
pages={1478—1486},
year={2020}
}
«`