СВЯЗАННЫЕ СУДЬБЫ: КОГДА ЖИЗНЬ СТАВИТ ВСЁ НА СВОИ МЕСТА

Мам, я беремен# Key-Value Memory with Self-Attention for MultiModal Rumor Detection

This repository contains the code used in the paper «Key-Value Memory with Self-Attention for MultiModal Rumor Detection» (He et al.), which is published at ACMMM.

## Overview

Social media is rife with rumors and misinformation. Many previous studies focus on textual modality only for rumor detection. However, news story can be better understood with multimodal data like texts and images. In this paper, we propose a novel model for rumor detection, which incorporates both textual and visual features for rumor detection by learning intermodal relations between images and texts, as well as relations inside each modality. The model consists of three main layers. First, the image-text fusion layer combines image and text feature vectors by applying cross-attention and multi-head self-attention mechanisms to capture their relations. Second, the key-value memory layer stores the multimodal representations of rumors and non-rumors, and uses them as references to compare with the target news. Finally, the prediction layer generates the probability score of the target news being a rumor. Experiments on two real-world datasets show that the proposed model outperforms the state-of-the-art methods.

## Requirements

— Python 3.6
— Tensorflow 1.4.0
— numpy
— pandas
— json
— pickle
— scikit-learn
— keras with tensorflow backend (for feature extraction)

## Data Preparation

### Twitter Dataset

The twitter multimodal dataset can be found at [Weibo Dataset](https://drive.google.com/file/d/1z09AtuyNQ_Xk2jQ62A7KW8bQPQJm8Zrs/view?usp=sharing). You can download the dataset and put the images under the directory `./twitter_images`. Sample images are shown in `./images`.

### Weibo Dataset

The Weibo dataset is collected from [Weibo](https://www.weibo.com). However, we are unable to share the dataset due to the issue of user privacy.

### Feature Extraction

Run the following command to extract visual features with VGG19:
«`angular2
python img_feature_extraction.py —image_dir ./twitter_images —output_file ./data/twitter_img_features.pickle
«`

## Training

Run the following command to train the model for Twitter dataset:
«`angular2
python train.py —data_dir ./data/twitter —image_feature_path ./data/twitter_img_features.pickle —output_dir ./models/twitter
«`

## Testing

Run the following command to test the model for Twitter dataset:
«`angular2
python test.py —data_dir ./data/twitter —image_feature_path ./data/twitter_img_features.pickle —model_dir ./models/twitter
«`

## Citation

If you use the code in your paper, please cite our paper:

«`angular2
@inproceedings{he2020key,
title={Key-Value Memory with Self-Attention for MultiModal Rumor Detection},
author={He, Xin and Zhang, Xingkun and Lei, Tao and Zhang, Min and Chen, Bo and Xu, Peng and Zhou, Ke},
booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
pages={1478—1486},
year={2020}
}
«`