Offensive Comment Detection Using Zero-Shot Learning: Nikhil Chilwant
Offensive Comment Detection Using Zero-Shot Learning: Nikhil Chilwant
learning
Advisor: Prof. D. Klakow • In collaboration with Eternio GmbH.
Nikhil Chilwant
Matriculation no. : 2577689
I Use BERT to select data points from the ‘source domain’ similar to
the ‘target domain’.
I Use BERT to select data points from the ‘source domain’ similar to
the ‘target domain’.
I The probability score from the domain classifier will quantify the
domain similarity.
I Use BERT to select data points from the ‘source domain’ similar to
the ‘target domain’.
I The probability score from the domain classifier will quantify the
domain similarity.
I Design the ‘learning curriculum’ of progressively harder samples.
‘Easy’ →
− high probability.
1 X
min L(xi , yi ; θ) + λ.dk2 (Ds , Dt ; θ) (2)
θ |S|
xi ,yi ∈S
L: Cross-entropy loss
S: collection of the labelled source domain data
λ: regularization parameter
k: rational quadratic kernel
I Dataset? →
− Facebook’s hateful meme challenge [6].
I Dataset? →
− Facebook’s hateful meme challenge [6].
I Includes ‘benign confounders’
I Inspired by OSCAR.
I Inspired by OSCAR.
I
I Visual region features and associated object tag pair (figure a) are
coupled in the shared space (figure b).
I Inspired by OSCAR.
I
I Visual region features and associated object tag pair (figure a) are
coupled in the shared space (figure b).
I Objects act like an ‘anchor point’ in the semantics space (figure c).
I Inspired by OSCAR.
I
I Visual region features and associated object tag pair (figure a) are
coupled in the shared space (figure b).
I Objects act like an ‘anchor point’ in the semantics space (figure c).
I Train the ‘extended VL-BERT’ using caption text, object entity
tags, race tags and image regions.
I If the text and image do not align, then the meme is probably
hateful.
I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I Use ERNIE-ViL without any modification
I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I Use ERNIE-ViL without any modification
I AUCROC: 0.845, Accuracy: 73.20%
I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I Use ERNIE-ViL without any modification
I AUCROC: 0.845, Accuracy: 73.20%
I Next step: analyze results and try to improve the performance.