0% found this document useful (0 votes)

63 views

Offensive Comment Detection Using Zero-Shot Learning: Nikhil Chilwant

The document discusses using zero-shot learning for offensive comment detection. It proposes using a domain adaptation approach with BERT, where BERT is used to select data points from a source domain that are similar to a target domain for offensive comment detection about deceased people. The approach involves measuring domain discrepancy with Maximum Mean Discrepancy and designing a learning curriculum from easy to harder samples based on domain similarity probabilities from a domain classifier.

Uploaded by

Nikhil Chilwant

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Offensive Comment Detection Using Zero-Shot Learning: Nikhil Chilwant

Uploaded by

Nikhil Chilwant

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

Offensive comment detection using zero-shot

learning
Advisor: Prof. D. Klakow • In collaboration with Eternio GmbH.

Nikhil Chilwant
Matriculation no. : 2577689

April 16, 2021

Master seminar Offensive comment detection April 16, 2021 1 / 22

Overview

Text only offensive comments

Hateful meme detection

Master seminar Offensive comment detection April 16, 2021 2 / 22

The problem statement

I eternio.de helps you to remember your loved ones.

Master seminar Offensive comment detection April 16, 2021 3 / 22

The problem statement (cont.)

I We want to identify the ‘inappropriate’ comment.

• Ich bin glücklich! (I am happy!)
• Jetzt hat der Tod ihn an der Backe Und wir sind ihn zum Glück los
(Now he is dead, we are happy to get rid of him)
• Traumhaft eine zukunft ohne ihn (A dream future without him)

I We consider such ‘inappropriate’ comments as offensive.

I Unfortunately, there no exclusive dataset with offensive comments
for deceased people.

Master seminar Offensive comment detection April 16, 2021 4 / 22

Preliminary experiment

I GermEval 2018 [1] dataset is a good starting point.

Master seminar Offensive comment detection April 16, 2021 5 / 22

Preliminary experiment

I GermEval 2018 [1] dataset is a good starting point.

I The results were as good as a random classifier.
Precision Recall F1-score Support
non-offensive 0.44 1.00 0.61 212
offensive 0.00 0.00 0.00 268
accuracy 0.44 480
macro avg 0.22 0.50 0.31 480
weighted avg 0.20 0.44 0.27 480

Master seminar Offensive comment detection April 16, 2021 5 / 22

Few shot learning [2]

I Few shot learning (FSL) is a special case of machine learning, which

targets at obtaining good learning performance given a limited
supervised information provided in the training set.

Master seminar Offensive comment detection April 16, 2021 6 / 22

Few shot learning [2]

I Few shot learning (FSL) is a special case of machine learning, which

Master seminar Offensive comment detection April 16, 2021 6 / 22

Few shot learning [2]

I Few shot learning (FSL) is a special case of machine learning, which

Master seminar Offensive comment detection April 16, 2021 6 / 22

Few shot learning [2]

I Few shot learning (FSL) is a special case of machine learning, which

targets at obtaining good learning performance given a limited
supervised information provided in the training set.
I When the training set does not contain any example with the
supervised information for the task, FSL becomes a zero-shot
learning (ZSL).
I Transfer learning methods are popularly used in FSL.
I ‘Domain adaptation’ is a type of transfer learning in which the
source/target tasks are the same but the source/target domains are
different.

Master seminar Offensive comment detection April 16, 2021 6 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

the type ‘discrepancy-based’ or the type ‘adversarial-based’.

Master seminar Offensive comment detection April 16, 2021 7 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

the type ‘discrepancy-based’ or the type ‘adversarial-based’.
I We should use the BERT - one of the best pre-trained language
model.

Master seminar Offensive comment detection April 16, 2021 7 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

Master seminar Offensive comment detection April 16, 2021 7 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

Master seminar Offensive comment detection April 16, 2021 7 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

the type ‘discrepancy-based’ or the type ‘adversarial-based’.
I We should use the BERT - one of the best pre-trained language
model.
I Ma et al. found that the adversarial approach is hard to train and
the performance is unsteady.
I Ma et al.’s Domain Adaptation + Data Selection approach is
promising.
I Inspired by the ‘curriculum learning’ and ’data selection’ techniques.

Master seminar Offensive comment detection April 16, 2021 7 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

Master seminar Offensive comment detection April 16, 2021 7 / 22

The selection of the approach [3]

I The methods and architectures proposed before are mainly either of

Master seminar Offensive comment detection April 16, 2021 7 / 22

Domain adaptation with BERT [3]

I Use BERT to select data points from the ‘source domain’ similar to
the ‘target domain’.

Master seminar Offensive comment detection April 16, 2021 8 / 22

Domain adaptation with BERT [3]

I Use BERT to select data points from the ‘source domain’ similar to
the ‘target domain’.
I The probability score from the domain classifier will quantify the
domain similarity.

Master seminar Offensive comment detection April 16, 2021 8 / 22

Domain adaptation with BERT [3]

I Use BERT to select data points from the ‘source domain’ similar to
the ‘target domain’.
I The probability score from the domain classifier will quantify the
domain similarity.
I Design the ‘learning curriculum’ of progressively harder samples.
‘Easy’ →
− high probability.

Master seminar Offensive comment detection April 16, 2021 8 / 22

Domain adaptation with BERT (contd.) [3]

Master seminar Offensive comment detection April 16, 2021 9 / 22

Domain adaptation with BERT (contd.) [3]

I Domain discrepancy will be measured by the Maximum Mean

Discrepancy (MMD).

Master seminar Offensive comment detection April 16, 2021 10 / 22

Domain adaptation with BERT (contd.) [3]

I Domain discrepancy will be measured by the Maximum Mean

Discrepancy (MMD).
I Squared MMD (dk ) between the probability distributions P & Q in
the reproducing kernel Hilbert space (Hk ) with kernel k

dk2 (P, Q) := ||EP [x] − EQ [x]||2Hk (1)

Master seminar Offensive comment detection April 16, 2021 10 / 22

Domain adaptation with BERT (contd.) [3]

I Domain discrepancy will be measured by the Maximum Mean

Discrepancy (MMD).
I Squared MMD (dk ) between the probability distributions P & Q in
the reproducing kernel Hilbert space (Hk ) with kernel k

dk2 (P, Q) := ||EP [x] − EQ [x]||2Hk (1)

I Domain regularized training objective

1 X
min L(xi , yi ; θ) + λ.dk2 (Ds , Dt ; θ) (2)
θ |S|
xi ,yi ∈S

L: Cross-entropy loss
S: collection of the labelled source domain data
λ: regularization parameter
k: rational quadratic kernel

Master seminar Offensive comment detection April 16, 2021 10 / 22

Domain adaptation with BERT (contd.) [3]

Figure 1: Setup for BERT domain adaptation with MMD-based domain

regularization.
xs : labeled source data
xt : unlabeled target data
zs : predicted label for source data
ys : target label for the source data
zs0 : latent domain representation of the source data
zt0 : latent domain representation of the target data

Master seminar Offensive comment detection April 16, 2021 11 / 22

Domain adaptation with BERT (contd.) [3]

I Improve the performance by incorporating multiple source domains

ex. Sentiment classification datasets, offensive comments datasets

Master seminar Offensive comment detection April 16, 2021 12 / 22

Domain adaptation with BERT (contd.) [3]

I Improve the performance by incorporating multiple source domains

ex. Sentiment classification datasets, offensive comments datasets
I Use the Multi-task learning (MTL) technique.

Figure 2: Conceptual overview of MTL [4].

Master seminar Offensive comment detection April 16, 2021 12 / 22

Domain adaptation with BERT (contd.)

Figure 3: The generic MTL algorithm [5]

Master seminar Offensive comment detection April 16, 2021 13 / 22
Hateful meme detection
I The user can post a meme.

Master seminar Offensive comment detection April 16, 2021 14 / 22

Hateful meme detection
I The user can post a meme.

I Dataset? →
− Facebook’s hateful meme challenge [6].

Master seminar Offensive comment detection April 16, 2021 14 / 22

Hateful meme detection
I The user can post a meme.

I Dataset? →
− Facebook’s hateful meme challenge [6].
I Includes ‘benign confounders’

Master seminar Offensive comment detection April 16, 2021 14 / 22

Performance numbers

Master seminar Offensive comment detection April 16, 2021 15 / 22

Performance numbers

Master seminar Offensive comment detection April 16, 2021 15 / 22

The current best approach

I Ensembles VL-BERT, UNITER, VILLA and ERNIE-Vil with slight

modifications.

Master seminar Offensive comment detection April 16, 2021 16 / 22

The current best approach

I Ensembles VL-BERT, UNITER, VILLA and ERNIE-Vil with slight

modifications.
I Additional data source
• Google Web Entity detection: image context
• FairFace classifier: gender, race

Master seminar Offensive comment detection April 16, 2021 16 / 22

The current best approach

I Ensembles VL-BERT, UNITER, VILLA and ERNIE-Vil with slight

modifications.
I Additional data source
• Google Web Entity detection: image context
• FairFace classifier: gender, race
I Extending VL-BERT
• Represent all external labels as a special type of text token and link
it to a special image region using visual feature embedding.

Master seminar Offensive comment detection April 16, 2021 16 / 22

Visual feature embedding[7]

I Inspired by OSCAR.

Master seminar Offensive comment detection April 16, 2021 17 / 22

Visual feature embedding[7]

I Inspired by OSCAR.
I

I Visual region features and associated object tag pair (figure a) are
coupled in the shared space (figure b).

Master seminar Offensive comment detection April 16, 2021 17 / 22

Visual feature embedding[7]

I Inspired by OSCAR.
I

I Visual region features and associated object tag pair (figure a) are
coupled in the shared space (figure b).
I Objects act like an ‘anchor point’ in the semantics space (figure c).

Master seminar Offensive comment detection April 16, 2021 17 / 22

Visual feature embedding[7]

I Inspired by OSCAR.
I

I Visual region features and associated object tag pair (figure a) are
coupled in the shared space (figure b).
I Objects act like an ‘anchor point’ in the semantics space (figure c).
I Train the ‘extended VL-BERT’ using caption text, object entity
tags, race tags and image regions.

Master seminar Offensive comment detection April 16, 2021 17 / 22

The current best approach (contd.)

I If the text and image do not align, then the meme is probably
hateful.

Master seminar Offensive comment detection April 16, 2021 18 / 22

The current best approach (contd.)

I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.

Master seminar Offensive comment detection April 16, 2021 18 / 22

The current best approach (contd.)

I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I Use ERNIE-ViL without any modification

Master seminar Offensive comment detection April 16, 2021 18 / 22

The current best approach (contd.)

I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I Use ERNIE-ViL without any modification
I AUCROC: 0.845, Accuracy: 73.20%

Master seminar Offensive comment detection April 16, 2021 18 / 22

The current best approach (contd.)

I If the text and image do not align, then the meme is probably
hateful.
I Use UNITER with ITM (Image-Text Matching) head.
I Use ERNIE-ViL without any modification
I AUCROC: 0.845, Accuracy: 73.20%
I Next step: analyze results and try to improve the performance.

Master seminar Offensive comment detection April 16, 2021 18 / 22

Timeline

I Implementation of the BERT based domain adaptation : April

I Propose and implement idea for Hateful meme : May-June
I Thesis write-up : July

Master seminar Offensive comment detection April 16, 2021 19 / 22

Conclusion

I The domain adaptation with data selection approach is promising for

offensive comment detection.

Master seminar Offensive comment detection April 16, 2021 20 / 22

Conclusion

I The domain adaptation with data selection approach is promising for

offensive comment detection.
I Would you like to suggest a better approach?

Master seminar Offensive comment detection April 16, 2021 20 / 22

Bibliography

[1] M. Wiegand, GermEval-2018 Corpus (DE), version V1, 2019. doi:

10.11588/data/0B5VML. [Online]. Available:
https://doi.org/10.11588/data/0B5VML.
[2] Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni, “Generalizing from a
few examples: A survey on few-shot learning,” ACM Computing
Surveys (CSUR), vol. 53, no. 3, pp. 1–34, 2020.
[3] X. Ma, P. Xu, Z. Wang, R. Nallapati, and B. Xiang, “Domain
adaptation with bert-based domain classification and data
selection,” in Proceedings of the 2nd Workshop on Deep Learning
Approaches for Low-Resource NLP (DeepLo 2019), 2019, pp. 76–83.
[4] S. Ruder, “An overview of multi-task learning in deep neural
networks,” arXiv preprint arXiv:1706.05098, 2017.
[5] X. Liu, P. He, W. Chen, and J. Gao, “Multi-task deep neural
networks for natural language understanding,” arXiv preprint
arXiv:1901.11504, 2019.

Master seminar Offensive comment detection April 16, 2021 21 / 22

Bibliography (cont.)

[6] D. Kiela, H. Firooz, A. Mohan, V. Goswami, A. Singh, P. Ringshia,

and D. Testuggine, “The hateful memes challenge: Detecting hate
speech in multimodal memes,” arXiv preprint arXiv:2005.04790,
2020.
[7] X. Li, X. Yin, C. Li, P. Zhang, X. Hu, L. Zhang, L. Wang, H. Hu,
L. Dong, F. Wei, et al., “Oscar: Object-semantics aligned
pre-training for vision-language tasks,” in European Conference on
Computer Vision, Springer, 2020, pp. 121–137.

Master seminar Offensive comment detection April 16, 2021 22 / 22

A Course in Machine Learning
100% (1)
A Course in Machine Learning
191 pages
NLP
No ratings yet
NLP
45 pages
Machine Learning1
No ratings yet
Machine Learning1
8 pages
Natural Language Processing
No ratings yet
Natural Language Processing
31 pages
new slides Machine Learning-winter 2024
No ratings yet
new slides Machine Learning-winter 2024
72 pages
U1 - ML
No ratings yet
U1 - ML
5 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
2023 LSE MY474 Applied Machine Learning Social Science, Lecture1
No ratings yet
2023 LSE MY474 Applied Machine Learning Social Science, Lecture1
65 pages
ASurveyon BERTand Its Applications
No ratings yet
ASurveyon BERTand Its Applications
6 pages
Ensemble_BERT_A_Student_Social_Network_Text_Sentiment_Classification_Model_Based_on_Ensemble_Learning_and_BERT_Architecture
No ratings yet
Ensemble_BERT_A_Student_Social_Network_Text_Sentiment_Classification_Model_Based_on_Ensemble_Learning_and_BERT_Architecture
4 pages
1729401471516
No ratings yet
1729401471516
98 pages
Autonomous Data Selection With Language Models For Mathematical Texts
No ratings yet
Autonomous Data Selection With Language Models For Mathematical Texts
25 pages
Data aug-IR
No ratings yet
Data aug-IR
15 pages
Training the application of LLM
No ratings yet
Training the application of LLM
68 pages
Learning With Less: Knowledge Distillation From Large Language Models Via Unlabeled Data
No ratings yet
Learning With Less: Knowledge Distillation From Large Language Models Via Unlabeled Data
14 pages
Diffusionbert: Improving Generative Masked Language Models With Diffusion Models
No ratings yet
Diffusionbert: Improving Generative Masked Language Models With Diffusion Models
10 pages
ChatGPT - Machine Learning Overview
No ratings yet
ChatGPT - Machine Learning Overview
34 pages
2020 Acl-Main 272
No ratings yet
2020 Acl-Main 272
11 pages
ML assignment
No ratings yet
ML assignment
7 pages
ML 01
No ratings yet
ML 01
24 pages
intro_slides
No ratings yet
intro_slides
31 pages
MLES
No ratings yet
MLES
30 pages
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
No ratings yet
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
7 pages
11192-Article (PDF) - 20731-1-10-20180420
No ratings yet
11192-Article (PDF) - 20731-1-10-20180420
43 pages
Academic Internship Final Report
No ratings yet
Academic Internship Final Report
11 pages
When BERT Plays The Lottery, All Tickets Are Winning (2020)
No ratings yet
When BERT Plays The Lottery, All Tickets Are Winning (2020)
13 pages
Tacl A 00300
No ratings yet
Tacl A 00300
14 pages
Efficientbert: Progressively Searching Multilayer Perceptron Via Warm-Up Knowledge Distillation
No ratings yet
Efficientbert: Progressively Searching Multilayer Perceptron Via Warm-Up Knowledge Distillation
14 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Mask BERT
No ratings yet
Mask BERT
9 pages
Bagging BERT Models For
No ratings yet
Bagging BERT Models For
7 pages
Basics of Machine Learning1
No ratings yet
Basics of Machine Learning1
67 pages
ConBERT-RL- A Policy-driven Deep Reinforcement Learning Based Approach for Detecting Homophobia and Transphobia in Low-resource Languages
No ratings yet
ConBERT-RL- A Policy-driven Deep Reinforcement Learning Based Approach for Detecting Homophobia and Transphobia in Low-resource Languages
12 pages
AAM ans
No ratings yet
AAM ans
3 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
5 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
2002.01605v2
No ratings yet
2002.01605v2
41 pages
Ai Unit5 Learning
No ratings yet
Ai Unit5 Learning
62 pages
Data Science Guide
100% (1)
Data Science Guide
275 pages
peterl/teaching/DM: E C I I
No ratings yet
peterl/teaching/DM: E C I I
8 pages
peterl/teaching/DM: E C I I
No ratings yet
peterl/teaching/DM: E C I I
8 pages
Data - and AI-driven Methods in Engineering
No ratings yet
Data - and AI-driven Methods in Engineering
40 pages
Chapter 3 - Introduction Via Linear Regression
No ratings yet
Chapter 3 - Introduction Via Linear Regression
20 pages
Week11_regularization and optimization
No ratings yet
Week11_regularization and optimization
75 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
Conditional Random Fields: Probabilistic Models For Segmenting and Labeling Sequence Data
No ratings yet
Conditional Random Fields: Probabilistic Models For Segmenting and Labeling Sequence Data
28 pages
Challenges in ML&DM
No ratings yet
Challenges in ML&DM
12 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
60 pages
2024.acl-long.778
No ratings yet
2024.acl-long.778
15 pages
FULLTEXT01
No ratings yet
FULLTEXT01
59 pages
GML-slides-2024-04-29 (1)
No ratings yet
GML-slides-2024-04-29 (1)
206 pages
2806 Neural Computation Committee Machines: 2005 Ari Visa
No ratings yet
2806 Neural Computation Committee Machines: 2005 Ari Visa
30 pages
nlp_essentials
No ratings yet
nlp_essentials
22 pages
lect0407
No ratings yet
lect0407
6 pages
Right to Be Forgotten
No ratings yet
Right to Be Forgotten
11 pages
2305.16938v2
No ratings yet
2305.16938v2
29 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
SMART: Robust and E Fficient Fine-Tuning For Pre-Trained Natural Language Models Through Principled Regularized Optimization
No ratings yet
SMART: Robust and E Fficient Fine-Tuning For Pre-Trained Natural Language Models Through Principled Regularized Optimization
21 pages
Online Finite Element Analysis Course
From Everand
Online Finite Element Analysis Course
Dr. James A. Mandel P.E.
3/5 (1)
MCS-024: Object Oriented Technologies and Java Programming
From Everand
MCS-024: Object Oriented Technologies and Java Programming
Dr. DK Sukhani
No ratings yet
Flasher St7
No ratings yet
Flasher St7
16 pages
Power Systems
100% (1)
Power Systems
21 pages
Algorithms WS1
No ratings yet
Algorithms WS1
5 pages
BA - KOSTAL Interface Description MODBUS - PIKO CI
No ratings yet
BA - KOSTAL Interface Description MODBUS - PIKO CI
38 pages
IEC App NoteAN 41 - IEC61000-4-3 Ed3 Update
No ratings yet
IEC App NoteAN 41 - IEC61000-4-3 Ed3 Update
8 pages
MP, Ex, 1,114000 MP, Prxy, 1,0.3
No ratings yet
MP, Ex, 1,114000 MP, Prxy, 1,0.3
7 pages
High Pressure Industrial / Commercial Pounds-to-Pounds Regulators 1580V and AA1580V Series
No ratings yet
High Pressure Industrial / Commercial Pounds-to-Pounds Regulators 1580V and AA1580V Series
1 page
BS 648 - Schedule of Weights of Building Materials
No ratings yet
BS 648 - Schedule of Weights of Building Materials
50 pages
CSC 2118 EXAM JAN2023 Final
No ratings yet
CSC 2118 EXAM JAN2023 Final
15 pages
W-Element Examples Using Hyperlynx
No ratings yet
W-Element Examples Using Hyperlynx
25 pages
Control System Model Paper
No ratings yet
Control System Model Paper
3 pages
Sundials - Art and Science
100% (1)
Sundials - Art and Science
113 pages
Paper RCFA - Root Cause Failure Analysis
No ratings yet
Paper RCFA - Root Cause Failure Analysis
7 pages
A.01.5 Section
No ratings yet
A.01.5 Section
1 page
Appleton
No ratings yet
Appleton
30 pages
CAM350 Version 12 1 Build 1036 Release Highlights PDF
No ratings yet
CAM350 Version 12 1 Build 1036 Release Highlights PDF
11 pages
Electrical Interview Questions
No ratings yet
Electrical Interview Questions
2 pages
Nano Ram Full Intro
No ratings yet
Nano Ram Full Intro
5 pages
Gic Eliro v7dfts3 Manual
No ratings yet
Gic Eliro v7dfts3 Manual
56 pages
Cantilever Retaining Wall Design 1.5 M
100% (1)
Cantilever Retaining Wall Design 1.5 M
6 pages
Chapter 2 - Principles of Language Assessment - H. Douglas Brown
No ratings yet
Chapter 2 - Principles of Language Assessment - H. Douglas Brown
4 pages
Seminar Community Medicine
No ratings yet
Seminar Community Medicine
13 pages
Rotork GP & GH Range
No ratings yet
Rotork GP & GH Range
8 pages
Tutorial 3 Answers
No ratings yet
Tutorial 3 Answers
15 pages
BALLARD, G. etal _ Art _ 2003 _ Lean project management
No ratings yet
BALLARD, G. etal _ Art _ 2003 _ Lean project management
15 pages
Aurora Chain White Paper en
No ratings yet
Aurora Chain White Paper en
9 pages
Blended Wing Design Considerations For A Next Generation Commerci
No ratings yet
Blended Wing Design Considerations For A Next Generation Commerci
83 pages
Bhuvan Resume
No ratings yet
Bhuvan Resume
2 pages
Garrison - 1997 - Systems Engineering Trades For The Iridium Constellation PDF
No ratings yet
Garrison - 1997 - Systems Engineering Trades For The Iridium Constellation PDF
6 pages
Design and MFG of Hydraulic Presses
100% (3)
Design and MFG of Hydraulic Presses
54 pages