|
|
[[Home](Home)]
|
|
|
|
|
|
---
|
|
|
|
|
|
A team around Prof Bialonski was ranked first place in the international competition "GermEval" to identify toxic and fact-claiming comments in social media data. Several teams had submitted their machine learning models this summer to automatically identify problematic Facebook comments. Such models have the potential to assist moderators in sifting through user-generated content. Their ML model was able to beat the models of all other teams, e.g., from the German Research Center for Artificial Intelligence (DFKI), TU Vienna, University of Regensburg, Austrian Institute of Technology or the University of New York (CUNY).
|
|
|
|
|
|
We are happy that the team agrees to tell us more about their approach and experience in taking part in such a competition.
|
... | ... | @@ -11,4 +15,7 @@ Data Science, FB Medizintechnik und Technomathematik, FH Aachen |
|
|
**Monday, 20 September 2021, 11am**
|
|
|
**Link to the VC:** https://webconf.fz-juelich.de/b/wen-mym-pj7
|
|
|
|
|
|
**Abstract:** The availability of language representations learned by large pretrained neural network models (such as BERT and ELECTRA) has led to improvements in many downstream Natural Language Processing tasks in recent years. Pretrained models usually differ in pretraining objectives, architectures, and datasets they are trained on which can affect downstream performance. In this talk, we introduce our approach which was ranked 1st place in the GermEval 2021 competition for identifying toxic and fact-claiming comments. We created ensembles of BERT and ELECTRA models and investigated whether and how classification performance depends on the number of ensemble members, their composition, and whether exploiting label-label correlation were helpful to improve classification performance. |
|
|
\ No newline at end of file |
|
|
**Abstract:** The availability of language representations learned by large pretrained neural network models (such as BERT and ELECTRA) has led to improvements in many downstream Natural Language Processing tasks in recent years. Pretrained models usually differ in pretraining objectives, architectures, and datasets they are trained on which can affect downstream performance. In this talk, we introduce our approach which was ranked 1st place in the GermEval 2021 competition for identifying toxic and fact-claiming comments. We created ensembles of BERT and ELECTRA models and investigated whether and how classification performance depends on the number of ensemble members, their composition, and whether exploiting label-label correlation were helpful to improve classification performance.
|
|
|
|
|
|
---
|
|
|
[[Home](Home)] |
|
|
\ No newline at end of file |