@@ -17,5 +17,7 @@ Data Science, FB Medizintechnik und Technomathematik, FH Aachen
**Abstract:** The availability of language representations learned by large pretrained neural network models (such as BERT and ELECTRA) has led to improvements in many downstream Natural Language Processing tasks in recent years. Pretrained models usually differ in pretraining objectives, architectures, and datasets they are trained on which can affect downstream performance. In this talk, we introduce our approach which was ranked 1st place in the GermEval 2021 competition for identifying toxic and fact-claiming comments. We created ensembles of BERT and ELECTRA models and investigated whether and how classification performance depends on the number of ensemble members, their composition, and whether exploiting label-label correlation were helpful to improve classification performance.