From c58972874155f9be01716f57c2ba4af5072cf3f0 Mon Sep 17 00:00:00 2001
From: "Jitsev, Jenia" <j.jitsev@fz-juelich.de>
Date: Mon, 20 Jul 2020 03:22:44 +0200
Subject: [PATCH] JJ: update study on what tasks lead to good generalization

---
 Description.md | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/Description.md b/Description.md
index 1955b42..2a9d087 100644
--- a/Description.md
+++ b/Description.md
@@ -109,3 +109,15 @@ https://arxiv.org/abs/1901.11210
   - **Out Of Distribution error** (*transferability / compatibility*): This is a heatmap showing how the image differs from our training data. If the heatmap is too bright then the image is very different from our training data and the model will likely not work. We will prevent an image from being processed if it is not similar enough to our training data in order to prevent errors in predictions.
   - **Predictive image regions** (*explainability*): The brighter each pixel is in the heatmap the more influence it can have on the predictions. If the color is bright it means that a change in these pixels will change the prediction.
   - **Disease Predictions** (*uncertainty* representation): A probability indicating how likely the image contains the disease. 50\% means the network is not sure.
+
+##### On the limits of cross-domain generalization in automated X-ray prediction
+
+* *"focuses on quantifying what X-rays diagnostic prediction tasks generalize well across multiple different datasets"*
+* paper: https://arxiv.org/abs/2002.02497
+
+Joseph Paul Cohen, Mohammad Hashir, Rupert Brooks, Hadrien Bertrand
+
+
+This large scale study focuses on quantifying what X-rays diagnostic prediction tasks generalize well across multiple different datasets. We present evidence that the issue of generalization is not due to a shift in the images but instead a shift in the labels. We study the cross-domain performance, agreement between models, and model representations. We find interesting discrepancies between performance and agreement where models which both achieve good performance disagree in their predictions as well as models which agree yet achieve poor performance. We also test for concept similarity by regularizing a network to group tasks across multiple datasets together and observe variation across the tasks. All code is made available online and data is publicly available: https://github.com/mlmed/torchxrayvision
+
+Code: https://github.com/mlmed/torchxrayvision
-- 
GitLab