A new hope!
(I will just ignore all promises and cringe posts from before)
This now becomes my research journal.
This week I mainly worked on my research progress presentation, which I also intend to use on the research seminar class. The topic was the basics on machine learning interpretability, so I used occlusion as the baseline example (actually RISE), and from there I went to explain ROSA and ROSAGDS.
Although I was a bit nervous about the presentation, there were many valuable comments, which I took note.
Rework the augmentation sets

Sensei commented that building GDS from a set of augmentations of the same image does not make sense the way I am doing. GDS is supposed to contain the difference on a set of different subspaces, and since the GDS dimensionality was becoming so small, that is a sign the augmentation subspaces were too similar.

Limasan also pointed out that my technique could be used for outofdistribution interpretability. Given the ROSA family is a class agnostic technique, I could as well input anything and try to analyze what the model sees. Not sure if that fits as an outofdistribution interpreter exactly, but something of the sort maybe.
These comments made me remember of Makisensei’s presentation, when he commented on semisupervised learning and how they had this concept of weak and strong augmentations. My idea now is to build GDS from 4 subspaces:
 weak augmentation: similar to the original image
 strong augmentation: distant from the original image
 weak aug on noise: baseline noise
 strong aug on noise: far from original noise
I still wish to keep using the original image, but I think that building GDS from these 4 sets could lead to a better grasp of the difference and what the representation vector is actually spanning. In other words, I hypothesize that the important parts of the original image generate transformations on the vector that will be far from all these 4 sets, thus giving meaning to using GDS and the “difference” we are looking for becomes more carefully defined.
Augmentations
In that sense, I already have some ideas on what could become the weak/strong augmentations:
strong
 patches / random crop
 diffusion model (+ control net): photo without x
 mask and inpaint
weak
 diffusion model (canny edge control net): photo of x
Compare the methods
 Many people asked how to measure the efficacy of each model and I believe comparing simple occlusion with ROSA is unfair. The point is that for metrics such as deletion and insertion, the fact the interpreter had access to the class, makes it hard for classagnostic methods to compete.
However, after some thinking, maybe the consistency metric is the solution.
The fact I didn’t notice before is that consistency is well defined directly from the definition of explanation:
Definition 1: Explanation
Let \( X \in \mathbb{R}^{C \times H \times W} \) be an image and \(f: \mathbb{R}^{C \times H \times W} \rightarrow \mathbb{R}^{n}\) a classification model. If \(M \in \mathbb{Z}_2^{H \times W}\) is a binary mask and \(S = X \odot M\) is a masked subset of the input, then the explanation \(E(f \vert X)\) is the minimal nontrivial subset which maintains the output
\[E(f \vert X) = \min_{S}{S} : f(X) = f(S) \; \& \; S > 0\]Idea: Consistency metric
Given #definition 1, it follows that the best explanation is the one that is closest to the original output and has the minimal amount of nonmasked entries.
\[C(E(f \vert X)) = \dfrac{1}{f(E)  f(X)S}\]Of course some things have to be adjusted in case the function outputs a class (make a == comparison instead), but thinking in terms of the representation vector or probability vector, I think this might be useful for comparing ROSA and Occlusion respectively.
Optimize
This is an extra set of ideas that come from things I have already been thinking and recent sensei email on Uchiyamasan’s approach.
I believe there is room for optimization in occlusion techniques, starting with:
Masking from instance segmentation
Just create a masking class that uses maskrcnn to generate a limited number of masks from instance segmentation. This model is available in pytorch, and I could compare with generating from bbox as well
Approximation
It happens that Uchiyamasan approximation technique was able to render good results on video transformers, what lead me to think if the same would be seen on image interpretability.
I think also that replacing the original gradient for something like SmoothGrad or IntegratedGradients could also lead to further improvements.
Outro
I should definitely use this for explaining selfsupervised methods, but maybe checking the literature on interpreting these things is also important. Possibly there is something on NLP or even StableDiffusion interpretation.
Also, Limasan commented on performing a feature watching during training for these interpretability maps. This is already in my plans, but I still need to finish implementing it and setting an option to watch only a sample of the dataset.