Making our analysis tools ready for Exascale
Currently our analysis tools are great for small problems that involve up to 1000 nodes. The PDF reports allow users to get a good overview of the data and to quickly see structure in the data.
For larger datasets up to about 10 000 nodes zooming is necessary. The PS/PDF viewer also has to be capable of opening the file.
For even larger datasets 100 000 nodes and up creating the reports is no longer possible due to out-of-memory errors even on machines with "large pools of RAM" (128 GiB).
For 1 000 000 nodes the compressed SION file size would exceed 12 TB, the corresponding report would likely be larger than 3 TB. This at the RAM limit of what most current CPUs can handle. Most PS/PDF viewers are not able to handle these file sizes.
For the future this leaves us with a few options with regard to how to handle these file sizes.
- We can supply a GUI that displays only a part of the matrix, effectively memory mapping it in the background. This would be the most work.
- We extend our current scripts to either downsamples/downaverage the matrix to a reasonable size.
- We extend our current scripts to be able to only show an excerpt of the matrix. This should be easy to do.
Any other ideas?
I suggest we put this on the back burner until we need it.