SIONlib issueshttps://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues2019-12-02T16:03:44+01:00https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/211sion_open_rank does not support key-value files, user-specified endianness(?)2019-12-02T16:03:44+01:00Benedikt Steinbuschb.steinbusch@fz-juelich.desion_open_rank does not support key-value files, user-specified endianness(?)https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/191Move documentation of `sionconfig` / build process to its own page2021-08-25T15:20:30+02:00Benedikt Steinbuschb.steinbusch@fz-juelich.deMove documentation of `sionconfig` / build process to its own pageCurrently the Doxygen documentation contains a short paragraph about `sionconfig` and how to use SIONlib from a user's code on the page "Installation, debugging and error messages". This should be expanded and moved to its own page.Currently the Doxygen documentation contains a short paragraph about `sionconfig` and how to use SIONlib from a user's code on the page "Installation, debugging and error messages". This should be expanded and moved to its own page.2.0.0https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/184Audit open and close functions for memory leaks2019-12-02T16:03:43+01:00Benedikt Steinbuschb.steinbusch@fz-juelich.deAudit open and close functions for memory leaksAddress sanitizer indicates that some open and close functions leak memory on the error paths (e.g. when called with incorrect parameters, as some tests do). A more thorough audit should be performed.Address sanitizer indicates that some open and close functions leak memory on the error paths (e.g. when called with incorrect parameters, as some tests do). A more thorough audit should be performed.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/182_sion_file_set_position_ansi and _sion_file_get_position_ansi use POSIX funct...2021-08-25T15:20:29+02:00Benedikt Steinbuschb.steinbusch@fz-juelich.de_sion_file_set_position_ansi and _sion_file_get_position_ansi use POSIX functions on some platformsE.g., on Linux, said functions reach around the C standard `FILE*` via `fileno()` and modify/inspect the position of the POSIX file descriptor directly via `lseek()`. This leads to confusing results by making the buffering that C standar...E.g., on Linux, said functions reach around the C standard `FILE*` via `fileno()` and modify/inspect the position of the POSIX file descriptor directly via `lseek()`. This leads to confusing results by making the buffering that C standard I/O performs observable, e.g. in a sequence `read -> set_position -> read` the second `read` can return bytes that were read by the first `read` and still remain in the I/O buffer.
This behavior is not consistent with POSIX (where no buffering is performed) or the C standard library (where `fseek` transparently flushes buffer as necessary) or even with itself (on mac OS (Darwin) `_sion_file_set_position_ansi` is implemented via `fseek`).
To paper over the inconsistency most (but not all) uses of `_sion_file_set_position()` and `_sion_file_get_position()` in SIONlib are prefixed with an explicit `_sion_file_flush()`.
Changing the implementation of the functions in question to use `fseek` and `ftello` like on mac OS does not break anything. In fact, after doing so, it is possible to remove `_sion_file_flush` from all sequences `_sion_file_flush -> _sion_file_{get|set}_position` without breaking any tests.
== Question
Is there a reason for `_sion_file_{get|set}_position_ansi` being implemented the way they are or can they be changed to be more consistent?https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/179Skip buffering for large write operations2019-12-02T16:03:44+01:00Benedikt Steinbuschb.steinbusch@fz-juelich.deSkip buffering for large write operationsIn buffered mode, `sion_fwrite` collects small amounts of data in a buffer and only writes to the file when the buffer is full. However, if the amount of data in a single write operation is larger than the buffer size, all of the data is...In buffered mode, `sion_fwrite` collects small amounts of data in a buffer and only writes to the file when the buffer is full. However, if the amount of data in a single write operation is larger than the buffer size, all of the data is first copied to the buffer (in blocks of the buffer size) and then written to the file from the buffer. This seems inefficient, especially with the continuous write modifications, which allow write operations to be quite large.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/172`_sion_free_filedesc_all_localranks` is dead code but should probably be call...2021-08-25T15:20:29+02:00Benedikt Steinbuschb.steinbusch@fz-juelich.de`_sion_free_filedesc_all_localranks` is dead code but should probably be called somewhereWhile working on #157 (dead code removal), I did not remove `_sion_free_filedesc_all_localranks`, because the corresponding allocation function is indeed called and the `all_localranks` field is used. Find the right spot to insert calls ...While working on #157 (dead code removal), I did not remove `_sion_free_filedesc_all_localranks`, because the corresponding allocation function is indeed called and the `all_localranks` field is used. Find the right spot to insert calls to `_sion_free_filedesc_all_localranks`.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/171Public functions without tests2021-08-25T15:20:31+02:00Benedikt Steinbuschb.steinbusch@fz-juelich.dePublic functions without testsThese symbols were not removed while working on #157 (dead code removal), because they appear in public headers and are not prefix by `_`.
```
sion_free_io_info
sion_set_fp_closed
sion_get_dfile
sion_isdebug
sion_debug_off
sion_generic...These symbols were not removed while working on #157 (dead code removal), because they appear in public headers and are not prefix by `_`.
```
sion_free_io_info
sion_set_fp_closed
sion_get_dfile
sion_isdebug
sion_debug_off
sion_generic_register_get_multi_filename_cb
```
They should probably be called from the test suite.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/170_sion_errorprint(_, _SION_ERROR_RETURN, ...) without returning2019-12-02T16:03:43+01:00Benedikt Steinbuschb.steinbusch@fz-juelich.de_sion_errorprint(_, _SION_ERROR_RETURN, ...) without returningAs a reaction to #169, I dug out my knowledge of regexen and searched the code for additional instances of `_sion_errorprint(_, _SION_ERROR_RETURN, ...)` that do not obviously return. Here is a list:
- [src/parlib/sion_generic_buddy.c@2...As a reaction to #169, I dug out my knowledge of regexen and searched the code for additional instances of `_sion_errorprint(_, _SION_ERROR_RETURN, ...)` that do not obviously return. Here is a list:
- [src/parlib/sion_generic_buddy.c@2173#L795](src/parlib/sion_generic_buddy.c@2173#L795)
- [src/parlib/sion_generic_buddy.c@2173#L897](src/parlib/sion_generic_buddy.c@2173#L897)
- [src/parlib/sion_generic_collective.c@2173#L188](src/parlib/sion_generic_collective.c@2173#L188)
- [src/parlib/sion_ompi_coll_cb_gen.c@2173#L158](src/parlib/sion_ompi_coll_cb_gen.c@2173#L158)
- [src/parlib/sion_ompi_coll_cb_gen.c@2173#L218](src/parlib/sion_ompi_coll_cb_gen.c@2173#L218)
- [src/parlib/sion_ompi_coll_cb_gen.c@2173#L364](src/parlib/sion_ompi_coll_cb_gen.c@2173#L364)
- [src/parlib/sion_ompi_coll_cb_gen.c@2173#L419](src/parlib/sion_ompi_coll_cb_gen.c@2173#L419)
https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/169Collective writes larger than the chunk size deadlock2021-08-25T15:20:30+02:00Benedikt Steinbuschb.steinbusch@fz-juelich.deCollective writes larger than the chunk size deadlockReported by Michael Bareford of EPCC (m.bareford@epcc.ed.ac.uk):
> So each process makes two calls to a `write_vector` method that then calls `sion_coll_fwrite(rhs.data(), sizeof(T), rhs.size(), _sid)`, where `T` is either `unsigned int...Reported by Michael Bareford of EPCC (m.bareford@epcc.ed.ac.uk):
> So each process makes two calls to a `write_vector` method that then calls `sion_coll_fwrite(rhs.data(), sizeof(T), rhs.size(), _sid)`, where `T` is either `unsigned int` or `double`. At present, it seems that writing the string data and the element ids causes no problem, only when the element data (the vectors of `double`s) are also written do some processes fail to return from `sion_coll_fwrite`.
And later:
> Just to let you know, I have fixed my problem. It was caused by specifying an inadequate chunk size in the call to sion_paropen_mpi.
This probably can and should be detected and reported as an error.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/163Incorrect intent declarations in Fortran bindings2019-12-02T16:03:43+01:00Benedikt Steinbuschb.steinbusch@fz-juelich.deIncorrect intent declarations in Fortran bindingsAnke Kreuzer recently reported a segmentation fault originating in a call to `fsion_paropen_mpi`.
> Hallo Benedikt,
>
> also der Code rund um das FSION_PAROPEN_MPI sieht so aus aus:
>
> fsblksize = -1
> chunksize = INT(16+nt*(96+...Anke Kreuzer recently reported a segmentation fault originating in a call to `fsion_paropen_mpi`.
> Hallo Benedikt,
>
> also der Code rund um das FSION_PAROPEN_MPI sieht so aus aus:
>
> fsblksize = -1
> chunksize = INT(16+nt*(96+120*ndl),KIND=8)
>
> IF(myid==0) THEN
> WRITE(*,*) " Before create_Com"
> ENDIF
> color = crank/24
> key = modulo(crank,24);
> ! WRITE(*,*) "My id: ",myid , "color: ", color, "key: ", key
>
> CALL MPI_COMM_SPLIT(MPI_COMM_WORLD, color, myid,
> & lcomm, ierr)
> IF(myid==0) THEN
> WRITE(*,*) " After create_Com, before SION_PAROPEN_MPI"
> ENDIF
>
> So, bis hier hin laeuft das Programm durch und dann kommt:
>
> CALL FSION_PAROPEN_MPI('NAM_CPs',
> & 'bw',-1,icomm0,lcomm,
> & chunksize, fsblksize, myid, newfname, sid)
>
> IF(myid==0) THEN
> WRITE(*,*) " After paropen"
> ENDIF
>
> Und diese letzte Ausgabe wird nie ausgegeben, da vorher der Segmentation
> fault passiert.
The problem is that SIONlib tries to return the number of files it actually opened (based on the number of distinct local communicators) in the `nfiles` argument. However, the special value `-1` is specified as an integer literal and as the dummy argument `nfiles` is declared with `intent(in)` is probably placed in a read-only part of memory.
A quick inspection reveals that the `intent` declarations on the Fortran subroutine and the argument const-ness on the C function that is called underneath differ in several places.
```
subroutine fsion_paropen_mpi(fname,file_mode,nfiles,fgcomm,flcomm,chunksizes,fsblksize,&
& globalrank,newfname,sid)
implicit none
character(len=*), intent(in) :: fname
character(len=*), intent(in) :: file_mode
integer, intent(in) :: nfiles
integer, intent(in) :: fgcomm
integer, intent(inout) :: flcomm
integer*8, intent(inout) :: chunksizes
integer*4, intent(inout) :: fsblksize
integer, intent(in) :: globalrank
character(len=*), intent(out) :: newfname
integer, intent(out) :: sid
call fsion_paropen_mpi_c(fname,file_mode,nfiles,fgcomm,flcomm,chunksizes,fsblksize,&
& globalrank,newfname,sid)
end subroutine fsion_paropen_mpi
```
```
void fsion_paropen_mpi_c(char *fname,
char *file_mode,
int *numFiles,
MPI_Fint * fgComm,
MPI_Fint * flComm,
sion_int64 *chunksize,
sion_int32 *fsblksize,
int *globalrank,
char *newfname,
int *sid,
int fname_len,
int file_mode_len,
int newfname_len);
```https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/140Add checks for malloc(0)2019-12-02T16:03:42+01:00Kay ThustAdd checks for malloc(0)Check all circumstances where `malloc(0)` could occur and handle them properly since it is implementation dependent.
Usually this should not be an issue, since zero sized allocs should be prevented by other logic, e.g. size of global co...Check all circumstances where `malloc(0)` could occur and handle them properly since it is implementation dependent.
Usually this should not be an issue, since zero sized allocs should be prevented by other logic, e.g. size of global communicator > 0.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/136Use more recent fortran for f90 interface2019-12-02T16:03:43+01:00Kay ThustUse more recent fortran for f90 interfaceE.g. integer*8 is f77 style, use selected_int_kind insteadE.g. integer*8 is f77 style, use selected_int_kind insteadhttps://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/134Add convenience functions to access filedesc members2021-08-25T15:20:29+02:00Kay ThustAdd convenience functions to access filedesc membersE.g. sion_get_chunksize
Alternatively fix sion_get_locations.E.g. sion_get_chunksize
Alternatively fix sion_get_locations.2.0.0https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/133Clean up debugging information2019-12-02T16:03:57+01:00Kay ThustClean up debugging information- `DFUNCTION` should be used consistently
- "Enter" debug messages should be the first
- "Exit" should be the last, if possible also for cases where the function does not exit successful- `DFUNCTION` should be used consistently
- "Enter" debug messages should be the first
- "Exit" should be the last, if possible also for cases where the function does not exit successfulhttps://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/110Check debug levels and usage of DFUNCTION2019-12-02T16:03:57+01:00Kay ThustCheck debug levels and usage of DFUNCTIONhttps://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/97check for more use of 'const' where possible2019-12-02T16:03:56+01:00Kay Thustcheck for more use of 'const' where possiblePointer arguments should be `const` wherever possible.Pointer arguments should be `const` wherever possible.https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/90All console output should also be shown in debug messages2019-12-02T16:03:56+01:00Kay ThustAll console output should also be shown in debug messagesFor example output of `_sion_errorprint` should be shown in in debug to get a consistent order of all output.For example output of `_sion_errorprint` should be shown in in debug to get a consistent order of all output.v1.6https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/82build python interface2019-12-02T16:03:56+01:00Kay Thustbuild python interfacesionconfig needs additional information: pythonpath, pythonversionsionconfig needs additional information: pythonpath, pythonversionhttps://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/80Error codes for unsuccessful read / write calls2021-08-25T15:20:43+02:00Kay ThustError codes for unsuccessful read / write callsreported by Bert Wesarg
> Das Problem ist wohl eher, das wir gerne zw. 'erfolgreichem lesen, auch wenn es zu wenig war' und 'fehler beim lesen' untscheiden wollen. Mit der ASNI API kann man das gut mit ferror() unterscheiden. Aber bei s...reported by Bert Wesarg
> Das Problem ist wohl eher, das wir gerne zw. 'erfolgreichem lesen, auch wenn es zu wenig war' und 'fehler beim lesen' untscheiden wollen. Mit der ASNI API kann man das gut mit ferror() unterscheiden. Aber bei sion_fread_key() sehe ich da jetzt nicht eine moeglichkeit.
Und später
> Auch bei ANSI kann man anhand des Rükgabewertes nicht entscheiden, ob es ein error oder ein end-of-file ist. Ob nun die Anzahl der Bytes oder der Elemente zurück gegeben werden, spielt da keine Rolle. Ich sehe also gerade nicht, wie diese Information vorher verfügbar hätte sein könnte. POSIX auf der anderen Seite gibt einem -1 wieder, in Falle eines Fehlers.2.0.0https://gitlab.jsc.fz-juelich.de/cstao-public/SIONlib/SIONlib/-/issues/38Optimize internal data strucure2014-10-29T17:43:01+01:00Wolfgang FringsOptimize internal data strucureSome vectors in the SIONlib internal data structure should be defined as sion_int32 instead sion_int64:
-> all_globalranks
-> all_localranks
-> all_blockcount
This is also a change to the file format!
Some vectors in the SIONlib internal data structure should be defined as sion_int32 instead sion_int64:
-> all_globalranks
-> all_localranks
-> all_blockcount
This is also a change to the file format!
v1.6Wolfgang FringsWolfgang Frings