Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • LinkTest LinkTest
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Graph
    • Compare revisions
  • Issues 29
    • Issues 29
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Deployments
    • Deployments
    • Releases
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • cstao-public
  • LinkTestLinkTest
  • Wiki
  • Troubleshooting

Troubleshooting · Changes

Page history
Changed Linktest to LinkTest authored Jul 21, 2022 by Max Holicki's avatar Max Holicki
Hide whitespace changes
Inline Side-by-side
Troubleshooting.md
View page @ 77f3ae76
Here we will list fixes to problems that were encountered when building or running Linktest. These fixes may not necessarily be up to date or even work on your system. We hope they will solve your problem or point you in the right direction.
Here we will list fixes to problems that were encountered when building or running LinkTest. These fixes may not necessarily be up to date or even work on your system. We hope they will solve your problem or point you in the right direction.
[[_TOC_]]
# ParastationMPI and `shmget(0, sizeof(shm_com_t), IPC_CREAT | 0777) : No space left on device`
When using ParastationMPI and testing the MPI communication API using Linktest you may encounter an error-message of the following form:
When using ParastationMPI and testing the MPI communication API using LinkTest you may encounter an error-message of the following form:
```
<PSP:r§TASKID§:shmget(0, sizeof(shm_com_t), IPC_CREAT | 0777) : No space left on device>
```
where `§TASKID§` is a 8-digit zero-filled task id, also known as a rank. If other terminating error messages occur afterwards Linktest will terminate, otherwise it will hang.
where `§TASKID§` is a 8-digit zero-filled task id, also known as a rank. If other terminating error messages occur afterwards LinkTest will terminate, otherwise it will hang.
This silent error message occurs when so many connections are to be tested that no more shared memory can be allocated on the communication device. This is a hardware limitation, not a software limitation. This commonly occurs when oversubscribing or overcommitting nodes with tasks, for example when using twice as many tasks on a node as it has logical cores.
......@@ -17,4 +17,4 @@ At the time of writing three potential solutions exist:
1. Test using less tasks. Depending on your requirements this, however, may not be possible.
2. Use a different MPI implementation like OpenMPI version 4.1.1.
3. Set the `PSP_UCP` environment variable to `2`. This is an undocumented option and it is unknown what it does. This may change in future ParaStation MPI versions. This option will cause ParaStation MPI to use UCP in the background, same as if `PSP_UCP` is set to `1`. Setting `PSP_UCP` to `1`, however, causes MPI to perform differently to if `PSP_UCP` is set to `2` and will still cause the same error. If using this option you may encounter errors simillar or identical to the following: `ib_mlx5_dv.c:160 UCX ERROR mlx5dv_devx_obj_create(QP) failed, syndrome 0: Cannot allocate memory`. In this case you can either try testing with a different communication API or see 4.
4. Upgrade your communication hardware such that it has more memory and can support more connections. Depending on how many connections you want to test this may be enough.
\ No newline at end of file
4. Upgrade your communication hardware such that it has more memory and can support more connections. Depending on how many connections you want to test this may be enough.
Clone repository
  • Acknowledgements
  • Build
  • Communication Patterns
  • Frequently Asked Questions
  • Glossary
  • Inspecting LinkTest SION Files with ImHex
  • LinkTest Python Reader
  • LinkTest Report
  • LinkTest SIONlib File Format
  • LinkTest
  • Noteworthy Reports
  • Old LinkTest SIONlib File Formats
  • Old Stable LinkTest Releases (<2.0.0)
  • Semi‐, Bi‐ & Uni‐directional Testing
  • Troubleshooting
View All Pages