Known Issues

  • The support for PBS job arrays is currently disabled.
     
  • Apid not cleaned properly, resulting in error messages ``Cpuset file /dev/cpuset/xxxxxxx/cpus wrote -1 of 5; found no other local apids", where xxxxxxx is a number, e.g.1290794. This problem is being worked on. 
  • Error message: undefined reference to `__pgas_register_dv' when using cray compilers. This is an issue with cce 8.1.4 (current default) and earlier compilers.  Version 8.1.5 and later installed cce compilers do not have this problem.  If for some reason you still need to use a compiler with this error, here is the workaround:
    • When using C++ compiler, use the flag "-hpgas_runtime"
    • For other compilers, add the link flag: -lpgas-dmapp 
  • The color listing feature of "ls" will be disabled by default upon further examination. The issue is performance related in that the extra work needed by the color feature is not handled well by Lustre when the number of files is large or the files are striped wide across the Lustre OSTs.
  • Running emacs targeted at X fails with a missing libgtk-x11-2.0.so.0
  • pat_build: Can not locate shared object file 'libmpich.so.1'
  • Blue Waters portal (this site) home page shows 0 hours used against allocations. Charging the accounts for job times has started but is not yet displayed on the portal.
  • If you see "relocation truncated to fit: R_X86_64_PC32 against `.bss' errors" or similar messages you need to enable special compiler/linker options.
    • Cray: compile:  -hpic , link: -dynamic -hpic
    • GNU: compile: -mcmodel=medium (and maybe -fpic), link: -mcmodel=medium (and maybe -dynamic)
    • PGI:  compile: -mcmodel=medium -Mlarge_arrays, link: -dynamic -mcmodel=medium -Mlarge_arrays
      • Before linking remove ATP: module delete atp
  • "Unable to open kdreg version file: No such file or directory
  • Warning: Unable to open kgni version file /sys/class/gemini/kgni0/version errno 2 at line 600 in file cdm.c"
    • These messages appear when using a dynamically linked executable built with the Cray compiler wrappers, on a login node or a node that is not directly connected to the Cray Gemini network. 
    • In most cases, these messages are benign and do not affect the functionality of the executable.
    • These messages do not appear when running the same executable on a Gemini connected node, such as a MOM node using an interactive session. 
  • user screen sessions are not available / disappear
    • The default auto-logout on idle is set to 4 hours. If you have not interacted with any session, including screen sessions, for over 4 hours, the system terminates those sessions automatically. 
  • TCMALLOC: Codes may crash in tcmalloc. Cray is working on fixing these. In the mean time, use the flag "-hsystem_alloc" to avoid linking in tcmalloc. Using this flag may result in a new set of problems. So use it with caution. Continue to open bugs on Jira if you see problems with tcmalloc. autoconf scripts may also fail in some cases due to tcmalloc. Please provide "-hsystem_alloc" to the configure script
  • netcdf and netcdf-hdf5parallel PGI builds are missing Fortran support in version 4.2.1.1.  To work around this:
    • cray-netcdf-hdf5parallel/4.2.1.1 needs to be swapped with netcdf-hdf5parallel/4.2.0(default)
    • cray-netcdf/4.2.1.1 needs to be swapped with netcdf/4.2.0
  • Warning messages when using hdf library. These messages are benign and can be safely ignored. 

/opt/cray/hdf5/1.8.11/cray/81/lib/libhdf5_cray.a(H5PL.o): In function `H5PL__open$$CFE_id_56395c9c_01603595': 
/home/users/seanb/pelibs/hdf5/1.8.11/rpm/BUILD/cray-hdf5-1.8.11-cce1-serial/src/H5PL.c:531: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

  • Error messages with 6.0.1 cray-mpich library
    • /usr/bin/ld: Warning: alignment 16 of symbol `mpifcmb8_' in /opt/cray/mpt/6.0.1/gni/mpich2-gnu/48/lib/libmpich_gnu_48.a(setbot.o) is smaller than 32 in ClimateDP_MPI.o
    • Solution is to move to cray-mpich 6.0.2

$ module unload cray-mpich/6.0.1
$ module load cray-mpich/6.0.2

Fixed known issues

  • Missing man pages for aprun, xtnodestat etc. - fixed
  • In certain scenarios, files maybe created with wrong dates. The date is set to 1901 instead of todays date. This may also prevent the user from successfully using Eclipse remote tools. - fixed, monitor
  • GCC: Using -finstrument-functions with SSE intrinsics is not currently supported. Related bug. fixed in latest version of CCE
  • Using MPI-IO calls, cannot open/close files more than 1012 times. Load the updated module: xt-mpich2/5.4.5. See this ticket for more information. updated xt-mpich2 is the default module now
  • FTN wrapper under PGI programming environment does not produce working binary when compiling CUDA-fortran code. fixed in the compiler wrapper (ftn)