Skip to Content

System News

Filtered by: Unscheduled Outages | (Remove Filters)

  • Blue Waters Notice: System returned to service

    Created by: jenos 2019-02-06 15:31:27 (Channels: unscheduledoutages)

    Blue Waters Users:

    The reboot is complete and the system has returned to service as of 3:14pm CT. Tomorrow's near-line storage maintenance will proceed as planned.

    We apologize for any inconvenience.

     

  • Blue Waters Unplanned Reboot

    Created by: bbode 2019-02-06 10:00:16 (Channels: unscheduledoutages)

    Blue Waters Users,

    We experienced an issue that has the high speed network in an unrecoverable state. We have to reboot the system to recover and all running jobs will be lost. The login nodes and endpoints (ncsa#Nearline ncsa#BlueWaters) will remain available during the reboot. The current estimate for return to...
    Read More

  • Blue Waters Notice: System returned to service

    Created by: tbouvet 2019-01-12 13:09:15 (Channels: scheduledoutages|unscheduledoutages)

    Blue Waters Users:

    The reboot is complete and the system has returned to service.

    We apologize for any inconvenience.

     

  • Blue Waters Notice: Unplanned Reboot

    Created by: tbouvet 2019-01-12 09:58:18 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users,

    We experienced an issue that has the high speed network in an unrecoverable state. We have to reboot the system to recover and all running jobs will be lost. The login nodes and endpoints (ncsa#Nearline ncsa#BlueWaters) will remain available during the reboot. The current estimate for return to...
    Read More

  • Blue Waters Notice: Nearline System (HPSS) return to service

    Created by: tbouvet 2019-01-10 15:24:20 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The ncsa#Nearline endpoint is available and has returned to service at 3PM CT. We apologize for any inconvenience.

    -Blue Waters

  • Blue Waters Notice: Nearline System (HPSS) remains unavailable.

    Created by: tbouvet 2019-01-10 09:38:30 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The ncsa#Nearline endpoint is paused pending repairs suffered to systems during the data center power failure on 1/09/19. We apologize for any inconvenience.

    Please check back for an update on the situtation.

    -Blue Waters

     

  • Blue Waters Notice: System has returned to service

    Created by: jenos 2019-01-10 09:37:52 (Channels: unscheduledoutages|systemnotices)

     

    Blue Waters Users:

    The power outage at the NPCF building has been resolved.  All Blue Waters services are now available with the exception of the Nearline storage system, which will take a bit longer to recover.  The job scheduler has been resumed and login nodes have access re-enabled.  There will be...
    Read More

  • Blue Waters Notice: Power disruption

    Created by: jenos 2019-01-09 16:21:59 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:


    Update 4:20pm:

    All systems up except Nearline.  Access remains restricted while performance tests complete.


    A power outage at the building housing the Blue Waters system has caused a service interruption; all running jobs were consequently terminated.  All Blue Waters subsystems have been affected and are currently out of...
    Read More

  • Blue Waters Notice: System returned to service

    Created by: jenos 2018-12-27 23:43:11 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The power outage at the Blue Waters building has been resolved and the compute nodes have been restarted. The Blue Waters system has been returned to normal operations at 11:40 PM CT. We apologize for any inconvenience.

     

  • Blue Waters Return to Service

    Created by: tbouvet 2018-11-14 18:35:11 (Channels: unscheduledoutages)

    The meta data server failover on the Home file system completed at 6:23 PM CT. I/O transactions initiated during the failover should have resumed normal operation when failover completed. Blue Waters has resumed normal operations.

  • Blue Waters File System Issue

    Created by: tbouvet 2018-11-14 17:45:36 (Channels: unscheduledoutages)

    Blue Waters is currently experiencing a meta data server failover on the home file system that began at 5:30 PM CT. As a result, all I/O transactions for the home filesystem will block until the failover completes on all clients. The rest of Blue Waters including the other filesystems are operating...
    Read More

  • Blue Waters Return to Service

    Created by: tbouvet 2018-11-14 11:32:42 (Channels: unscheduledoutages)

    The meta data server failover on the Home file system completed at 11:26 AM CT. The file system issue start at 10:45 AM. I/O transactions initiated during the failover should have resumed normal operation when failover completed. Blue Waters has resumed normal operations.

  • Blue Waters File System Issue

    Created by: tbouvet 2018-11-14 11:19:47 (Channels: unscheduledoutages)

    Blue Waters is currently experiencing a meta data server failover on the home file system that began at 11:15 AM CT. As a result, all I/O transactions for the home filesystem will block until the failover completes on all clients. The rest of Blue Waters including the other filesystems are operating...
    Read More

  • BlueWaters returned to service

    Created by: mshow 2018-11-05 06:18:32 (Channels: unscheduledoutages)

    The system has been returned to full service operation after a brief unscheduled outage.

  • BlueWaters system issue

    Created by: mshow 2018-11-05 01:21:18 (Channels: unscheduledoutages|general)

    The system has experienced  a fault that will require a full system shutdown/reboot. This will take several hours before the system is returned to full operations. 

  • Blue Waters Partial Scratch Outage

    Created by: tbouvet 2018-06-23 21:31:02 (Channels: unscheduledoutages|systemnotices)

    Blue Waters experienced a newtork switch failure that resulted in a partial outage of the scratch filesystem (ost168-179) from 7:44 PM CT to 7:59 PM CT. Jobs that ended during this time may have been impacted. I/O transactions targeting the affected storage server should block until the ost targets returned...
    Read More

  • Blue Waters returned to full service

    Created by: mshow 2018-06-12 10:23:37 (Channels: mandatory|unscheduledoutages|systemnotices)

    Blue Waters has returned to full service after recovery from a power event.

  • Blue Waters system power interruption

    Created by: mshow 2018-06-12 04:56:14 (Channels: mandatory|unscheduledoutages|systemnotices)

    Thunderstorms have resulted in a power interruption of the BlueWaters System. This outage imacts both the compute nodes and all  filesystems. Therefore, a full reboot will be necessary.Return to service is estimated to be approximately 10 am Centeral time.

    BW Admin

  • Blue Waters has Returned to Service

    Created by: tbouvet 2018-06-07 14:27:12 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has returned to full service at 2:14 PM CT. The issue encountered required a full system reboot to resolve. All running jobs were lost so please resubmit your jobs from latest checkpoint file if your job exited prematurely.   

  • Blue Waters System Issue

    Created by: tbouvet 2018-06-07 10:12:41 (Channels: unscheduledoutages)

    Blue Waters is experiencing a full system issue that began at 6:30 AM CT. System support staff are evaluating and attempting to restore normal service but may require a full system reboot. Job scheduling is paused until the issue is resolved. Interim updates will be posted on the Blue Waters...
    Read More