Skip to Content

System News

Filtered by: Unscheduled Outages | (Remove Filters)

  • Blue Waters: NPCF Power Issue Update

    Created by: tbouvet 2019-07-06 03:45:32 (Channels: unscheduledoutages|systemnotices)

    Blue Waters: NPCF Power Issue 7/5/2019 3PM

    All Blue Waters Resources are available except for the compute nodes. Blue Waters Computes are being rebooted and all running jobs were lost. No RTS eta yet.

  • Blue Waters Returned to Service

    Created by: bbode 2019-07-05 17:57:32 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has been rebooted and returned to service at 5:55PM following a power interuption earlier this afternoon. All running jobs were lost due to the outage.

    Please email help+bw@ncsa.illinois.edu to report any issues.

  • Blue Waters: NPCF Power Issue, Scheduler paused expect full reboot

    Created by: tbouvet 2019-07-05 13:32:15 (Channels: unscheduledoutages|systemnotices)

    A power outage at the building housing the Blue Waters system has caused a service interruption; the Login Nodes, Network, Storage, Compute Nodes, and Near-line Storage may be unavailable. It is unknown at this time when a return to service can be expected. Watch the Blue Waters portal blog for updates.

  • Nearline Tape Library Has Returned To Service

    Created by: briandi 2019-06-28 13:27:38 (Channels: unscheduledoutages|systemnotices)

    The NCSA_Nearline storage subsystem issue on Blue Waters was resolved and the system returned to normal operations at 1:00 pm.

     

  • Nearline Tape Library Emergency System Maintenance

    Created by: briandi 2019-06-28 09:45:21 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, Blue Waters is experiencing an issue on the HPSS storage subsystem (ncsa#Nearline) that began at 9:00 AM. System support staff are evaluating and attempting to restore normal service. The rest of Blue Waters subsystems remain in normal operation. Transfers into or out of the system will be unavailable until...
    Read More
  • Blue Waters: Nearline Tape Library Has Returned To Service

    Created by: tbouvet 2019-05-31 12:58:05 (Channels: unscheduledoutages|systemnotices)

    The Nearline subsystem reboot completed at 11:30AM today.

    All existing transfers should resume as Nearline was returned to service.

  • Nearline Tape Library System Reboot

    Created by: bbode 2019-05-31 09:28:04 (Channels: unscheduledoutages|systemnotices)

    The Nearline subsystem is currently undergoing an emergancy reboot to clear multiple issues. It is expected to return to service by 1PM today.

    All existing transfers will resume once Nearline returns to service.

  • Blue Waters Has Returned to Service

    Created by: tbouvet 2019-05-16 17:15:55 (Channels: unscheduledoutages|systemnotices)

    The storage issue on Blue Waters projects file system has been resolved and the system returned to normal operations at "5:07" PM CT. Any teams who were impacted by the file system issue have been contacted individually. The scheduler has resumed normal operations. Thank you for your patience while this was...
    Read More
  • Blue Waters Project File System Update

    Created by: tbouvet 2019-05-16 09:29:40 (Channels: unscheduledoutages|systemnotices)

    We are in the process of running a file system check and repair of a small portion of the projects file system. When that is complete we will access the results and take appropriate action. Update: File system repair continues and is expected to last until late this afternoon (5/16). If...
    Read More
  • Blue Waters Project File System Issue

    Created by: tbouvet 2019-05-15 15:20:25 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, Blue Waters is currently experiencing a storage server issue for a portion of the projects filesystem. As a result, all I/O transactions targeting the affected storage server will block. A single storage server supplies a small fraction of the file system data and all remaining storage servers continue normal operation....
    Read More
  • Blue Waters OSS Return to Service

    Created by: squaire3 2019-05-12 12:33:07 (Channels: unscheduledoutages|systemnotices)

    The storage issue on Blue Waters was resolved and the system returned to normal operations at 12:30 PM CT. Scheduler has also been resumed.

  • Blue Waters OSS Failover

    Created by: squaire3 2019-05-12 09:18:53 (Channels: unscheduledoutages|systemnotices)

    Blue Waters is currently experiencing a storage server failover on the OSS file system that began at 8:08 AM CT. As a result, all I/O transactions targeting the affected storage server will block until the failover completes on all clients. A single storage server supplies a small fraction of the...
    Read More
  • Nearline Tape Library System Maintenance

    Created by: glasgow 2019-04-29 16:55:33 (Channels: unscheduledoutages|systemnotices)

    Nearline maintenance work is complete and the service has returned to full operations as of: 1600hrs, April 29th, 2019 --- Nearline is undergoing emergency service on one tape library. The work is related to fallout from last week's hardware service and is expected to take approximately 10 hours to complete. Some files may...
    Read More
  • Nearline Tape Library System Maintenance

    Created by: glasgow 2019-04-29 09:42:49 (Channels: unscheduledoutages|systemnotices)

    Nearline is undergoing emergency service on one tape library. The work is related to fallout from last week's hardware service and is expected to take approximately 10 hours to complete. Some files may not be accessible during that time.

    Start time: 0945hrs to ~ 2000hrs, April 29, 2019

  • Blue Waters Return to Service

    Created by: squaire3 2019-04-12 15:41:38 (Channels: unscheduledoutages|systemnotices)

    The high-speed network issue on Blue Waters was resolved and the system returned to normal operations at 3:34 PM CT. Scheduler has also been resumed.

  • Blue Waters Scheduler Paused - HSN Issue

    Created by: squaire3 2019-04-12 15:02:55 (Channels: unscheduledoutages|systemnotices)

    Blue Waters is experiencing an issue on the high-speed network that began at 2:10 PM CT. System support staff are evaluating and attempting to restore normal service. Job scheduling is paused until the issue is resolved. The file systems and data transfer services are operating normally. Logins have been occationally hanging...
    Read More
  • Nearline Endpoint Paused for Storage Maintenance

    Created by: glasgow 2019-04-08 23:14:20 (Channels: unscheduledoutages|systemnotices)

    The Nearline endpoint has now been returned to normal operations.  The Blue Water's Nearline endpoint will be paused beginning at 1700hrs CDT. New and current user actions/requests will be paused and will resume normal activity when the endpoint is released. No user action is necessary. This maintenance...
    Read More
  • Nearline Endpoint Paused for Storage Maintenance

    Created by: glasgow 2019-04-08 16:27:20 (Channels: unscheduledoutages|systemnotices)

    The Blue Water's Nearline endpoint will be paused beginning at 1700hrs CDT. New and current user actions/requests will be paused and will resume normal activity when the endpoint is released. No user action is necessary. This maintenance window will be used to conduct resource management operations that have been deferred...
    Read More
  • Blue Waters Returned to Service

    Created by: tbouvet 2019-04-07 15:14:36 (Channels: unscheduledoutages|systemnotices)

    Blue WAters Users,

    The storage server issue on the scratch file system is resolved. I/O transactions initiated during the outage should have resumed when the Lustre target returned. Blue Waters has resumed normal operations.

  • Blue Waters Scheduler is paused 9:30 AM

    Created by: tbouvet 2019-04-07 10:30:00 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, Blue Waters is currently experiencing a storage server issue for a small portion of the scratch filesystem. As a result, all I/O transactions targeting the affected storage server will block. A single storage server supplies a small fraction of the file system data and all remaining storage servers continue normal...
    Read More
  • BlueWaters cabinet failure

    Created by: mshow 2019-03-23 16:44:36 (Channels: unscheduledoutages|systemnotices)

    The cabinet has been restored.

  • BlueWaters repair complete

    Created by: mshow 2019-03-23 05:13:22 (Channels: unscheduledoutages|systemnotices)

    The cabinet has been restored

  • BlueWaters cabinet failure

    Created by: mshow 2019-03-23 03:46:37 (Channels: unscheduledoutages|systemnotices)

    A cabinet has shutdown resulting in job loss and an incomplete network configuration. It is unknown at this time when a return to service can be expected for that cabinet. Watch the Blue Waters portal blog for updates.

  • Blue Waters returned to service

    Created by: mshow 2019-02-24 03:41:58 (Channels: mandatory|unscheduledoutages|systemnotices)

    The high-speed network issue on Blue Waters was resolved and the system returned to normal operations at 3:30 AM CT.

  • Blue Waters High Speed Network issue

    Created by: mshow 2019-02-23 23:43:19 (Channels: unscheduledoutages|systemnotices)

    Blue Waters is experiencing an issue on the high-speed network that began at 9:48 PM CT. System support staff are evaluating and attempting to restore normal service. Job scheduling is paused until the issue is resolved. The file systems and data transfer services are operating normally. Interim updates will be posted on...
    Read More
  • Blue Waters Notice: System returned to service

    Created by: jenos 2019-02-06 15:31:27 (Channels: unscheduledoutages)

    Blue Waters Users:

    The reboot is complete and the system has returned to service as of 3:14pm CT. Tomorrow's near-line storage maintenance will proceed as planned.

    We apologize for any inconvenience.

     

  • Blue Waters Unplanned Reboot

    Created by: bbode 2019-02-06 10:00:16 (Channels: unscheduledoutages)

    Blue Waters Users, We experienced an issue that has the high speed network in an unrecoverable state. We have to reboot the system to recover and all running jobs will be lost. The login nodes and endpoints (ncsa#Nearline ncsa#BlueWaters) will remain available during the reboot. The current estimate for return to...
    Read More
  • Blue Waters Notice: System returned to service

    Created by: tbouvet 2019-01-12 13:09:15 (Channels: scheduledoutages|unscheduledoutages)

    Blue Waters Users:

    The reboot is complete and the system has returned to service.

    We apologize for any inconvenience.

     

  • Blue Waters Notice: Unplanned Reboot

    Created by: tbouvet 2019-01-12 09:58:18 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, We experienced an issue that has the high speed network in an unrecoverable state. We have to reboot the system to recover and all running jobs will be lost. The login nodes and endpoints (ncsa#Nearline ncsa#BlueWaters) will remain available during the reboot. The current estimate for return to...
    Read More
  • Blue Waters Notice: Nearline System (HPSS) return to service

    Created by: tbouvet 2019-01-10 15:24:20 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The ncsa#Nearline endpoint is available and has returned to service at 3PM CT. We apologize for any inconvenience.

    -Blue Waters

  • Blue Waters Notice: Nearline System (HPSS) remains unavailable.

    Created by: tbouvet 2019-01-10 09:38:30 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The ncsa#Nearline endpoint is paused pending repairs suffered to systems during the data center power failure on 1/09/19. We apologize for any inconvenience.

    Please check back for an update on the situtation.

    -Blue Waters

     

  • Blue Waters Notice: System has returned to service

    Created by: jenos 2019-01-10 09:37:52 (Channels: unscheduledoutages|systemnotices)

      Blue Waters Users: The power outage at the NPCF building has been resolved.  All Blue Waters services are now available with the exception of the Nearline storage system, which will take a bit longer to recover.  The job scheduler has been resumed and login nodes have access re-enabled.  There will be...
    Read More
  • Blue Waters Notice: Power disruption

    Created by: jenos 2019-01-09 16:21:59 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users: Update 4:20pm: All systems up except Nearline.  Access remains restricted while performance tests complete. A power outage at the building housing the Blue Waters system has caused a service interruption; all running jobs were consequently terminated.  All Blue Waters subsystems have been affected and are currently out of...
    Read More
  • Blue Waters Notice: System returned to service

    Created by: jenos 2018-12-27 23:43:11 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The power outage at the Blue Waters building has been resolved and the compute nodes have been restarted. The Blue Waters system has been returned to normal operations at 11:40 PM CT. We apologize for any inconvenience.

     

  • Blue Waters Return to Service

    Created by: tbouvet 2018-11-14 18:35:11 (Channels: unscheduledoutages)

    The meta data server failover on the Home file system completed at 6:23 PM CT. I/O transactions initiated during the failover should have resumed normal operation when failover completed. Blue Waters has resumed normal operations.

  • Blue Waters File System Issue

    Created by: tbouvet 2018-11-14 17:45:36 (Channels: unscheduledoutages)

    Blue Waters is currently experiencing a meta data server failover on the home file system that began at 5:30 PM CT. As a result, all I/O transactions for the home filesystem will block until the failover completes on all clients. The rest of Blue Waters including the other filesystems are operating...
    Read More
  • Blue Waters Return to Service

    Created by: tbouvet 2018-11-14 11:32:42 (Channels: unscheduledoutages)

    The meta data server failover on the Home file system completed at 11:26 AM CT. The file system issue start at 10:45 AM. I/O transactions initiated during the failover should have resumed normal operation when failover completed. Blue Waters has resumed normal operations.

  • Blue Waters File System Issue

    Created by: tbouvet 2018-11-14 11:19:47 (Channels: unscheduledoutages)

    Blue Waters is currently experiencing a meta data server failover on the home file system that began at 11:15 AM CT. As a result, all I/O transactions for the home filesystem will block until the failover completes on all clients. The rest of Blue Waters including the other filesystems are operating...
    Read More
  • BlueWaters returned to service

    Created by: mshow 2018-11-05 06:18:32 (Channels: unscheduledoutages)

    The system has been returned to full service operation after a brief unscheduled outage.

  • BlueWaters system issue

    Created by: mshow 2018-11-05 01:21:18 (Channels: unscheduledoutages|general)

    The system has experienced  a fault that will require a full system shutdown/reboot. This will take several hours before the system is returned to full operations. 

  • Blue Waters Partial Scratch Outage

    Created by: tbouvet 2018-06-23 21:31:02 (Channels: unscheduledoutages|systemnotices)

    Blue Waters experienced a newtork switch failure that resulted in a partial outage of the scratch filesystem (ost168-179) from 7:44 PM CT to 7:59 PM CT. Jobs that ended during this time may have been impacted. I/O transactions targeting the affected storage server should block until the ost targets returned...
    Read More
  • Blue Waters returned to full service

    Created by: mshow 2018-06-12 10:23:37 (Channels: mandatory|unscheduledoutages|systemnotices)

    Blue Waters has returned to full service after recovery from a power event.

  • Blue Waters system power interruption

    Created by: mshow 2018-06-12 04:56:14 (Channels: mandatory|unscheduledoutages|systemnotices)

    Thunderstorms have resulted in a power interruption of the BlueWaters System. This outage imacts both the compute nodes and all  filesystems. Therefore, a full reboot will be necessary.Return to service is estimated to be approximately 10 am Centeral time.

    BW Admin

  • Blue Waters has Returned to Service

    Created by: tbouvet 2018-06-07 14:27:12 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has returned to full service at 2:14 PM CT. The issue encountered required a full system reboot to resolve. All running jobs were lost so please resubmit your jobs from latest checkpoint file if your job exited prematurely.   

  • Blue Waters System Issue

    Created by: tbouvet 2018-06-07 10:12:41 (Channels: unscheduledoutages)

    Blue Waters is experiencing a full system issue that began at 6:30 AM CT. System support staff are evaluating and attempting to restore normal service but may require a full system reboot. Job scheduling is paused until the issue is resolved. Interim updates will be posted on the Blue Waters...
    Read More