Skip to Content

System News

Filtered by: Mandatory | Unscheduled Outages | (Remove Filters)

  • Blue Waters returned to service

    Created by: jenos 2020-05-22 14:07:30 (Channels: unscheduledoutages)

    Blue Waters has been returned to service as of 2:00pm CT after a full system reboot to correct the problem on the high-speed network. All previously running jobs were lost.

    The Blue Waters Team

  • Blue Waters System Reboot in Progress

    Created by: jenos 2020-05-22 09:05:09 (Channels: unscheduledoutages)

    The Blue Waters compute system is being rebooted this morning due to an unexpected complication with hardware and cooling maintenance.  Login node, scheduler, filesystem, and data mover components are not expected to be interrupted by this operation.  Thank you for your patience while the issue is resolved.  Estimated return to...
    Read More
  • Blue Waters returned to service

    Created by: kingda 2020-03-29 18:41:43 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has been returned to service after a full system reboot to correct a problem on the high-speed network. All previously running jobs were lost.

    The Blue Waters Team

  • Blue Waters Unscheduled Outage

    Created by: jenos 2020-03-29 13:01:11 (Channels: unscheduledoutages)

    Blue Waters is experiencing an issue with the high-speed network that began at 12:02 PM CT. System support staff are evaluating and attempting to restore normal service. Job scheduling is paused until the issue is resolved. The file systems and data transfer services are operating normally. Interim updates will be posted on...
    Read More
  • NCSA Operations Update

    Created by: bbode 2020-03-18 08:32:01 (Channels: mandatory)

    Valued partners and collaborators, NCSA is doing our part to protect the health and wellbeing of our employees and the community. Therefore, we have taken the necessary steps to help minimize the risk of infection, including transitioning our staff to working remotely. These measures are to protect our team and to...
    Read More
  • Blue Waters returned to service

    Created by: bbode 2020-03-16 16:07:00 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has been returned to service after a full system reboot to correct a problem on the high-speed network. All previously running jobs were lost.

    The Blue Waters Team

  • Blue Waters Announcement: Nearline Retirement Approaching!

    Created by: bbode 2020-03-02 09:28:06 (Channels: mandatory|scheduledoutages)

    Hello Blue Waters Partners,     The Blue Waters nearline (tape) subsystem will be permanently shutdown March 31, 2020. This is a hard deadline with no extension possible as NCSA is contractually obligated to remove the controlling software and metadata following the shutdown since we will no longer have the right...
    Read More
  • Blue Waters Nearline Phase Out

    Created by: bbode 2020-01-28 10:09:30 (Channels: mandatory)

    Hello Blue Waters Partners,     This is a reminder that the Blue Waters nearline (tape) system will be permanantly shutdown and dismantled on 4/1/2020 (no extension is possible). A rush on data retrieval is expected as the end of nearline service approaches which will be limited by the retrieval throughput...
    Read More
  • BlueWaters returned to service

    Created by: mshow 2019-12-22 23:32:00 (Channels: mandatory|unscheduledoutages)

    After a full system reboot and checkout, the system hs been retrned to full service operations

  • BlueWaters system reboot

    Created by: mshow 2019-12-22 13:55:58 (Channels: unscheduledoutages|systemnotices)

    While the filesystem issue has been resolved, a full system reboot will be required before returning to production status. It is our expectation that the system will return later this evening.

    BW Admin Team

  • Blue Waters: UPDATE Scheduler Remains Paused

    Created by: tbouvet 2019-12-22 08:21:11 (Channels: unscheduledoutages|systemnotices)

    UPDATE: Scheduler Remains Paused as we continue to restore the scratch file system to service. The Blue Waters scheduler is currently paused due to a meda data server issue with the scratch file system. We are actively working the issue and new logins will likely hang without completion. Status updates will be...
    Read More
  • Blue Waters: Scheduler paused because of a scratch file system issue.

    Created by: tbouvet 2019-12-21 20:06:21 (Channels: unscheduledoutages|systemnotices)

    The Blue Waters scheduler is currently paused due to a meda data server issue with the scratch file system. We are actively working the issue and new logins will likely hang without completion.

    Status updates will be posted to the blog on the Blue Waters portal.

    The Blue Waters team.

  • Blue Waters Returned to Service

    Created by: tbouvet 2019-12-21 19:13:07 (Channels: unscheduledoutages|systemnotices)

    Blue Waters returned to service 12/21/2019 at 7:00PM following today's file system issue.

    Please email help+bw@ncsa.illinois.edu to report any issues.

     

  • Blue Waters: Scheduler paused due to file system issue

    Created by: bbode 2019-12-21 13:08:13 (Channels: unscheduledoutages|systemnotices)

    The Blue Waters scheduler is currently paused due to two down storage targets in the scratch file system. Staff are currently working to resolve the issue. 

    Status updates will be posted to the blog on the Blue Waters portal.

    The Blue Waters team.

  • Blue Waters Operations Transitioning

    Created by: jenos 2019-12-19 19:03:08 (Channels: mandatory|systemnotices|general|policychange)

    Hello Blue Waters partners, Today marks the conclusion of regular NSF Blue Waters operations and allocations.  Allocations ending today will be granted the normal 90 day grace period to transfer data off Blue Waters storage systems. After today, jobs remaining in the queue will be permitted to run but job submission...
    Read More
  • Blue Waters Returned to Service

    Created by: bbode 2019-11-28 22:36:06 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has been rebooted and returned to service at 10:35PM following an issue with the high-speed network earlier this afternoon. All running jobs were lost due to the outage.

    Please email help+bw@ncsa.illinois.edu to report any issues.

  • Blue Waters: HSN issues full reboot in progress

    Created by: bbode 2019-11-28 17:59:37 (Channels: unscheduledoutages|systemnotices)

    An issue with the high-speed network on Blue Waters has forced a full system reboot. We currently anticipate a return to service of 11PM CST. 

    Status updates will be posted to the blog on the Blue Waters portal.

    The Blue Waters team.

  • Nearline Tape Library Has Returned To Service

    Created by: briandi 2019-11-14 17:19:54 (Channels: unscheduledoutages|systemnotices)

    The NCSA_Nearline storage subsystem issue on Blue Waters was resolved and the system returned to normal operations at 3:30 pm.

     

  • Nearline Tape Library Emergency System Maintenance

    Created by: briandi 2019-11-14 12:06:32 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, Blue Waters is experiencing an issue on a subset of the HPSS storage subsystem (ncsa#Nearline) that began early this morning. System support staff are evaluating and attempting to restore it to normal service. The rest of Blue Waters subsystems remain in normal operation. Some transfers in and out of...
    Read More
  • Blue Waters Returned to Service

    Created by: kingda 2019-10-01 20:46:38 (Channels: scheduledoutages|unscheduledoutages|systemnotices)

    Blue Waters returned to service at 8:45PM following today's scheduler maintenance. 

    Please email help+bw@ncsa.illinois.edu to report any issues.

  • Scheduled systems testing period extended to 8PM Central

    Created by: jenos 2019-10-01 16:54:57 (Channels: scheduledoutages|unscheduledoutages|systemnotices)

    Update: This testing period will be extended to 8PM due to unanticipated system-related delays. A scheduled systems testing period will take place on Tuesday, October 1st from 7AM to 5PM 8PM Central, necessitating a shutdown of the job scheduler. Compute nodes will not be available during the test period. Login nodes and file...
    Read More
  • Blue Waters outage

    Created by: mshow 2019-10-01 10:15:54 (Channels: unscheduledoutages|systemnotices)

    Multiple cabinets have failed within the Blue Waters system. The failed area will be bypassed and operations will continue. 

  • Blue Waters: NPCF Power Issue Update

    Created by: tbouvet 2019-07-06 03:45:32 (Channels: unscheduledoutages|systemnotices)

    Blue Waters: NPCF Power Issue 7/5/2019 3PM

    All Blue Waters Resources are available except for the compute nodes. Blue Waters Computes are being rebooted and all running jobs were lost. No RTS eta yet.

  • Blue Waters Returned to Service

    Created by: bbode 2019-07-05 17:57:32 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has been rebooted and returned to service at 5:55PM following a power interuption earlier this afternoon. All running jobs were lost due to the outage.

    Please email help+bw@ncsa.illinois.edu to report any issues.

  • Blue Waters: NPCF Power Issue, Scheduler paused expect full reboot

    Created by: tbouvet 2019-07-05 13:32:15 (Channels: unscheduledoutages|systemnotices)

    A power outage at the building housing the Blue Waters system has caused a service interruption; the Login Nodes, Network, Storage, Compute Nodes, and Near-line Storage may be unavailable. It is unknown at this time when a return to service can be expected. Watch the Blue Waters portal blog for updates.

  • Blue Waters: Nearline Tape Library Has Returned To Service

    Created by: tbouvet 2019-05-31 12:58:05 (Channels: unscheduledoutages|systemnotices)

    The Nearline subsystem reboot completed at 11:30AM today.

    All existing transfers should resume as Nearline was returned to service.

  • Nearline Tape Library System Reboot

    Created by: bbode 2019-05-31 09:28:04 (Channels: unscheduledoutages|systemnotices)

    The Nearline subsystem is currently undergoing an emergancy reboot to clear multiple issues. It is expected to return to service by 1PM today.

    All existing transfers will resume once Nearline returns to service.

  • Blue Waters Has Returned to Service

    Created by: tbouvet 2019-05-16 17:15:55 (Channels: unscheduledoutages|systemnotices)

    The storage issue on Blue Waters projects file system has been resolved and the system returned to normal operations at "5:07" PM CT. Any teams who were impacted by the file system issue have been contacted individually. The scheduler has resumed normal operations. Thank you for your patience while this was...
    Read More
  • Blue Waters Project File System Update

    Created by: tbouvet 2019-05-16 09:29:40 (Channels: unscheduledoutages|systemnotices)

    We are in the process of running a file system check and repair of a small portion of the projects file system. When that is complete we will access the results and take appropriate action. Update: File system repair continues and is expected to last until late this afternoon (5/16). If...
    Read More
  • Blue Waters Project File System Issue

    Created by: tbouvet 2019-05-15 15:20:25 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, Blue Waters is currently experiencing a storage server issue for a portion of the projects filesystem. As a result, all I/O transactions targeting the affected storage server will block. A single storage server supplies a small fraction of the file system data and all remaining storage servers continue normal operation....
    Read More
  • Blue Waters OSS Return to Service

    Created by: squaire3 2019-05-12 12:33:07 (Channels: unscheduledoutages|systemnotices)

    The storage issue on Blue Waters was resolved and the system returned to normal operations at 12:30 PM CT. Scheduler has also been resumed.

  • Blue Waters OSS Failover

    Created by: squaire3 2019-05-12 09:18:53 (Channels: unscheduledoutages|systemnotices)

    Blue Waters is currently experiencing a storage server failover on the OSS file system that began at 8:08 AM CT. As a result, all I/O transactions targeting the affected storage server will block until the failover completes on all clients. A single storage server supplies a small fraction of the...
    Read More
  • Nearline Tape Library System Maintenance

    Created by: glasgow 2019-04-29 16:55:33 (Channels: unscheduledoutages|systemnotices)

    Nearline maintenance work is complete and the service has returned to full operations as of: 1600hrs, April 29th, 2019 --- Nearline is undergoing emergency service on one tape library. The work is related to fallout from last week's hardware service and is expected to take approximately 10 hours to complete. Some files may...
    Read More
  • Nearline Tape Library System Maintenance

    Created by: glasgow 2019-04-29 09:42:49 (Channels: unscheduledoutages|systemnotices)

    Nearline is undergoing emergency service on one tape library. The work is related to fallout from last week's hardware service and is expected to take approximately 10 hours to complete. Some files may not be accessible during that time.

    Start time: 0945hrs to ~ 2000hrs, April 29, 2019

  • Blue Waters Return to Service

    Created by: squaire3 2019-04-12 15:41:38 (Channels: unscheduledoutages|systemnotices)

    The high-speed network issue on Blue Waters was resolved and the system returned to normal operations at 3:34 PM CT. Scheduler has also been resumed.

  • Blue Waters Scheduler Paused - HSN Issue

    Created by: squaire3 2019-04-12 15:02:55 (Channels: unscheduledoutages|systemnotices)

    Blue Waters is experiencing an issue on the high-speed network that began at 2:10 PM CT. System support staff are evaluating and attempting to restore normal service. Job scheduling is paused until the issue is resolved. The file systems and data transfer services are operating normally. Logins have been occationally hanging...
    Read More
  • Nearline Endpoint Paused for Storage Maintenance

    Created by: glasgow 2019-04-08 23:14:20 (Channels: unscheduledoutages|systemnotices)

    The Nearline endpoint has now been returned to normal operations.  The Blue Water's Nearline endpoint will be paused beginning at 1700hrs CDT. New and current user actions/requests will be paused and will resume normal activity when the endpoint is released. No user action is necessary. This maintenance...
    Read More
  • Nearline Endpoint Paused for Storage Maintenance

    Created by: glasgow 2019-04-08 16:27:20 (Channels: unscheduledoutages|systemnotices)

    The Blue Water's Nearline endpoint will be paused beginning at 1700hrs CDT. New and current user actions/requests will be paused and will resume normal activity when the endpoint is released. No user action is necessary. This maintenance window will be used to conduct resource management operations that have been deferred...
    Read More
  • Blue Waters Returned to Service

    Created by: tbouvet 2019-04-07 15:14:36 (Channels: unscheduledoutages|systemnotices)

    Blue WAters Users,

    The storage server issue on the scratch file system is resolved. I/O transactions initiated during the outage should have resumed when the Lustre target returned. Blue Waters has resumed normal operations.

  • Blue Waters Scheduler is paused 9:30 AM

    Created by: tbouvet 2019-04-07 10:30:00 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, Blue Waters is currently experiencing a storage server issue for a small portion of the scratch filesystem. As a result, all I/O transactions targeting the affected storage server will block. A single storage server supplies a small fraction of the file system data and all remaining storage servers continue normal...
    Read More
  • BlueWaters cabinet failure

    Created by: mshow 2019-03-23 16:44:36 (Channels: unscheduledoutages|systemnotices)

    The cabinet has been restored.

  • BlueWaters repair complete

    Created by: mshow 2019-03-23 05:13:22 (Channels: unscheduledoutages|systemnotices)

    The cabinet has been restored

  • BlueWaters cabinet failure

    Created by: mshow 2019-03-23 03:46:37 (Channels: unscheduledoutages|systemnotices)

    A cabinet has shutdown resulting in job loss and an incomplete network configuration. It is unknown at this time when a return to service can be expected for that cabinet. Watch the Blue Waters portal blog for updates.

  • Blue Waters returned to service

    Created by: mshow 2019-02-24 03:41:58 (Channels: mandatory|unscheduledoutages|systemnotices)

    The high-speed network issue on Blue Waters was resolved and the system returned to normal operations at 3:30 AM CT.

  • Blue Waters High Speed Network issue

    Created by: mshow 2019-02-23 23:43:19 (Channels: unscheduledoutages|systemnotices)

    Blue Waters is experiencing an issue on the high-speed network that began at 9:48 PM CT. System support staff are evaluating and attempting to restore normal service. Job scheduling is paused until the issue is resolved. The file systems and data transfer services are operating normally. Interim updates will be posted on...
    Read More
  • Blue Waters Notice: System returned to service

    Created by: jenos 2019-02-06 15:31:27 (Channels: unscheduledoutages)

    Blue Waters Users:

    The reboot is complete and the system has returned to service as of 3:14pm CT. Tomorrow's near-line storage maintenance will proceed as planned.

    We apologize for any inconvenience.

     

  • Blue Waters Unplanned Reboot

    Created by: bbode 2019-02-06 10:00:16 (Channels: unscheduledoutages)

    Blue Waters Users, We experienced an issue that has the high speed network in an unrecoverable state. We have to reboot the system to recover and all running jobs will be lost. The login nodes and endpoints (ncsa#Nearline ncsa#BlueWaters) will remain available during the reboot. The current estimate for return to...
    Read More
  • Blue Waters Notice: System returned to service

    Created by: tbouvet 2019-01-12 13:09:15 (Channels: scheduledoutages|unscheduledoutages)

    Blue Waters Users:

    The reboot is complete and the system has returned to service.

    We apologize for any inconvenience.

     

  • Blue Waters Notice: Unplanned Reboot

    Created by: tbouvet 2019-01-12 09:58:18 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users, We experienced an issue that has the high speed network in an unrecoverable state. We have to reboot the system to recover and all running jobs will be lost. The login nodes and endpoints (ncsa#Nearline ncsa#BlueWaters) will remain available during the reboot. The current estimate for return to...
    Read More
  • Blue Waters Notice: Nearline System (HPSS) return to service

    Created by: tbouvet 2019-01-10 15:24:20 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The ncsa#Nearline endpoint is available and has returned to service at 3PM CT. We apologize for any inconvenience.

    -Blue Waters

  • Blue Waters Notice: Nearline System (HPSS) remains unavailable.

    Created by: tbouvet 2019-01-10 09:38:30 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The ncsa#Nearline endpoint is paused pending repairs suffered to systems during the data center power failure on 1/09/19. We apologize for any inconvenience.

    Please check back for an update on the situtation.

    -Blue Waters

     

  • Blue Waters Notice: System has returned to service

    Created by: jenos 2019-01-10 09:37:52 (Channels: unscheduledoutages|systemnotices)

      Blue Waters Users: The power outage at the NPCF building has been resolved.  All Blue Waters services are now available with the exception of the Nearline storage system, which will take a bit longer to recover.  The job scheduler has been resumed and login nodes have access re-enabled.  There will be...
    Read More
  • Blue Waters Notice: Power disruption

    Created by: jenos 2019-01-09 16:21:59 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users: Update 4:20pm: All systems up except Nearline.  Access remains restricted while performance tests complete. A power outage at the building housing the Blue Waters system has caused a service interruption; all running jobs were consequently terminated.  All Blue Waters subsystems have been affected and are currently out of...
    Read More
  • Blue Waters Notice: System returned to service

    Created by: jenos 2018-12-27 23:43:11 (Channels: unscheduledoutages|systemnotices)

    Blue Waters Users:

    The power outage at the Blue Waters building has been resolved and the compute nodes have been restarted. The Blue Waters system has been returned to normal operations at 11:40 PM CT. We apologize for any inconvenience.

     

  • Blue Waters Return to Service

    Created by: tbouvet 2018-11-14 18:35:11 (Channels: unscheduledoutages)

    The meta data server failover on the Home file system completed at 6:23 PM CT. I/O transactions initiated during the failover should have resumed normal operation when failover completed. Blue Waters has resumed normal operations.

  • Blue Waters File System Issue

    Created by: tbouvet 2018-11-14 17:45:36 (Channels: unscheduledoutages)

    Blue Waters is currently experiencing a meta data server failover on the home file system that began at 5:30 PM CT. As a result, all I/O transactions for the home filesystem will block until the failover completes on all clients. The rest of Blue Waters including the other filesystems are operating...
    Read More
  • Blue Waters Return to Service

    Created by: tbouvet 2018-11-14 11:32:42 (Channels: unscheduledoutages)

    The meta data server failover on the Home file system completed at 11:26 AM CT. The file system issue start at 10:45 AM. I/O transactions initiated during the failover should have resumed normal operation when failover completed. Blue Waters has resumed normal operations.

  • Blue Waters File System Issue

    Created by: tbouvet 2018-11-14 11:19:47 (Channels: unscheduledoutages)

    Blue Waters is currently experiencing a meta data server failover on the home file system that began at 11:15 AM CT. As a result, all I/O transactions for the home filesystem will block until the failover completes on all clients. The rest of Blue Waters including the other filesystems are operating...
    Read More
  • BlueWaters returned to service

    Created by: mshow 2018-11-05 06:18:32 (Channels: unscheduledoutages)

    The system has been returned to full service operation after a brief unscheduled outage.

  • BlueWaters system issue

    Created by: mshow 2018-11-05 01:21:18 (Channels: unscheduledoutages|general)

    The system has experienced  a fault that will require a full system shutdown/reboot. This will take several hours before the system is returned to full operations. 

  • Blue Waters Partial Scratch Outage

    Created by: tbouvet 2018-06-23 21:31:02 (Channels: unscheduledoutages|systemnotices)

    Blue Waters experienced a newtork switch failure that resulted in a partial outage of the scratch filesystem (ost168-179) from 7:44 PM CT to 7:59 PM CT. Jobs that ended during this time may have been impacted. I/O transactions targeting the affected storage server should block until the ost targets returned...
    Read More
  • Blue Waters returned to full service

    Created by: mshow 2018-06-12 10:23:37 (Channels: mandatory|unscheduledoutages|systemnotices)

    Blue Waters has returned to full service after recovery from a power event.

  • Blue Waters system power interruption

    Created by: mshow 2018-06-12 04:56:14 (Channels: mandatory|unscheduledoutages|systemnotices)

    Thunderstorms have resulted in a power interruption of the BlueWaters System. This outage imacts both the compute nodes and all  filesystems. Therefore, a full reboot will be necessary.Return to service is estimated to be approximately 10 am Centeral time.

    BW Admin

  • Blue Waters has Returned to Service

    Created by: tbouvet 2018-06-07 14:27:12 (Channels: unscheduledoutages|systemnotices)

    Blue Waters has returned to full service at 2:14 PM CT. The issue encountered required a full system reboot to resolve. All running jobs were lost so please resubmit your jobs from latest checkpoint file if your job exited prematurely.   

  • Blue Waters System Issue

    Created by: tbouvet 2018-06-07 10:12:41 (Channels: unscheduledoutages)

    Blue Waters is experiencing a full system issue that began at 6:30 AM CT. System support staff are evaluating and attempting to restore normal service but may require a full system reboot. Job scheduling is paused until the issue is resolved. Interim updates will be posted on the Blue Waters...
    Read More