2016 SPIN Student Projects
Gregory Bauer, University of Illinois at Urbana-Champaign
Usage Details
Gregory Bauer, Mohammadreza Ahmadzadehraji, Yushuo Lin, Aniruddha Pispati, Samuel Kaufman, Yibo JiangStudents Pushing Innovation (SPIN) is an outreach and training effort by NCSA to expose students to the various projects located at NCSA.
The current 2016 SPIN cadre has several students working with mentors on the Blue Waters project. These projects are:
Mentor: Jeremy Enos
A universal monitoring issue spans a range of Unix platforms, from HPC supercomputers to standalone Linux servers. Particularly for administrative activity on systems with shared administrative responsibility, tracking who did what and when is a well-known challenge. Several approaches have been taken to address the issue, but all have specific weaknesses or cannot meet a rapidly evolving need. With the Multi-Session Monitoring tool concept, a specific set of gateway hosts would be enabled with secure shell session recording capability. Session propagation which traversed the gateways would then be recorded as well, and would therefore need no custom software changes on any further endpoints in order to be included in the recording. Further, logic could be applied to the recording content or system information outside of the sessions to identify when sessions propagated to different hosts, or when an identity change took place. Ultimately, these recordings would be cataloged and available for historical or live views from a web-based interface (a python-based session recorder to HTML already exists and could potentially be extended to this effort). The interface would permit filtering on various criteria, starting with user, endpoint host, idle time, command, and point in time (or live). The result would be the capability to use the interface to very rapidly assess the fourth item given three are provided from who, what, when, and where an action was taken. This type of tool would make waves in a community facing challenges of multi-stewardship of Unix resources, particularly when some endpoints (e.g., Unix "appliances") are not permitted to be modified with instrumentation tools already in existence to partially address this challenge.
Mentor: Jeremy Enos
The large-scale scientific simulation data routinely generated by researchers on Blue Waters is central to the process by which scientific breakthroughs are made. While this data presents significant challenges in every stage of its lifecycle, its prominence drives concerted efforts at each of those stages. Behind-the-scenes however, data analogous to the aforementioned is generated: extensive system diagnostic information. As opposed to the physically-structured simulation data, this diagnostic data may be considered a collection of independent events and/or measurements. As stated, this information is extensive but technical staff have nonetheless taken the steps to curate and store this data so as to remain available for future analysis. We believe the relative obscurity of this data along with its likelihood of having only an indirect impact on discoveries to be the reason that methods for benefitting from this data are ad hoc or incomplete. We also believe this data represents a vast set of opportunities for research and development. In this project we will begin by evaluating current development efforts in visualizing this data toward creating a tool that will actually be deployed and used by HPC professionals. We will then plan future activities through the coordination of perspectives including: existing data, the experts generating the data, those who might benefit from its use, challenges or gaps in coverage that require innovation to address, and what looks fun.
Mentor: Andriy Kot
The Blue Waters applications group has a big data challenge. Not all data can reside in a database. This is especially true for the system monitoring data for Blue Waters. We have a large number of temporal records stored in a collection of relatively big files, around 50GB each. The records are variable in length and are not sorted. Each record contains multiple data points. We would like to extract a subset of data points from a subset of records, then perform some basic statistical operations such as sum, min, max, mean, etc. Doing this in a straightforward way requires a lot of computing time, looking at the same files repeatedly (e.g., for a different set of data points from the same records) requires the same amount of computing time every time. The proposed solution is to index the data files, either by preprocessing or on-demand and store the indexes in a database to accelerate all subsequent queries. The interested SPIN student should have some understanding of file I/O (an understanding of parallel file I/O would be a plus) and some relational database experience.
Mentor: Michael Showerman
This project will involve the development of a mobile application that interfaces to a variety of Blue Waters information services as well as integrating a customizable alert system. This will involve software development of an Android and iOS app, and will be extended from an existing prototype. Prior C++ or Java programming experience is greatly preferred, but mobile programming can be learned as part of this project.
Mentor: Robert Sisneros
The Hadoop MapReduce software framework is commonly used for processing "big data" and as such we are evaluating its potential for success on modern HPC equipment through deployment on Blue Waters. While there has been some work in creating visualization algorithms in that framework there is little overlap with the state of the art for visualizing large-scale HPC simulation data. We would like to explore the design of a software volume renderer that is implemented in this framework that we can stand up quickly and improve incrementally and modularly. The goal of this project is to create a useful and releasable software package with a low level of entry for future contributing developers. The students will learn valuable skills while Blue Waters will benefit from the students contributions to the projects they are working on. For some projects a small amount of node-hours and storage is needed. 5,000 node-hours across XE and XK nodes with the default storage quotas will be needed. Access to system logs, etc. will be done via different group ACLs.