Workshop attendees are being identified through a mix of invitation and expression of interest. Position papers are invited from interested individuals, institutions, and organizations (formal and virtual), whether or not they have already been invited. Position papers are invited from:
The National Science Foundation is sponsoring a workshop on the topic “Cyberinfrastructure Software Sustainability,” to be held March 26 and 27 2009 at Indiana University. As a way to encourage broad and incisive community input regarding this topic, the workshop organizers are issuing an open call for position papers. Position papers should be no longer than 3 pages.
The iPlant Collaborative (iPlant) was funded in February 2008 by NSF as a project to “create a new type of organization – a cyberinfrastructure collaborative for the plant sciences” – that seeks to transform the way plant biologists answer Grand Challenge questions and collaborate in the data-laden and cross-disciplinary research environment in which we now live.
Grand Challenges in the plant sciences are research questions that are currently intractable with conventional approaches. For example: What are the genetics of species range limits? How do we improve crop yield under environmental stress? Which genes and pathways have an effect on ecophysiological behavior of plants? The iPlant Collaborative focuses on using cyberinfrastructure development as one way to resolve such questions.
The iPlant Collaborative would very much like the opportunity to learn from leaders in software sustainability and reusability. As our project moves forward, these issues must be addressed from the beginning of our software and data life cycles.
As the nation prepares for the TeraGrid “eXtreme Digital” future, establishing an independent, unbiased national software oversight committee focused on meeting the requirements of a much broader national science community could revitalize the nation’s interest in computational science and enable researchers to secure maximum advantage from the world class resources to be deployed. The goals, structure, and responsibilities of such an oversight committee is outlined as a proactive mechanism for dealing with software sustainability into the future.
Scientific research is based upon the ability to compare theory and simulations with experimental and observational data. The data are assembled in reference collections to enable comparison of future analyses with the current state-of-the-art understanding. The reference collections are published for use by the entire scientific discipline. This process is used by all science disciplines to document scientific progress and facilitate exchange of knowledge. We present an approach that has been used to implement data grid technology that is used internationally in support of data sharing, data publication, data preservation, and data analysis.
Open source development relies primarily on voluntary effort. To be sustained by open source development, cyberinfrastructure projects must survive the transition from being sustained by funding to being sustained by volunteers. They must be able to attract and retain volunteers. We point out that although examples of successful open source projects exist, it is not yet possible to ascertain a priori whether individual projects can attract and hold sufficient numbers of volunteers to be successful. We assert that infrastructure projects are more suitable to open source development than are research projects because motives of the founders predispose them open projects to communities of developers very early, establishing communities that are more likely to survive the transition from funding to independence. Finally, we encourage funding agencies that are interested in open source development to formulate policies and guidelines that clearly set expectations for the transition from funding to independence.
This paper summarizes my personal thoughts on how the Fedora Project can serve as a model for the
NSF to follow to sustain the development and widespread use of a diverse ecosystem of scientific
software in an open and transparent manner. The NSF's role would parallel the one which Red Hat
currently plays in the Fedora Project, by making key investments and setting project direction, while
leaving room for community to grow the project in new ways. I've been selected to write this paper
from the Fedora perspective in part because, as a scientific researcher and Fedora contributor, I straddle
both the science and software development communities. In the following discussion I focus on two
aspects of the Fedora project structure I feel are most relevant to the CI Software Sustainability
Workshop goals and NSF's core mission.
The Open Source Software development model provides a number of benefits for Cyberinfrastructure Software development. We describe how the sustainability for Open Source by using the foundation community model. Foundations are created to address specific issues and provide a common infrastructure, rules, and licenses for individual projects. These foundations provide longevity, standard expectations, and cultural commonality for software projects that we expect to increase cyberinfrastructure sustainability. We elaborate on ways that granting agencies can interact with foundations to develop software and note that some forethought must go into the design and governance of these software foundations.
It is important in considering sustainability to ask what needs to be sustained.
Specifically, is it software products, or the capabilities enabled by that software? In this
position paper, we argue that sustaining capabilities and architecting to enable change are
often the better choice, particularly when one considers the costs and consequences of
maintaining a specific software product. Further, these approaches are not independent –
software products architected to support change are themselves more maintainable. We explore
the idea of sustaining capabilities in more detail, discuss its application to scientific
cyberinfrastructure, and conclude with a discussion of how this reframes the discussion
of sustainability and leads to different approaches to achieving long-term sustainability of
scientific cyberinfrastructure.
This paper describes our position about the sustainability and support required thereof for Cyber Infrastructure(CI) software with the perspective from the management of GridChem TeraGrid Science Gateway as a production quality cyber-environment currently in use. We present a case here to elaborate the national CI software policy also from the perspectives of developers of multiple components of domain engaging CI. In this process we also identify possible outlook for NSF to provide goal oriented support for the CI software sustainability with science and engineering research communities needs for the 21st century multi-disciplinary endeavors. Continual research community development, engagement and involvement in CI software development, deployment, usage is noted one of the critical components of CI software sustainability.
In all areas of science, technology, and digital industry, vast amounts of data is produced that needs to be reliably and efficiently stored, processed, accessed and shared. The inadequacy of traditional distributed computing systems in dealing with complex data handling problem in our new data-rich world requires a new paradigm called data-aware distributed computing. The advancement in data-aware distributed computing will capitalize NSF’s investments on TeraGrid, DataNets and other large-scale cyberinfrastracture and computational science efforts; and will directly impact scientific discovery and economic development in the nation.