Description of Data and Information Management
The goal of the PIE LTER data and information system is to provide a centralized network of information and data related to the Plum Island Ecosystem. This network provides researchers access to common information and data in addition to protected long-term storage. Data and information are also easily accessible to local, regional, and state partners and the broader scientific community. Researchers associated with PIE are committed to the integrity of the information and databases resulting from the research.
Access by the public and scientific community to data and information has been provided since 1998 through the web site, http://ecosystems.mbl.edu/PIE, but we are in the process of migrating to a new web site, http://pie-lter.mbl.edu (see below). The web site contains information on personnel, data, published and unpublished papers, reports and School Yard education. The data section features Core Research and Signature Data, Data Links, Education and Outreach and Physical Characteristics. PIE maintains an Intranet site with archived datasets from which the PIE web site is updated annually. Some datasets (streaming data logger data) are updated more frequently. The organization of the PIE home page mirrors the Intranet archive in nomenclature, which allows for easy updating of datasets. MBL researchers can directly access archived data on MBL's server. Non-MBL researchers have access to a secure FTP web site at MBL for archival back up of their data (both unprocessed and processed). PIE maintains a server at the Rowley Field station to manage streaming of telemetry data from weather, water quality and eddy flux remote stations. Near real time data (helpful when planning research schedules) are available on the Rowley field station website, http://www.pielter.org/ and will be integrated in the new web site.
New PIE Web Site
In a collaborative effort with 8 other LTER sites (LNO, ARC, SEV, LUQ, NTL, VCR, JRN and NWT), we are migrating our web site to Drupal, an open source web based relational content management system. We made the decision to move to Drupal because management of the current PIE web site requires time-consuming manual editing of HTML for updating content, and the web site does not allow for search and discovery of information content. The development of this web site began in 2010 and is expected to be completed by 2013. A Drupal system uses a Linux or Windows OS and Apache, MySQL and PHP (LAMP or WAMP) installation. The resulting Drupal-based web site has powerful capabilities for data and information inquiry of a relational database via an Internet web interface. The Drupal Environmental Information Management System (DEIMS) collaborative, coordinated by Inigo San Gil (LNO, MCR), has grown to include more than LTER sites alone. The goal of PIE, collaborating LTER sites and other DEIMS sites is to provide a viable environmental information content management system with standardized core content types that will leverage programming/coding development. Sharing of code across sites is easily accomplished using Drupal modules and Drupal export/import capabilities of PHP code. Utilizing a variety of Drupal modules such as Views, Panels and Taxonomy (content tagging via controlled vocabulary) relational information content will be easily searched and discovered. The Drupal system, via open source shared modules, leverages the programming capabilities of thousands of programmers around the world, http://drupal.org/home. The DEIMS collaborative has a repository of information located at
PIE information and databases are stored on a MBL Microsoft Windows server with a level 3 RAID array that is backed up on an external drive nightly. Once a month the external drive is replaced and stored in an offsite location. The PIE Drupal web site is served via a Linux virtual machine, LAMP set up, which is backed up nightly and mirrored to an offsite location.
Data Management and Design of Research Projects
Data management and design of research projects is coordinated through an information management team. The information management team consists of: Anne Giblin (Lead PI), Joe Vallino (PI), Robert (Hap) Garritt (IM), Gil Pontius (PI) and research assistants associated with program areas. The team has the necessary leadership, knowledge and technical expertise for creating and maintaining the PIE research information. Hap Garritt, a senior research assistant with The Ecosystems Center, MBL, has been the information manager (IM) since 1998 and has the responsibility for overseeing the overall integrity of the data and information system for PIE LTER. Hap has over 30 years experience in ecological research, an MS in Ecosystems Ecology and is very active in PIE research. Hap's regular research activities involve him with the design and execution of many of the research projects, which allows for a smooth incorporation of data and information into the PIE database.
Several meetings each year provide each researcher the opportunity to communicate with the PIE information management team regarding the design of the specific research project and subsequent incorporation of data and information into the PIE LTER database.
Contributions of Data to Database
Individual researchers are responsible for providing metadata and data via an Excel metadata template for each of the core research areas. Researchers on the PIE-LTER are expected to follow the LTER Network data release policy defined on the LTER web page, http://www.lternet.edu/data/netpolicy.html. Researchers using the facilities of the PIE LTER are expected to comply with the LTER policy even if they are not funded by the LTER. Data files must include accompanying documentation files that completely describe the data. PIE currently uses a Microsoft Excel spreadsheet template for managing metadata and data. The Excel template allows for consistent metadata entry and subsequent conversion via a visual basic macro to XML structured Ecological Metadata Language (EML 2.1.0) according to EML Best Practices for LTER Sites. Individual researchers are responsible for quality assurance, quality control, data entry, validation and analysis for their respective projects. Researchers are reminded about contributions to the database several times during the year via email, teleconference calls and field sampling trips, in addition to announcements during our Annual Spring PIE LTER All Scientists Meeting. LTER researchers who fall behind in their data submission are referred to the Executive Committee for further action.
Data Accessibility and Timeliness
Researchers on the PIE-LTER are required to contribute data to the PIE-LTER database. Researchers on associated projects have been and will continue to be encouraged to both publish and contribute data to the PIE-LTER database. It is recognized that investigators on PIE-LTER have first opportunity for use of data in publications but there is also the realization for timely submittal of data sets for incorporation into the PIE-LTER database. Data are typically posted on the PIE web site within one to two years and selected data is made available in near real time to promote ecological awareness of the local environment. PIE follows the data release policy for the LTER network that states:
“There are two types of data: Type I (data that is freely available within 2 years) with minimum restrictions and, Type II (Exceptional data sets, rare in occurrence that are available only with written permission from the PI/investigator(s)).”
Datasets are available across the broad breadth of PIE research in the watersheds and estuary. We currently have no registration requirements for either observing or downloading data from our web site, which has resulted in seamless access to all PIE LTER data. PIE data downloads on our web site are accompanied by a metadata document, which requests (based on the honor system) users of the data to notify the corresponding Principal Investigator about reasons for acquiring the data and resulting publication intentions. While our current system allows easy access to data, it does not allow us to track individual users using a registration interface. During 2012, we will begin integrating the Data Access Server interface developed by the LTER Network Office (LNO) with the new PIE Drupal web site as a means of standardized registration and documentation on the use of PIE data sets. The Data Access Server will require users interested in downloading data to register and comply with the LTER Network Data Access Policy
File Naming and Site Name Protocol
The Excel metadata template file contains both a sheet for metadata and data.The resulting combined metadata and data file should be named starting with a three letter acronym of the main research theme of the data set (see table below). A second two letter acronym is appended that describes the research site (see table below). A descriptor is then appended that briefly describes the dataset. For instance, dissolved oxygen transect data from the Parker River is named "EST-PR-O2", and tidal-creek nutrient data from the long-term experiments is named "LTE-TC-NUT". The data-manager may edit file names as appropriate.
These same naming methods apply for the PIE LTER site names. All site names will have the same three and two letter descriptors, and an appended descriptor. For instance, the Middle Bridge YSI monitoring station is named "MON-PR-MBYSI". This descriptor will either be an abbreviation describing the station or a river kilometer distance that describes its location along one of the rivers in the Plum Island ecosystem. Please use this River Kilometer Image as an aid in determining names for sampling locations. The image is color-coded to show which two-letter site descriptor code should be used in station naming. The convention that has been established for determining locations on a river is based upon river kilometer, where the mouth of Plum Island Sound is the starting point at 0.0 km. River kilometer markings are clearly labeled in this image as is the 0.0 km starting point in the Sound.
Major Research Theme Acronyms
|Higher Trophic Levels
|Long Term Experiments
|Short Term Projects & Experiments
|Geographic Information Systems
Some Site Descriptors
|Plot-level Marsh Fertilization