ARCTIC LTER INFORMATION MANAGEMENT
Overall Strategy and Structure.
Information management in the Arctic LTER has two principal aims. The first is to maximize data access both within the project and to other researchers. We try to maximize data access by rapidly adding new data sets to the data base (usually before publication) and by making all of the data sets available for downloading by anyone; the only requirements are: (1) users must identify themselves via the LTER Network’s data access system or the LTER Network Information System (NIS) and (2) NSF and the Arctic LTER project must be acknowledged in any use of the data. The second aim is to optimize data usability and integration for within-site synthesis and modeling, regional and long-term scaling, and multisite or global comparisons and syntheses. Careful planning at the research design stage is required to ensure that any single set of measurements is easily linked to other measurements; typically this includes working closely with collaborating projects so that their work on LTER sites and experiments is optimally integrated.
The structure of our information management system parallels the overall structure of the project, with four major components to the ARC LTER information system linked to the terrestrial, streams, lakes, and landscape interactions research components. A Senior RA, Jim Laundre, is the overall project information manager with responsibility for overseeing the integrity of the ARC information system. Information management is a primary responsibility of all four full-time RAs associated with each of the research components. While each of the four core RAs maintains the data in their area, all are in frequent communication on overall data compatibility and metadata standards (currently two work at the MBL in Woods Hole, one is at University of Michigan, and one at University of Vermont). Each RA is deeply involved in the actual research design, day-to-day management, and data collection within their area. The four RAs work closely in the field with investigators, technicians, and students to ensure quality control and appropriate documentation. For most of the past year we have also employed, with annual supplemental funding, an information management RA specifically charged with validating and uploading our data base to the new “PASTA” system at the LTER Network Data Portal (https://portal.lternet.edu). Overall guidance is provided by the ARC Executive Committee while Laundre attends the LTER Network Information Manager's meetings and makes sure we are kept up to date and compatible with Network data standards.
Each year at our annual winter meeting in Woods Hole we review the status of the information system and ways of improving its accessibility and ease of use. At this meeting we focus in particular on the upcoming summer season and on how to design our research for optimum integration of diverse data sets. All project personnel including postdocs, graduate students, and occasional REU students participate in these discussions.
Availability of Datasets
Datasets of the Arctic LTER project are available from the Arctic LTER web site (http://ecosystems.mbl.edu/arc/Datatable.html) and can be download once a user is registered with the Network Data Access System (http://metacat.lternet.edu/das/). We ask only that the LTER project and the principal investigator responsible for the data set be informed and that NSF and the ARC LTER be acknowledged in any papers published in which the data are used.
Data from the large-scale experiments and from routine monitoring are available online as soon as the data are checked for quality and, where necessary, transformed for presentation in standard units and scales. Many data sets, such as weather observations, stream flow, flower counts, and data that do not require a great deal of post-collection chemical or other analysis, are available within 3-6 months of collection. Other data, particularly from samples requiring chemical analysis in our home laboratories, may take up to two years before they appear on-line. We also request collaborating projects to contribute their datasets to our online database, and most do so. In addition to datasets on our web server the ARC LTER also participates in the LTER Network’s ClimDB, HydroDB, and EcoTrends information systems. These centralized databases provide access to meteorological, hydrological, and long-term change data from all the LTER sites. We have recently begun transferring our data sets into the new “PASTA” system developed for the LTER NIS; this transfer is nearing completion as of late May 2013.
Format of Datasets
Investigators, technicians, and students who collect the data are responsible for data analysis, quality control, and documentation. This ensures that the data are checked and documented by those most familiar with the data. While investigators may use any software for their own data entry and analysis, we expect that all documentation and datasets that are submitted conform to the required ARC LTER formats. The metadata and data are submitted using ARC LTER’s Excel based metadata form. Comments are used extensively throughout the sheet to aid in filling out the data. Data validation lists are used to created drop down lists for units, measurement scale, and number types. For researchers who do not use Excel a rich text form is available with the data being submitted as comma delimited ASCII. Researchers are encouraged to include the metadata worksheet in their Excel workbooks to facilitate documentation. The worksheet was designed to be easily moved or copied. Submitted files are checked for conformance by the four senior RAs. Once files are accepted, they are placed in the appropriate data directories on the web. An Excel macro is used to parse the metadata form and generate html, xml, and data files needed for accessing the data via the web. The xml file conforms to the LTER network’s “EML Best practices” and is PASTA ready.. The xml file is uploaded to the LTER Network Office metacat server and the new LTER Network Data Portal (https://portal.lternet.edu ) via a harvest list. Uploaded files are then available from the LTERNET data catalog or any metacat server.
General site information and publications
General information about the ARC LTER project is provided on our web site (http://ecosystems.mbl.edu/arc/) including site descriptions, past proposals and other documents, a site bibliography including publications based on project research (Table 4-2), educational opportunities, contact information for site personnel, and links to related sites. This information is updated once a year or whenever major changes occur.
Toolik Field Station Environmental Monitoring Program
The Arctic LTER and its precursor projects have maintained an environmental monitoring program at Toolik Lake since 1975, including basic weather data as well as stream and lake observations. These data have always been made available to other projects and to Toolik Field Station (TFS) management but, as the number and diversity of projects at TFS have grown, it has become clear that it would be more appropriate for TFS to maintain these observations and make them available via the TFS web site. Increased support for TFS from NSF-OPP has also made it possible for TFS to make additional observations that the ARC LTER cannot afford by itself.
To accommodate these changes, since September 2006 TFS has gradually assumed responsibility for maintenance and data management of the main Toolik weather station, which LTER has been supporting since 1987. The ARC LTER project is still responsible for collection and management of weather and other data collected from experimental plots and as part of LTER research. Toolik Field Station weather data is available from the TFS web site (http://toolik.alaska.edu/edc/index.php). Also available on the TFS web site is a new weather data query and plotting capability. The TFS Environmental Data Center has added additional components including plant phenological monitoring, bird observations, and other year-round observations of weather and natural history that cannot be made by LTER personnel who are not year-round residents.
Geographic Information Systems, Mapping, and Remote Sensing
Geographic information from the Toolik Lake region is extensive, detailed, and linked to several key global and regional data bases. Because much of this first-class information system was developed with funding independent from the ARC LTER project, we have focused our efforts on insuring access to this valuable database and on optimizing its usability for our needs. Where appropriate, we have contributed some funds and personnel support to guarantee this access and usability. Links to the key databases are provided on the Arctic LTER web site at http://ecosystems.mbl.edu/arc/datacatalog.html; these include:
The data are stored on an Microsoft windows server with level 3 raid. The data files are backed up on backup storage drive daily. Once a month a drive is removed and stored in a separate building. In the near future the data and web site will be part of the Marine Biological data backup system which will provide off site and redundent storage.
Anticipated changes, 2013-2017
Several changes are planned to our overall Information Management strategy and practices. Our current approach was reviewed favorably in the 2007 Site Review, with no major changes recommended. We do plan to continue organizing and making available older “legacy” data sets in line with LTER NISAC recommendations. We are currently completing the transition of our metadata from EML Best Practices level 2/3 (no attribute EML) to the new PASTA system (all of the old data sets are still available using METACAT; ~200 data sets are available in PASTA). Bringing the metadata up to PASTA standards requires review and where appropriate consolidated into multi-year files. Differences in methods and personnel will require that some years remain separate. For some data sets we will be using a relational database for storing and retrieving subsets of data.
We will also be implementing a content management system framework based on the Drupal Environmental Information Management System (DEIMS): This multiple site LTER effort is aimed at using the Drupal Content Management System to deploy a data model based on Ecological Metadata Language (EML) and to develop a common set of tools for use at LTER sites. This implementation will allow us to meet and exceed the new LTER Executive Board expectations for data accessibility, specifically concerns about core and non-core data sets. For more information see the 2009 LTER ASM workgroup “No dead end information” website, http://asm.lternet.edu/2009/workgroups/no-dead-ends-lter-information-website, currently we have a beta site at http://arc-lter.ecosystems.mbl.edu.
As described above, Toolik Field Station started an environmental monitoring program in 2006 and has taken over some of the basic weather and environmental measurements, e.g., precipitation chemistry; all of these data are regularly added to the ARC data base. Plans are also underway to work with the Toolik Field Station GIS manager to generate EML files for some of the basic site GIS files. This would include the research locations and layers with vegetation, topography, streams, and lakes.
As the research program at TFS grows we expect increased challenges as well as opportunities for information management. Two that are likely to affect our work in the next six years are (1) establishment of the Arctic Observatory Network (AON) including several projects at TFS, and (2) establishment of a National Environmental Observatory Network (NEON) site at TFS. Carbon, water, and energy-balance data sets from collaborating projects of the AON program are already available at http://ecosystems.mbl.edu/arc/AON/AONdata.html.
||Please contact firstname.lastname@example.org questions, comments, or for technical assistance regarding this web site.|