Extending the Data Life Cycle


I started this post on the data life cycle while at the Ecological Society of America (ESA) annual meeting last summer. Somehow it got put aside and forgotten. I found it as I was doing some electronic tidying for the New Year. I decided to quickly revive it. Hopefully you don’t need a New Year’s resolution of being better data stewards but if you do maybe some of this will be useful. I find that every time I attend a data management workshop I leave more confused than enlightened. These people seem to speak a different language or talk about something thing without identifying where in the data cycle it comes. I don’t blame them, I think it’s a case of expert blind spots and a lack of knowledge of my part. This is the type of thing I wish I learned in that intro to research/grad school that many of us have to take as first year grad students.

While this is a very incomplete list of resources, hopefully it will help get people started as it helps me get my own data life cycle planning organized. Managing your data for the long term is incredibly important as 80% of data is lost within 20 years! Since I started this post, Ethan White and colleagues published a very nice paper on preparing data for reuse in Ideas in Ecology and Evolution (open access). Below is a list of tools that are designed to help with the data life cycle (with focus on ecology not including genetics and GIS-specific resources):

Data Management Planning

Databases (local computer or local server)

Metadata

Repositories

Find, Retrieve, and Compile Data from Repositories (Google is probably not your best option here)

Other potentially useful information and websites from the ESA Meeting:

Sustainable Environment Actionable Data (SEAD): http://sead-data.net/

  • Active content repository
  • Virtual Archive

Terra Populus (TerraPop): http://www.terrapop.org/

[from their website]: Terra Populus will integrate the world’s population and environmental data, including

  • Population censuses and surveys
  • Land cover information from remote sensing
  • Climate records from weather stations
  • Land use records from statistical agencies
Data Life Stream

Traditional Data Life Stream where data is unavailable for reuse after initial use by the original researchers.

DataOne tool-based description of data life cycle

DataOne tool-based description of data life cycle
Data available for reuse and re-purposing in perpetuity

Advertisements

One thought on “Extending the Data Life Cycle

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s