IU Data to Insight Center releases Komadu, a new suite of data provenance software tools to help researchers track and verify big data (touch for more >>)
As today’s researchers deal with ever-expanding data sets and share them with colleagues around the world, it’s increasingly important for that data to have a documented history, proving its validity and quality. Called “data provenance,” this history reveals the origins of each data object, as well as processes applied by various research teams. Good data provenance can have a transformational impact on scientific discovery.
“The Komadu tools are made for capturing, representing, and using data provenance, which tells us where a piece of digital data came from, particularly digital data that has undergone transformation by software algorithms,” said Beth Plale, director of the Data to Insight Center and managing director of IU’s Pervasive Technology Institute. “Who carried out a transformation on a piece of data, why, and when are all critical bits of information to someone interested in using the data in a different setting. Data provenance, for instance, can expose errors that crop up when one day’s run of an image processing pipeline differs from another day because of a missing file.”