Job Provenance - Insight into Very Large Provenance Datasets

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Science. Official publication website can be found on muni.cz.

Authors	KŘENEK Aleš MATYSKA Luděk SITERA Jiří RUDA Miroslav DVOŘÁK František FILIPOVIČ Jiří ŠUSTR Zdeněk SALVET Zdeněk
Year of publication	2008
Type	Article in Proceedings
Conference	Provenance and Annotation of Data and Processes. Second International Provenance and Annotation Workshop, IPAW 2008; LNCS
MU Faculty or unit	Faculty of Science
Citation
web	http://www.springerlink.com/content/m634jl860g15/
Field	Computer hardware and software
Keywords	Provenance
Description	Following the job centric monitoring concept, Job Provenance (JP) service organizes provenance records on the per job basis. It is designed to manage very large number of records, as was required in the EGEE project where it was developed originally. The quantitative aspect is also a focus of the presented demonstration. We show JP capability to retrieve data items of interest from a large dataset of full records of more than 1 million of jobs, to perform non trivial transformation on those data, and organize the results in such a way that repeated interactive queries are possible. The application area of the demo is derived from that of previous Provenance Challenges. Though the topic of the demo (a computational experiment) is arranged rather artificially, the demonstration still delivers its main message that JP supports non-trivial transformations and interactive queries on large data sets.
Related projects:	Highly Parallel and Distributed Computing Systems