The Performance of the Czech National Grid Infrastructure after Major Reconfiguration of Job Scheduling System

Investor logo

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

KLUSÁČEK Dalibor TÓTH Šimon

Year of publication 2014
Type Article in Proceedings
Conference Cracow Grid Workshop 2014
MU Faculty or unit

Faculty of Informatics

Citation
Field Informatics
Keywords queue reconfiguration; multi-resource fairness; plan-based scheduling
Description This work describes the outcomes of a large reconfiguration of the job scheduling system used in the Czech National Grid MetaCentrum which has been done in January and July 2014. MetaCentrum serves to various users and research groups. It is very important to guarantee that computational resources are used efficiently and in a fair fashion with respect to different users. With the significant growth of MetaCentrum (1,500 CPU cores in 2009 vs. 10,000 CPU cores in 2014) we recently had to revise our scheduling approaches to better reflect the increased size of the system and the growing heterogeneity of hardware resources and users' workloads. This revision took place in three major steps through the year 2014. First of all, new multi-resource aware fair-sharing algorithm was deployed, in order to improve fairness with respect to growing heterogeneity of resources and users demands. Second, large queue reconfiguration was done, in order to decrease resource fragmentation and improve utilization. Finally, new plan-based job scheduler enabling schedule optimization has been deployed in July 2014, currently managing 5 large computer clusters with 4500 CPU cores.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info