MULTIVARIATE STATISTICAL ANALYSIS OF ARCHAEOLOGICAL SAMPLES AND THEIR VISUALISATION

Faltusová,  Veronika; Vaculovič,  Tomáš; Tomkova, KAterina; Kanický,  Viktor

MULTIVARIATE STATISTICAL ANALYSIS OF ARCHAEOLOGICAL SAMPLES AND THEIR VISUALISATION

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Science. Official publication website can be found on muni.cz.

Authors	DILLINGEROVÁ Veronika VACULOVIČ Tomáš TOMKOVA KAterina KANICKÝ Viktor
Year of publication	2018
Type	Conference abstract
MU Faculty or unit	Faculty of Science
Citation
Description	Historical contact between Bavaria and Bohemia was shown in different fields. Archaeology was also influenced by that, so when it comes to findings from these two areas, it is important to investigate their connection. This work aims to find differences or similarities between glass beads found in Bavaria and Bohemia. Elemental composition of glass beads was measured by LA-ICP-MS. From an archaeological point of view, it is usually important to compare main oxides, such as K2O and MgO to differentiate soda-lime natron glass from plant ash glass. However, this study focusses on the use of machine learning algorithms to find patterns in trace elements of the glass beads. With a low number of observations - glass beads (only 51) it is important to find the most suitable algorithm. Therefore, we decided to compare multiple machine learning algorithms for this data set. Its main purpose is to show the possibilities of complex algorithms, such as k-nearest neighbours (KNN), decision trees, e.g. random forest (RF) or linear discriminant analysis (LDA). First of all, an exploratory analysis was performed to determine variables - elements with high distinguishing power. Programming language python with sci-kit learn library and software SAS Enterprise guide was used. This study compared multiple machine learning algorithms for finding differences and similarities of the set of glass beads from Bohemia and Bavaria. These algorithms work on different principles, so in some cases it can be beneficial to use more than just a one. Especially in this one, when each one of them can discover various interesting facts about the observed dataset. Acknowledgment This work was supported by the project CEITEC2020 (LQ1601) made by Ministry of Education, Youth and Sports of the Czech Republic and Student Project Grant at MU (specific research, rector's programme) - Category A (MUNI/A/1288/2017).
Related projects:	CEITEC 2020 Výzkum, vývoj a aplikace v analytické a fyzikální chemii