Evaluating the Use of Clustering for Automatically Organising Digital Library Collections

Hall, Mark, Clough, Paul and Stevenson, Mark (2012) Evaluating the Use of Clustering for Automatically Organising Digital Library Collections. Theory and Practice of Digital Libraries 2012, 23-27 September 2012, Paphos, Cyprus, 7489, pp. 323-334, ISBN 9783642332890, ISSN 0302-9743, DOI https://doi.org/10.1007/978-3-642-33290-6_35.

halletal2012b.pdf - Accepted Version

Download (473kB)


Large digital libraries have become available over the past years through digitisation and aggregation projects. These large collections present a challenge to the new user who wishes to discover what is available in the collections. Subject classification can help in this task, however in large collections it is frequently incomplete or inconsistent. Automatic clustering algorithms provide a solution to this, however the question remains whether they produce clusters that are sufficiently cohesive and distinct for them to be used in supporting discovery and exploration in digital libraries. In this paper we present a novel approach to investigating cluster cohesion that is based on identifying instruders in a cluster. The results from a human-subject experiment show that clustering algorithms produce clusters that are sufficiently cohesive to be used where no (consistent) manual classification exists.

Item Type: Conference or Workshop Item (Paper)
Additional Information: Second International Conference, TPDL 2012, Paphos, Cyprus, September 23-27, 2012. Proceedings
Subjects: Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Divisions: Computing and Information Systems
Date Deposited: 23 Oct 2013 10:36
URI: http://repository.edgehill.ac.uk/id/eprint/5719

Archive staff only

Item control page Item control page