Knowledge organization
Knowledge organization (KO), organization of knowledge, organization of information, or information organization is an intellectual discipline concerned with activities such as document description, indexing, and classification that serve to provide systems of representation and order for knowledge and information objects. It addresses the "activities carried out and tools used by people who work in places that accumulate information resources (e.g., books, maps, documents, datasets, images) for the use of humankind, both immediately and for posterity. It discusses the processes that are in place to make resources findable, whether someone is searching for a single known item or is browsing through hundreds of resources just hoping to discover something useful. Information organization supports a myriad of information-seeking scenarios."[1] Traditional human-based approaches performed by librarians, archivists, and subject specialists are increasingly challenged by computational (big data) algorithmic techniques. KO as a field of study is concerned with the nature and quality of such knowledge organizing processes (KOP) (such as taxonomy and ontology) as well as the resulting knowledge organizing systems (KOS).
Information science |
---|
General aspects |
Related fields and sub-fields |
Divergent historical and theoretical approaches towards organizing knowledge are based on different views of knowledge, cognition, language, and social organization. This richness lends itself to many complementary ways to consider knowledge organization. The academic International Society for Knowledge Organization (ISKO) engages with these issues via the research journal Knowledge Organization.
Theoretical approaches
One widely used analysis of organizational principles summarizes them as Location, Alphabet, Time, Category, Hierarchy (LATCH).[2]
Traditional approaches
Among the major figures in the history of KO are Melvil Dewey (1851–1931) and Henry Bliss (1870–1955).
Dewey's goal was an efficient way to manage library collections; not an optimal system to support users of libraries. His system was meant to be used in many libraries as a standardized way to manage collections. The first version of this system was created in 1876.
An important characteristic in Henry Bliss' (and many contemporary thinkers of KO) was that the sciences tend to reflect the order of Nature and that library classification should reflect the order of knowledge as uncovered by science:
Natural order --> Scientific Classification --> Library classification (KO)
The implication is that librarians, in order to classify books, should know about scientific developments. This should also be reflected in their education: “Again from the standpoint of the higher education of librarians, the teaching of systems of classification . . . would be perhaps better conducted by including courses in the systematic encyclopedia and methodology of all the sciences, that is to say, outlines which try to summarize the most recent results in the relation to one another in which they are now studied together. . . .” (Ernest Cushing Richardson, quoted from Bliss, 1935, p. 2).
Among the other principles, which may be attributed to the traditional approach to KO are:
- Principle of controlled vocabulary
- Cutter’s rule about specificity
- Hulme’s principle of literary warrant (1911)[3]
- Principle of organizing from the general to the specific
Today, after more than 100 years of research and development in LIS, the “traditional” approach still has a strong position in KO and in many ways its principles still dominate.
Facet analytic approaches
The date of the foundation of this approach may be chosen as the publication of S. R. Ranganathan’s Colon Classification in 1933. The approach has been further developed by, in particular, the British Classification Research Group. In many ways this approach has dominated what might be termed “modern classification theory.”
The best way to explain this approach is probably to explain its analytico-synthetic methodology. The meaning of the term “analysis” is: Breaking down each subject into its basic concepts. The meaning of the term synthesis is: Combining the relevant units and concepts to describe the subject matter of the information package in hand.
Given subjects (as they appear in, for example, book titles) are first analyzed into a few common categories, which are termed “facets”. Ranganathan proposed his PMEST formula — Personality, Matter, Energy, Space and Time:
- Personality is the distinguishing characteristic of a subject.
- Matter is the physical material of which a subject may be composed.
- Energy is any action that occurs with respect to the subject.
- Space is the geographic component of the location of a subject.
- Time is the period associated with a subject.
The information retrieval tradition (IR)
Important in the IR-tradition have been, among others, the Cranfield experiments, which were founded in the 1950s, and the TREC experiments (Text Retrieval Conferences) starting in 1992. It was the Cranfield experiments, which introduced the measures “recall” and “precision” as evaluation criteria for systems efficiency. The Cranfield experiments found that classification systems like UDC and facet-analytic systems were less efficient compared to free-text searches or low level indexing systems (“UNITERM”). The Cranfield I test found, according to Ellis (1996, 3–6) the following results:
system | recall |
---|---|
UNITERM | 82,0% |
Alphabetical subject headings | 81,5% |
UDC | 75,6% |
Facet classification scheme | 73,8% |
Although these results have been criticized and questioned, the IR-tradition became much more influential while library classification research lost influence. The dominant trend has been to regard only statistical averages. What has largely been neglected is to ask: Are there certain kinds of questions in relation to which other kinds of representation, for example, controlled vocabularies, may improve recall and precision?
User-oriented and cognitive views
The best way to define this approach is probably by method: Systems based upon user-oriented approaches must specify how the design of a system is made on the basis of empirical studies of users.
User studies demonstrated very early that users prefer verbal search systems as opposed to systems based on classification notations. This is one example of a principle derived from empirical studies of users. Adherents of classification notations may, of course, still have an argument: That notations are well-defined and that users may miss important information by not considering them.
Folksonomies is a recent kind of KO based on users' rather than on librarians' or subject specialists' indexing.
Bibliometric approaches
These approaches are primarily based on using bibliographical references to organize networks of papers, mainly by bibliographic coupling (introduced by Kessler 1963) or co-citation analysis ( independently suggested by Marshakova 1973[4] and Small 1973). In recent years it has become a popular activity to construe bibliometric maps as structures of research fields.
Two considerations are important in considering bibliometric approaches to KO:
- The level of indexing depth is partly determined by the number of terms assigned to each document. In citation indexing this corresponds to the number of references in a given paper. On the average, scientific papers contain 10–15 references, which provide quite a high level of depth.
- The references, which function as access points, are provided by the highest subject-expertise: The experts writing in the leading journals. This expertise is much higher than that which library catalogs or bibliographical databases typically are able to draw on.
The domain analytic approach
Domain analysis is a sociological-epistemological standpoint. The indexing of a given document should reflect the needs of a given group of users or a given ideal purpose. In other words, any description or representation of a given document is more or less suited to the fulfillment of certain tasks. A description is never objective or neutral, and the goal is not to standardize descriptions or make one description once and for all for different target groups.
The development of the Danish library “KVINFO” may serve as an example that explains the domain-analytic point of view.
KVINFO was founded by the librarian and writer Nynne Koch and its history goes back to 1965. Nynne Koch was employed at the Royal Library in Copenhagen in a position without influence on book selection. She was interested in women’s’ studies and began personally to collect printed catalog cards of books in the Royal Library, which were considered relevant for women’s studies. She developed a classification system for this subject. Later she became the head of KVINFO and got a budget for buying books and journals, and still later, KVINFO became an independent library. The important theoretical point of view is that the Royal Library had an official systematic catalog of a high standard. Normally it is assumed that such a catalog is able to identify relevant books for users whatever their theoretical orientation. This example demonstrates, however, that for a specific user group (feminist scholars), an alternative way of organizing catalog cards was important. In other words: Different points of view need different systems of organization.
DA is the only approach to KO which has seriously examined epistemological issues in the field, i.e. comparing the assumptions made in different approaches to KO and examining the questions regarding subjectivity and objectivity in KO. Subjectivity is not just about individual differences. Such differences are of minor interest because they cannot be used as guidelines for KO. What seems important are collective views shared by many users. A kind of subjectivity about many users is related to philosophical positions. In any field of knowledge different views are always at play. In arts, for example, different views of art are always present. Such views determine views on art works, writing on art works, how art works are organized in exhibitions and how writings on art are organized in libraries (see Ørom 2003). In general it can be stated that different philosophical positions on any issue have implications for relevance criteria, information needs and for criteria of organizing knowledge.
See also
- Automatic document classification
- Body of knowledge
- Dewey decimal classification
- Discipline (academia)
- Document classification
- Information ecology
- Knowledge organization systems
- Library and Information Science
- Library classification
- List of academic fields
- Outline of academic disciplines
- Personal information management
References
- Joudrey, Daniel N., and Arlene G. Taylor. The Organization of Information, 4th ed. Santa Barbara,CA: Libraries Unlimited, 2018.
- Richard Saul Wurman, Information Anxiety, 1990 ISBN 0553348566
- HULME, E.W. Principles of book classification. Library Association Record, n.13-14, 1911-1912.
- "System of Document Connections Based on References" (PDF). Nauchn-Techn.Inform. 1973.