KH Coder

Last updated
KH Coder
Developer(s) Koichi Higuchi
Stable release
2.00f / Dec 2015
Preview release
3.Beta.01 / Mar 2020
Repository
Operating system Microsoft Windows, Linux, macOS
Type Qualitative data analysis, Text mining, Content analysis
License GPL2 license
Website khcoder.net/en/

KH Coder is an open source software for computer assisted qualitative data analysis, particularly quantitative content analysis and text mining. It can be also used for computational linguistics. It supports processing and etymological information of text in several languages, such as Japanese, English, French, German, Italian, Portuguese and Spanish. Specifically, it can contribute factual examination co-event system hub structure, computerized arranging guide, multidimensional scaling and comparative calculations. [1]

Contents

It is well received by researchers worldwide and used in a large number of disciplines, including neuroscience, sociology, psychology, public health, media studies, education research and computer science. There are more than 500 English research papers listed in Google scholar. [2] More than 3500 academic research papers were published that use KH Coder according to a list compiled by the author. [3]

KH Coder has been reviewed as a user friendly tool "for identifying themes in large unstructured data sets, such as online reviews or open-ended customer feedback" [4] and has been reviewed in comparison to WordStat. [5]

Features

Its features include:

KH Coder allows for further search and statistical analysis functions using back-end tools such as Stanford POS Tagger, the natural language processing toolkit FreeLing, Snowball stemmer, MySQL and R.

Alternatives

See also

Related Research Articles

<span class="mw-page-title-main">Analysis</span> Process of understanding a complex topic or substance

Analysis is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle, though analysis as a formal concept is a relatively recent development.

Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a knowledge discovery in databases (KDD) process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.

<span class="mw-page-title-main">Content analysis</span> Research method for studying documents and communication artifacts

Content analysis is the study of documents and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner. One of the key advantages of using content analysis to analyse social phenomena is their non-invasive nature, in contrast to simulating social experiences or collecting survey answers.

Unstructured data is information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated in documents.

ATLAS.ti is a computer-assisted qualitative data analysis software that facilitates analysis of qualitative data for qualitative research, quantitative research, and mixed methods research.

In the social sciences, coding is an analytical process in which data, in both quantitative form or qualitative form are categorized to facilitate analysis.

NVivo is a qualitative data analysis (QDA) computer software package produced by Lumivero. NVivo helps qualitative researchers to organize, analyze and find insights in unstructured or qualitative data like interviews, open-ended survey responses, journal articles, social media and web content, where deep levels of analysis on small or large volumes of data are required.

<span class="mw-page-title-main">RQDA</span> Qualitative data analysis tool

RQDA is an R package for computer assisted qualitative data analysis or CAQDAS. It is installable from, and runs within, the R statistical software, but has a separate window running a graphical user interface. RQDA's approach allows for tight integration of the constructivist approach of qualitative research with quantitative data analysis which can increase the rigor, transparency and validity of qualitative research.

Computer-assistedqualitative data analysis software (CAQDAS) offers tools that assist with qualitative research such as transcription analysis, coding and text interpretation, recursive abstraction, content analysis, discourse analysis, grounded theory methodology, etc.

MAXQDA is a software program designed for computer-assisted qualitative and mixed methods data, text and multimedia analysis in academic, scientific, and business institutions. It is being developed and distributed by VERBI Software based in Berlin, Germany.

Patent visualisation is an application of information visualisation. The number of patents has been increasing steadily, thus forcing companies to consider intellectual property as a part of their strategy. Patent visualisation, like patent mapping, is used to quickly view a patent portfolio.

In statistics and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear approximately equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about 9 times more dog words than cat words. The "topics" produced by topic modeling techniques are clusters of similar words. A topic model captures this intuition in a mathematical framework, which allows examining a set of documents and discovering, based on the statistics of the words in each, what the topics might be and what each document's balance of topics is.

QSR International is a qualitative research software developer based in Burlington, Massachusetts, with offices in Australia, Germany, New Zealand, and the United Kingdom. QSR International is the developer of qualitative data analysis (QDA) software products, NVivo, NVivo Server, Interpris and XSight. These are designed to help qualitative researchers organize and analyze non-numerical or unstructured data.

Aquad is a free computer-assisted qualitative data analysis software (CAQDAS) that supports content analysis of open data in qualitative research in psychology, education, sociology, philosophy, medicine, ethnography, politics, etc. Open data is collected through observation, introspection, narratives, discussion groups, interviews, etc.

QDA Miner is mixed methods and qualitative data analysis software developed by Provalis Research. The program was designed to assist researchers in managing, coding and analyzing qualitative data.

WordStat is a content analysis and text mining software. It was first released in 1998 after being developed by Normand Peladeau from Provalis Research. The latest version 9 was released in 2021.

Thematic analysis is one of the most common forms of analysis within qualitative research. It emphasizes identifying, analysing and interpreting patterns of meaning within qualitative data. Thematic analysis is often understood as a method or technique in contrast to most other qualitative analytic approaches - such as grounded theory, discourse analysis, narrative analysis and interpretative phenomenological analysis - which can be described as methodologies or theoretically informed frameworks for research. Thematic analysis is best thought of as an umbrella term for a variety of different approaches, rather than a singular method. Different versions of thematic analysis are underpinned by different philosophical and conceptual assumptions and are divergent in terms of procedure. Leading thematic analysis proponents, psychologists Virginia Braun and Victoria Clarke distinguish between three main types of thematic analysis: coding reliability approaches, code book approaches and reflexive approaches. They describe their own widely used approach first outlined in 2006 in the journal Qualitative Research in Psychology as reflexive thematic analysis. Their 2006 paper has over 120,000 Google Scholar citations and according to Google Scholar is the most cited academic paper published in 2006. The popularity of this paper exemplifies the growing interest in thematic analysis as a distinct method.

Quantitative Discourse Analysis Package (qdap) is an R package for computer assisted qualitative data analysis, particularly quantitative discourse analysis, transcript analysis and natural language processing. Qdap is installable from, and runs within, the R system.

<span class="mw-page-title-main">Quirkos</span>

Quirkos is a CAQDAS software package for the qualitative analysis of text data, commonly used in social science. It provides a graphical interface in which the nodes or themes of analysis are represented by bubbles. It is designed primarily for new and non-academic users of qualitative data, to allow them to quickly learn the basics of qualitative data analysis. Although simpler to use, it lacks some of the features present in other commercial CAQDAS packages such as multimedia support. However, it has been proposed as a useful tool for lay and participant led analysis and is comparatively affordable. It is developed by Edinburgh, UK based Quirkos Software, and was first released in October 2014.

References

  1. S. N. Vinithra, S.N; Arun Selvan, S.J.; Anand Kumar, M.; Soman, K.P. (2015): Simulated and Self-Sustained Classification of Twitter Data based on its Sentiment. Indian Journal of Science and Technology. Vol. 8, Issue 24
  2. Google Scholar search using Keywords "KH Coder" and "KHCoder"
  3. Higuchi, Koichi (2017): Scholarly research using KH Coder
  4. Towler, Will (2014): Text Analytics For Everyone. UX Magazine, July 31, 2014.
  5. Huirong, Cheng;Guobin, Huang; Lin, Zheng (2015): Comparison of Software for Unstructured Text Analysis:KH Coder vs. Wordstat. 图书与情报, 2015(04): 110-117.