match_instruments {harmonydata}R Documentation

Match Instruments Function

Description

This function takes a list of instruments, converts it to a format acceptable by the database, and matches the instruments using the 'Harmony Data API'. It returns the matched instruments.

Usage

match_instruments(
  instruments,
  topics = list(),
  is_negate = TRUE,
  clustering_algorithm = "affinity_propagation"
)

Arguments

instruments

A list of instruments to be matched.

topics

A list of topics with which to tag the questions. Default is empty.

is_negate

A boolean indicating whether to apply negation-based preprocessing. Default is TRUE.

This option addresses a common limitation in large language model (LLM) embeddings, where antonyms (e.g., "happy" and "sad") may be treated as similar due to contextual overlap. When is_negate = TRUE, the function prepends negation terms such as "not" or "didn't" to the input sentences and evaluates whether this increases or decreases their cosine similarity. If the similarity increases after negation, the model interprets the sentences as antonyms and returns a negative similarity score.

When is_negate = FALSE, negation is skipped and most similarity values returned will be positive.

The Harmony API defaults to is_negate = TRUE, as some users prefer detecting antonymy through negative similarity values, while others may prefer only positive scores.'

clustering_algorithm

A string value to select the clustering algorithm to use. Must be one of: "affinity_propagation", "kmeans", "deterministic", "hdbscan". Default is "affinity_propagation".

Value

A list containing the matched instruments retrieved from the Harmony Data API. The returned object includes attributes such as the similarity matrix, identified clusters, associated cluster topics, and other relevant metadata.

Author(s)

Ulster University [cph]

Examples



instrument_A <- create_instrument_from_list(list(
  "How old are you?",
  "What is your gender?"
))

instrument_B <- create_instrument_from_list(list(
  "Do you smoke?"
))

instruments <- list(instrument_A, instrument_B)

matched_instruments <- match_instruments(
  instruments,
  topics = list("anxiety", "depression")
)



[Package harmonydata version 0.3.1 Index]