corpusQuery,KorAPConnection-method {RKorAPClient} | R Documentation |
Search corpus for query terms
Description
corpusQuery
performs a corpus query via a connection to a KorAP-API-server
Usage
## S4 method for signature 'KorAPConnection'
corpusQuery(
kco,
query = if (missing(KorAPUrl)) {
stop("At least one of the parameters query and KorAPUrl must be specified.", call. =
FALSE)
} else {
httr2::url_parse(KorAPUrl)$query$q
},
vc = if (missing(KorAPUrl)) "" else httr2::url_parse(KorAPUrl)$query$cq,
KorAPUrl,
metadataOnly = TRUE,
ql = if (missing(KorAPUrl)) "poliqarp" else httr2::url_parse(KorAPUrl)$query$ql,
fields = c("corpusSigle", "textSigle", "pubDate", "pubPlace", "availability",
"textClass", "snippet", "tokens"),
accessRewriteFatal = TRUE,
verbose = kco@verbose,
expand = length(vc) != length(query),
as.df = FALSE,
context = NULL
)
Arguments
kco |
|
query |
string that contains the corpus query. The query language depends on the |
vc |
string describing the virtual corpus in which the query should be performed. An empty string (default) means the whole corpus, as far as it is license-wise accessible. |
KorAPUrl |
instead of providing the query and vc string parameters, you can also simply copy a KorAP query URL from your browser and use it here (and in |
metadataOnly |
logical that determines whether queries should return only metadata without any snippets. This can also be useful to prevent access rewrites. Note that the default value is TRUE.
If you want your corpus queries to return not only metadata, but also KWICS, you need to authorize
your RKorAPClient application as explained in the
authorization section
of the RKorAPClient Readme on GitHub and set the |
ql |
string to choose the query language (see section on Query Parameters in the Kustvakt-Wiki for possible values. |
fields |
character vector specifying which metadata fields to retrieve for each match. Available fields depend on the corpus. For DeReKo (German Reference Corpus), possible fields include:
Use |
accessRewriteFatal |
abort if query or given vc had to be rewritten due to insufficient rights (not yet implemented). |
verbose |
print some info |
expand |
logical that decides if |
as.df |
return result as data frame instead of as S4 object? |
context |
string that specifies the size of the left and the right context returned in |
Value
Depending on the as.df
parameter, a tibble or a KorAPQuery()
object that, among other information, contains the total number of results in @totalResults
. The resulting object can be used to fetch all query results (with fetchAll()
) or the next page of results (with fetchNext()
).
A corresponding URL to be used within a web browser is contained in @webUIRequestUrl
Please make sure to check $collection$rewrites
to see if any unforeseen access rewrites of the query's virtual corpus had to be performed.
References
https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9026
See Also
KorAPConnection()
, fetchNext()
, fetchRest()
, fetchAll()
, corpusStats()
Other corpus search functions:
fetchAll,KorAPQuery-method
,
fetchNext,KorAPQuery-method
Examples
## Not run:
# Fetch basic metadata for "Ameisenplage"
KorAPConnection() |>
corpusQuery("Ameisenplage") |>
fetchAll()
# Fetch specific metadata fields for bibliographic analysis
query <- KorAPConnection() |>
corpusQuery("Ameisenplage",
fields = c("textSigle", "author", "title", "pubDate", "pubPlace", "textType"))
results <- fetchAll(query)
results@collectedMatches
## End(Not run)
## Not run:
# Use the copy of a KorAP-web-frontend URL for an API query of "Ameise" in a virtual corpus
# and show the number of query hits (but don't fetch them).
KorAPConnection(verbose = TRUE) |>
corpusQuery(
KorAPUrl =
"https://korap.ids-mannheim.de/?q=Ameise&cq=pubDate+since+2017&ql=poliqarp"
)
## End(Not run)
## Not run:
# Plot the time/frequency curve of "Ameisenplage"
KorAPConnection(verbose = TRUE) |>
{
. ->> kco
} |>
corpusQuery("Ameisenplage") |>
fetchAll() |>
slot("collectedMatches") |>
mutate(year = lubridate::year(pubDate)) |>
dplyr::select(year) |>
group_by(year) |>
summarise(Count = dplyr::n()) |>
mutate(Freq = mapply(function(f, y) {
f / corpusStats(kco, paste("pubDate in", y))@tokens
}, Count, year)) |>
dplyr::select(-Count) |>
complete(year = min(year):max(year), fill = list(Freq = 0)) |>
plot(type = "l")
## End(Not run)