lambdaG_visualize {idiolect} | R Documentation |
Visualize the output of the LambdaG algorithm
Description
This function outputs a colour-coded list of sentences belonging to the input Q text ordered from highest to lowest \lambda_G
, as shown in Nini et al. (under review).
Usage
lambdaG_visualize(
q.data,
k.data,
ref.data,
N = 10,
r = 30,
output = "html",
print = "",
scale = "absolute",
cores = NULL
)
Arguments
q.data |
A single questioned or disputed text as a |
k.data |
A known or undisputed corpus containing exclusively a single candidate author's texts as a |
ref.data |
The reference dataset as a |
N |
The order of the model. Default is 10. |
r |
The number of iterations. Default is 30. |
output |
A string detailing the file type of the colour-coded text output. Either "html" (default) or "latex". |
print |
A string indicating the path to the folder where to print a colour-coded text file. If left empty (default), then nothing is printed. |
scale |
A string indicating what scale to use to colour-code the text file. If "absolute" (default) then the raw |
cores |
The number of cores to use for parallel processing (the default is one). |
Value
The function outputs a list of two objects: a data frame with each row being a token in the Q text and the values of \lambda_G
for the token and sentences, in decreasing order of sentence \lambda_G
and with the relative contribution of each token and each sentence to the final \lambda_G
in percentage; the raw code in html or LaTeX that generates the colour-coded file. If a path is provided for the print argument then the function will also save the colour-coded text as an html or plain text file.
References
Nini, A., Halvani, O., Graner, L., Gherardi, V., Ishihara, S. Authorship Verification based on the Likelihood Ratio of Grammar Models. https://arxiv.org/abs/2403.08462v1
Examples
q.data <- corpus_trim(enron.sample[1], "sentences", max_ntoken = 10) |> quanteda::tokens("sentence")
k.data <- enron.sample[2:5]|> quanteda::tokens("sentence")
ref.data <- enron.sample[6:ndoc(enron.sample)] |> quanteda::tokens("sentence")
outputs <- lambdaG_visualize(q.data, k.data, ref.data, r = 2)
outputs$table