Pragmatic features

Local consistency

Semantic similarity between a sentence and its predecessor.

The average semantic similarity score, often calculated by the cosine distance, is a common measure in automatic language processing (NLP) for evaluating the semantic proximity between sentences.

Cosine distance is a measure of similarity between two vectors in a multidimensional space, calculated by measuring the cosine of the angle between them. In NLP, this measure is often used to compare vectors of words or phrases, where the vectors represent the semantic distribution of terms. The cosine_similarity function in the scikit-learn library is an implementation of this measure.

Higher values indicate greater similarity and less semantic distance. Lower values indicate less similarity and greater semantic distance.

Words denoting uncertainty

Words denoting uncertainty about the nature of an image element to be described.

Number of occurrences of the following words in the sample: "think", "look", "like", "kind", "seem", "maybe", "can", "something".

Calculated in two ways:

in absolute numbers
in relation to the total number of words in the sample

Word list inspired by (Garrard et al., 2014), It is necessary to make a French version.

Difficulty finding the right words

Use of words indicating lexical access difficulties.

Number of instances of the following words in the sample: "know", "remember", "unable".

Calculated in two ways:

absolute number
in relation to the total number of words in the sample

Word list inspired by (Garrard et al., 2014) and (Rentoumi et al., 2014), It is necessary to make a French version.

Connotation of speech

Emotions generated by speech. Depends on the average valence of the words in the speech. The average valence score of all the words in the sample will be obtained when the psycholinguistic variables are extracted. For each word, possible scores range from 1 to 9. A higher score indicates that a word has a more positive connotation, while a lower score indicates a more negative connotation.

If the average valence is greater than or equal to 5, the label "positive connotation" will be given to the speech.
If the average valence is greater than or equal to 4 and less than 5, the label "neutral connotation" will be given to the speech.
If the average valence is less than 4, the label "negative connotation" will be given to the speech.

Formulaic expressions

Expressions with a fixed form and non-literal meaning with attitudinal nuances.

Total number of occurrences of the following formulaic expressions in the sample: "well", "so", "I guess", "you know", "as it is", "as it were". (Van Lancker Sidtis et al. 2015)

Calculated in the following two ways:

in absolute numbers
in relation to the total number of words in the sample

Modalizations

An individual's opinions about the content of his or her description (or what is happening on the image to be described), including doubts and concerns about his or her production.

Total number of occurrences of the following expressions in the sample: "I think", "In my opinion", "of course", "naturally", "unsure", "likely", "could be that", "unfortunately", "surely". (Boschi et al. 2017, Boyé et al. 2014)

Calculated in the following two ways:

in absolute numbers
in relation to the total number of words in the sample

Filler words

Words or groups of words used to emphasize what will be said or has been said, or which indicate that an individual is thinking about what to say next.

Total number of times the expressions "you know", "I mean" are mentioned in the sample.

Calculated in two ways:

in absolute numbers
in relation to the total number of words in the sample

Could give information on an individual's lexical access capacity.

Note: This table is a modified version of the one found in Slegers et al. 2021 and Pellerin Sophie.

Variable names

Coherence_locale
Sentiment-valence
Emotion
Nombre_de_mots_incertitude
Frequence_relative_mots_incertitude
Nombre_de_mots_difficulte_acces_lexical
Frequence_relative_mots_difficulte_acces_lexical
Nombre_de_mots_expression_formulaiques
Frequence_relative_mots_expression_formulaiques
Nombre_de_mots_modalisations
Frequence_relative_mots_modalisations
Nombre_de_mots_de_remplissage
Frequence_relative_mots_de_remplissage