Pragmatic features
Local consistency
Semantic similarity between a sentence and its predecessor.
The average semantic similarity score, often calculated by the cosine distance, is a common measure in automatic language processing (NLP) for evaluating the semantic proximity between sentences.
Cosine distance is a measure of similarity between two vectors in a multidimensional space, calculated by measuring the cosine of the angle between them. In NLP, this measure is often used to compare vectors of words or phrases, where the vectors represent the semantic distribution of terms. The cosine_similarity
function in the scikit-learn library is an implementation of this measure.
Higher values indicate greater similarity and less semantic distance. Lower values indicate less similarity and greater semantic distance.
Words denoting uncertainty
Words denoting uncertainty about the nature of an image element to be described.
Number of occurrences of the following words in the sample: "think", "look", "like", "kind", "seem", "maybe", "can", "something".
Calculated in two ways:
- in absolute numbers
- in relation to the total number of words in the sample
Word list inspired by (Garrard et al., 2014), It is necessary to make a French version.
Difficulty finding the right words
Use of words indicating lexical access difficulties.
Number of instances of the following words in the sample: "know", "remember", "unable".
Calculated in two ways:
- absolute number
- in relation to the total number of words in the sample
Word list inspired by (Garrard et al., 2014) and (Rentoumi et al., 2014), It is necessary to make a French version.
Connotation of speech
Emotions generated by speech. Depends on the average valence of the words in the speech. The average valence score of all the words in the sample will be obtained when the psycholinguistic variables are extracted. For each word, possible scores range from 1 to 9. A higher score indicates that a word has a more positive connotation, while a lower score indicates a more negative connotation.
- If the average valence is greater than or equal to 5, the label "positive connotation" will be given to the speech.
- If the average valence is greater than or equal to 4 and less than 5, the label "neutral connotation" will be given to the speech.
- If the average valence is less than 4, the label "negative connotation" will be given to the speech.
Formulaic expressions
Expressions with a fixed form and non-literal meaning with attitudinal nuances.
Total number of occurrences of the following formulaic expressions in the sample: "well", "so", "I guess", "you know", "as it is", "as it were". (Van Lancker Sidtis et al. 2015)
Calculated in the following two ways:
- in absolute numbers
- in relation to the total number of words in the sample
Modalizations
An individual's opinions about the content of his or her description (or what is happening on the image to be described), including doubts and concerns about his or her production.
Total number of occurrences of the following expressions in the sample: "I think", "In my opinion", "of course", "naturally", "unsure", "likely", "could be that", "unfortunately", "surely". (Boschi et al. 2017, Boyé et al. 2014)
Calculated in the following two ways:
- in absolute numbers
- in relation to the total number of words in the sample
Filler words
Words or groups of words used to emphasize what will be said or has been said, or which indicate that an individual is thinking about what to say next.
Total number of times the expressions "you know", "I mean" are mentioned in the sample.
Calculated in two ways:
- in absolute numbers
- in relation to the total number of words in the sample
Could give information on an individual's lexical access capacity.
Note: This table is a modified version of the one found in Slegers et al. 2021 and Pellerin Sophie.
Variable names
Coherence_locale
Sentiment-valence
Emotion
Nombre_de_mots_incertitude
Frequence_relative_mots_incertitude
Nombre_de_mots_difficulte_acces_lexical
Frequence_relative_mots_difficulte_acces_lexical
Nombre_de_mots_expression_formulaiques
Frequence_relative_mots_expression_formulaiques
Nombre_de_mots_modalisations
Frequence_relative_mots_modalisations
Nombre_de_mots_de_remplissage
Frequence_relative_mots_de_remplissage