speechmetryflow

Automated nextflow-based workflow designed to extract both audio and text metrics from speech tasks (like picture descriptions) at scale.

Running

nextflow run lingualab/speechmetryflow -r {last_release_or_tag} --input participant_ids.csv

Replace the -r option with the release you want to use

Files needed

participant_ids.csv

This CSV file must contain at least 4 columns:

participant_id is required for the pipeline to find your files. These files must begin by the participant_id. To specify the folder where your files are located, see nextflow.config.
language: 2 choices, en or fr.
sex: 2 choices, male or female.
task: 2 choices, cookie_theft or picnic.

Example:

participant_id	language	sex	task
sub-PKM8767	en	male	cookie_theft
sub-SBK4467	en	female	picnic

nextflow.config

Example for elm server:

params {
    audio_folder = "/data/brambati/dataset/CCNA/derivatives/audio_extract"
    text_folder = "/data/brambati/dataset/CCNA/derivatives/cookie_txt"
}

And then run:

nextflow run lingualab/speechmetryflow -r {last_release_or_tag} -profile unf_elm --input participant_ids.csv

output

The pipeline produces 3 csv files:

population_lingualab_audio: metrics compute with lingualabpy_lingualab_audio from lingualabpy
population_uhmometer_metrics: metrics compute with uhm-o-meter
population_lingualab_text: metrics compute with Text2Variable