collect_files
Reliability status: stable
Collect exactly one file per subdirectory of ROOT by matching EXAMPLE_FILE. Without –variable-part the match is exact and files are automatically renamed with the subdirectory name. With –variable-part a fixed-length pattern match is used.
usage: collect_files [-h] --root ROOT --example-file EXAMPLE_FILE
[--variable-part VARIABLE_PART] --copy-to COPY_TO
[--force] [--rename]
Named Arguments
- --root
Root directory to scan (immediate subdirectories are collection units)
- --example-file
Example filename to search for in each subdirectory
- --variable-part
Substring in EXAMPLE_FILE that varies across subdirectories. When omitted, exact filename matching is used and files are automatically renamed with the subdirectory name.
- --copy-to
Destination folder for collected files (created if missing)
- --force
Skip subdirectories with no match instead of failing
Default:
False- --rename
Rename collected files by inserting the subdirectory name before the extension (e.g. data.dat from 013_ROI -> data_013_ROI.dat). Automatic when –variable-part is omitted.
Default:
False
Why ?
When merging data from multiple ROIs, you need to collect one file per subdirectory into a single folder. Two problems arise:
Localization files have the same name in every ROI (
localizations_3D_barcode.dat), so they cannot be copied to the same folder without renaming.Trace files have different names (the ROI number varies), but you need to select only the right files and ignore others (e.g. Matrix files).
collect_files solves both problems with a single tool using fixed-length pattern matching.
Two Matching Modes
Exact match (no --variable-part)
When all files share the same name, omit --variable-part. Files are automatically
renamed by inserting the subdirectory name before the extension:
collect_files --root data/RUT \
--example-file "localizations_3D_barcode.dat" \
--copy-to collected/
Result: localizations_3D_barcode_013_ROI.dat, localizations_3D_barcode_014_ROI.dat, etc.
Variable match (with --variable-part)
When filenames differ by a fixed-length substring (e.g. the ROI number), specify
--variable-part so any characters of the same length are accepted at that position:
collect_files --root data/RUT \
--example-file "Trace_3D_barcode_mask-mask0_ROI-13.ecsv" \
--variable-part "13" \
--copy-to collected/
This matches ROI-14.ecsv, ROI-25.ecsv, etc. but rejects ROI-021.ecsv
(different length) and _Matrix_uniqueBarcodes.ecsv (different total length).
Notes
Each immediate subdirectory of
--rootis scanned recursively for exactly one match.If a subdirectory has zero or multiple matches, the script stops with a clear error.
Use
--forceto skip subdirectories with no match.Use
--renameto insert the subdirectory name in the output filename (automatic in exact mode).Original file metadata (timestamps, permissions) is preserved.
Replaces
This script replaces the former localization_cp_files script and the find -exec cp
pattern previously used for trace files.