Tutorial 1: Merge Multi-ROI Chromatin Trace Data
This tutorial walks through the standard workflow for merging chromatin tracing data from multiple regions of interest (ROIs):
Collect trace files from all ROIs into one folder
Assess quality by computing Pearson correlations between ROIs
Remove ROIs with poor correlation (outliers)
Re-assess the correlation matrix after removal
Merge the remaining trace files into a single table
Statistics on the merged dataset
Next steps — link to Tutorial 2 for quality control
Step 0: Set-up your data and output path
[2]:
import os
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
Set up your folder data path:
[3]:
data_path = "/home/devos/Documents/data_to_compare_pdx1/PDX1"
Set up destination folder for output:
[6]:
dest_path = f"{data_path}/"
Check ROIs detected:
[7]:
print(f"Data path: {data_path}")
print(f"Output path: {dest_path}")
print(f"\nAvailable ROIs:")
for d in sorted(Path(data_path).iterdir()):
if d.is_dir() and "ROI" in d.name:
print(f" {d.name}")
Data path: /home/devos/Documents/data_to_compare_pdx1/PDX1
Output path: /home/devos/Documents/data_to_compare_pdx1/PDX1/
Available ROIs:
016_ROI
017_ROI
018_ROI
019_ROI
020_ROI
021_ROI
022_ROI
023_ROI
024_ROI
025_ROI
026_ROI
027_ROI
028_ROI
029_ROI
030_ROI
031_ROI
Step 1: Collect trace files from all ROIs
collect_files scans each subdirectory of --root for a file matching --example-file. The --variable-part "13" indicates that the ROI number varies; fixed-length matching naturally excludes Matrix files (different filename length).
[22]:
!collect_files --root {data_path} --example-file "Trace_3D_barcode_mask-mask0_ROI-18_Pdx1_filtered_Pdx1.ecsv" --variable-part "18" --copy-to {dest_path}/raw_traces --force
Matched (16):
016_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/016_ROI/Trace_3D_barcode_mask-mask0_ROI-16_Pdx1_filtered_Pdx1.ecsv
017_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/017_ROI/Trace_3D_barcode_mask-mask0_ROI-17_Pdx1_filtered_Pdx1.ecsv
018_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/018_ROI/Trace_3D_barcode_mask-mask0_ROI-18_Pdx1_filtered_Pdx1.ecsv
019_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/019_ROI/Trace_3D_barcode_mask-mask0_ROI-19_Pdx1_filtered_Pdx1.ecsv
020_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/020_ROI/Trace_3D_barcode_mask-mask0_ROI-20_Pdx1_filtered_Pdx1.ecsv
021_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/021_ROI/Trace_3D_barcode_mask-mask0_ROI-21_Pdx1_filtered_Pdx1.ecsv
022_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/022_ROI/Trace_3D_barcode_mask-mask0_ROI-22_Pdx1_filtered_Pdx1.ecsv
023_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/023_ROI/Trace_3D_barcode_mask-mask0_ROI-23_Pdx1_filtered_Pdx1.ecsv
024_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/024_ROI/Trace_3D_barcode_mask-mask0_ROI-24_Pdx1_filtered_Pdx1.ecsv
025_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/025_ROI/Trace_3D_barcode_mask-mask0_ROI-25_Pdx1_filtered_Pdx1.ecsv
026_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/026_ROI/Trace_3D_barcode_mask-mask0_ROI-26_Pdx1_filtered_Pdx1.ecsv
027_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/027_ROI/Trace_3D_barcode_mask-mask0_ROI-27_Pdx1_filtered_Pdx1.ecsv
028_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/028_ROI/Trace_3D_barcode_mask-mask0_ROI-28_Pdx1_filtered_Pdx1.ecsv
029_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/029_ROI/Trace_3D_barcode_mask-mask0_ROI-29_Pdx1_filtered_Pdx1.ecsv
030_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/030_ROI/Trace_3D_barcode_mask-mask0_ROI-30_Pdx1_filtered_Pdx1.ecsv
031_ROI -> /home/devos/Documents/data_to_compare_pdx1/PDX1/031_ROI/Trace_3D_barcode_mask-mask0_ROI-31_Pdx1_filtered_Pdx1.ecsv
Copied 16 file(s) to /home/devos/Documents/data_to_compare_pdx1/PDX1/raw_traces
Step 2: Compute Pearson correlations between ROIs
trace_pearsons computes a pairwise distance map for each ROI (median 3D distance between every barcode pair), then calculates the Pearson correlation between these maps.
A high correlation between two ROIs means they share similar chromatin organization. An ROI with low correlation against all others is likely an outlier (imaging artifact, poor segmentation, etc.).
[15]:
!ls {dest_path}/raw_traces/*.ecsv | trace_pearsons --pipe -O {dest_path}
# Display the correlation matrix
img = mpimg.imread(f"{dest_path}/trace_correlation_matrix.png")
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(img)
ax.axis('off')
plt.tight_layout()
plt.show()
Analyzing 16 trace files...
Processing Trace_3D_barcode_mask-mask0_ROI-16_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-17_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-18_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-19_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-20_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-21_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-22_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-23_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-24_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-25_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-26_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-27_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-28_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-29_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-30_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-31_Pdx1_filtered_Pdx1.ecsv
$ Saved correlation matrix as /home/devos/Documents/data_to_compare_pdx1/PDX1/trace_correlation_matrix.png
$ Saved correlation matrix data in NPY format: /home/devos/Documents/data_to_compare_pdx1/PDX1/trace_correlation_matrix.npy
Step 4: Re-run Pearson to verify improvement
After removing the outlier ROIs, the correlation matrix should show higher overall values.
[19]:
!ls {dest_path}/raw_traces/*.ecsv | trace_pearsons --pipe -O {dest_path}
# Display the updated correlation matrix
img = mpimg.imread(f"{dest_path}/trace_correlation_matrix.png")
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(img)
ax.axis('off')
plt.tight_layout()
plt.show()
Analyzing 12 trace files...
Processing Trace_3D_barcode_mask-mask0_ROI-16_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-18_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-19_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-20_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-21_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-22_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-24_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-25_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-27_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-29_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-30_Pdx1_filtered_Pdx1.ecsv
Processing Trace_3D_barcode_mask-mask0_ROI-31_Pdx1_filtered_Pdx1.ecsv
$ Saved correlation matrix as /home/devos/Documents/data_to_compare_pdx1/PDX1/trace_correlation_matrix.png
$ Saved correlation matrix data in NPY format: /home/devos/Documents/data_to_compare_pdx1/PDX1/trace_correlation_matrix.npy
Step 5: Merge trace files
[20]:
!ls {dest_path}/raw_traces/*.ecsv | trace_merge -F {dest_path} -N merged_traces.ecsv
Number of trace files to merge: 12
$ Merged trace file will contain 31407 traces
Read and accumulated 12 trace files
$ Saving output table as /home/devos/Documents/data_to_compare_pdx1/PDX1/merged_traces.ecsv ...
Finished execution
Step 6: Compute basic statistics
[21]:
!trace_stats --input {dest_path}/merged_traces.ecsv
$ Importing table from pyHiM format
Successfully loaded trace table: /home/devos/Documents/data_to_compare_pdx1/PDX1//merged_traces.ecsv
Statistics for /home/devos/Documents/data_to_compare_pdx1/PDX1//merged_traces.ecsv:
- Number of unique ROIs: 12
- Number of unique chromatin traces: 3388
- Number of unique barcodes: 23
Next steps
The merged trace file is ready for downstream analysis.
Continue with Tutorial 2 — Quality Control to:
Generate detailed quality metrics with
trace_analyzerInterpret barcode detection, neighbor distances, and barcode frequency plots
Decide on filtering thresholds for Tutorial 3
Summary
Step |
Script |
Output |
|---|---|---|
Collect |
|
|
Correlations |
|
|
Filter ROIs |
Python (numpy) |
removed outlier files from |
Merge |
|
|
Statistics |
|
stdout |
Output location: data/output/