trace_genomic_coordinates

Reliability status: development

Assign genomic coordinates to a chromatin trace table. It assigns genomic coordinates (Chrom, Chrom_Start, Chrom_End) from the BED file to each row in the trace table based on the ‘Barcode #’ column. If the BED file provides a fifth column, the barcode in the trace table is updated to the new value.

usage: trace_genomic_coordinates [-h] [--input INPUT] --bed BED
                                 [--output OUTPUT] [--pipe] [--auto-continue]

Named Arguments

--input

Path to the input trace file (ECSV format).

--bed

Path to the BED file containing genomic coordinates.

--output

Path to save the updated trace file.

--pipe

inputs Trace file list from stdin (pipe)

Default: False

--auto-continue

Automatically continue processing even with unmatched barcodes

Default: False

Usage example

trace_genomic_coordinates --input trace_file.ecsv --bed bed_file.bed --output output_file.ecsv

BED file format

Warning

No header!

4-column BED (default behavior)

chrX            14785864        14789298                1
chrX            14789398        14792430                2
chrX            14792433        14795380                3
chrX            14795381        14798629                7
chrX            14799003        14802202                8
chrX            14802255        14805371                9
chrX            14805412        14809056                10
chrX            14809057        14812112                11
chrX            14812113        14814817                12
chrX            14814823        14818656                13
chrX            14824792        14829604                14

The last column should contain only numbers and should match the inputs in the Trace table which are themselves taken from the filenames processed by pyHiM.

5-column BED (barcode substitution)

If you provide a fifth column, trace_genomic_coordinates will substitute barcode names in the Trace table: the 4th column must match the barcode ID in the trace table, and the 5th column is the value that will be used after substitution.

chrX            14785864        14789298                1              HoxA1
chrX            14789398        14792430                2              HoxA2
chrX            14792433        14795380                3              HoxA3
chrX            14795381        14798629                7              HoxA7
chrX            14799003        14802202                8              HoxA8
chrX            14802255        14805371                9              HoxA9
chrX            14805412        14809056                10             HoxA10
chrX            14809057        14812112                11             HoxA11
chrX            14812113        14814817                12             HoxA12
chrX            14814823        14818656                13             HoxA13
chrX            14824792        14829604                14             HoxA14