trace_export_to_fofct

Reliability status: development

Convert a pyHiM trace table (ECSV) to a FOF-CT CSV file. (https://fish-omics-format.readthedocs.io/en/latest/index.html)

We use a JSON file to store the metadata of the experiment, such as: - the genome assembly - the experimenter’s name - contact information

required_keys = [“genome_assembly”, “experimenter_name”, “experimenter_contact”]

usage: trace_export_to_fofct [-h] [--ecsv_file ECSV_FILE]
                             [--bed_file BED_FILE] [--json_file JSON_FILE]
                             [--output_file OUTPUT_FILE]

Named Arguments

--ecsv_file

Path to the ECSV file

--bed_file

Path to the BED file

--json_file

Path to the JSON file with the metadata. Default: parameters.json

Default: '/home/docs/checkouts/readthedocs.org/user_builds/traceratops/checkouts/release-0.4.0/parameters.json'

--output_file

Path to the output CSV file

Example BED file

The bed file should have no header.

chr2L	2343645	2356099	5
chr2L	2356147	2369783	9
chr2L	2369828	2381912	13
chr2L	2381947	2393854	17
chr2L	2393892	2405589	21

Notes

For this first version, we don’t consider, from the pyHiM trace table:

  • the “mask_id” column: it’s can be linked to the “Cell_ID” column but sometimes it’s not the case, two “mask_id” can be linked to the same Cell_ID.

  • the “label” column: usually it’s linked to RNA species but it’s like a global mask for many cells, so it’s not RNA spots.

The output CSV file will have the following columns:

  • Spot_ID

  • Trace_ID

  • X

  • Y

  • Z

  • Chrom

  • Chrom_Start

  • Chrom_End

  • Extra_Cell_ROI_ID (“ROI #” in the pyHiM trace table)

We need as run arguments:

  • the path to the ECSV file

  • the path to the BED file

  • the path to the JSON file (optional)

  • the path to the output CSV file (optional)

Example

trace_export_to_fofct --ecsv_file /path/to/Trace_3D_barcode_KDtree_ROI-5.ecsv --bed_file /path/to/barcode.bed --json_file /path/to/parameters.json --output_file /path/to/output.csv

Example json file:

{
  "genome_assembly": "GRCh38",
  "experimenter_name": "Dr. Pirulo",
  "experimenter_contact": "pirulo@gmail.com"
}

To link the traces to the chromosomes, we use a BED file that contains the barcode information. We expect the BED file to have the following columns:

  • chrName

  • startSeq

  • endSeq

  • Barcode_ID