Saving & Loading single cells and trajectory data

In this tutorial, we will walk through the process of saving and loading single cells and trajectory data using the livecellx library. By the end of this, you should be able to efficiently store and retrieve your single cells and trajectories json file.

[1]:
# import some common libraries
import os
import glob
import os.path
import numpy as np
import json, random, cv2
from cellpose import models
from cellpose.io import imread
import matplotlib.pyplot as plt

from tqdm import tqdm
from pathlib import Path
from skimage import measure
from PIL import Image, ImageSequence

# from livecellx import segment
from livecellx import core
from livecellx.core import datasets
from livecellx.core import SingleCellTrajectory, SingleCellStatic
from livecellx.core.datasets import LiveCellImageDataset, SingleImageDataset

SingleCellStatic: saving & loading

Before we dive into data processing, let’s establish an output directory to save our results.

[2]:
io_out_dir = Path("test_io_output")

Loading single cells from existing mask files

We are setting up the paths to the datasets that we will use throughout this tutorial. Ensure that the dataset paths provided below point to your actual datasets. The mask_dataset_path should contain your segmentation mask data, which will be used to derive individual cell information. In mask_dataset_path please make sure that the sorted (alphabetically) file names correspond to the order of times.

Note: The sorted mechanism provided simply sorts the url (file name) list according to string value. Please note that without proper left trailing zeroes, the order of final times may be incorrect. e.g. string T10 (10th file) is less than string T2. If you have your customized file patterns, please provide LiveCellImageDataset with a time2url dictionary to give necessary time information mapped to file locations for reading time-lapsed data.

[3]:
# using the Path class from the pathlib module to work with file and directory paths
dataset_dir_path = Path(
    "../datasets/test_data_STAV-A549/DIC_data"
)

mask_dataset_path = Path("../datasets/test_data_STAV-A549/mask_data")

Loading the mask dataset

We’ll now load the mask dataset using the LiveCellImageDataset class. This dataset helps segment the cells in our images.

[4]:
mask_dataset = LiveCellImageDataset(mask_dataset_path, ext="png")
mask_dataset.time2url
3 png img file paths loaded;
[4]:
{0: '../datasets/test_data_STAV-A549/mask_data/seg_STAV-A549_VIM_24hours_NoTreat_NA_YL_Ti2e_2022-12-21_T252_XY01_DIC.tif.png',
 1: '../datasets/test_data_STAV-A549/mask_data/seg_STAV-A549_VIM_24hours_NoTreat_NA_YL_Ti2e_2022-12-21_T253_XY01_DIC.tif.png',
 2: '../datasets/test_data_STAV-A549/mask_data/seg_STAV-A549_VIM_24hours_NoTreat_NA_YL_Ti2e_2022-12-21_T254_XY01_DIC.tif.png'}

Organizing and loading DIC images

Next, we organize and load the DIC images which give us a detailed and contrasting view of cell boundaries.

[5]:
# using the glob module to list all relevant files to load the dataset.
time2url = sorted(glob.glob(str((Path(dataset_dir_path) / Path("*_DIC.tif")))))
time2url = {i: path for i, path in enumerate(time2url)}
dic_dataset = LiveCellImageDataset(time2url=time2url, ext="tif")
# dic_dataset = LiveCellImageDataset(dataset_dir_path, ext="tif")

We check if the time2url mapping is correct

[6]:
dic_dataset.time2url
[6]:
{0: '../datasets/test_data_STAV-A549/DIC_data/STAV-A549_VIM_24hours_NoTreat_NA_YL_Ti2e_2022-12-21_T252_XY01_DIC.tif',
 1: '../datasets/test_data_STAV-A549/DIC_data/STAV-A549_VIM_24hours_NoTreat_NA_YL_Ti2e_2022-12-21_T253_XY01_DIC.tif',
 2: '../datasets/test_data_STAV-A549/DIC_data/STAV-A549_VIM_24hours_NoTreat_NA_YL_Ti2e_2022-12-21_T254_XY01_DIC.tif'}

Preparing single cells from mask dataset

Using the mask dataset and DIC images, we will now prepare the single cell data.

[7]:
from skimage.measure import regionprops
from livecellx.core.io_sc import prep_scs_from_mask_dataset
single_cells = prep_scs_from_mask_dataset(mask_dataset, dic_dataset)
single_cells[1].meta
100%|██████████| 3/3 [00:09<00:00,  3.30s/it]
[7]:
{'label_in_mask': 2}

Saving single cells to JSON

Here, we save the single cell data into a JSON file for future use.

[9]:
sc_json_list = SingleCellStatic.write_single_cells_json(single_cells, io_out_dir/"single_cells.json", dataset_dir=io_out_dir/"dataset", return_list=True)

Loading single cells from JSON

Once saved, you can easily load this data back into your workspace.

[11]:
loaded_scs = SingleCellStatic.load_single_cells_json(io_out_dir/"single_cells.json")
loaded_scs[0]
[11]:
SingleCellStatic(id=07c0b946-b728-4da9-bbd6-0136007364ad, timeframe=0, bbox=[ 143.  978.  207. 1044.])

We may check the first two single cells data

[12]:
for sc in single_cells[:2]:
    print(sc.id)
ae749ff4-5971-4508-bb1f-14555fd0bd3e
0c0bb021-3a52-41e2-92d8-64ff26dd8eaf

To ensure data integrity, you might want to compare the loaded data with the original

[13]:
for sc in single_cells:
    for loaded_sc in loaded_scs:
        if sc.id == loaded_sc.id:
            # compare contour, contours are np.array
            assert np.allclose(sc.contour, loaded_sc.contour), f"the difference between sc.contour and loaded_sc.contour is {sc.contour - loaded_sc.contour}, ids are {sc.id} and {loaded_sc.id}"

SingleCellTrajectory: saving & loading

Tracking single cells for trajectories

To track the movement of cells across different frames, we will now generate trajectories for each cell.

[14]:
from typing import List
from livecellx.track.sort_tracker_utils import (
    gen_SORT_detections_input_from_contours,
    update_traj_collection_by_SORT_tracker_detection,
    track_SORT_bbox_from_contours,
    track_SORT_bbox_from_scs
)


sct_collection = track_SORT_bbox_from_scs(single_cells, dic_dataset, mask_dataset=mask_dataset, max_age=1, min_hits=1)

Saving the Trajectories

Finally, we save the generated trajectories for future reference

[15]:
sct_collection.write_json(io_out_dir/"sct_collection.json")