role=“alert”>This page introduces a few concepts upon which PickCells is based and will help you get started with using PickCells.
- What is an ‘experiment’ in PickCells?
- How does a PickCells experiment work'?
- What kind of data can I generate?
What is an ‘experiment’ in PickCells?
An experiment in PickCells is the analysis of a set of images where meaningful information is extracted, comprehensively annotated and saved in a database that can be shared with others.
Let us explore each of these concepts in more detail:
“Set of images”
Some biological experiments we do in the lab will generate images. For example, we create images to see how an embryo develops over time, or if there is any difference in organisation of mutant cells compared to wild type cells.
These experiments may generate too many images for us to look at them all in detail. Besides, simply looking at images can be deceptive and most of the rich information contained within images require quantification to be useful.
So, for such an experiment in the lab, a PickCells experiment allows to us to batch perform quantitative and exploratory image analysis on a whole set of images.
For one PickCells experiment, there is a single biological experiment in the lab. So, by a set of images, we mean all the comparable images acquired on the same microscope and with the same settings for a single biological experiment - images from experimental conditions and technical or biological replicates are comparable while images from distinct biological experiments are not.
When we look at an image, our brain is capable of building a sophisticated mental model of the scene contained in that image. It does so particularly well because of the strong prior knowledge we possess. We know that a tissue is made of cells and that a cell contains only one unique nucleus (most of the time). We also know what a nucleus should look like. A trained person can recognise the body axes of an embryo in an image in just a few seconds, because that person knows what the head and the tail of the embryo should look like.
One phase of a PickCells experiment is to automatically label, or annotate, these recognisable structures in the images that compose the experiment so that image intensities or regions are translated into scenes composed of meaningful objects compatible with the mental model that we, as experimenters, have about our study system. This can be seen as a transfer of our prior knowledge to PickCells, or, as making the computer speak our own language.
With this annotation in place, the novel information that PickCells can generate, i.e. distribution of median intensities, are translated into information relevant to our experiment, i.e. distribution of level of expression of marker of interest in nuclei.
In addition to the ‘image to biological object translation’ described above, anything that can be documented can be added as annotations in a PickCells experiment. For example, an experiment is generally composed of multiple experimental conditions with both biological and technical replicates. A PickCells experiment allows us to define the structure of our experimental design. This annotation is essential to be able to compare experimental conditions with one another, but it can also be useful to help us to share our findings with our colleagues. It should be relatively easy for them to understand and explore the data generated by our experiment. It also makes it easier for us to figure out what we have done when we come back to our complex experiments in the future.
“Saved in a database”
PickCells experiments are saved within a database, so that we can work on our experiment over time, pause work on our experiment and return to it at a later date, and share our experiment with our colleagues.
The type of database that PickCells uses is a graph database.
How does a PickCells experiment work?
Performing an image analysis involves completing a sequence of tasks. PickCells offers a collection of modules, which are the functional units of PickCells, each of which can accomplish a specific analysis task. As the administrator of a PickCells experiment, our role is to activate and configure modules in a given order to carry out our desired analysis.
An example could be as follows, where the objective is to analyse the relative level of expression of a protein in 2 compartments of the cells (comparing nucleus versus cytoplasm):
The PickCells modules required to complete our analysis are shown at the top as square boxes and in order of activation from left to right.
The data that are created, edited or used by the module are shown as ellipse node: Images, Nucleus, Cytoplasm, Relationship.
The arrows show the relationships between modules and data:
- Green: data used as input to the module.
- Red: data output from, or created by, the module.
- Orange: data edited, or updated, by the module.
- Blue: relationship between two types of object (discussed in more detail below, in What kind of data can I generate?).
What kind of data can I generate?
With PickCells you are able to generate classical data tables where each row corresponds, for example, to one cell, and each column to a type of measurement, for example, the cell volume or the average intensity in one particular channel.
Now, very often the relationships between diverse types of objects are as important as the objects themselves. For example, imagine an experiment which consists of grafting different cell populations isolated in the lab into a damaged tissue in order to test if the cells can repair a lesion. To determine which cell population is the most effective, the following questions can be asked:
- How many grafted cells effectively survived?
- To which extent has the lesion been repaired?
- Is there any evidence that the grafted cells directly remodelled the lesion?
In order to answer these quesions, we need to detect the cells in the image and also the contour of the lesion, which needs to be documented at the start and at the end of the experiment. Having data tables for each of these objects can help us answer questions 1 and 2. However for question 3, it is useful to determine the relative positions of the grafted cells with respect to various regions of the lesion (repaired or still damaged). If the grafted cells are on repaired regions preferentially then this might provide a good indication that the cells can remodel the damaged tissue.
Here is a representation of the dataset we would like to obtain:
}' alt=‘There is an issue with the rendering of this graph’>
This needs to be done for each cell population that we want to test. As we also want to make sure that our results are reproducible, several grafts (and, therefore, their images) are obtained for each population. Our dataset becomes very complex if we have to handle multiple separate data table files.
PickCells solves this issue by organising the data into a property graph in order to keep everything tidy and easily searchable (for simplicity, only image 3 is ‘developed’, and it only contains 3 cells):
In the property graph above, each node corresponds to one object in the experiment. Each node has an associated list of properties. For example:
- An ‘Image’ node can have the properties: name, dimensions, acquisition time, number of channels, etc.
- A ‘Cell’ node can have the properties: Marker intensity, volume, shape, etc.
For each object type, categories can be created, grouping together objects of that type. These are represented by the boxes surrounding the nodes in the graph above, which includes the following categories:
- ‘Population1’: ‘Image’s in Population 1.
- ‘Population2’: ‘Image’s in Population 2.
- ‘Repaired’: ‘Region’s that are repaired.
- ‘Damaged’: ‘Region’s that are damaged.
The property graph allows us to create tables for a specific population by telling PickCells how to navigate the property graph. For example, if we want ‘All the cells in Image3’, we can define a query which starts to ‘Image3’ and collect all ‘Cell’s related to ‘Image3’ via the ‘IDENTIFIED IN’ relationship:
Alternatively, we could ask for all the ‘Cell’s in ‘Image3’ which are found in the ‘Repaired’ region of the lesion. These are identified by searching for ‘Cell’s that are in both ‘Image3’ and in a ‘Region’ such that the ‘Region’ has been categorised as ‘Repaired’ and the ‘Region’ is related to a ‘Lesion’ related to ‘Image3’. In this example, there is one such ‘Cell’, ‘Cell2’.
When searching in this way, the relationships between objects can be traversed in either direction.
Property graphs provide both an intuitive way to model real world problems and a powerful and scalable data structure which can be efficiently queried to identify novel patterns in the data.
A nice place to learn more about how property graphs are used to model and interrogate real world problems is the Neo4j graphgist pages