July 2021 – Real Data Digital Pathology blog

Uploading and downloading

The Pathomation software platform for digital pathology and virtual microscopy offers powerful slide presentation and interaction capabilities.

Much of the focus (especially with respect to end-users) is on slide visualization. But workflow can be organized through our platform as well.

In this article we focus on upload- and download-capabilities. We distinguish between three different mechanisms. We also provide background and insights into how and why we decided to provide the functionality in this fashion.

PMA.transfer

For many cases, PMA.transfer is your initial workhorse of choice. It’s a user-friendly end-user facing application with many features. For ad hoc slide transfers from your local hard disk to a PMA.core instance, PMA.transfer is the go-to tool.

There are two ways to obtain PMA.transfer: you can download it individually from its own website, or it can be installed in combination with your latest PMA.start download during the installation procedure.

If you already have PMA.start up and running and don’t want to go through the trouble of downloading the complete setup package again, you can download PMA.transfer individually through its own website at https://www.pathomation.com/pma.transfer. From there, you can also select individual versions of the software.

Regardless of how you obtain PMA.transfer, the software does require you to have PMA.start up and running, to function itself.

The reason for this is that PMA.transfer relies on PMA.start to tell its what slides reside on the hard disk. PMA.start already has all the logic on board for slide data processing, and we didn’t want to re-implement all this code in PMA.transfer. The result is that PMA.transfer knows what a slide is on your hard disk through PMA.start. You need not concern yourself figuring out if it’s a single- or multi-file file format that you’re working with. PMA.transfer deals with slides and that’s it. Do read our other blog article to find out why virtual slides are so big and complicated in the first place.

Transferring slides with PMA.transfer

Once you have PMA.transfer open, you see the slides on your own hard disk in the left-hand panel. You can connect to any PMA.core instance that you have the credentials for. If you find yourself re-connecting to the same server again and again, we recommend that you use the site manager. You can also use the site manager for complex scenarios like geo-replicated PMA.core instances.

You can both upload and download slides with PMA.transfer. What it is exactly that you do depends a bit on your point of view. Typically upload goes from your local computer to the PMA.core server; download is the reverse (from PMA.core to your hard disk). But there are ways to rig PMA.transfer so it operates between two server instances of PMA.core instead of interfacing with just your local computer. Essentially, you’re simply transferring slides from one location to another. The same API calls are being user underneath the hood regardless of whether you’re doing one or the other. More on that API later in this article by the way. Stay tuned.

We tried to make PMA.transfer extremely low-threshold and userfriendly, which means that there’s usually more than one way to get something accomplished. For instance, you can transfer slides from one side to the other either with context-menus, or via drag and drop. Selecting multiple slides works the same way as you’re used to from the Windows Explorer with Ctrl+Click and Shift+Click actions.

A full manual of PMA.transfer workings, interfaces, and best practices is available as an online wiki.

PMA.core

As mentioned above: PMA.transfer can be configured for site to site transfer. But even in that scenario, it still relies on PMA.start. This isn’t always convenient or even possible. Therefore; PMA.core 2.0 and higher contains its own slide transfer interface. It’s not as feature-rich as PMA.transfer, but if you quickly want to copy a large volume of slides from one server to another, it will get the job done. Plus, you can use this to copy slides from anywhere to anywhere; Migrating your old FTP server to updated S3-based cloud storage becomes a breeze. You can do it remotely, and asynchronously: just close your browser after initiating the operation and check back later.

Back-end API features

Occasionally, we get the question from people “ok, but I don’t want drag and drop stuff. I have an [incoming] folder somewhere on my system, and I want to automatically transfer all of those slides to their final destination overnight, when network use is low and I don’t feel like I’m hogging other people’s bandwidth…”. They then ask for the API call to make this happen.

If you’re one of these people: you’re on the right track. But what you’re really asking for is automation. Bear with us here. The API is part of that, but not the whole story.

See, these are the various API calls involved in slide transfer:

PMA.transfer uses these; and so does PMA.core. They work. But we still highly recommend that you do not engage in calling these methods yourself. Instead, you should rely on the SDKs instead.

The reason for this is related to the underlying structure of the virtual slides themselves. There can be many files, they can be big. Therefore, the API methods are mostly involved with uploading chunks or partial slide data to PMA.core. We understand that there may be instances where it is convenient to have direct interaction with these, to perhaps build progress indicators to monitor a processing workflow. For those uses, we recommend studying the implementation of the Core::upload() methods in PMA.python.

Automation

We illustrate here how slide transfer automation can work for you by means of a Jupyter notebook utilizing the PMA.python SDK.

The Core module contains both upload() and download() methods, which each rely on PMA.start.

When using the upload() function, the SDK assumes you’re transferring slides from PMA.start to PMA.core. Therefore, it’s good to do some preliminary verification and make sure that everything is in place before starting your transfer:


from pma_python import core
sourceSession = core.connect()
targetSession = core.connect("https://test.pathomation.com/ pma.core", "username", "scrt")
print(sourceSession, targetSession)

This block of code should result in two meaningful strings. If not, you can already interrupt the flow and send an email to a system administrator, informing him or her that something is wrong.

After confirming the connection, you should also check that both your source folder on your local hard disk (this can be the folder where your scanner deposits new WSIs).

Since you’re handling two PMA.core instances (PMA.start utilizes a limited of PMA.core, too), it’s a good idea to explicitly mentioned the PMA.core sessionID values as optional arguments in your code (we typically don’t do this when only interacting with a single instance; remember: good programmers are not only lazy but also know when to be):

core.get_directories("C:/", sessionID = sourceSession)
core.get_slides("C:/wsi", sessionID = sourceSession)
core.get_directories("_sys_ref/", sessionID = targetSession)
core.get_slides("_sys_ref/experiment", sessionID = targetSession)

Once you’re assured that your source data is available and your target folder is ready (and doesn’t have the data you’re looking for yet), it’s time to do the actual upload:

core.upload("C:/wsi/OS-3.ndpi", "_sys_ref/experiment", targetSession)

This call is spectacularly unimpressive. But behind the scenes, it figures out which files belong to the slide, and transfers them one by one to your target destination,. Large files are automatically split in smaller blocks.

And of course this can be made a lot more complex. This is the step where the magic happens: you can build a loop around all the slides found in the source folder, and you can add checks to make sure the target folder is empty to begin with (or at least doesn’t contain the to be transferred slide yet).

Regardless, after the transfer is complete, you want to verify that everything went right.

You instinct probably tells you to compare the ImageInfo objects on both sides:

However, this is not a good idea, as the ImageInfo will actually differ, as it contains a couple of location-dependent and slide-specific criteria as well. Comparing the information then just becomes confusing.

Rather, we recommend that after transferring a slide, you merely pull up the fingerprint for each slide:

core.get_fingerprint('_sys_ref/experiment/OS-3.ndpi', sessionID = targetSession)
core.get_fingerprint('C:/wsi/OS-3.ndpi', sessionID = sourceSession)

Comparing these two strings is a lot easier, whether through visual (interactive) or automated detection:

Which way do I take?

Due to their size and file structure, whole slide images are complicated data-beasts. ~~Wrestling~~ Managing them takes some time and practice.

The Pathomation software platform for digital pathology and virtual microscopy recognizes that both simple and complex slide transfer scenarios emerge from daily practice. Therefore, we offer different tools and routes to efficiently transfer slides, depending on which category of user you fall into and the scenario you with to implement.

Month: July 2021

Three ways to transfer your virtual slides