Bioinformatics is coming

The back-story of Pathomation is well-known: once upon a time, there was an organization that had many WSI scanners. All these scanners came with different pieces of software, which made it inconvenient for pathologists to operate digitally. After all, you don’t alternate between Microsoft Office and Apache OpenOffice depending on what kind of letter you want to write and which department you want to address it to.

Tadaaaah, ten years later Pathomation’s PMA.core tile server acts as a veritable Rosetta stone for a plethora of scanner vendors, imaging modalities, and storage capacities alike.

There’s a second story, too, however: Pathomation could not have been founded with a chance encounter of a couple of people with exactly the right background at the right time. As it happened, in 2012, the annual Digital Pathology Association in Baltimore (coinciding with a pass-through of hurricane Sandy) had a keynote speech by John Tomaszewski about the use of bioinformatics tools to (amongst others) determine Gleason scores on prostate slides. One of the co-founders of Pathomation was in the audience and thought “what a great idea…”.

Bioinformatics meets digital pathology

Pathomation reached out to the bioinformatics community early on. Within the ISCB and ECCB communities however, interest was initially low: only one or two talks (or even posters) at the Vienna, Berlin, and Dublin editions discussed anything even remotely related to microscopy. The few people that did operate in this intersection of both fields, expressed mostly frustration having to spend a seemingly outrageous amount of time just learning how to extract the pixels from the raw image data files.

A simple observation emerged: bioinformatics could contribute a lot more to digital pathology, if only we could make it easier to port the data back and forth between the two communities. Say: A universal honest broker for whole slide imaging.

But having such software (in the form of its CE-IVD certified PMA.core tile server and freeware “starter kit” PMA.start) is not enough. We still had to get the word out.

So Pathomation next set out on its own and started organizing its own digital / computational pathology workshops. Coinciding with the European-based bi-annual ECCB events in The Hague (The Netherlands) and Athens (Greece) The proceedings of these are still available online.

A maturing relationship

Fast forward to 2022 and things are a bit different.

As Pathomation has forged on as a middleware provider, so has bioinformatics gradually started to contribute to digital pathology. As we already hypothesized 10 years ago: the datasets in pathology (and by extension: microscopy, histology, time lapse data, MSI etc) are too interesting not to be explored with tools that have already contributed so much to other fields like genetics and medicine.

When you go to OUP’s leading Bioinformatics journal and search for “pathology”, more than 10,000 hits spring up.

Correspondingly, a search for “bioinformatics” in the Journal of Pathology Informatics (JPI) yields significantly fewer results. That’s not unexpected, as oftentimes “bioinformatics” wouldn’t be mentioned as a wholesale term, but one would rather reference a particular protocol, method or technique instead. Chubby Peewee, anyone?

The relationship between digital pathology and bioinformatics is clearly well established and maintained.

Building bridges

Occasionally we’re asked what we offer in the field of image analysis (IA). We’ve always maintained our stance that we don’t want to get into this directly ourselves by means of offering an alternative product (or software component) to well established packages like ImageJ, QuPath (open source) or Visiopharm, HALO, or Definiens (commercial).

Instead we’ve pursued our mission of becoming the one true agnostic broker for digital pathology. This means that we look at all of these (and more) and determine how we can best contribute to help people transfer information back-and-forth between different environments.

SDKs

Many in silico experiments start of as scripts. In bioinformatics, one of the first extensively supported scripting languages was Perl, in the form of the BioPerl framework. It still exists and is in use today, but (at the risk of alienating Perl-aficionados; we mean no offense) has been surpassed in popularity by Python (and BioPython).

Looking at the success of (bio)python, the first language we decided to support in the form of providing an SDK for it was Python: our PMA.python library is included in the Python Package Index (PyPI), and very easy to install through an interactive framework like Jupyter or Anaconda.

Several articles on this blog tackle digital pathology challenges by means of our own PMA.python library, and can be used as an inspiration for your own efforts:

All our SDKs are available as open source through GitHub, and since PMA.python, we’ve also added on PMA.java and PMA.php.

We have a development portal with sections on each programming language that we support, and we use the respective “best practice” mechanisms in each to provide exhaustive documentation on provided functions and methods as well.

Plugins

The plugin concept is a powerful one: many companies that build broadly used software have realized that they can’t themselves foresee all uses and provide a way for third-party providers to extend the software themselves through a plugin mechanism. In fact, software has a better chance of becoming broadly used when it offers this kind of functionality from the start. It’s called “the network effect”, and the network always wins.

Not everybody wants (or needs) to start from the ground up. There are many environments that can serve as a starting point for image analysis. Two specific ones related to digital pathology are ImageJ and QuPath. As we needed a proving ground for our own PMA.java SDK anyway, we decided to work towards these two environments. Today, we have plugins available for both.

Our plugins are available from our website, free of charge. You can download our plugins for image analysis, which are bioinformatics related. For more mundane applications, we also offer plugins to content management systems (CMS) and learning management systems (LMS).

The PMANN interface

If there is no plugin architecture available, oftentimes an intermediary file can be used to exchange data. We apply this technique ourselves in PMA.studio, where a virtual tray can persist over times by exporting it to and importing it back from a CSV file as needed.

So it is with commercial providers. Visiopharm has its own (binary) MLD file format to store annotations in, and Indica Labs’ HALO uses an XML-based file.

What you typically want to do, is run an algorithm repeatedly, with slightly different parameters. This results then in different datasets, that are subject to comparison and often even manual inspection or curation.

The PathoMation ANNotation interface assists in this: As the analytical environments are oftentimes unsuited for curators to work with (both in terms of complexity as well as monetary cost), a slide in PMA.core can be instructed to look for external annotations in various locations. You can have slide.svs associated with algo1.mld,  algo2.mld, and algo3.mld. You can interpret the overlaying datalayers computationally, or you can visualize them in an environment like PMA.studio.

What’s more: PMANN allows you to overlay data from multiple environments simultaneously. So you can do parameter optimization in one environment, but you can also run an algorithm across different environments and see which environment (or method like deep learning or random forest) performs the best.

Don’t reinvent the wheel

Bioinformatics is a much more mature field than digital pathology, with a much broader community. At Pathomation, we strongly believe that many techniques developed are transferable. Data-layers and data-sets are equally apt to enrich each other. Genetics and sequencing offer resolution, but tissue can add structure as an additional source of information to an experiment outcome, and help with interpretation. The work of Ina Koch that we referenced earlier is a typical example of this. Another great example is the TCGA dataset, which has been incorporating whole slide images as well as sequencing data for a couple of years now.

At Pathomation, we aim to offer connectivity. Connectivity to different slide scanner vendors (by supporting as many file formats as possible), but also by allowing different software tools to exchange data in a standard manner. To this end, Pathomation also recently signed up for the Empaia initiative.

We strongly believe that as a physician or researcher alike, you should be able to focus on what you want to do, rather than how you want to do it. We build middleware, SDKs, and plugins to help you do exactly that.

Do you have experience with coupling bioinformatics and digital pathology? Have you used our software already in your own research? Let us know, and you may find yourself the subject of a follow-up blog post.

What we do

Middleware

At the end of the day, what we do is straightforward: Pathomation makes middleware software for digital pathology.

Now depending on who you talk to, one or more terms in that statement may take some explaining:

  • Pathomation: it’s not phantomation (ghostbusters, anybody?), photomation (photonics, quantum physics; nah, to be honest we’re probably not smart enough)… It’s Pathomation, from Pathology – Automation.
  • Middleware software: Middleware acts as a broker between different components. Your typical example of middleware would be a printer driver, which allows a user to convert text and pixels from the computer screen to ink on paper. On that note, “pixel broker”, is another way how we like to describe ourselves. The slide scanner converts the tissue from the glass slide into a (huge!) collection of pixels, and we make sure these pixels can be read and presented properly, regardless of what scanner was used to generate them, and regardless of where they are stored
  • Digital pathology: it started in the 60s at Massachusetts General Hospital in Boston with tele-pathology. In the 2000s, engineers figured out that they could automate microscopes to take sequential pictures of various regions of interest on glass slides, and then stitch those to create (again: huge!) compound images that would let pathologists bypass the traditional microscope altogether and use the computer screen as a virtual microscope.
  • Pathology: the medical sub-specialty that diagnoses disease at the smallest level. Up to 70% of therapeutic treatment is attributable to one pathological exam or another. That’s… huge actually, and it’s why it’s all the more important to make the pixels from the scan flow to that storage location to that computer screen as smooth as possible.

But here’s another way of looking at is: Pathomation is the Rosetta Stone to your digital pathology content.

Pathomation is a rosetta stone for digital pathology

We make middleware software that optimizes the transport of pixels. We show the pathologist the pixels he or she needs, when he or she wants, where he or she desires to have them.

The central piece of software we develop for that is called PMA.core, and on top of PMA.core we have a variety of end-user front-end applications like PMA.slidebox, PMA.studio, or PMA.control.

Growing software

We didn’t start off with these though. So bear with us as we take a little trip down memory lane.

Once you have a successful piece of software, it doesn’t take long for people to ask “hey, can I also use it for this (or that)?”. Therefore, on top of PMA.core, we built several Software Development Kits (SDKs) that make it easier for other software developers to integrate digital pathology into their own applications (image analysis (AI/DL/ML), APLIS, LIMS, PACS…).

The next question is: “I don’t know how to program. I just have a WordPress blog that I want to embed some slides in.” So we wrote a series of plugins for various Content Management Systems (CMS), Learning Management Systems (LMS), and even third-party Image Analysis (IA) tools.

Eventually, we got into integrated end-user application development, too. As far as we’re concerned, we have three software packages that cater to end-users:

  • PMA.slidebox caters to educational applications whereby people just want to share collections with students, conference participants, or peers. A journal could benefit from this and publish virtual slides via a slidebox to serve as supplemental material to go with select author papers.
  • PMA.studio wants to be a pathologist’s cockpit. You can have slides presented in grid layouts, coming from different PMA.core servers. But if you’re an image analyst working with whole slide images (WSI), that works, too. Integrate data sources, and annotations. Have a video conference, and prepare high-resolution images (snapshots) for your publications… Do it all from the convenience of a single flexible and customizable user interface. If you’re an oncologist or surgeon that only occasionally wants to look along, PMA.studio may be a bit of overkill. But to build custom portals for peripheral users, you have those SDKs of course.
  • PMA.control goes above and beyond the simple collections that you place online with PMA.slidebox. With PMA.control, you can manage participants, manage what content they see at that what time, organize complex training sessions, organize the development and evaluation of new scoring schemes etc. With PMA.control, you are in… control.

A platform

Pathomation solves one problem really well. Because the problem manifests itself in many ways, we offer a number of routes to address it.

Different scanners output different file formats, that need to be distributed in different ways.

Due to the diverse applications of virtual microscopy and digital pathology, we conceived our own toolbox different from the beginning. Instead of a single application, we set off from the start to build a platform instead.

Pathomation offers a software platform.

  • You start with PMA.core, which handles virtual slides, user management, data capture, and audit trailing. All applications need these capabilities.
  • On top of PMA.core comes a range of connectable components. These can take the form of an application (desktop, web, mobile), but can also be connecting pieces to other external software. We do not offer Image Analysis, but we give you the right connectors to pass your data on to your favorite AI environment. We do not have our own LIMS, but we offer technical building blocks (APIs) for independent vendors to embed slide handling capabilities on their end on top of our central PMA.core.

So whether you’re looking to build custom portals for your own user-base (PMA.UI framework), or are looking for a one-stop solution to evaluate new histological scoring protocols (PMA.control); we have you covered.

Contact us to continue the conversation!

 Thoughts on Software Development Life Cycle (part 3)

This is part 3 in a series of software development. The earlier parts are available here:

In this episode, we want to pick up where we left of, and discuss how a small company like ourselves, can effectively still have an efficient software validation process, without slowing things down and ending stranded in a hopeless bureaucracy.

Indeed it’s a bad joke that, once you have your software validated, (CE) certified, and government approved, you stop developing it any further, because the overhead (i.e. cost) of going through the whole process again, doesn’t justify the (typically small) incremental gains that are to be had from releasing a new version.

No Sir, at Pathomation we like to be creative and innovate, and we’ve always said that administration should not stand in the way of that. So what did we do? Like with our original software development itself, we looked at automation to provide part of the answer.

Remember the schematic from last time?

We came up with a whole datababase schema to put behind this and optimize workflow from one type of documentation to the next.

Custom granularity is what we’re after here, and we like to think we come pretty close!

Because SSMS isn’t known for user-friendly data entry, we also built a set of PHP scripts around it, so we can easily populate it (and most importantly: never have to worry about synchronizing Word-documents by hand again!)

We’ll use parts of the PMA.core project for sample data and illustrate our approach throughout the rest of this article.

Oh, and if you’re curious about the other product names you see in the list: have a look at our product portal.

Requirements and specifications

One of the key documents to keep track of when developing software are the user requirement specifications (URS).

After writing out a general description of what our software is supposed to do (the System Overview document – .sov extension in the schema), we can start to  capture in more granular detail what the different bells and whistles are going to be.

There’s a lot of room for interpretation at this level: Since PMA.core’s bread and butter is supporting as many different slide scanners as possible; Each file format is a separate URS entry.

For each URS, a specification is written up. Subsequently, tickets can be assigned to it. Tickets can originate from different locations:

Talkback is our historical original ticketing system based Corey Trager’s excellent BugTracker.Net project.

As we grew our team, we outgrew BugTracker and upgraded to Jira.

Features can be the result of a helpdesk-ticket. Keeping in mind what requirements originate from actual user requests (and not Pathomation’s CTO’s crazy brain) is useful to prioritize.

A completely annotated URS ends up looking like this:

We have all information in a single page. What a difference from a few years ago, where we had to puzzle these pieces together from several Word-documents.

When are you done with your software? When you’ve written sufficient tests to prove that all your user requirement specifications effectively work as intended.

Risk assessment and testing

In order to get a grasp on what “sufficient” testing means, a risk assessment has to occur first.

We provide product owners with a wizard-like approach to determine the risk analysis for each URS individually. Let’s have a look at this one:

The Risk Analysis then becomes as follows:

Do this for each URS, and you can come up with a granular test plan. In the future, we’d like to couple this back to the URS detail screen itself: A feature with high-risk, should have 3 tests; a feature with medium-risk, should have 2 tests; and a feature with low-risk can be sufficiently documented by providing a video.

Reporting

Remember our original Word-mess?

We (and you too) still need these documents at the end of the day for filing. You can complain about paper generating, but when to think about it: it absolutely makes sense to still produce these textual snapshots.

Just look at it from the other side: Imagine you’re a regulatory agency and you get applications from 100 companies. Each company deposits a database and a set of scripts to interface it with the message “oh, just get the scripts up and running and you’ll get all our information. it’s super-easy, barely an inconvenience”. The only alternative then is for the agency to provide its own templates for all 100 companies to fill out.

Luckily that direction is easy from where we’re standing. What we do to provide the necessary regulatory documentation is extract the proper information from the database, format it properly as HTML, and then print those webpages to PDF documents.

A traceability matrix? Well, that’s just a matter of a couple of outer joins and summarize the outcome in a table.

Documentation. Validation. Done.

What’s left?

We used the technique described in this article to obtain our self-certified CE-IVD label for our PMA.core software.

We also worked with external independent consultants to make sure that our technique would indeed withstand outside scrutiny. After all, any software developer can self-certify their own software.

It’s important to note that in addition to keeping track of a number of documentation items, we also performed a validation study in two hospitals, with two different slide scanners. This confirmed everything we did so far, indeed applied to a clinical setting as well.

Now it’s onto PMA.core 3.1. This will be a minor release, with a focus on additional security features like LDAPS and improved IAM / EC2 integration.

For PMA.core 3.1, we’re not doing the whole process described in here from scratch. How we solve this problem then in an incremental fashion, is food for a next article.

Thoughts on Software Development Life Cycle (part 2)

See our earlier article on the processes that we developed at Pathomation to improve our software development practices.

Software validation

Much has been written about software testing, verification, and validation. We’ll spare you the details, subtleties, and intricacies of these. Fire up your favorite search engine. Contact your friendly neighborhood consultant for a nice sit-down fireside chat on semantics if you must.

Suffice it to say that validation is the highest level of scrutiny you can throw against your code. And, like many processes, at Pathomation, we take a pragmatic approach.

Pragmatic does not need to mean that we bypass rules or cut corners. Au contraire.

The need

The process sometimes gets a bad rep. If software validation is so involving, why even do it at all? After all:

  • You have a ticketing system like jira or Bugzilla in place, right?
  • You have source code control like git or svn in place, right?
  • Your developers can just test their code properly before they commit, right?
  • Right?…

Anybody with any real-life experience in software development knows it’s just not that simple.

At Pathomation, we have jira; we have git; we add Slack on top of that for real-time communication and knowledge transfer; etc. And yet, there was something missing.

Consider the following typical problems during non-validated SDLC:

  • Regression bugs
  • Incorrect results
  • Wrong priorities
  • Bottlenecks and capacity planning problems

Are you rolling your eyes now and thinking “Du-uuuuh”? Do you think these are just inherent to software development in general? Well, let’s see if there’s a way to reduce the impact of these at least a bit.

GAMP

Writing software is sort of like a manufacturing process. So with terms like GLP (Good Lab Practice) and GMP (Good Manufacturing Practice) already in existence, it made sense to expand the terminology to GAMP, which stands for Good Automated Manufacturing Process (and yes, is derived from GMP).

In essence, GAMP means that you’ve documented all the needs that your software addresses, have processes in place to follow-up on the fullfillment of these needs (the actual code writing process), and subsequently have a system in place to provide that the needs are effectively met.

GAMP helps organizations to prove that their software does what you say it does.

There a different levels of GAMP, and as the levels increase, so does the burden of proof.

  • Gamp levels 1-3 are reserved for really widespread applications. Think about operating systems and firmware. When you install Windows on a server, you expect it to do what it does. You may still check out a couple of things (probably specific procedures tied to your unique situation), but roughly speaking you can rely on the vendor that the software does what you expect it to do, and that bugs will be handled and dealt with accordingly.
  • Gamp level 4 is tied to software that can still be considered COTS, but is somewhat less widespread than, say, an operating system. It may be an open source application that you think is a good fit in your organization: it may have a wide user base, but it’s hard at the same time to beat the resources of the big tech companies. A certain level of scrutiny is still warranted.
  • Gamp level 5 is for niche software applications. It requires the highest level of checks, tests, and reporting. To some extent, everybody that builds their own software (including the big techs) is expected to do their own Gamp 5 validation.

We like to brag that we see a lot of users. But regardless how many satisfied users we have, we’ll never come even close to software that has the user bandwidth of Microsoft Office, Google Chrome, or even specialty database management systems like MongoDB or Neo4J.

PMA.core (including the PMA.UI visualization framework) is niche and custom software. Therefore, all of its derived components must go through extensive Gamp 5 validation procedures.

Next stop: Amazon. Read instructions. Clear.

Hard times and manual labor

In principle, it’s all very simple: you document everything that you do and provide evidence that you’ve done it, and that at the end of the day things work as planned and expected.

But at the very least, you need somebody to monitor this entire process, and, most importantly: keep it going. So we did contract with an external organization, and it sort of worked. That is, after a lot of frustration, we ended up with a list of documents that was good enough to claim that version 1.0 of our software was now validated:

The experience was not a fun one; nor a creative one; nor a productive one; nor… There were many things it wasn’t. In typical Pathomation (remember, we’re rebels at heart) we started wondering how we could improve the process. We identified two major bottlenecks:

  • Lack of involvement: it’s all too easy to throw money at a problem hoping that it will go away. It doesn’t. Read our separate rant about consultants for a somewhat different perspective on the consultancy world.
  • Inefficient procedures. No, wait, that’s too polite. How about hopelessly obsolete antiquated workflows? Getting there. Except for the word “flow”. What we did; it didn’t flow at all; think of molasses flowing; or lava…

Essentially we ended up sending back and forth a bunch of Word document. A lot of them… and they were long…

And you dread the moment when you want to add anything afterwards, because that involves making modifications in long documents that all need to reference each other correctly. Like below:

A user requirement specification (URS) needs to have functional specifications (FS), followed by technical specifications (TS). Since the all these are spread out across separate Word documents, you need a fourth document to track the items across the documents; a traceability matrix (TM). The TM is stored as an Excel spreadsheet, because stored tables in a Word document would just be silly… apparently??

They say insanity is repeating the same process over again and expecting different results, right? That was our conclusion after our first couple of iterations and experience with the software validation process as a whole.

A tunnel… with light!

Realizing that we would first and foremost have to take more ownership of the validation process, we thought about tackling the “not a fun one; nor a creative one; nor a productive one” accolades. Pathomation is an innovative software company itself. Could we put some of that innovation and software savviness into the software validation process itself, too?

We started by looking back at the delivered documents from our manual procedure. After some doodling on paper, we deduced a flow-chart that we agreed would be pretty much common to each validation project:

Our 30k view in place, our next step was to start thinking about what the most efficient way could be to fill it our for new products and releases going forward. That is the story we’ll be elaborating on in part 3 of this mini-series.

Thoughts on Software Development Life Cycle (SDLC) – part 1

What we do

At the end of the day, what we do is straightforward: Pathomation makes middleware software for digital pathology.

Now depending on who you talk to, one or more terms in that statement may take some explaining:

  • Pathomation: it’s not phantomation (ghostbusters, anybody?), photomation (photonics, quantum physics; nah, to be honest we’re probably not smart enough)… It’s Pathomation, from Pathology – Automation.
  • Middleware software: Middleware acts as a broker between different components. Your typical example of middleware would be a printer driver, which allows a user to convert text and pixels from the computer screen to ink on paper. On that note, “pixel broker”, is another way how we like to describe ourselves. The slide scanner converts the tissue from the glass slide into a (huge!) collection of pixels, and we make sure these pixels can be read and presented properly, regardless of what scanner was used to generate them, and regardless of where they are stored
  • Digital pathology: it started in the 60s at Massachusetts General Hospital in Boston with tele-pathology. In the 2000s, engineers figured out that they could automate microscopes to take sequential pictures of various regions of interest on glass slides, and then stitch those to create (again: huge!) compound images that would let pathologists bypass the traditional microscope altogether and use the computer screen as a virtual microscope.
  • Pathology: the medical sub-specialty that diagnoses disease at the smallest level. Up to 70% of therapeutic treatment is attributable to one pathological exam or another. That’s… huge actually, and it’s why it’s all the more important to make the pixels from the scan flow to that storage location to that computer screen as smooth as possible.

So there you have it: we make middleware software that optimizes the transport of pixels. We show the pathologist the pixels he or she needs, when he or she wants, where he or she desires to have them.

The central piece of software we develop for that is called PMA.core, and on top of PMA.core we have a variety of end-user front-end applications like PMA.slidebox, PMA.studio, or PMA.control.

Growing software

We didn’t start off with these though. So bear with us as we take a little trip down memory lane.

Once you have a successful piece of software, it doesn’t take long for people to ask “hey, can I also use it for this (or that)?”. Therefore, on top of PMA.core, we built several Software Development Kits (SDKs) that make it easier for other software developers to integrate digital pathology into their own applications (image analysis (AI/DL/ML), APLIS, LIMS, PACS…).

The next question is: “I don’t know how to program. I just have a WordPress blog that I want to embed some slides in.” So we wrote a series of plugins for various Content Management Systems (CMS), Learning Management Systems (LMS), and even third-party Image Analysis (IA) tools.

Eventually, we got into integrated end-user application development, too. As far as we’re concerned, we have three software packages that cater to end-users:

  • PMA.slidebox caters to educational applications whereby people just want to share collections with students, conference participants, or peers. A journal could benefit from this and publish virtual slides via a slidebox to serve as supplemental material to go with select author papers.
  • PMA.studio wants to be a pathologist’s cockpit. You can have slides presented in grid layouts, coming from different PMA.core servers. But if you’re an image analyst working with whole slide images (WSI), that works, too. Integrate data sources, and annotations. Have a video conference, and prepare high-resolution images (snapshots) for your publications… Do it all from the convenience of a single flexible and customizable user interface. If you’re an oncologist or surgeon that only occasionally wants to look along, PMA.studio may be a bit of overkill. But to build custom portals for peripheral users, you have those SDKs of course.
  • PMA.control goes above and beyond the simple collections that you place online with PMA.slidebox. With PMA.control, you can manage participants, manage what content they see at that what time, organize complex training sessions, organize the development and evaluation of new scoring schemes etc. With PMA.control, you are in… control.

Development

How do we develop it all? We’re a small company after all, with a small team, and many people wear multiple hats.

First, we like technological centipedes. If your ambition is to become an expert on jquery, SQL, or vue.js, and only do that, Pathomation is not the place for you. Even when you’re a full-stack developer, we’ll encourage you at least to get out of your comfort zone and write an occasional Jupyter notebook to test someone else’s code as part of our QA process.

Second, we have a workflow process that is tailored to our size, and we use the tools that we find useful. We don’t treat Scrum and Agile as gospel, but we adapt what we think makes sense.

Pipelines and dashboards

Sometimes people think we’re crazy for supporting so many tools and technologies. Sometimes we think we’re crazy for supporting to many tools and technologies.

But the truth of the matter is: we really want to remain flexible. We strive to be the go-to digital pathology middleware solution on the market, and that’s not going to happen by carving out a niche within a niche (as in: “everybody should just use PHP to build digital pathology portals from hereon”). We could probably have a more comprehensive PMA.java SDK if all we did was focus on Java, but we wouldn’t be able to help you with that if you come to us with a PyTorch algorithm or a Drupal blog.

All technologies come with their own peculiarities though, and as much as we adhere to the above centipede principle, no one can know it all. Or should even know it all.

So we’re big fans of dashboards and pipelines. To give you an idea:

  • We use standard applications like Jira, Slack, and Git extensively on a daily basis
  • We also have our own dashboard portal, that integrates meta-data from the above into KPIs that make sense to us.
  • A KPI is a KPI; we prefer knowledge. We’re lucky enough that we still have relatively short decision processes and everyone in the company understands what we’re doing. Because of this, we typically stay away from Google Analytics, because nobody gets anything out of a KPI.
  • Scrum and Agile are full of rituals. If you want to perform rituals though, your calling may be in religion rather than technology. Not that we frown upon religious callings. We’re just saying there’s a difference. Stand-up meetings are a good example. For high-profile projects, we’ve found stand-up meetings to be extremely valuable. But when that (phase of the) project is over, there’s no need to keep the meeting schedule around anymore. If we did that, it would be the only thing we would do anymore.
  • We look at standard methodologies as kitchen appliances; you use the ones you need when you need them. No need to institutionalize them. Some days you need a conventional oven; some days you need a microwave. There’s a reason they both exist. Chefs tell us that hybrid appliances that claim to do both, don’t really provide either functionality particularly well.

Does it work?

Sure, Pathomation is a small company, and we’re unconventional. Some call us rebels, renegades… Some partner with us exactly because of this even.

But of course, being rebellious cannot be a goal in itself; or even a merit.

What we’ve come to realize in the last year or so particularly is that products are only parts of the equation. What a company needs, in addition, to become successful is a set of processes.

The question then becomes: So does it work?

We think it does.

What it comes down to is that we do have a process that works now. Developers receive their input via Jira; managers can monitor respective projects through Jira dashboards. Build pipelines for different environments (no matter how diverse) are all organized through Jenkins. We’re on Microsoft Teams for meetings and chats, both at the office and at home. And for real-time technical support amongst the developers, there’s Slack.

Even though we use COTS software, we’ve still tuned it to match our own scale and pace. We still prefer on-premise installations instead of cloud, and we probably only have begun to scratch the surface of the possibilities in Jira (Confluence has recently been added to our arsenal).

There’s no right or wrong, or why you should follow our lead. Each company is different and should probably find what works for them. And realize that standardization can be a two-edged sword.

But even so: does it work? How do these processes in our case lead to better software?

We recently obtained our CE-IVD mark for PMA.core. So: yes.

But the tools and techniques described in this article don’t give the complete picture. So in a follow-up article, we plan to elaborate on what else was needed in addition to what we describe here, to get to yet another level of quality.

Custom panels and functionality in PMA.studio

Because sometimes you want more

PMA.studio has a lot of functionality out of the box, but sometimes you want more. Having PMA.studio open on one screen and your organization’s (AP)L(I)MS on another is not always ideal. And what if you don’t have multiple screens?

As we pointed out before, PMA.studio offers a panel-based layout. The standard panels can be moved around and even stacked. The standard ribbon in PMA.studio offers some convenient default layouts, too.

Further configuration of PMA.studio is available through the Configure tab, where you can enable individual panels.

But what if you can’t find the panel that you’re looking for? Maybe the content that you’re looking for is at a different website, and if you just could have that particular page available withing PMA.studio as a separate panel…

Custom panels

PMA.studio offers the possibility the add a custom panel w/ select content from a particular URL.

Let’s say that you want to have a reference website available next to your slide content. We’ll use our own PMA.studio wiki website at https://docs.pathomation.com/pma.studio as an example.

You could start by pulling up PMA.studio in one browser, and our wiki in another browser. You start off nice and smooth like this:

But soon your layout gets messy. The 50% screen width really gives too much to the wiki, and let’s not even get into how easy it is to have that other browser window snowed under a ton of other applications (the word processor you’re using to write a paper in, your ELN, your EPR, your LIMS…)!

There’s a straightforward solution you. Click on the “More” button in the Layout group from the Home tab:

And a new dialog shows up. In addition to selecting any number of default panels, you can also define custom panels. Like so:

After clicking ok, your new panel appears, making your overall screen layout look like this:

Now that’s more like it!

This is already great for reference data, but what if you want to combine this with slide awareness? In other words: you want to have the content in the panel change automatically depending on the selected slide.

Passing parameters

The custom panel mechanism in PMA.studio automatically passes along references to other webpages that trace back to a current PMA.core tile server and selected slide.

This mechanism is typically hidden from plain view to reduce the functional complexity of it all, but a single line of PHP brings up the necessary data:

print_r($_GET);

When we create yet another custom panel that refers to this new page, we see the following appear:

And let’s just say that we don’t like the way PMA.studio displays a slide’s thumbnail and label image. We’d rather just have those in separate panel, too, so we have more control over how they’re displayed.

We also know that the thumbnail of slide X can be reached via:

https://server/pma.core/thumbnail?sessionID=…&pathOrUid=X…

And the label image of slide X can be reached via:

https://server/pma.core/barcode?sessionID=…&pathOrUid=X…

We can therefore make two new scripts, that translate the received input parameters from PMA.studio, and translates those to the correct querystring variables for our respective thumbnail and label images:

thumbnail.php:
<?php
$url = $_GET["server"].
	"thumbnail?sessionID=".$_GET["sessionId"].
	"&pathOrUid=".$_GET["slideUid"];
header("location: $url");
?>
barcode.php:
<?php
$url = $_GET["server"].
	"barcode?rotation=0&sessionID=".$_GET["sessionId"].
	"&pathOrUid=".$_GET["slideUid"];
header("location: $url");
?>

We place the new files on a server, and reference them via two separate custom panels:

When we navigate the slides in a folder one by one, we now see that the panels are updated accordingly, too.

Other applications

In this post, we showed you how you can configure custom panels in PMA.studio exactly to your liking, and how the content of such panels can synchronize with currently viewed slides.

We showed you how to pass along information through some trivial examples, referring back to our own infrastructure.

Now you can build your interfaces, like we demonstrated in an earlier blog article.

PMA.studio is more than just a universal slide viewer; you can turn it into your own veritable organizational cockpit. Think e.g. about custom database queries against your back-end LIMS, bio-repository, or data warehouse. You can show the data where you want it, when you want it, all with a few configuration tweaks. No longer do you have to juggle multiple browsers, as PMA.studio simply allows you to build your own custom dashboards.

Find out more about PMA.studio through our landing page at https://www.pathomation.com/pma.studio.

Sharing facilities in PMA.studio

Let’s get together

Sharing content is arguably one of the most important applications of digital pathology, if not for the Web in general.

PMA.studio allows you to share content in a variety of ways. There is a dedicated group for sharing content on the ribbon:

When you just want to share what you’re currently looking at, chances are that you can get by with one of the quick share buttons:

  • If you want to share the current folder you’re navigating, click on the “Share folder” button
  • If you want to share the current slide that you’re looking at, click on the “Share slide” button
  • If you want to share the current grid that you’re looking at, click on the “Share grid” button
  • Etc.

If you want more control over what and how you’re sharing content, you can click on the final “Share” button of the group. You could say that that’s our “universal” share button.

It allows for further customization of your share link, including:

  • Password-protect your link
  • Expire the link (e.g. students can only access it for the duration of a test)
  • Include or exclude annotations from the shared link
  • Use a QR code instead of a plain text link
  • Etc.

Our best advice is for you to play with the various options. But do let us know when you think there are some features missing or you think something is broken.

Share administration

We’ve worked hard on making the sharing concept in PMA.studio broadly applicable to a variety of content. We’ve also worked on making it easy to share content.

So with all this sharing going on then, it’s only natural to be asking after a while “wait, what am I actually sharing?”.

On the configure tab, in the “Panels” group, you can active the “Shared links panel”

Once clicked, you get a new panel with an overview of everything you’ve shared so far.

The buttons behind each link allow various operations.

One application of this is to recycle a share link and re-use as you see fit.

You can also (temporarily) invalidate links, or delete them altogether.

The history is linked to your PMA.core login, so if at first you don’t see anything, make sure that you’re connected to the PMA.core instance for which you expect to see share links.

Monitoring

On the back-end of PMA.studio, administrators can get an overview of all created shares across all users. They can also use this view to temporarily suspend or even delete shares.

Automation

While we highly advocate the implementation of the PMA.UI framework in third-party software like (AP)L(I)MS, PACS, VNA, and other digital pathology consumers, we realize that this is not trivial. In a proof-of-concept phase, all you may want to do is show a button in your own user interface that then subsequently just pops up a viewport to the content that you want to launch. Easy-peasy, as they say.

Let’s say that you have an existing synoptic reporting environment that looks like this:

In order to convince your administration that adding digital pathology to it is a really good idea, you want to upgrade the interface to this:

With PMA.studio, you can now get exactly this effect.

Let’s switch to Jupiter to see how this works:

.First some homekeeping. We import the pma_python core module, and connect to the PMA.core instance that holds our slide.

Our slide is stored at “cases_eu/breast/06420637F/HE_06420637F0001S.mrxs”. Let’s make sure that the slide exists in that location by requesting its SlideInfo dictionary:

Alternatively, we can also write some exploratory code to get to the right location:

The PMA.studio API

Ok, we’ve identified our slide. Now let’s go to the PMA.studio. Unfortunately, PMA.python doesn’t have a studio module yet, so we’ll have to interface with the API directly for the time being.

The back-end call that we need is /API/Share/CreateLinkForSlide and takes on the following parameters:

We create the URL that invokes the API by hand first. We can do this accordingly:

Never mind that pma._pma_q() method that we use. It’s a fast and easy way for ourselves to properly encode HTTP querystring arguments. You’re free to piggy-back on ours, or use your own preferred method.

After execution of the code, you get a URL that looks like this:

https://yourserver/pma.studio.2/api/Share/CreateLinkForSlide?userName=user1&password=user1&serve.rUrl=https%3A%2F%2Fyourserver%2Fpma.core.2&pathOrUid=IRVGLPWBFT

The URL by itself doesn’t do anything, but create the share link. So you still need to invoke it. You can do this by either copying the URL to a webbrowser, or by invoking it from Python as well:

Again: it’s the return result from the URL that you want to distribute to others and not the initial URL.

To confirm that it worked, you go back to PMA.studio and check your panel with the share link overview:

But you can also just pull up the resulting URL in a new browser window:

Yay, it worked!

Automating folders

You can also create links that point to folders.

That slide that we just referenced? It turns out to be an H&E slide, and along with the other slides in the folder, actually comprises a single patient case.

So you can emulate cases by organizing their slides per folder, with each folder representing a case. Your hierarchy can then be:

[patient]

    [case]

        [slides]

Say that we want to offer a case-representation of breast cancer patient 06420637F. We use Share/CreateLinkForFolder and point to a folder instead of a slide:

The result again appears on the PMA.studio side. And clicking on it results in a mini-browser interface:

What’s next

After PMA.core, we’re starting to provide back-end API calls into PMA.studio as well. Even though as we prefer developers to integrate with PMA.UI directly, there are scenarios where automation through PMA.UI makes sense. When you’re in one of the following scenarios when:

  • PMA.studio is your main cockpit interface to work with slide content, but there are a few other routes (like an intranet) through which you want to provide quick access to content, too.
  • You have an (AP)LI(M)S, PACS, VNA, or other system and you’re in a PoC phase to add digital pathology capabilities to your own platform, PMA.studio automation may be a quicker route to go than adapting our SDKs.

Do keep in mind however that we’re providing the PMA.studio back-end mostly for convenience, at least for the time being. There are any number of ways in which you may want to integrate digital pathology in your infrastructure and workflows. For a high level of customization, you’re really going to have to move up to PMA.UI, as well as a back-end counterpart like PMA.python, PMA.php, or PMA.java.

Four ways to identify slides

By filename

The most straightforward way to identify a slide is by its filename.

When you request the slides that are in a subfolder, you get a list of filenames back. Each filename refers to a slide.

By Unique Identifier (UID)

Once you obtain a filename referring to a slide, you typically want to do something with it.

Using filenames as references throughout your software is problematic however, for a variety of reasons:

  • A full path reference can become really long, and may not fit a field. No matter how careful you are, at some point there’s always that 51-character string that just won’t quite fit into the varchar field that was defined with a standard field size of varchar(50)
  • Unicode-encoding can be tricky, and many languages complicate matters further by providing different methods for querystrings, querystring parameters etc. Not to mention the databasefield that you forgot to make nvarchar instead of varchar. Good luck chasing that one!
  • Using filename references is just not safe. Imagine that you’re passing on a URL that looks like lookAtMySlide.jsp?slide=case35%2fslide03.svs… It’s all too easy (or even tempting) for the recipient to want to try out variations on that scheme: “hmm, I wonder what slides 2 and 4 look like?” or “let’s have a look at cases 1-34 too”

For this purpose, we’ve introduced the UID-principle. A UID is a 6-character random string, tied to a particular slide in a particular location (folder). The UIDs are generated by the PMA.core engine, so there’s no collusion possible between UIDs referring to different slides. By their nature, there’s no sequential logic to them either, so there’s no point wondering asking information about slides YT4TGQ or YT4TGS, after finding out slide YT4TGR exists.

You can retrieve the UID of any slide through the GetUID() method. For this, you’ll need an instance of PMA.core, because slide anonymization is not supported by our free PMA.start viewer.

If you’ve had a look at our API calls and SDK methods, you’ll notice that many calls have a parameter argument PathOrUid, rather than just Path. This means that each time you specify a filename to identify a slide, you might as well make life a bit more easy on yourself (as well as possibly your compliance department!) and use the UID parameter instead. Have a look then at the following semantically identical calls:

The one notably exception to this would be the invocation of the GetUID() method itself, of course. But don’t worry; you can’t accidentally request the UID of a UID. If a slide reference passed on doesn’t refer to any existing content, you’ll just get a runtime error instead.

By Fingerprint

UIDs are great, but what if you want to track virtual slides? Physical slides aren’t static and move around; this real-life environment is oftentimes mimicked in the virtual world, where a lifecycle of a slide can go like this:

Different systems may be responsible for the different types of movement, making it very hard to track the virtual slide’s lifecycle in its entirety.

This is where a slide’s fingerprint can come in handy. Unlike a UID, the fingerprint is a signature string that is calculated based on a slide’s actual characteristics. We have a whole separate article on the subject.

The bottom line is that when you have new_slides/slide17.svs, and you move it to validated_slides/slide17.svs, you’ll be able to identify these slides as being identical through their fingerprint signature.

Let’s say that we have a slide slide54123.mrxs in the incoming folder, that get subsequently moved (and renamed) to a folder related to bladder research, to finish its lifecycle in an archival folder.

Have a look at the following code then:

Note that the UID is different for all three slides, but the fingerprint remains the same, even as the filename changes!

And you can use this for even more applications:

If you’re later wondering if new_slides/slide17.svs is a re-scanned slide, or whether it’s just the original slide that somebody forgot to delete, you can also use the fingerprint for this. If it is the old version that is just lingering, it will still have the same fingerprint. However, if it’s a newly re-scanned version of the physical slide, the fingerprint will be different, due to subtle changes in the image capturing process. It’s interesting to note than that in the latter case, the UID will still be the same.

Why would you still use UID instead of a fingerprint signature? B/c a fingerprint takes some time to generate: a slide must actually be (at least partially) and analyzed to obtain its fingerprint. The UID in contrast is only a random string that is generated in a split second. In many cases, all you want is a point to a slide. For a variety of reasons, the UID is then a faster alternative.

By barcode

Virtual slides comes from physical slides, so how did people identify slides before the advent of technology? Well, first they invented the are on the slide that we now refer to as slide label, and coated it with a material they could easily write (typically in pencil). Later on, label printers were introduced, and in combination with bio-repository systems provided by the (AP)LI(M)S, random (bar)codes could now be imprinted on small stickers that could be pasted on the slides directly, so that no scribbling was necessary anymore.

The idea is that the barcodes are machine-readable, and could be used to match all sorts of information afterwards, at the same time guaranteeing anonymity of the slide itself (a barcode identifier only makes sense in the context a particular hospital / lab / (AP)LI(M)S / biorepository).

A barcode in many ways is the real-world equivalent of a UID, but it doesn’t have to be. Consider this:

  • There can be structure to encoded barcodes, most often sequence information
  • The barcode can still encode certain patient or doctor information, though this is rather rare. People aren’t doing this anymore because of practical concerns, as a barcode only holds a limited number of characters.
  • Unlike UIDs, barcodes can take on all sorts of shapes and forms, making it difficult to provide universal identification services
  • You can have more than one barcode on a slide
  • The barcode can still be pasted on manually at an angle, which can make it hard for machines to recognize, whereas a human lab technician in such a case would just rotate her hand-scanner a bit.
  • The barcode can be applied to a slide before it goes through a series of chemical steps, that can in turn degrade the barcode and make it less legible

Even with the above caveats, we don’t argue the value of barcoding per se. It’s definitely way better than the alternative (pencil, scribbling). At the same time, barcoding makes most sense within a setting of one lab (or lab-group), one set of hardware, and one (AP)LI(M)S, so that the entire pipeline can be calibrated to a (beforehand agreed-upon) specific format.

At Pathomation, we offer the possibility to extract barcodes from label images through the GetBarcode(). We have our basic implementation in such a way that it works on a wide variety of labels:

Another way to study this behavior is via the debugger in your webbrowser:

The thing about barcodes is unfortunately: it’s not waterproof. The resolution of your scanner may not be high-quality enough, and there are a number of other reasons it can still go wrong. We’ve seen stupid things happen, like the number “1” being interpreted like the lowercase letter “L”  etcetera.

You can use the existing implementation for test scenarios. However, for production environments, we should be involved in the IQ /OQ / PQ loop, so we can advise properly on how best to roll this out. When you know that your identification scheme only includes numbers, we can configure the recognition engine to not accidentally pick up letters (and prevent a 1 > l switch from ever happening in the first place).

In summary

At Pathomation, we pride ourselves with the slogan “digital pathology for pathologists, by pathologists”. We know of the struggles to identify and keep track of slides, both physically and virtually. Therefore, we offer different ways of identification. We’ve published content on this before, but this is the first article in which we neatly outline all options next to each other.

Who’s in the driver’s seat?

We offer a platform…

Pathomation is not just about selling software. We offer a comprehensive platform with a variety of technical components that help you build tailor-made digital pathology environments and set up custom workflows.

From simple viewing to automated back-end image analysis and data integration, we believe we have the broadest offering on the market today. Best of all: our technology is centered around PMA.core, a powerful tile server on top of which everything else can be connected and integrated.

Recently, we published a video in which we showcase how one of our customers adopted our components into their own SaaS solution:

The customer offers services for second opinion counseling. It had already built a proprietary workflow portal for patients and pathologists to log in and submit new or evaluate existing data. Until the Pathomation components were integrated however, slide exchange was limited to upload and download mechanisms.

The front-end uses primarily two controls: PMA.UI Gallery and PMA.UI Viewport.

Now that the Pathomation PMA.UI slide visualization framework is integrated in the customer’s codebase, things run a lot smoother: slides are uploaded directly to the customer’s website, visualization is instantly thanks to PMA.UI’s viewport component, and even annotations can be added by various actors throughout a submitted case’s workflow process.

To help customers like Agoko get started and guide them to the process, we have our own developer portal. There, you can find articles and tutorials on how to adapt our technology both on the client (JavaScript) and server-side (Java, Python, PHP).

We’re working on a dedicated YouTube channel for Pathomation software developers, too. Head over there to get the basic skills you need to get started in the respective programming language of your choice.

… in more ways than one

While we have a number of customers that currently integrate our platform into their own online infrastructure, this is not for all. If you already have an application, and you have the technical resources (people) to work on this, that’s great. But what if you’re more limited in terms of time, money, staff… If you’re a startup, every decision has its implications. If you’re an image analysis or algorithm shop for instance, you may invest more heavily into your back end, rather than your front-end presentation (or at least postpone that for a later stage).

But you still want to allow people to upload slides. You still want to be able to allow experts (human or AI) to make annotations, and you still want _your_ end-users to see findings and results.

In that case, PMA.studio may be a more convenient solution, than learning how to integrate individual components. PMA.studio is a web-based slide viewer. In its simplest form, it looks like this:

However, PMA.studio is much more than just a slide viewer. All those integration capabilities that we mentioned earlier in the context of PMA.core can be visualized through various controls and panels. So while you can use PMA.studio as a viewer, it’s more likely you end up with interfaces that look like this:

PMA.studio brings modern user interface elements like a ribbon and panel-layout to a browser-based digital pathology environment. For the panels, we use the GoldenLayout library. From the start, we realized that one size will never fit all, so we made the layout of both the ribbon and the panel orientation completely configurable via XML configuration files. You can do this both for panels:

And for the ribbon:

Not only that, but you can also custom create new panels, that retrieve data from your own databases, present workflows etc. This means that our customer might as well have ended up with an interface that looks like this:

Flexibility and choices

The flexibility offered through both our SDKs and PMA.studio allows any customer to easily white-label our various software components.

Which route you decide to go with depends on you, but having had the experience with various customers on different projects, we’d be happy to guide you along the way. Contact us for a free consultation today and see how we can help take your digital pathology infrastructure to the next level.

Three ways to transfer your virtual slides

Uploading and downloading

The Pathomation software platform for digital pathology and virtual microscopy offers powerful slide presentation and interaction capabilities.

Much of the focus (especially with respect to end-users) is on slide visualization. But workflow can be organized through our platform as well.

In this article we focus on upload- and download-capabilities. We distinguish between three different mechanisms. We also provide background and insights into how and why we decided to provide the functionality in this fashion.

PMA.transfer

For many cases, PMA.transfer is your initial workhorse of choice. It’s a user-friendly end-user facing application with many features. For ad hoc slide transfers from your local hard disk to a PMA.core instance, PMA.transfer is the go-to tool.

There are two ways to obtain PMA.transfer: you can download it individually from its own website, or it can be installed in combination with your latest PMA.start download during the installation procedure.

If you already have PMA.start up and running and don’t want to go through the trouble of downloading the complete setup package again, you can download PMA.transfer individually through its own website at https://www.pathomation.com/pma.transfer. From there, you can also select individual versions of the software.

Regardless of how you obtain PMA.transfer, the software does require you to have PMA.start up and running, to function itself.

The reason for this is that PMA.transfer relies on PMA.start to tell its what slides reside on the hard disk. PMA.start already has all the logic on board for slide data processing, and we didn’t want to re-implement all this code in PMA.transfer. The result is that PMA.transfer knows what a slide is on your hard disk through PMA.start. You need not concern yourself figuring out if it’s a single- or multi-file file format that you’re working with. PMA.transfer deals with slides and that’s it. Do read our other blog article to find out why virtual slides are so big and complicated in the first place.

Transferring slides with PMA.transfer

Once you have PMA.transfer open, you see the slides on your own hard disk in the left-hand panel. You can connect to any PMA.core instance that you have the credentials for. If you find yourself re-connecting to the same server again and again, we recommend that you use the site manager. You can also use the site manager for complex scenarios like geo-replicated PMA.core instances.

You can both upload and download slides with PMA.transfer. What it is exactly that you do depends a bit on your point of view. Typically upload goes from your local computer to the PMA.core server; download is the reverse (from PMA.core to your hard disk). But there are ways to rig PMA.transfer so it operates between two server instances of PMA.core instead of interfacing with just your local computer.  Essentially, you’re simply transferring slides from one location to another. The same API calls are being user underneath the hood regardless of whether you’re doing one or the other. More on that API later in this article by the way. Stay tuned.

We tried to make PMA.transfer extremely low-threshold and userfriendly, which means that there’s usually more than one way to get something accomplished. For instance, you can transfer slides from one side to the other either with context-menus, or via drag and drop. Selecting multiple slides works the same way as you’re used to from the Windows Explorer with Ctrl+Click and Shift+Click actions.

A full manual of PMA.transfer workings, interfaces, and best practices is available as an online wiki.

Pathomation PMA.transfer wiki

PMA.core

As mentioned above: PMA.transfer can be configured for site to site transfer. But even in that scenario, it still relies on PMA.start. This isn’t always convenient or even possible. Therefore; PMA.core 2.0 and higher contains its own slide transfer interface. It’s not as feature-rich as PMA.transfer, but if you quickly want to copy a large volume of slides from one server to another, it will get the job done. Plus, you can use this to copy slides from anywhere to anywhere; Migrating your old FTP server to updated S3-based cloud storage becomes a breeze. You can do it remotely, and asynchronously: just close your browser after initiating the operation and check back later.

Back-end API features

Occasionally, we get the question from people “ok, but I don’t want drag and drop stuff. I have an [incoming] folder somewhere on my system, and I want to automatically transfer all of those slides to their final destination overnight, when network use is low and I don’t feel like I’m hogging other people’s bandwidth…”. They then ask for the API call to make this happen.

If you’re one of these people: you’re on the right track. But what you’re really asking for is automation. Bear with us here. The API is part of that, but not the whole story.

See, these are the various API calls involved in slide transfer:

PMA.transfer uses these; and so does PMA.core. They work. But we still highly recommend that you do not engage in calling these methods yourself. Instead, you should rely on the SDKs instead.

The reason for this is related to the underlying structure of the virtual slides themselves. There can be many files, they can be big. Therefore, the API methods are mostly involved with uploading chunks or partial slide data to PMA.core. We understand that there may be instances where it is convenient to have direct interaction with these, to perhaps build progress indicators to monitor a processing workflow. For those uses, we recommend studying the implementation of the Core::upload() methods in PMA.python.

Automation

We illustrate here how slide transfer automation can work for you by means of a Jupyter notebook utilizing the PMA.python SDK.

The Core module contains both upload() and download() methods, which each rely on PMA.start.

When using the upload() function, the SDK assumes you’re transferring slides from PMA.start to PMA.core. Therefore, it’s good to do some preliminary verification and make sure that everything is in place before starting your transfer:


from pma_python import core
sourceSession = core.connect() targetSession = core.connect("https://test.pathomation.com/ pma.core", "username", "scrt") print(sourceSession, targetSession)

This block of code should result in two meaningful strings. If not, you can already interrupt the flow and send an email to a system administrator, informing him or her that something is wrong.

After confirming the connection, you should also check that both your source folder on your local hard disk (this can be the folder where your scanner deposits new WSIs).

Since you’re handling two PMA.core instances (PMA.start utilizes a limited of PMA.core, too), it’s a good idea to explicitly mentioned the PMA.core sessionID values as optional arguments in your code (we typically don’t do this when only interacting with a single instance; remember: good programmers are not only lazy but also know when to be):

core.get_directories("C:/", sessionID = sourceSession)
core.get_slides("C:/wsi", sessionID = sourceSession)
core.get_directories("_sys_ref/", sessionID = targetSession)
core.get_slides("_sys_ref/experiment", sessionID = targetSession)

Once you’re assured that your source data is available and your target folder is ready (and doesn’t have the data you’re looking for yet), it’s time to do the actual upload:

core.upload("C:/wsi/OS-3.ndpi", "_sys_ref/experiment", targetSession)

This call is spectacularly unimpressive. But behind the scenes, it figures out which files belong to the slide, and transfers them one by one to your target destination,. Large files are automatically split in smaller blocks.

And of course this can be made a lot more complex. This is the step where the magic happens: you can build a loop around all the slides found in the source folder, and you can add checks to make sure the target folder is empty to begin with (or at least doesn’t contain the to be transferred slide yet).

Regardless, after the transfer is complete, you want to verify that everything went right.

You instinct probably tells you to compare the ImageInfo objects on both sides:

However, this is not a good idea, as the ImageInfo will actually differ, as it contains a couple of location-dependent and slide-specific criteria as well. Comparing the information then just becomes confusing.

Rather, we recommend that after transferring a slide, you merely pull up the fingerprint for each slide:

core.get_fingerprint('_sys_ref/experiment/OS-3.ndpi', sessionID = targetSession)
core.get_fingerprint('C:/wsi/OS-3.ndpi', sessionID = sourceSession)

Comparing these two strings is a lot easier, whether through visual (interactive) or automated detection:

Which way do I take?

Due to their size and file structure, whole slide images are complicated data-beasts. Wrestling Managing them takes some time and practice.

The Pathomation software platform for digital pathology and virtual microscopy recognizes that both simple and complex slide transfer scenarios emerge from daily practice. Therefore, we offer different tools and routes to efficiently transfer slides, depending on which category of user you fall into and the scenario you with to implement.