To index or not to index?

A question we get frequently from potential customers is “how do we import our slides into your system (PMA.core)?”. The answer is: we don’t. In contrast with other Image Management Systems (IMS), we opted to not go for a central database. In our opinion, databases only seem like a good idea at first, but inevitable always cause problems down the road.

People also ask us “how easy is it to import our slides?”. The latter phrasing is probably more telltale than the first, as it assumes that is not the case apparently with other systems, i.e., other systems often put you in a situation where it is not easy to register slides. It still puts us in an awkward position, as we then actually have to explain that there is no import process as such. Put the slides where you want them, and that’s it. You’re done. Finito.

Here are some of the reasons why you would want a database overlaying your slides:

  • Ease of data association. Form data and overlaying graphical annotation objects can be stored with the slide’s full path reference as a foreign key.
  • Ease of search for a specific slide. Running a search query on a table is decidedly faster than parsing a potentially highly hierarchical directory tree structure
  • Rapid access to slide metadata. Which is not the same as our first point: data association. Slide metadata is information that is already incorporated into the native file format itself. A database can opt to extract such information periodically and store it internally in a centralized table structure, so that it is more easily extracted in real time when needed.

When taken together, the conclusion is that such databases are nothing more but glorified indexing systems. Such an indexing system invariable turns into a gorilla… An 800 lbs gorilla for that matter… Let’s talk about it:

  • An index takes time to build
  • An index consumes resources, both during and after construction.
  • With a rapidly evolving underlying data structure, the index is at risk of being behind the curve and not reflecting the actual data
  • In order to control the index and not constantly having to rebuild it, a guided (underlying) data management approach may be needed
  • At some point, in between index builds, and outside the controlled data entry flow, someone is going to do something different to your data
  • Incremental index builds to bypass performance bottlenecks are problematic when data is updated

Now there are scenarios where all of the above doesn’t matter, or at least doesn’t matter all that much. Think of a conventional library catalog; does it really matter if your readers can only find out about the newest Dean Koontz book that was purchased a day after it was actually registered in the system? Even with rapidly moving inventory systems: when somebody orders an item that is erroneously no longer available from your webstore… Big whoop. The item is placed on back-order, or the end-user simply cancels the order. It you end up making the majority of your customers mad this way, then the problem is not in your indexing system, but in your supply chain itself. There’s no doubt that for webshops and library catalogs, indexes speed up search, and the pros on average outweigh the cons.

But digital pathology is different. Let’s look at each of the arguments against indexing and see how much weight they carry in a WSI environment:

  • An index takes time to build. When are you going to run it? Digital pathology was created so you can have round the clock availability of your WSI data. Around the clock. Across time-zones. Anything that takes time, also takes resources. CPU cycles, memory. So expect performance of your overall system to go down while this happens.
  • Resource (storage) consumption during and after construction. So be careful about what you are going to index in terms of storage. Are you going to index slide metadata? Thumbnails? How much data are your practically talking about? How much data are you going to index to begin with? And how much of your indexed data will realistically ever be accessed (more on that subject in a separate post)?
  • Rapidly evolving underlying data structure. Assume a new slide is generated once every two minutes, and a quantification algorithm (like HistoQC) takes about a minute to complete per slide. This means you have a new datapoint every minute. And guess which datapoint the physician wants to see now now NOW…
  • Guided data management approach. One of the great uses of digital pathology is the sharing of data. You can share your data, but other can also share it with you. So apart from your in-house scanner pipeline; what do you going to do with the external hard disk someone just sent you? Data hierarchies come in all shapes and sizes. Sometimes it’s a patient case; sometimes it’s toxicological before/after results; sometimes it’s a cohort from a drug study. Are you going to setup data import pipelines for all these separate scenarios? Who’s going to manage those?
  • Sometimes, somewhere, someone is going to do something different to your data. Because the above pipelines won’t work. No matter how carefully you design them. Sometimes something different is needed. You need to act, and there’s no time for re-design first. The slide gets replaced, and now the index is out-of-date. Or the slide is renamed because of earlier human error, and the index can’t find it anymore. And as is often the case: this isn’t about the scenarios that you can think of; but about the scenarios you can’t.

Safe to say that we think an indexing mechanism for digital pathology and whole slide images is not a good idea. Just don’t do it.

Do you DICOM?

Standardization efforts in digital pathology

DICOM has been working on a standard description of digital pathology (DP) imaging data has been underway for a few years now. Digital pathology and whole slide imaging (WSI) is the focus of DICOM workgroup 26. A summary of its efforts can be found in David Clunie‘s paper in the Journal of Pathology informatics at https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236926/

In 2014, we published our own conference paper on the effort (during the 12th European Conference on Digital Pathology in Paris, France) . The abstract is available through the Researchgate website; the full presentation from the ECP Paris conference is available through https://www.slideshare.net/YvesSucaet/digital-pathology-information-web-services-dpiws-convergence-in-digital-pathology-data-sharing

The focus of this blog is on imaging. To be complete, readers interested in digital pathology standardization efforts, should also have a look at the IHE PaLM initiative.  Additional resources can also be found on slide 45 of our SlideShare publication.

Pathomation supports DICOM

At Pathomation, we’ve been supporting DICOM supplement 145 file format extension (PDF available here) for a while now. We recently added our own “dicomizer” tool to our free PMA.start software. This is a command-line tool (CLI) that allows for the conversion of any WSI file format into a into a DICOM VL (Visible Light) Whole Slide Image IOD (Information Object Definition).

The Pathomation dicomizer is currently available on Windows only (other platforms pending) and can be installed as part of the regular setup process.

After installation, you have to navigate to the folder in which you installed the tool (typically c:\program files\pathomation\dicomizer) , and then you can invoke the tool through the command-line, like this:

Running the validation tool from David Clunie on our generated DICOM slides results in the following:

We can also visualize the slides side by side in two browser windows, empirically “proving” that the DICOM output is equivalent to the original slide:

So there you have it. DICOM has been involved in the digital pathology standardization process for a while now. For those interested to support it, you can now use Pathomation’s free dicomizer tool to get hands-on experience.

 

 

 

 

 

PHP SDK & Packagist, making your life easier!

Over the past few years, dependency management tools have become a key component to successfully building high-level applications.

The number of third party libraries (with all the dependencies involved) has just insanely grown during the last few years – and expected to grow even more in the upcoming years – that managing the packages, the versions and the dependencies can become a serious nightmare for both developers and project managers. Dependency management tools therefore provide solutions to relieve them of the hassle of keeping dependencies up to date and organized so they can focus on their main mission, building awesome applications.

To make libraries easily shareable online and available for downloads, submissions, updates, bug reporting…Dependency management tools rely naturally on central repositories (we can name Maven central repository for Java, PyPI for python…and the list goes on).

As introduced in our previous posts, our Java SDK is already available on Maven, our Python SDK is also available on PyPI, so the next move logically is to add the PHP SDK to a central repository for PHP to make it easily available for the public…but what do we have for PHP?

For PHP, the dependency management tool to go for is composer, as pointed out earlier, it relies – like other dependency management tools – on its own central repository Packagist.

Installing Composer

In the same logic as most dependency management tools , Composer is available via command line interface which offers indeed way more powerful possibilities than a GUI.
To be able to execute Composer commands, you need to have it installed first on your computer, for more information you can refer to install composer.

Once composer is successfully installed, you can check if your environment is well set up via the following command :

Your first sample code

Next step is to create a sample code to call method getVersionInfo() from Class Core on pma-php (once obviously installed via composer).

To install pma-php via composer in a PHP project, you simply issue the following command in the project folder :

As you could already notice, we didn’t specify any version to install for composer, so it installed automatically the latest release (v2.0.0.40).

Targeting a specified release is straightforward via this command :

For the list of available releases for pma-php library, you can refer to its official Packagist page

Once the library pma-php is successfully installed on a project folder, the following files/folders are created :

Next step is to create a PHP file to call method getVersionInfo() of Class Core on pma-php :


<?php

require_once 'vendor/autoload.php';
use Pathomation\PmaPhp\Core;

echo Core::getVersionInfo();

Running this script returns value “2.0.0.1346”, which is the version of PMA.start installed on my own computer.

That’s all it takes! easy, isn’t it?

Run & share your Python code online with Jupyter notebooks

If you’ve had a look on previous posts on our Python SDK, you have a already a pretty good idea about scripts you can run against PMA.start (or PMA.core) for retrieving image data, Slide visualization

Most of the times running code locally is all what’s needed, however there are situations where being able to run it online and share it with collaborators (students, colleagues…) on the other side of the globe can come very handy and offer a richer and more interactive experience.
There are couple of free/paid services to run & share Python code online, our focus in this tutorial will be on Azure notebooks powered by Jupyter.

What are Azure notebooks & Jupyter?

As stated on the official documentation, Azure Notebooks is a free service for anyone to develop and run code in their browser using Jupyter. Jupyter is an open source project that enables combing markdown prose, executable code, and graphics onto a single canvas.

Azure Notebooks currently supports Python 2, Python 3, R and F# and their popular packages (e.g for Python the Anaconda distro is preinstalled), but our focus for the moment will be on Python 3 since it’s the minimum required version for the Python SDK

Your first Azure notebooks project

To create your first Azure notebooks project, navigate to the home page
click on Try it now then login (any Microsoft, Gmail,…. email address).

Once successfully logged in, navigate to section My Projects

Click on button New Project :

Introduce a Project name and and ID then click on Create (It’s worth mentioning that you have to set the project as Public if you wish to be able to share your notebooks with other users)

Click on the newly created project :

Notebooks are organized by projects, this makes it very efficient to create separate projects depending on scripts, target hosts and target audience to share notebooks with.

To create a new notebook, click on menu then select Notebook. Add a name for the notebook and select your Python 3 version (either 3.5 or 3.6 as both compatible with the Python SDK)

Notebooks are organized into cells, each cell can contain one or multiple python scripts to execute.

First cell should always be the following one as it’s required to install first pma_python packages on the running server before being able to interact with the Python SDK.

From there on you can add as many cells as you wish to interact with the Python SDK via scripts we introduced on previous posts or ones you create yourself.

Once your notebook modified and saved, you can easily share the created project with other users via share button on My projects page.

The terminal

In addition to its intuitive and easy-to-use interface, Azure notebooks do provide provides access to a complete terminal running on the server. To access the terminal first click on the Jupyter icon in the upper left hand corner of your notebook server. Then click on the New button on the upper right hand side of the notebook list. Finally click Terminal.

Your newly created notebook is accessible on the running server and executing shell commands there which can be useful for downloading data, copying files, inspecting processes, or editing files with traditional Unix tools.

What whole slide images (WSIs) are made of

Whole Slide Images

If you already know about pyramidical image files, feel free to skip this paragraph. If you don’t, sticks around; it’s important to understand how microscopy data coming out of slide scanners is structured to be able to manipulate it.

It all starts with a physical slide: a physical slide is a thin piece of glass, with the dimensions

When a physical slide is registered in a digital fashion, it is translated into a 2-dimensional pixel matrix. At a 40X magnification, it takes a grid of4 x 4 pixels to represent 1 square micrometer. We can also say that the image has a resolution of 0.25 microns per pixel. This is also expressed as 4 pixels per micron (PPM).

All of this means that in order to present our 5 cm x 2 cm physical specimen from the first part of this tutorial series in a 40X resolution we need (5 * 10 * 1000 * 4) * (2 * 10 * 1000 * 4) = 200k x 80k = 16B pixels

Now clearly that image is way too big to load in memory all at once, and even with advanced compression techniques, the physical sizes of these is roughly around one gigabyte per slide. So what people have thought of is to package the image data as a pyramidal stack.

Pyramid of Cestius as a metaphore for pyramidal stack images. By Francesco Gasparetti from Senigallia, Italy – Piramide Cestia, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=2614848

Ok, perhaps not that kind of pyramid…

But you can imagine a very large image being downsampled a number of times until it receives a manageable size. You just keep dividing the number of pixels by two, and eventually you get a single image that still represents the whole slide, but is only maybe 782 x 312 pixels in size. This then becomes the top of your pyramid and we label it as zoomlevel 0.

At zoomlevel 1, we get a 1562 x 624 pixel image etc. It turns out that our original image of 200k x 80k pixels is found at zoomlevel 8. Projected onto our pyramid, we get something like this:

Worked out example (showing the different zoomlevels) of a pyramidal stack for a concrete whole slide image.

So the physical file representing the slide doesn’t just store the high-resolution image, it stored a pyramidal stack with as many zoomlevels as needed to reach the deepest level (or highest resolution). The idea is that depending on the magnification that you want to represent on the screen, you read data from a different zoomlevel.

Tiles

The pyramid stack works great up to certain resolution. But pretty quick we get into trouble and the images become too big once again to be shown in one pass. And of course, that is eventually what we want to do: Look at the images in their highest possible detail.

In order to work around this problem, the concept of tiles is introduced. The idea is that at each zoomlevel, a grid overlays the image data, arbitrarily breaking the image up in tiles. This leads to a representation like this:

Now, for any area of the slide that we want to display at any given time to the end-user, we can determine the optimal zoomlevel to select from, as well a select number of tiles that are sufficient to show the desired “field of view”, rather than asking the user to wait to download the entire (potentially huge!) image. This goes as follows:

Or, put the other way around (from the browser’s point of view):

So there you have it: whole slide images are nothing but tiled pyramid-shaped stacks of image data.

A look at PMA.view

Architecture

Apart from PMA.start, Pathomation also offers a professional range of products. Yes, professional is a euphemism for “not free”, but we do feel you get quite some value in return. And some of the money flows back to our developers so they can also keep working diligently on improving PMA.start, and the free product offering around it, including our SDKs and software plugin for ImageJ/FIJI.

At the core of it all always sits PMA.core. Even PMA.start runs on top of PMA.core; albeit a restricted version, that only can access local data on your personal system. Hence the name PMA.core.lite. The professional version, PMA.core, can do loads more, including making annotations, capture form meta-data, as well as track user activity in a 21CFR.11 compliant manner. Both PMA.core has been validated conform GAMP 5 guidelines.

In a different article on this blog we explained how big (and why!) these whole slide images get. PMA.core then is responsible for extracting tiles from the original images when the users wants it. These tiles can be extracted via one of our language-specific SDKs, or end-users can use a viewer software built on top of PMA.core, and understand how user (mouse) operations need to be translated into tile requests.

At Pathomation, our viewer software is PMA.view. Like PMA.core, it comes in two flavors: PMA.view and PMA.view.lite. The distinction is made in order to provide better interaction with the respective underlaying versions of PMA.core. One could also say that PMA.start as a product is the combination of PMA.view.lite and PMA.core.lite. PMA.view in turn interacts with (multiple instances of) PMA.core.

As you can suspect, Pathomation also offers other applications next to PMA.view, that are also built on top PMA.core. But that’s the focus of a different post (sneak preview of what we mean through our YouTube channel).

PMA.view features

Below is a screenshot of PMA.view. The main element of the user interface are a ribbon, a central viewing panel for slide visualization, and two side panel which in turn may contain one or more sections.

The content of the ribbon (as well as the number of tabs etc) is completely configurable through an XML file. Similarly, the content and sections of the side panels is configurable through XML configuration files. Editors for all are provided in PMA.view administrative interface. Syntax highlighting and restore options are provided as well.

The central viewport for slide viewing is a Zooming User Interface (ZUI); you navigate slides by panning left and right, up and down, and by zooming in and out. You can use the mouse scrollwheel or drag a rectangle with the mouse while holding down the shift-key (on your keyboard) to zoom in on a specific area of your choosing.

In the left panel you typically see a navigation tree, representing the slides hosted by PMA.core. PMA.view can connect to multiple PMA.core instances simultaneously. This is useful when involved in international collaboration, of even in a situation where you have a central hospital hub with several smaller satellite offices spread throughout a region. Just put a tile server in each location to prevent having to transport (digital or – worse – physical) slides around.

Apart from convenient slide management across multiple sites, PMA.view offers many other features that people have come to expect from modern slide viewers, including:

  • Capture structured or free text meta-data
  • Seamless support for different scanning modalities (brightfield, fluorescence, and z-stacking)
  • Brightness and contrast controls
  • On-slide annotations in arbitrary colors and shapes (rectangle, circle, freehand etc.)
  • Annotation toggling based on type and author

Slide sharing

One of the big selling points of digital pathology is sharing slide content, without the need to physically distribute the slides via regular mail. Apart from the obvious improvement this bring regarding speed, there’s a secondary advantage that you can send slide to multiple parties at the same time. The third advantage is that the slides can’t get lost in the mail or damaged during transport anymore. In return for that of course, we occasionally encounter over-eager spam filters.

Two important impediments that prevent slide sharing however are the following:

  • When I share a slide with you, I have to make sure to specify which file format I’m sharing with you, so you get get the appropriate viewer
  • When I share a slide with you, I have to upload a LOT of data to WeTransfer, Aspera, or a good ole’ fashion FTP site, where you in turn can download… a LOT of data… again.

Pathomation’s PMA.core and PMA.view combo solve both problems for you. PMA.core abstracts any proprietary slide file format to “just” pixels, and PMA.view allows you to share slides with a counter-party in the form of HTML hyperlinks.

How does this work? PMA.view has a dedicated “Share” button on its ribbon to create links that point directly to selected content. There are different kinds of content that you can share:

  • You can share all slides in a selected folder, thus mimicking a patient case
  • You can share an individual slide
  • You can share a pre-selected region of interest within a slide

Share links are always formatted the same, but they can be used in multiple ways. You can:

  • Use links directly as they are. You can share them with your buddies via email, during a Skype chat session, WebEx, GoToMeeting, whatever.
  • Convert links into scannable QR codes. When you’re giving a presentation during a conference, or in a classroom setting, text-based links are cumbersome to present. Ironically, text-based links are not well suited for print media, either, for the same reason: it’s too easy to make a type copying the link character by character. It’s more convenient to present a QR-code then that people can scan with their smartphones or tablets, and immediately view, or convert into the actual text-link for use elsewhere.
  • Embed them into your own web-content. If you still have an actual website, that is, a place on a server somewhere where you deposit your own HTML code, you can now sprinkle live slides throughout the site and have them embedded in an <iframe> tag. Because not everybody knows how these work, PMA.view will give you the necessary HTML code that you can past directly into your own website. You’ll notice that within the HTML snippet, the plain old original link from above resurfaces. And it gets better: Whether you use plain old notepad to make your website in the traditional sense, or you use a CMS like WordPress or Drupal, an LMS like Canvas, Moodle, or Blackboard, or a social media platform: these too boil down to sending HTML code to the browser, so there’s usually a way to use <iframe>s there, too.

Remember the Zooming User Interface (ZUI) terminology we introduced you to earlier? Well, last but not least, when you click on a PMA.view slide-link, you’re essentially instantiating our ZUI. There are no plugins required, nothing to download, it’s just all basic JavaScript and HTML 5. As a consequence, it’s also easy to configure the layout of the ZUI. And that’s what the last set of options at the bottom of the share dialog is about.

How do you want your audience to experience your slide when they go to it? Do you want them to see the barcode? The overview?… It’s all in your hands, and we think this level of control and flexibility is pretty awesome.

Organizing pipelines

So as awesome as we think ourselves to be, there’s always room for improvement, right? So here’s a scenario that a customer of our came across recently:

  • We have a large number of slides that we want to embed throughout various pages of our proprietary customer portal. We like the PMA.view slide embedding <iframe> capability, but it’s really a pain to generate all these links one by one. Because there are so many, it’s also rather tedious making sure that they are ALL clicked on.

Is there a better way? Yes, there is.

When you look at the links that are generated, it’s not rocket science to figure out how they’re built. The customer wanted to have a link to a thumbnail of a slide, which always looks like this:

http://yourserver/view/EmbedThumbnail/{seemingly random charachters}

As well as a link to the actual slide ZUI, which always looks like this:

http://yourserver/view/Embed/{seemingly random charachters}

The character string at the end of these links is a particular slide’s unique identifier (UID). When we switch over to our PHP SDK, we can write just a few lines of code that gets all UIDs from all slides in a particular directory:

<?php
require_once "lib_pathomation.php";
?>
<html><head><title>All thumbnails for all slides</title></head><body>
<?php
$session = connect("http://yourserver/core", "username", "secret");
echo "SessionID for universal viewer account = ".$session."<br>";
foreach (getSlides("rootdir/subdir", $session) as $slide) {
    echo "<h3>$slide</h6>";
    $uid = getUID($slide, $session);
    $thumb = "http://yourserver/view/EmbedThumbnail/$uid";
    echo "<a href='$thumb'><img border=0 src='$thumb' height='50' align='left' /></a>";
    echo "<tt>$thumb</tt><br />";
    echo "<br clear='all'>";
}
?>
</body></html>

 

Of course, you can modify this script anyway you want to compensate for your particular directory hierarchy and structure.

Then, it was just a matter of simple string concatenation to provide the client with a custom website where they were able to retrieve all of the links to their slides in batch. As the page interact with PMA.core directly at that point,

So, for our client, we figured out how to organize a pipeline to facilitate their content production process. We user our PHP SDK on top of PMA.core to generate links that in turn exploit the slide sharing capabilities of PMA.view. Now that’s cool!

But we still want more

Do you have a scenario that you have difficulties with or want to see optimized? Let us know; we’ll he happy to talk to you.

 

Slide visualization in Python

Now that both PHP and Java have methods for embedded slide visualization, we can’t leave Python out. Originally we didn’t think there would be much need for this, but it’s at least confusing to have certain methods available in one version of our SDK, while not in others.

In addition, interactive visualization is definitely a thing in Python; just have a look at what you can do with Bokeh. Ideally and ultimately, we’d like to add digital pathology capabilities to an already existing framework like Bokeh, but in this blog post we’ll just explore how you can embed a slide into your IPython code as is.

As PMA.python is not a standard library, it bears to start your notebooks with the necessary detection code for the library. If it doesn’t work, it’s bad manners to leave your users in the dark, so we’ll provide some pointers on what needs to be done, too:


try:
    from pma_python import core
    print("PMA.python loaded: " + core.__version__)
except ImportError:
    print("PMA.python not found")
    print("You don't have the PIP PMA.python package installed.\n"
        + "Please obtain the library through 'python -m pip install pma_python'")

If all goes well, you’ll see something like this:

Once you’re assured the PMA.python library is good to go, you should probably verify that you can connect to your PMA.core instance (which can be PMA.start, too, of course; just leave the username and password out in that case):


server = "http://yourserverhere/pma.core/"
user = "your_username"
pwd = "your_password"
slide = "rootdir/subdir/test.scn"
session = core.connect(server, user, pwd)
if (session):
    print("Successfully connected to " + server + ": " + session)
else:
    print("Unable to connect to PMA.core " + server + ". Did you specify the right credentials?")

If all goes well, you should get a message that reads like this:

Successfully connected to http://yourserverhere/pma.core

Finally, the visualization part. Note that Pathomation provides a complete front-end Javascript-framework for digital pathology. In order to bring these capabilities into (I)Python then, it sufficient to write some encapsulation code around this basic demonstration code:


def show_slide(server, session, slide):
    try:
        from IPython.core.display import display, HTML
    except ImportError:
        print("Unable to render slide inline. Make sure that you are running this code within IPython")
        return
    
    render = """
        <script src='""" + server + """scripts/pma.ui/pma.ui.view.min.js' type="text/javascript"></script>

<div id="viewer" style="height: 500px;"></div>
<script type="text/javascript">
            // initialize the viewport
            var viewport = new PMA.UI.View.Viewport({
                    caller: "Jupyter",
                    element: "#viewer",
                    image: '""" + slide + """',
                    serverUrls: ['"""+ server + """'],
                    sessionID: '""" + session + """'
                },
                function () {
                    console.log("Success!");
                },
                function () {
                    console.log("Error! Check the console for details.");
                });
        </script>"""
display(HTML(render))

Our method is a bit more bulky than strictly needed; it’s robust in this sense that it makes sure that it is actually running in an IPython environment like Anaconda, and will also provide output to the JavaScript console in case the slide can load for some reason.

Rendering a slide inline within a Python / Jupyter notebook is now trivial (just make sure you ran the cell in which you define the above method, before invoking the following piece of code):


show_slide(server, session, slide)

The result look like this:

There is never an excuse not to use exploratory data analysis to get initial insights in your data. As switching environments, browsers, screens… can be tedious, and notebooks are meant to encapsulate complete experiments, interactive visualization of select whole slide images may just be one more thing you want to include.

The .ipynb file can be downloaded here and used as a starting point in your own work.

By studying the PMA.UI framework, you can learn more about how to further modify and customize your interactive views.

Now, anybody out there who wants to pick up our Bokeh challenge?

Building digital pathology web apps with Java

Working with slides

The Java Core class is part of our PMA.java API (introductory materials available here) and comes with a number of methods to navigate WSI slides on your local hard disk. Most often you’ll be alternating between getDirectories() and getSlides().

Here’s an example that will allow you to navigate your hard disk in a tree-like fashion :


import java.io.IOException;
import java.net.URLEncoder;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class Test extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();

      if (!Core.isLite()) {
         System.out.println("PMA.start not found");
         return;
      }

      Core.connect();

      out.println("<html>");
      out.println("<ul>");
      if (request.getParameter("p") == null) {
         for (String rd : Core.getRootDirectories()) {
            out.println(
                  "<li><a href='?p=" + URLEncoder.encode(rd, "UTF-8").replace("+", "%20") + "'>" + rd + "</li>");
         }
      } else {
         String[] parts = request.getParameter("p").split("/");
         for (String rd : Core.getRootDirectories()) {
            if (parts[0].equals(rd)) {
               out.println("<li><b>" + rd + "</b>");
               for (String subdir : Core.getDirectories(rd)) {
                  out.println("<ul>");
                  String subdirparts[] = subdir.split("/");
                  if (request.getParameter("p").indexOf(subdir) == -1) {
                     out.println("<li><b>" + subdirparts[subdirparts.length - 1] + "</b>");
                     // keep drilling down, or see if you can retrieve slides as well
                     out.println("</li>");
                  } else {
                     out.println("<li><a href='?p=" + URLEncoder.encode(subdir, "UTF-8").replace("+", "%20")
                           + "'>" + subdirparts[subdirparts.length - 1] + "</a></li>");
                  }
                  out.println("</ul>");
               }
               out.println("</li>");
            } else {
               out.println("<li><a href='?p=" + URLEncoder.encode(rd, "UTF-8").replace("+", "%20") + "'>" + rd
                     + "</a></li>");
            }
         }
      }
      out.println("</ul>");
      out.println("</html>");
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

Yes, this should all be in a recursive function so you can dig down to just about any level in your tree structure. However, this post is not about recursive programming; we’re mostly interested in showing you what our API/SDK can do.

For instance, you can retrieve the slides in a selected folder and generate a link to them for visualization:


// now handle the slides in the subfolder
out.println("<ul>");
for (String slide : Core.getSlides(subdir)) {
   String[] slideParts = slide.split("/");
   out.println("<li>" + slideParts[slideParts.length - 1] + "</li>");
}
out.println("</ul>");


Introducing the UI class

We can do better than our last example. Providing a link to PMA.start is easy enough, but once you get to that level you’ve lost control over the rest of your layout. What if you make a website where you want to place slides in certain predefined locations and placeholders?

That’s where the UI class comes in. Currently, you can use it to either embed slide viewports, or thumbnail galleries in your own website.

Here’s how you can include an arbitrary slide:


import java.io.IOException;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import com.pathomation.UI.UI;

public class Test extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      // setup parameters
      UI.pmaUIJavascriptPath = UI.pmaStartUIJavascriptPath;
      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect();

      // pick a slide to embed in your page
      String slide = Core.getSlides("C:/my_slides/").get(0);
      List<String> results = UI.embedSlideBySessionID("http://localhost:54001/", slide, sessionID);
      // the first element in the list contains the generated front end code
      out.println(results.get(0));
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}


The embedSlideBySessionID() method return a string that serves as an identifier for the generated viewport. Use this identifier to subsequently define a style for your viewport:


// actually embed slide
// the second element in the list corresponds to the viewport ID
String viewport = UI.embedSlideBySessionID("http://localhost:54001/", slide, sessionID).get(1);
out.println("<style type=\"text/css\">\n" + 
"#" + viewport + "{\n" 
      + "width: 500px;\n" 
      + "height: 500px;\n"
      + "border: 2px dashed green;\n" 
      + "}\n" 
+ "</style>");

The result is now a 500 x 500 fixed square (with a dashed border) that doesn’t change as your modify the browser window:


You can have as many viewports on a single page as you want; each is automatically assigned a new ID, and so you can set separate layout for each one.

Working with galleries

What if you have a collection of slides and you want to present an overview of these (browsing through slide filenames is tiring and confusing). You could already combine the code we have in this post so far and request thumbnails for a list of a slides found in a directory, subsequently rendering selected slides in a viewport.

But what if you have 50 slides in the folder? Do you really want to handle the scrolling, just-in-time rendering of initially hidden thumbnails etc.?

Pretty early on in our Pathomation career we found ourselves facing the same problems. We re-invented our own wheel a couple of times, after which we figured it was round enough to encapsulate in a piece of re-usable code.

You guessed it: the UI class provides a way to generate galleries, too. At its simplest implementation, only one line of code is needed (setup not included):


import java.io.IOException;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import com.pathomation.UI.UI;

public class Test extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

             ServletOutputStream out = response.getOutputStream();
             String sessionID = Core.connect();
             out.println("<p>" + sessionID + "</p>\n");
             
             UI.pmaUIJavascriptPath = UI.pmaStartUIJavascriptPath;

             List<String> results = UI.embedGalleryBySessionID("http://localhost:54001/",
             "C:/my_slides", sessionID);
             // the first element in the list contains the generated front end code
             out.println(results.get(0));        
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

You’ll notice that you can select slides in the gallery, but they’re not particularly reactive. For that, you’ll need to instruct PMA.UI to provide a viewport as well. When a slide is clicked in the gallery, the slide is then shown in the viewport:

out.println(UI.linkGalleryToViewport(gallery, "viewer"));

 

The default orientation of a gallery is “horizontal”, but you can set it to a vertical layout, too:


List<String> results = UI.embedGalleryBySessionID("http://localhost:54001/", "C:/my_slides", sessionID,
      new HashMap<String, String>() {
         {
            put("mode", "vertical");
         }
      });
// the first element in the list contains the generated front end code
out.println(results.get(0));

In which you can build something that looks like this:

 

Try it!

You can build pretty complicated interfaces already this way. One possibly scheme e.g. is where you offer end-users the possibility to compare slides. You need two galleries and two viewports, and that goes like this:


List<String> results = UI.embedGalleryBySessionID("http://localhost:54001/", "C:/my_slides", sessionID);
List<String> results2 = UI.embedGalleryBySessionID("http://localhost:54001/", "C:/my_slides", sessionID);
try {
	out.println(
		 "<table width=\"100%\"><tr><th width=\"50%\">Slide 1</th><th "
		 + "width=\"50%\">Slide 2</th></tr><tr><td width=\"50%\" valign=\"top\">"
		 + results.get(0)
		 + "</td><td width=\"50%\" valign=\"top\">"
		 + results2.get(0)
		 + "</td></tr><tr><td width=\"50%\" valign=\"top\">"
		 + UI.linkGalleryToViewport(results.get(1), "viewerLeft")
		 + "</td><td width=\"50%\" valign=\"top\">"
		 + UI.linkGalleryToViewport(results2.get(1), "viewerRight")
		 + "</td></tr></table>");	
} catch (Exception e) {
	e.printStackTrace();
}

Under the hood

As demonstrated in this post through the Java SDK : simple single-line instructions in Java are translated in whole parts of JavaScript code. The UI class takes care of loading all the required libraries, and makes sure housekeeping is taken care of in case of multiple viewports, galleries, etc.

You can see this for yourself by looking at the source-code of your page, after it loads in the webbrowser.

The JavaScript framework that we wrote ourselves for browser-based WSI visualization is called PMA.UI. It comes with its own set of tutorials, and there is much more you can do with PMA.UI than through the Java SDK alone.

However, we found in practice that there is much demand for cursive embedding of WSI content in any number of places on a website or corporate intranet. In my cases, a gallery browser and a “live” slide viewport are sufficient. In those scenarios, the Java SDK can definitely come in handy and offer a reduced learning curve.

The SDK should help you get started . By studying the interaction between the Java-code and the generated JavaScript, you can eventually master the PMA.UI library as well and interact with it directly.

By all means, do send us screenshots of your concoctions (let us know when you need help from your friendly neighborhood pathologists, too)! Perhaps we can have a veritable “wall of WSI fame” up one day.

Code samples for the Java SDK

Java sample code availability

Three months ago we introduced the SDK for Java.

A set of code samples can come very handy to ease end users’ first interaction with both PMA.Start and its bigger brother PMA.core. A folder named “samples” was added for this purpose to our PMA.java GitHub repository.

Introduction

For the samples we added, we create a Java HTTPServlet class that interacts with the Java SDK to generate HTML code to be displayed on a web browser.
We aim to make the samples as simple & general as possible, end users can then adapt them to their needs depending on what architecture & frameworks they rely on for their Java web application.

PMA.Start

For the following samples to be run successfully, you’ll need to have PMA.Start installed and running on your computer.
For more information about PMA.Start, refer to https://free.pathomation.com
If you don’t have PMA.Start installed yet on your computer, make sure you download it from https://free.pathomation.com/download/ for free.

This first sample checks if PMA.Start (PMA.core.lite) is installed and running on end user’s computer :


package samples_10;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.*;

public class IdentifyingPMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      // test for PMA.core.lite (PMA.start)
      out.println("Are you running PMA.core.lite? " + (Core.isLite() ? "Yes!" : "no :-(") + "<br />");
      out.println("Seeing 'no' and want to see 'yes'? Make sure PMA.start is running on your system or download it from "
                  + "<a href = \"http://free.pathomation.com\">http://free.pathomation.com</a>");
      out.println("</html>");

   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

If PMA.Start is actually running, it returns :

Version info

The second sample is quite similar to the previous one for it checks if PMA.Start is running, if yes it displays its version number

package samples_20;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class GettingVersionInformationPMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      // test for PMA.core.lite (PMA.start)
      if (!Core.isLite()) {
         // don't bother running this script if PMA.start isn't active
         out.println("PMA.start is not running. Please start PMA.start first");
      } else {
         // assuming we have PMA.start running; what's the version number?
         out.println("You are running PMA.start version " + Core.getVersionInfo());
      }
      out.println("</html>");
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

Making the connection

For the third sample, a connection is established to PMA.Start to retrieve a sessionID


package samples_30;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class ConnectingToPMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
         
      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect();
      out.println("<html>");
      if (sessionID == null) {
         out.println("Unable to connect to PMA.start");
      } else {
         out.println("Successfully connected to PMA.start; sessionID = " + sessionID);
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect();
      }
      out.println("</html>");   
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

If the connection succeeds, the sessionID is then displayed on the web browser

Finding your hard disks

For the fourth sample, a connection is established to PMA.Start, then the local HDD is parsed to retrieve root directories (Drive letters) :


package samples_40;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class GettingDriveLettersFromPMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      // establish a default connection to PMA.start
      if (Core.connect() != null) {
         out.println("The following drives were found on your system:" + "<br/>");
         for (String rootDirectory : Core.getRootDirectories()) {
            out.println(rootDirectory + "<br/>");
         }
         out.println("Can't find all the drives you're expecting? For network-connectivity (e.g. mapped drive access) you need PMA.core instead of PMA.start");
      } else {
         out.println("Unable to find PMA.start");
      }
      out.println("</html>");         
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

If the connection succeeds, a list of drive letters is displayed one by one on the web browser

Navigating directories

The fifth sample is very similar to previous one, it selects the first root directory (drive letter), its content is then displayed one by one


package samples_60;

import java.io.IOException;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class GettingDirectoriesPMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      String sessionID = Core.connect();
      if (sessionID == null) {
         out.println("Unable to connect to PMA.start");
      } else {
         out.println("Successfully connected to PMA.start" + "<br/>");
         List<String> rootDirs = Core.getRootDirectories();
         out.println("Directories found in " + rootDirs.get(0) + ":" + "<br/>");
         List<String> dirs = Core.getDirectories(rootDirs.get(0), sessionID);
         for (String d : dirs) {
            out.println(d + "<br/>");
         }
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect(sessionID);
      }
      out.println("</html>");         
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

Successful connection results in a list of drive letters with their respective folders shown one by one

Looking for slides

For the next sample, a connection is established to PMA.Start, then we look for the first non empty directory inside the local HDD to browse its content


package samples_80;

import java.io.IOException;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class GettingSlidesPMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      String sessionID = Core.connect();
      if (sessionID == null) {
         out.println("Unable to connect to PMA.start");
      } else {
         out.println("Successfully connected to PMA.start" + "<br/>");
         // for this demo, we don't know where we can expect to find actual slides
         // the getFirstNonEmptyDirectory() method wraps around recursive calls to getDirectories() and is useful to "just" find a bunch of slides in "just" any folder
         String dir = Core.getFirstNonEmptyDirectory("/", sessionID);
         out.println("Looking for slides in " + dir + ":" + "<br/>");
         List<String> slides = Core.getSlides(dir, sessionID);
         for (String slide : slides) {
            out.println(slide + "<br/>");
         }
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect(sessionID);
      }
      out.println("</html>");         
      
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

A quite similar result can be obtained if connection succeeds

Listing slides

Last sample for PMA.Start is a continuation for previous one, once the first non empty directory is found, the list of slides within is displayed successively (besides each slide name, its UID is added)


package samples_90;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;

public class GetUIDSlidePMAStart extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      String sessionID = Core.connect();
      if (sessionID == null) {
         out.println("Unable to connect to PMA.start");
      } else {
         out.println("Successfully connected to PMA.start" + "<br/>");
         String dir = Core.getFirstNonEmptyDirectory("/", sessionID);
         out.println("Looking for slides in " + dir + "<br/>");
         for (String slide : Core.getSlides(dir, sessionID)) {
            out.println(slide + " - " + Core.getUid(slide, sessionID) + "<br/>");
         }
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect(sessionID);
      }
      out.println("</html>");      
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

Here is what you can get for a successful connection

PMA.core

One main difference between PMA.Start and PMA.core is the former is run on a server which implies end users connect to through authentication (server URL, username, password).
For better code organisation & readability, we added Config class which intends to provide a single file to define the connection values to interact with PMA.core.
As You can notice later on the samples for PMA.core, HTTPServlet Classes for each sample retrieve connection values from Config class.
End users are invited to substitute the server’s url & credentials with their own respective values!


package configuration;

public class Config {
   
   // modify the following three lines for your specific circumstances:
   public static String pmaCoreServer = "http://my_server/pma.core";
   public static String pmaCoreUser = "user";
   public static String pmaCorePass = "secret";
}

First sample on the list checks if the url provided refers to a PMA.Start instance


package samples_10;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.*;
import Configuration.Config;

public class IdentifyingPMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      String pmaCoreServer = Config.pmaCoreServer;

      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      // testing actual "full" PMA.core instance that may or may not be out there
      out.println("Are you running PMA.start(PMA.core.lite) at " + pmaCoreServer + " ? " + ((Core.isLite(pmaCoreServer) != null && 
            (Core.isLite(pmaCoreServer) == true))  ? "Yes!" : "no :-(") + "<br />");
      out.println(
            "Are you running PMA.start(PMA.core.lite) at http://nowhere ? "
                  + ((Core.isLite("http://nowhere") != null && (Core.isLite("http://nowhere") == true)) ? "Yes!" : "no :-("));
      out.println("</html>");

   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

The second sample retrieves the version info for the provided PMA.core instance, it also shows what you would get trying a “bogus” URL


package samples_20;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import Configuration.Config;

public class GettingVersionInformationPMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
         
      String pmaCoreServer = Config.pmaCoreServer;
      
      ServletOutputStream out = response.getOutputStream();
      out.println("<html>");
      out.println("You are running PMA.core version " + Core.getVersionInfo(pmaCoreServer) + " at " + pmaCoreServer + "<br/>");
      
      // what happens when we run it against a bogus URL?
      String version = Core.getVersionInfo("http://nowhere/");
      
      if (version == null) {
         out.println("Unable to detect PMA.core at specified location (http://nowhere/)");
      } else {
         out.println("You are running PMA.core version " + version);
      }
      out.println("</html>");
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

For the third sample a connection is established to the PMA.Core instance, then the sessionID generated is displayed on web browser


package samples_30;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import Configuration.Config;

public class ConnectingToPMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
         
      String pmaCoreServer = Config.pmaCoreServer;
      String pmaCoreUser = Config.pmaCoreUser;
      String pmaCorePass = Config.pmaCorePass;
      
      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect(pmaCoreServer, pmaCoreUser, pmaCorePass);
      out.println("<html>");
      if (sessionID == null) {
         out.println("Unable to connect to PMA.core at specified location (" + pmaCoreServer + ")");
      } else {
         out.println("Successfully connected to PMA.core; sessionID = " + sessionID);
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect();
      }
      out.println("</html>");   
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

In analogy to previous PMA.Start samples, The next sample searches for root directories of the provided PMA.core instance and displays them one by one


package samples_40;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import Configuration.Config;

public class GettingRootDirectoriesFromPMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      String pmaCoreServer = Config.pmaCoreServer;
      String pmaCoreUser = Config.pmaCoreUser;
      String pmaCorePass = Config.pmaCorePass;
      
      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect(pmaCoreServer, pmaCoreUser, pmaCorePass);
      out.println("<html>");
      if (sessionID == null) {
         out.println("Unable to connect to PMA.core at specified location (" + pmaCoreServer + ")");
      } else {
         out.println("Successfully connected to " + pmaCoreServer + "<br/>");
         out.println("You have the following root-directories at your disposal:" + "<br/>");
         for (String rd : Core.getRootDirectories(sessionID)) {
            out.println(rd + "<br/>");
         }
         Core.disconnect(sessionID);
      }
      out.println("</html>");         
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

The fifth sample on the list is very similar to previous one, except that instead of displaying a complete list of root directories it only displays the first one and browses its content


package samples_60;

import java.io.IOException;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import Configuration.Config;

public class GettingDirectoriesPMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      String pmaCoreServer = Config.pmaCoreServer;
      String pmaCoreUser = Config.pmaCoreUser;
      String pmaCorePass = Config.pmaCorePass;

      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect(pmaCoreServer, pmaCoreUser, pmaCorePass);
      out.println("<html>");
      if (sessionID == null) {
         out.println("Unable to connect to PMA.core at specified location (" + pmaCoreServer + ")" + "<br/>");
      } else {
         out.println("Successfully connected to PMA.core; sessionID = " + sessionID + "<br/>");
         List<String> rootDirs = Core.getRootDirectories();
         out.println("Directories found in " + rootDirs.get(0) + ":" + "<br/>");
         List<String> dirs = Core.getDirectories(rootDirs.get(0), sessionID);
         for (String d : dirs) {
            out.println(d + "<br/>");
         }
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect(sessionID);
      }
      out.println("</html>");         
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

For the next sample, a connection is established to the provided PMA.core instance.
If Successful, we look for the first non empty directory to display the list of slides within successively


package samples_80;

import java.io.IOException;
import java.util.List;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import Configuration.Config;

public class GettingSlidesPMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      String pmaCoreServer = Config.pmaCoreServer;
      String pmaCoreUser = Config.pmaCoreUser;
      String pmaCorePass = Config.pmaCorePass;
      
      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect(pmaCoreServer, pmaCoreUser, pmaCorePass);
      out.println("<html>");
      if (sessionID == null) {
         out.println("Unable to connect to PMA.core at specified location (" + pmaCoreServer + ")" + "<br/>");
      } else {
         out.println("Successfully connected to PMA.core; sessionID = " + sessionID + "<br/>");
         // for this demo, we don't know where we can expect to find actual slides
         // the getFirstNonEmptyDirectory() method wraps around recursive calls to getDirectories() and is useful to "just" find a bunch of slides in "just" any folder
         String dir = Core.getFirstNonEmptyDirectory("/", sessionID);
         out.println("Looking for slides in " + dir + ":" + "<br/>");
         List slides = Core.getSlides(dir, sessionID);
         for (String slide : slides) {
            out.println(slide + "<br/>");
         }
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect(sessionID);
      }
      out.println("</html>");      
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

What we did on previous sample is extended on the last sample on this list, so in addition to displaying slides’ name we add also the UID for each slide


package samples_90;

import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.ServletOutputStream;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import com.pathomation.Core;
import Configuration.Config;

public class GetUIDSlidePMACore extends HttpServlet {

   private static final long serialVersionUID = 1L;

   @Override
   protected void doGet(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {

      String pmaCoreServer = Config.pmaCoreServer;
      String pmaCoreUser = Config.pmaCoreUser;
      String pmaCorePass = Config.pmaCorePass;
      
      ServletOutputStream out = response.getOutputStream();
      String sessionID = Core.connect(pmaCoreServer, pmaCoreUser, pmaCorePass);
      out.println("<html>");
      if (sessionID == null) {
         out.println("Unable to connect to PMA.core at specified location (" + pmaCoreServer + ")" + "<br/>");
      } else {
         out.println("Successfully connected to PMA.core; sessionID = " + sessionID + "<br/>");
         String dir = Core.getFirstNonEmptyDirectory("/", sessionID);
         out.println("Looking for slides in " + dir + "<br/>");
         for (String slide : Core.getSlides(dir, sessionID)) {
            out.println(slide + " - " + Core.getUid(slide, sessionID) + "<br/>");
         }
         // not always needed; depends on whether the client (e.g. browser) still needs to SessionID as well
         Core.disconnect(sessionID);
      }
      out.println("</html>");      
   }

   @Override
   protected void doPost(HttpServletRequest request, HttpServletResponse response)
         throws ServletException, IOException {
      this.doGet(request, response);
   }
}

What’s next?

In this blog post we toured you around our PMA.java SDK. We focused on basic back-end interactions like connecting and finding out which slides are available. We showed you how to navigate around our sample files that we now deposit as part our respective github repositories.

We intend to keep building on top of these initial samples. At the time you read this, there may already be more samples available in addition to the ones described here, so make sure to check.

In a next blog post, we intend to provide you with sample code to start building graphical user interfaces for digital pathology using the pathomation.UI namespace (if you just can’t wait; feel free to already read how this is done in PHP).

Clipping, cropping, and pasting, oh my!

Tile servers and tiles

The Pathomation API and SDKs are built around tiles. PMA.start and PMA.core are essentially tile servers. Conceptually, Whole Slide Imaging servers are not that different from GIS software. Putting it in big data terms, the difference between the two often lies in the Velocity of the data; GIS software has the luxury of being concerned with only one planet Earth, whereas a totally new whole slide image is generated every couple of minutes or so. Say what you will; exo-planets will never be mapped at the speed of tissue.

This impacts how the two categories of software can (afford to) manipulate tiles behind the scenes. Data duplication of Planet Earth’s satellite imagery is acceptable if it speeds up the graphics rendering process. In contrast, this is not the case for whole slide images. Because of the amount of data generated in a short timeframe, storage and time needed to extract all tiles beforehand registered somewhere on a scale from unnecessary, over impractical, up to just downright impossible.

That being said, a tile is valuable. It took time to extract and to render, and it will be gone once you release it, so you better do something useful with it once you have it!

What’s in a tile?

A tile in Pathomation is typically 500 by 500 pixels. That’s actually a LOT of pixels (250,000). Add to that the fact that we’re usually talking about 24-bit data stored in the RGB color space, and you end up with 750,000 bytes needed to store a single tile in-memory. It also means that when we compute an individual tile’s histogram, we need no fewer than 750,000 computations to take place. If you have a grid of 1000 x 2000 tiles… you do the math.

But of course, today’s GPUs solve all this for you, right? We can do billions of computations per second. We have gigabytes of RAM memory available, and it’s all cheap. Why even bother button up the original slide in tiles at all?

Because algorithms and optimization still matter. At our recent CPW2018 workshop, one very clear message was that we cannot solve problems in pathology by brute force. Knowing what happens behind the scenes is still relevant.

In an AI-centric world, deep learning (DL) is at the center of that center. Can we really solve all problems in the world by just adding more layers to our networks?

With real problems, can we even afford to waste C/GPU cycles using brute force “throw enough at the wall; something will stick” approaches? Or did XKCD essentially get it right when they illustrated the goal of technology?

So, this is just a long rant to illustrate our point that we think it’s still worthwhile to think about proper algorithmic design and parallelization. The tile as a basic unit is key to that, and our software can help you get bite-size tiles for your processing pleasure.

Loading images and tiles

If you’ve made it this far, it means you at least partially agree with out tile-centric vision. Cool! Perhaps you’ve even tried a couple of our code snippets in our earlier tutorials already. Even cooler! Perhaps you’ve already experienced how SLOW some of our proposed solutions to problems are. In the latter: stick with us; we totally plan on addressing all of these issues in the coming months through posts examining various aspects of these problems.

Before we get into this however, let’s just explore some of the basic techniques there are in Python to work with partial image content. Here’s how we can load an image from disk:

import matplotlib.pyplot as plt
img_grid = plt.imread("ref_grid.png")
plt.imshow(img_grid)

And here’s how we can load a tile through Pathomation:

from pma_python import core
core.connect()    # connect to PMA.start
img_tile = core.get_tile("C:/my_slides/CMU-1.svs")   # make sure this file exists on your HDD
plt.imshow(img_tile)

The internal Python representations are slightly different:

print(type(img_grid))
print(type(img_tile))

But we can convert PIL image-objects to Numpy arrays just as easily. We can convert an image to a numpy array, and subsequently visualize that one:

import numpy as np
arr_tile = np.array(img_tile)
print(type(arr_tile))
plt.imshow(arr_tile)

Converting a numpy ndarray back to a PIL Image goes like this:

import numpy as np
type(arr_grid)
pil_grid = Image.fromarray(np.uint8(arr_grid))
type(pil_grid)

Take a note of this! There’s a tremendous amount of operations possible in Python, but some of it is in numpy, other things occur in matplotlib, there’s PIL etc. Chances are that you’ll be converting back and forth between different types quite often.

Subplots

Matplotlib offers a convenient way to combine multiple images into a grid-like organization:

import matplotlib.pyplot as plt
from pma_python import core
core.connect()    # connect to PMA.start

img_tile1 = core.get_tile("C:/my_slides/CMU-1.svs", zoomlevel = 0)
img_tile2 = core.get_tile("C:/my_slides/CMU-1.svs", zoomlevel = 1)
img_tile3 = core.get_tile("C:/my_slides/CMU-1.svs", zoomlevel = 2)

plt.subplot(1,3,1)
plt.imshow(img_tile1)
plt.subplot(1,3,2)
plt.imshow(img_tile2)
plt.subplot(1,3,3)
plt.imshow(img_tile3)

plt.show()

And here’s a one more neat trick:

def plot_slide_as_tiles(slide_ref, zoomlevel):
    dims = core.get_zoomlevels_dict(slide_ref)[zoomlevel]
    max_x = dims[0]
    max_y = dims[1]
    plt.subplots(max_y, max_x, figsize=(15,15))
    for x in range(0,max_x):
        for y in range(0,max_y):
            plt.subplot(max_y, max_x, (x+1) + y * max_x)
            plt.imshow(core.get_tile(slide_ref, zoomlevel = zoomlevel, x = x, y = y))
    plt.plot(figsize=(200,100))
plt.show()

Basic operations

Let’s go back to the original image shown with this post: it’s a 100 x 100 pixel image, purposefully and deliberately divided in a 3 x 3 grid. Why? Because 100 isn’t divided by 3. So:

  • in the corners, we have 33 x 33 pixels squares,
  • in the center we have a 34 x 34 pixels square,
  • in the top and bottom center section we have two rectangles of 34 pixels wide and 33 pixels tall,
  • In the left and right section of the middle band of the image we have two rectangles of 33 pixels wide and 34 pixels tall.

What’s the importance of this image? It allows us to experiment in a convenient way with cropping. See, when dealing with array data it’s very easy to be just one-element off. You forget to process the last or first element, your offset is just one-off, or another couple of hundred variations on this basic scenario.

Let’s start by cutting the image into strips:

from PIL import Image
import matplotlib.pyplot as plt

grid = Image.open("ref_grid.png")
col1 = grid.crop((0, 0, 33, 99))
col2 = grid.crop((33, 0, 67, 99))
col3 = grid.crop((67, 0, 99, 99))

plt.subplot(1, 3, 1)
plt.imshow(col1)
plt.subplot(1, 3, 2)
plt.imshow(col2)
plt.subplot(1, 3, 3)
plt.imshow(col3)

plt.show()

The output of this script is as follows:

Now let’s see if we can loop this operation:

def cut_in_strips(img, num_strips):
    w = img.width
    interval = w / num_strips
    for i in range(0, num_strips):
        strip = img.crop((interval * i, 0, interval * (i+1), w))
        plt.subplot(1, num_strips, i+1)
        plt.imshow(strip)

cut_in_strips(grid, 3)

It works, but we actually sort of got lucky here. The key is that the width of our image is 100 pixels, and 100 doesn’t divide exactly by 3. It turns out that when we calculate interval, the variable automatically assuming the floating point data type. This may not always be the case (and certainly not in all languages). We can actually simulate what could go wrong by forcing interval into an integer datatype:

interval = (int)(w / num_strips)

You can see now that the third strip shows an extra pixel-edge that is clearly overflow from the third one.

“What’s the big deal?”, you might ask. After all, Python got it right the first time. Why bother?

Because Python might not get it right all the time. Our explicit conversion to int raises a typical off-by-one error. Furthermore: as images are typically converted to 2-dimensional arrays, and as we can have hundreds of tiles next to each other, this kind of one-off errors can easily snowball into big problems.

And in defense of sell-documenting code, the correct syntax to calculate the interval statement should be something more along the lines of:

interval = (float)(w * 1.0 / num_strips)

Remember, the ultimate goal is to break apart a “native” 500 x 500 Pathomation time into smaller pieces (say 25 100 x 100 tiles) and be able to parallelize tasks on these smaller tasks (as well as operate at a coarser zoomlevel).

So with this in mind, we can now plot any image into an arbitrary grid of images:

def plot_as_grid(img, num_rows, num_cols):
    w = img.width
    h = img.height
    interval_x = (float)(w * 1.0 / num_cols)
    interval_y = (float)(h * 1.0 / num_rows)
    for y in range(0, num_rows):
        for x in range(0, num_cols):
            cell = img.crop((interval_x * x, interval_y * y, interval_x * (x+1), interval_y * (y + 1)))
            plt.subplot(num_rows, num_cols, y * num_cols + x + 1)
            plt.imshow(cell)

plot_as_grid(grid, 3, 3)
plot_as_grid(grid, 2, 2)
plot_as_grid(grid, 9, 9)

 Re-constituting an image