Exploiting the Pathomation software stack for business intelligence (BI)

A customer query

As part of our commercial offering, Pathomation offers hosting services for those organization that don’t want to make a long term investment in on-premise server hardware, or whose server farm is incompatible with our system requirements (PMA.core requires a Microsoft Windows stack).

One such customer came to us recently and asked how much space they were consuming on the virtual machine they rented from us.

This was our answer:

And this is how we obtained the answer. We wrote the following script:

In Python:


from pma_python import core
def get_slide_size(session, slideRef):
    info = core.get_slide_info(slideRef, sessionID = session)
    return info["PhysicalSize"]
def get_total_storage(session, path):
    total = 0
    for slide in core.get_slides(path, session, True):
        total = total + get_slide_size(session, slide)
    return total
def map_storage_usage(srv, usr, pwd):
    sess = core.connect(srv, usr, pwd)
    map = {}
    for rd in core.get_root_directories(sess):
        map[rd] = get_total_storage(sess, rd)
    return map

You can also do this in PHP, of course:


<?php
require "lib_pathomation.php"; 	// PMA.php library

use Pathomation\PmaPhp\Core;

function getTotalStorage($sessionID, $dir) {
        $slides = Core::getSlides($dir, $sessionID, $recursive = FALSE);
        $infos = Core::GetSlidesInfo($slides, $sessionID);
        
        echo "Got slides for ".$dir.PHP_EOL;

        $func = function($value) {
            return $value["PhysicalSize"];
        };

        $s = array_sum(array_map($func, $infos));
        return $s;
}

function mapStorageUsage($serverUrl, $username, $password) {
    $sessionID = Core::Connect($serverUrl, $username, $password);
    
    $map = array();
    $rootdirs = Core::getRootDirectories($sessionID);
    foreach ($rootdirs as $rd) {
        $map[$rd] = getTotalStorage($sessionID, $rd);
        
    }
    return $map;
}
?>

Creating a dictionary that contains the number of consumed bytes per root-directory now comes down to using just a single line of code:

In Python:

print(map_storage_usage("http://server/pma.core", "user", "secret_password"))

And in PHP:


print_r( mapStorageUsage ("http://server/pma.core", "user", "secret_password"));

Making our code mode robust

Unfortunately, chances are that if you’ve been running your PMA.core tile server for a while, the script comes down crashing miserably. It the “should have could have would have” syndrome of programming. There’s probably a meme for this out there somewhere (in the interest of productivity, we won’t search for it ourselves but let you look for that one). What it comes down to is: mounting points and access permissions chance, and at some point in time in any large enough data repository some files are going to end up corrupt, meaning either the core.get_slides() call is going to go wrong, or the core.get_slide_info() call.

So to make our script a bit more robust, we can add try… catch… exception handling in Python:


def get_slide_size(session, slideRef):
    try:    
        info = core.get_slide_info(slideRef, sessionID = session)
        return info["PhysicalSize"]
    except:
        print("Unable to get slide information from", slideRef)
        return 0

def get_total_storage(session, path):
    total = 0
    try:
        for slide in core.get_slides(path, session, True):
            total = total + get_slide_size(session, slide)
    except:
        print("unable to get data from", path)
    return total

def map_storage_usage(srv, usr, pwd):
    sess = core.connect(srv, usr, pwd)
    map = {}
    for rd in core.get_root_directories(sess):
        map[rd] = get_total_storage(sess, rd)
    return map

As well as in PHP:


<?php
require_once "lib_pathomation.php";

use Pathomation\PmaPhp\Core;

function getTotalStorage($sessionID, $dir) {
    try {
        $slides = Core::getSlides($dir, $sessionID, $recursive = FALSE);
        $infos = Core::GetSlidesInfo($slides, $sessionID);
        
        $func = function($value) {
            return $value["PhysicalSize"];
        };

        $s = array_sum(array_map($func, $infos));
        return $s;
    }
    catch(Exception $e) {
        // echo "unable to get data from ".$dir.PHP_EOL;
    }
}

function mapStorageUsage($serverUrl, $username, $password) {
    $sessionID = Core::Connect($serverUrl, $username, $password);    
    $map = array();
    $rootdirs = Core::getRootDirectories($sessionID);
    foreach ($rootdirs as $rd) {
        $map[$rd] = getTotalStorage($sessionID, $rd);   
    }
    return $map;
}

print_r(mapStorageUsage("http:/server/core/", "user", "secret"));
?>

In Python, you can also add the prettyprint library to clean up the output a bit:

And in PHP, we add a convenient method to make the numbers a bit easier to read:


function human_filesize($bytes, $decimals = 2) {
$size = array('B','kB','MB','GB','TB','PB','EB','ZB','YB');
$factor = floor((strlen($bytes) - 1) / 3);
return sprintf("%.{$decimals}f", $bytes / pow(1024, $factor)) . @$size[$factor];
}

Input for business intelligence (BI)

In this article we showed how you can use automation to let Pathomation’s PMA.core tile server generate an overview report of how much space your slides consume. Depending on your specific folder structure, you can further customize this for breakdowns into sizable data-morsels that fit your particular appetite.

Given the size of the average size of a slide, this kind of information can be vital to your organization: Slides can accumulate fast, and it is important to keep a handle on their growth and space occupation.

Pathomation’s software platform is more than just a slide viewing solution: it can help to generate insights in your storage resource consumption and be used as a veritable planning and evaluation tool before actual investments take place.