Pipeline Pilot in Docker - speeding up complex deployments

Background

dockerPP.png

If you ever had to manage, administer or install a Pipeline Pilot server you know that this is usually not a piece of cake. It's a fairly long process, involves a lot of more or less manual steps (that you could automate, but did you do it so far?).

On top of that...once you have your server installed and everything, it is usually very cumbersome to install a second one, third one fourth one on the same machine to build for instance different dev/test/prod environments, development environments for different people etc...so in the end, most of the time you might have one or two instances of PP on a machine and that's it. 

If you want to clone a particular install (configuration, all packages installed, your own packages installed, users etc...) and maybe a few custom modifications you made to PP then this is also a mainly manual process although Biovia Hub is meant to answer this specific use case I think.

Pipeline Pilot stuffed into Docker

Lately I was spending most of my time on deploying one of our applications that uses Pipeline Pilot on different servers and environments and got a bit fed up with this. So we tried to see whether we could use docker to ease up this whole process for this particular application. Fortunately for you, we also came up with a sufficiently generic docker image for Pipeline Pilot so that various user/administrator profiles could use the image to do very easy and flexible deployments. 

The Pipeline Pilot Image

The image is the thing that holds the Pipeline Pilot install itself. It's the thing that you can pull from a repo or import from a zip file and magically everything is working when launching "docker run". If you use the provided dockerfile to build your image and start the PP container you will notice that this container contains a plain empty install of Pipeline PIlot. Just a few base packages were installed, the server itself and that's it. Nothing is configured on this image yet. 

Now every Pipeline Pilot install is a bit different, the settings are slightly different, user management might be different etc ... this is why we wanted to configure Pipeline Pilot servers at run-time and not build-time, to have a maximum flexibility on deployment. But in the end you are free to do whatever you want.

Another thing is that a docker container always starts in the same state. So in order to run a bunch of PP servers with different settings and packages etc several things had to be adapted around a basic PP installation container. 

Building the Pipeline Pilot 2017R2 Image (tested on Linux)

As I'm not working for BIOVIA I am not allowed to provide you the working docker image as is. So basically you have to build it yourself at your end. Fortunately docker makes things relatively easy. All you need is the attached dockerfile to build the image. All commands in the dockerfile are explained in the file itself and are not further explained in this post. The base image I'm using evolves relatively quickly, but during my last tests all builds were successful.

Place this dockerfile in a folder on a machine where the docker daemon is installed. Then place the Pipeline Pilot installer in the same folder (unzip to the BPP2017R2 folder). Place your pipeline pilot license in this BPP2017R2 folder and name it pp.lic.

Now you can build the pipeline pilot docker image by issuing the following command: 

docker build -t myimagename .

The build will take a quite significant amount of time, but once finished you should have the docker image "myimagename" available when showing all available docker images ("docker images").

Not that the dockerfile can still be optimized, this might follow in an updated blog post. 

The collection data volume

Each flavor of Pipeline Pilot you want to run should have a distinct data volume associated to it. Hereafter I'll refer to this volume as the collection volume.  This volume contains:

  • install.sh (contains the instructions to run when you start the container)

  • pp.lic (the license file of your PP install)

  • config.xml (the PP server configuration, that's the file you can export from your admin portal)

  • DataSource.xmlx (optional) : these are DB data sources defined in the admin portal of a PP server configuration

  • numbered folders containing non-BIOVIA PP collections to install (01_dng_chemistry, 02_dng_network for instance)

The install.sh script provided here installs all custom collections into the apps/discngine folder. You can change that to whatever is suitable for your needs!

To all purists: I know that associating a volume to an image to run a particular instance of software is not the idea of a dockerized application. However, the build process of PP is soooo long, and configuration variety is sooo large that I opted for this as the easiest way to allow a maximum amount of flexibility to the user.

Running a Pipeline Pilot docker image

Once you have built your docker image (or imported it) you can run it using the different flavors of the following command line:

docker run -v collections_dev:/collections -p 9944:9918 -p 9943:9917 ppimagename

You can run another configuration of the same PP server version like this: 

docker run -v collections_prod:/collections -p 9944:9916 -p 9943:9915 ppimagename

and you can do this on the very same server. 

When you run this, the install.sh script will be run and once finished, startserver is run. So if you have 10 big collections to install on startup, this might take a while. Now you can start 3 different servers with the same configuration but different port mappings (-p flag) for instance and deploy different development environments, dev/test/prod environments etc....

Feel free to post comments and suggestions on how to improve the docker image a bit. I know its not perfect yet and probably never will be, but right now it fills a rather pressing need on our side. 

What about user management?

The example provided here does not include any specific user management as this is also very configuration specific. Note that local user accounts are not exported and imported using the PP admin portal tools. Thus they cannot easily be imported via an xml file. However, the install.sh script can easily incorporate sections for local user creation. Basically you can configure your Pipeline Pilot server with the configUtil tool provided by BIOVIA and all of this in the install.sh script. 

If you want to activate LDAP autentication, this could also be done, but you'd need to pass through the hosts /etc/passwd and/or shadow files (on linux) to correctly configure PAM authentication in a docker container for example. There is a lot of documentation on these aspects online (general docker user authentication).

Did you know?

We are already using a little bit more complex example of what was explained above for our 3decision production server. So we already tested a bit the stability of this and were are satisfied.


Downloads

 
TechGuest UserComment