Torque PBS & Ubuntu 16.04/Mint 18

There are some programs that like MPI. There are others that are… kind of single threaded, but work pretty well with a PBS (portable batch server) to actually queue up tasks and generally speed up execution.

The I-TASSER suite, for protein structure prediction, is one of the latter.

If you’re in academia, I-TASSER is free, so it’s a useful tool to have even if it’s not used very often.

But getting Ubuntu to play nice with a PBS can be something of a trick… partly because the version included with Ubuntu is now old. Very old.

And the newer versions are still free – it only costs money for Torque if you want to use the more powerful schedulers like Maui. Which I don’t, because I’m usually the only person actually logging in to the boxes I administer. This may change in the future, but for now, I don’t need a complex PBS.

Anyway, to get Torque working without using the version included in the repos (because it’s ancient) requires relatively little work in the grand scheme of things…

The first job is to get the basic requirements for Torque installed:

sudo apt-get install libboost-all-dev libssl-dev libxml2-dev

Boost pulls in a ton of things, so it may or may not be worth adding --no-install-recommends to the end of that apt-get command. I didn’t, but I’m not short on space.

If you’ve not got a C compiler installed, now is the time for that as well. Fortunately, Torque doesn’t need anything fancy like cmake to build, just good ol’ ./configure, make, make install.

Now they’re installed, you can go download the Torque source code from Adaptive Computing. Now, annoyingly, the most recent release (6.1.1.1 as of writing) screws up for me for reasons I can’t figure out. I know from prior experience that 6.0.2 works 100%, so I’ll stick to that. It’s still newer than what is in the Ubuntu repos…

Extract the source somewhere sensible, like ~/bin using tar xzvf [torque.tgz] and run ./configure, then watch for any errors – there shouldn’t be any. When it’s all done, type make. You can use make -j [number of CPU cores] to speed things up a bit. Once that is done, switch to root with either sudo bash or su -, and type make install.

Now comes some fun bits.

There is a nice script in the folder you just built Torque in called torque_setup, but that’s not everything you need.

The first thing to check is that you have your hostname listed appropriately in /etc/hosts. Now, here is where static IP addresses really make your life easier: if you are using DHCP and your router decides to change you IP, Torque will stop working. Very frustrating.

Anyway, while lots of things need 127.0.0.1 to point to localhost, Torque also needs it to point to the server name. I name mine after elements of the periodic table, but you can do whatever you want.

Here’s what my /etc/hosts file looks like:

127.0.0.1 localhost
127.0.0.1 hydrogen
169.254.1.100 hydrogen
169.254.1.101 helium
169.254.1.102 lithium
169.254.1.103 beryllium

Without this extra 127.0.0.1 entry, Torque doesn’t work. It also works putting localhost and the hostname on the same 127.0.0.1 line.

Now you can run ./torque_setup [username] and answer y at the prompt.

Now run, echo '/usr/local/lib' > /etc/ld.so.conf.d/torque.conf and ldconfig. This tells the system where the Torque libraries are.

And echo "hydrogen np=32" > /var/spool/torque/server_priv/nodes and echo "hydrogen" > /var/spool/torque/mom_priv/config. These tell Torque about the nodes (and how many CPUs each node has) and the pbs_mom which server it’s running on. With Torque 6, trqauthd should do the job of pbs_mom, I think?

Get the server running again with pbs_server, pbs_sched and trqauthd (as root) at the commandline.

Then check that it’s working with qmgr -c 'p s' (the space is important).

Finally, check that it works by starting an interactive PBS session with qsub -I as a normal user (you can’t run this as root).

Should all work OK now!

Leave a Reply

Your email address will not be published. Required fields are marked *