Tag Archives: computing

Dodgy Internet Connections: Ping Script

I’m having some fun at the minute with a dodgy internet connection. It really is driving me to distraction, I must admit. It’s quite frustrating.

It manifests as a connection that drops in and out almost at random. In fact, if I heavily load the connection, rather than doing what I expect and dropping out, it seems to stabilise. But I can’t really download hundreds of GB all the time, can I? Also, it does likewise drop out occasionally at high loads – that’s the worst issue anyone has to try to troubleshoot: an intermittent one.

Regardless, I’m trying to collect evidence for the flakiness (although tonight it doesn’t seem to be playing – I’ve had a largely painless experience online this evening) so need some way of proving that something is up.

Note that I’m in Windows 10 most evenings, due to Skype, Word and Visual Studio. Yup, spend all day at work on Linux, move to Windows of an evening. Insert moaning about no good Office suites on Linux here.

So, anyway, a quick Windows (non-Powershell) script to ping a site at regular intervals, and log if not successful. Right now I’m not interested in when it is working – only when it’s not.

@echo off
setlocal enabledelayedexpansion
set hostIP=[put an IP address here]
:loop
set pingline=1
for /f "delims=" %%A in ('ping -n 1 -w 250 -l 255 %hostIP%') do (
    if !pingline! equ 2 (
        set logline=!date! !time! "%%A"
        echo !logline! | find "TTL=">nul || echo !logline! >> pinglog.txt
        )
    set /a pingline+=1
    )
timeout 10
goto loop

All credit due, I found this here and modified slightly so I didn’t have to download something from Windows Server 2003. OK, so timeout is slightly less accurate than sleep, but for my purposes it is sufficient. It will run until you kill it. Closing the command line is sufficient.

Torque PBS & Ubuntu 16.04/Mint 18

There are some programs that like MPI. There are others that are… kind of single threaded, but work pretty well with a PBS (portable batch server) to actually queue up tasks and generally speed up execution.

The I-TASSER suite, for protein structure prediction, is one of the latter.

If you’re in academia, I-TASSER is free, so it’s a useful tool to have even if it’s not used very often.

But getting Ubuntu to play nice with a PBS can be something of a trick… partly because the version included with Ubuntu is now old. Very old.

And the newer versions are still free – it only costs money for Torque if you want to use the more powerful schedulers like Maui. Which I don’t, because I’m usually the only person actually logging in to the boxes I administer. This may change in the future, but for now, I don’t need a complex PBS.

Anyway, to get Torque working without using the version included in the repos (because it’s ancient) requires relatively little work in the grand scheme of things…

The first job is to get the basic requirements for Torque installed:

sudo apt-get install libboost-all-dev libssl-dev libxml2-dev

Boost pulls in a ton of things, so it may or may not be worth adding --no-install-recommends to the end of that apt-get command. I didn’t, but I’m not short on space.

If you’ve not got a C compiler installed, now is the time for that as well. Fortunately, Torque doesn’t need anything fancy like cmake to build, just good ol’ ./configure, make, make install.

Now they’re installed, you can go download the Torque source code from Adaptive Computing. Now, annoyingly, the most recent release (6.1.1.1 as of writing) screws up for me for reasons I can’t figure out. I know from prior experience that 6.0.2 works 100%, so I’ll stick to that. It’s still newer than what is in the Ubuntu repos…

Extract the source somewhere sensible, like ~/bin using tar xzvf [torque.tgz] and run ./configure, then watch for any errors – there shouldn’t be any. When it’s all done, type make. You can use make -j [number of CPU cores] to speed things up a bit. Once that is done, switch to root with either sudo bash or su -, and type make install.

Now comes some fun bits.

There is a nice script in the folder you just built Torque in called torque_setup, but that’s not everything you need.

The first thing to check is that you have your hostname listed appropriately in /etc/hosts. Now, here is where static IP addresses really make your life easier: if you are using DHCP and your router decides to change you IP, Torque will stop working. Very frustrating.

Anyway, while lots of things need 127.0.0.1 to point to localhost, Torque also needs it to point to the server name. I name mine after elements of the periodic table, but you can do whatever you want.

Here’s what my /etc/hosts file looks like:

127.0.0.1 localhost
127.0.0.1 hydrogen
169.254.1.100 hydrogen
169.254.1.101 helium
169.254.1.102 lithium
169.254.1.103 beryllium

Without this extra 127.0.0.1 entry, Torque doesn’t work. It also works putting localhost and the hostname on the same 127.0.0.1 line.

Now you can run ./torque_setup [username] and answer y at the prompt.

Now run, echo '/usr/local/lib' > /etc/ld.so.conf.d/torque.conf and ldconfig. This tells the system where the Torque libraries are.

And echo "hydrogen np=32" > /var/spool/torque/server_priv/nodes and echo "hydrogen" > /var/spool/torque/mom_priv/config. These tell Torque about the nodes (and how many CPUs each node has) and the pbs_mom which server it’s running on. With Torque 6, trqauthd should do the job of pbs_mom, I think?

Get the server running again with pbs_server, pbs_sched and trqauthd (as root) at the commandline.

Then check that it’s working with qmgr -c 'p s' (the space is important).

Finally, check that it works by starting an interactive PBS session with qsub -I as a normal user (you can’t run this as root).

Should all work OK now!

nVidia GPUs: Fixing Power State Woes in Linux

This one is easy. I mean, embarrassingly easy.

A little background, however; in Linux, nVidia GPUs don’t appear to clock down properly, even when idle and not connected to a monitor. This is frustrating for several reasons:

  1. P0 (full power) state when ‘idle’ wastes power, which wastes money
  2. The cards idle hot (my GTX 1080’s, without this tweak, idle around 60°C, with tweak, idle around 35-40°C)
  3. The P0 state downshifts to P2 (approximately a 20% performance drop!) when under compute load (which is stupid, but apparently to protect consumer grade cards under sustained load)
  4. I expect my computers to do what I tell them

Fortunately, nvidia-smi, the system management interface, can come to your rescue, without needing Xorg or nvidia-settings or a GUI.

Pick your favourite text editor (VI(m), Emacs, nano, pico, Xed… Leafpad (*shudder*) and point it, with root privileges, to :

/etc/rc.local

Then add:

/usr/bin/nvidia-smi -i 0,1 -pm ENABLED

To the file, just above exit 0, then save and quit. Log out and back in, and your GPUs will have persistence enabled, which will make sure they clock down appropriately.

The -i flag indicates which GPUs you want to instruct with this command, and the -pm flag just sets persistence mode to enabled.

There’s another trick to get the cards to sit at P0 state when in compute mode, but I’ll admit I’ve not tried it because mine already get plenty warm enough.

UEFI & Linux

I’ll do a longer post on this another time, going into the details of how to persuade Windows 10 Pro and Ubuntu 16.04/Mint 18(.1) (or other distro of choice that supports SecureBoot) to play nicely together with SecureBoot enabled and third party drivers so CUDA can actually… y’know, work.

But for right now, a quick PSA (that’s Public Service Announcement, not psa as in the gene family…):

I’ve not had Ubuntu get it quite right yet with the USB stick I used to install it from; it wants to format the USB stick, which naturally enough doesn’t work.

For the purposes of this, I’m going to assume you want to run an SSD (solid state drive) as you main system drive with a larger spinning rust HDD for bulk data storage. I’ve done various setups over the years, but this one seems to work the best.

Partitioning must be set manually (I’ll fire up Virtualbox and take some screenshots to demonstrate later) to the following…

EFI System Partition (512MB) Beginning of SSD
Swap Partition (16GB) End of SSD
/ (that’s ‘root’ and holds everything else) the rest of the SSD
/data all the HDD

You also need to set the installation target of the GRUB bootloader to /dev/sda1 (or whatever partition the EFI System Partition is) (not the MBR!) or the system won’t boot with EFI/SecureBoot enabled.

When Drupal Breaks After Updating

{{This post is a bit of a blast from the past, being somewhat historical. Well, I can’t change the publication date like I could in Drupal (I won’t go into the reasons I quit using Drupal just now) so it gets bumped up to 2017…}}

Since this keeps catching me out every time I haven’t had to update Drupal for a while, here’s the most common way that Drupal manages to “500 Internal Error” itself after an update:

.htaccess

One little hash can totally ruin my day (well, for about five minutes until I remember why exactly everything is broken…)

Find the line:

# RewriteBase /

and change it to read:

RewriteBase /

Job done. Simple, but incredibly annoying if not expected, as it doesn’t break the ‘main’ webpage… just every other page and link on the entire site.

NB: If you’re sensible and running a tight set of permissions on your .htaccess file, don’t forget to chmod it first so you can actually make changes, then chmod it back again when done; forgetting that last bit can be a Bad Thing™.

Stopping Drupal from Summarising Everything (Badly)

{{This post is a bit of a blast from the past, being somewhat historical. Well, I can’t change the publication date like I could in Drupal (I won’t go into the reasons I quit using Drupal just now) so it gets bumped up to 2017…}}

I absolutely abhor the “summary” or “teaser” view that Drupal insists on imposing on you by default when you exceed a certain number of lines when making an Article. It’s the most annoying thing in the world. Especially when it gives you the option for forcing a break, but then completely ignores you and does its own thing anyway.

Regardless, there is a method of fixing it:

In the Administration Menu, head for Structure, then Content Types. Click ‘Manage Display’, then ‘Custom Display Settings’, and uncheck ‘Teaser’.

This must be done for each Content type.

It doesn’t, however, stop Drupal from giving you the incredibly stupid “Summary” of the post above the post when you hit “Preview”. But that is a minor irritation.

If anyone has found a way to do this for all Content types across the whole site in one go, that would be great.

Python, PIP and Visual Studio

{{This post is a bit of a blast from the past, being somewhat historical. Well, I can’t change the publication date like I could in Drupal (I won’t go into the reasons I quit using Drupal just now) so it gets bumped up to 2017…}}

When Python moved to the ‘pip’ installer system, I was torn about it. On one hand, it sounded like a great idea – package management! Great! But so far I’ve spent a fair bit of time grumbling over its oddities and annoyances.

Trying to install Numpy, it just wouldn’t do it. It would always fail with, while not the most useless error message in the world (that is reserved for the wonderful “Error: no error”) one that could have been quite a bit less cryptic.

I recognise the file it’s talking about, because it’s part of Visual Studio.

It’s interesting that it’s complaining, because I have Visual Studio 2013 Community installed, and am slowly learning that… (shock)… I quite like it. But the issue is that Python 3.4 was compiled using Visual Studio 2010. And looks for vcvarsall.bat in one place and one place only.

Anyway, there are several options, most of which are simply unacceptable in my view.

“The internet” seems to think that installing Visual C++ 2010 Express, when I already have VS2013 installed is a sensible, (nay, essential!) thing to do. I disagree. So I refused to do it.

The next “internet suggestion” was install MinGW and point pip to it as a compiler. OK, but I’ve got a compiler already, I want to use that. While my dev environments tend to have a lot in them, I don’t need a Swiss Army Knife with 1001 tools – just a few will do me, thanks.

The second was use pre-compiled binaries. OK, I’ve done it before with things like PyMol – let me just take this opportunity to plug Christoph Gohlke’s superb collection – but it has to be possible to do this myself…

To that end, you can just do the following:

Create an Environment Variable for “VS100COMNTOOLS” (Visual Studio 2010 is version 10.0) which points at the same directory as “VS120COMNTOOLS”

Which works perfectly.

A New Job

So, I’ve moved on to newer pastures.

I’m not involved directly with magnetic resonance any longer, although that will always be close to my (scientific) heart. Being back in a wet lab, albeit while still doing a lot of computing related work, is exciting.

This blog/site/whatever-you-want-to-call-it won’t focus on my job. I’ll post now and again regarding curiosities encountered, minor challenges overcome (particularly when in respect to a reticent bit of code) but almost anything that catches my attention may be fair game.

Books, music, computing, science… perhaps a bit of sight-seeing if I go somewhere particularly worth sharing with others than close friends or family… all are good.

But if you just come for one or two posts, that’s fine too.

Of course, the remit might expand as time passes, but on that front… we shall have to see.

In the mean time, take care all.