Sur un air de balisette

2010/02/17

Launching Python scripts via Condor

Filed under: Python — Alexandre Fayolle @ 14:03

As part of an ongoing customer project, I’ve been learning about the Condor queue management system (actually it is more than just a batch queue management system, tacking the High-throughput computing problem, but in my current project, we’re not using the full possibilities of Condor, and the choice was dictated by other considerations outside the scope of this note). The documentation is excellent, and the features of the product are really amazing (pity the project runs on Windows, and we cannot use 90% of these…).

To launch a job on a computer participating in the Condor farm, you just have to write a job file which looks like this:

Universe=vanilla
Executable=$path_to_executabe
Arguments=$arguments_to_the_executable
InitialDir=$working_directory
Log=$local_logfile_name
Output=$local_file_for_job_stdout
Error=$local_file_for_job_stderr
Queue

and then run condor_submit my_job_file and use condor_q to monitor the status your job (queued, running…)

My program is generating Condor job files and submitting them, and I’ve spent hours yesterday trying to understand why they were all failing : the stderr file contained a message from Python complaining that it could not import site and exiting.

A point which was not clear in the documentation I read (but I probably overlooked it) is that the executable mentionned in the job file is supposed to be a local file on the submission host which is copied to the computer running the job. In the jobs generated by my code, I was using sys.executable for the Executable field, and a path to the python script I wanted to run in the Arguments field. This resulted in the Python interpreter being copied on the execution host and not being able to run because it was not able to find the standard files it needs at startup.

Once I figured this out, the fix was easy: I made my program write a batch script which launched the Python script and changed the job to run that script.

UPDATE : I’m told there is a Transfer_executable=False line I could have put in the script to achieve the same thing.

Advertisements

2010/02/12

Why you shoud get rid of os.system, os.popen, etc. in your code

Filed under: Python — Alexandre Fayolle @ 13:57

I regularly come across code such as:

output = os.popen('diff -u %s %s' % (appl_file, ref_file), 'r')

Code like this might well work machine but it is buggy and will fail (preferably during the demo or once shipped).

Where is the bug?

It is in the use of %s, which can inject in your command any string you want and also strings you don’t want. The problem is that you probably did not check appl_file and ref_file for weird things (spaces, quotes, semi colons…). Putting quotes around the %s in the string will not solve the issue.

So what should you do? The answer is “use the subprocess module”: subprocess.Popen takes a list of arguments as first parameter, which are passed as-is to the new process creation system call of your platform, and not interpreted by the shell:

pipe = subprocess.Popen(['diff', '-u', appl_file, ref_file], stdout=subprocess.PIPE)
output = pipe.stdout

By now, you should have guessed that the shell=True parameter of subprocess.Popen should not be used unless you really really need it (and even them, I encourage you to question that need).

Blog at WordPress.com.