open-source Notes

Notes of an open-source programmer.
24 Nov

How To: create your own work generator script (part 1)

This article will deal with the task of creating your own modified work generator for your BOINC project. In fact this is just the explanation of a file I created for a project admin who explained his application workflow to me. It’s not for general use but can be used as a blueprint for your own. Here is what I got as presumptions:

  • a single application using the wrapper feature
  • one input file (a text file containing configuration parameters)
  • four output files
  • some kind of database, packaged with the application

The configuration file changes with every job (workunit) and is generated either by hand or by an external program. The task for the work generator should be: process the text files given in a specific directory into the BOINC framework. Continue until a specified threshold of tasks is reached or the directory is empty. So this is my outline of the work generator workflow:

  1. Look for all files in directory ../input_files/
  2. Take File#1 as input file for Job#1 and create it
  3. Move File#1 into the download directory
  4. Goto step 2 and proceed with File#2 and create Job#2
  5. Do so until all files read in Step 1 are processed OR a certain threshold is reached (CUSHION)
  6. Begin with Step 1 if unsent tasks is too low

And now I’ll try to convert this concept into some functional C++ code. I’ll divide the file in four parts (header, make_job, main_loop and main). Let’s start with the header and some definitions.

// os-tools work generator written by Christian Beer
// http://blog.os-tools.net
// Copyright (C) 2008 Christian Beer
//
// This is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation;
// either version 2.1 of the License, or (at your option) any later version.
//
// This software is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
// See the GNU Lesser General Public License for more details.
//
// To view the GNU Lesser General Public License visit
// http://www.gnu.org/copyleft/lesser.html
// or write to the Free Software Foundation, Inc.,
// 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

// ostools_work_generator.C: a BOINC work generator.
// This work generator has the following properties:
//
// - Runs as a daemon, and creates an unbounded supply of work.
//   It attempts to maintain a "cushion" of 5 unsent job instances (tasks)
// - Creates work for the application "ostools_app".
// - Moves the input file for each job from input to download directory;
//   the job names contain a timestamp and a sequence number, so that they're unique.
// - Search for TODO within this file to see more configuration options

#include 
#include 
#include 
#include 
#include 

#include "filesys.h"
#include "boinc_db.h"
#include "error_numbers.h"
#include "backend_lib.h"
#include "parse.h"
#include "util.h"

#include "sched_config.h"
#include "sched_util.h"
#include "sched_msgs.h"

#define CUSHION 5
  // maintain at least this many unsent tasks (results)
#define REPLICATION_FACTOR  1
  // generate as much tasks for every job (workunit) created
#define INPUT_FOLDER "../input_files"
  // The directory where the input files are stored
  // NO trailing slash!
// Search for TODO within this file to see more configuration options

// globals
char* wu_template;
DB_APP app;
int start_time;
int seqno;

Basically I used the includes from the sample work generator and added the INPUT_FOLDER define. I also wanted to insert all configuration values at the top but that’s not needed because they are set just once when adapting the work generator to a specific project and application. So I added the TODO comments at these points. But we’ll come to this later in the script.

The config option CUSHION is the unsent tasks (result) threshold for the generator for the specified application. The generator tries to generate as many jobs to meet this value.

The next option REPLICATION_FACTOR is indirectly influencing this threshold. The replication factor determines how many tasks are initialy generated with one job.
If you set CUSHION to 10 and REPLICATION_FACTOR to 2 the generator will produce 5 jobs with 2 tasks each at the first run. You shouldn’t generate too much taskss at first, keep in mind that your server hardware probably can’t handle the load and/or traffic of a hundred thousands of tasks. Seti@home for example tries to maintain a cushion of 500,000 unsent tasks and uses four servers to handle the load and traffic.
At last the INPUT_FOLDER option specifies the folder where the raw input files are stored. This is given as a relative path to the working directory of the generator (bin/) but you can also define an absolute path somewhere else. You just have to make sure that the generator (more specifically the user running the generator) can read/write the files and that only valid input files are stored there. This work generator does not check for validity of the input files but it should be easy to implement.
The last part of the header are some globals used later in the code.

Read the next parts of this How To: the make_job function, the main_loop function and the main function.

Leave a Reply

404 Not Found

404 Not Found


nginx/0.8.53

© 2012 open-source Notes | Entries (RSS) and Comments (RSS)

GPS Reviews and news from GPS Gazette

404 Not Found


nginx/0.8.53
" title="Powered by Wordpress">wordpress logo