Jobsched is a scheduler that allows you easily run periodic tasks on many hosts using just curl
.
He allows you to execute any specified scripts or arbitrary python code in designated time intervals.
He comes just with Flask
and apscheduler
dependencies for initial setup.
Easiest way to play with Jobsched (works if you have virtualenv):
git clone https://github.com/frenzykryger/jobsched
cd jobsched
virtualenv shedenv
source schedenv/bin/activate
pip install -r requirements.txt
./debug.sh
Now you have Jobsched, ready for scheduling!
Jobsched require two command line options to start:
--conf-dir
- directory with all config files--tasks-dir
- directory with tasks files
catalog.cfg
specify tasks that will be allowed to run in Jobsched
For example, lets look on this configuration:
catalog.cfg
[runfoo] file=runscript.py class=RunScript script=foo.bash [example] file=example.py class=ExampleJob phrase=foo [timeout] file=timeout.py class=TimeoutJob phrase=awwwww
This is ini file with several sections. Each section describe one task.
- Section name is a task name
- file is name of python source file in directory specified by
--tasks-dir
- class is class name which represents this task in file
- other options are arguments which will be passed to
__init__
of this class
Example: Task runfoo is a class RunScript located in runscript.py. It accepts argument script (that this task will run).
jobsched.cfg
is actually a python source file with just three options:
- JOB_LIVETIME In seconds. Only applies for jobs that are controlled by watchdog.
- JOBSCHED_PORT
- SCHED_CONFIG -
Advanced Python Scheduler
configuration. Dictionary with any possible scheduler config options. See http://packages.python.org/APScheduler/ for details. For example, here you can tell scheduler to store it's jobs in MySQL, file, sqlite, PostgreSQL, MongoDB. Or you can specify special threadpool for jobs.
By default scheduler store its jobs to a file using Schelve. Adjusting this options should allow you to make persistent jobs (which could be run after Jobsched restart), or even run jobs on several hosts. Just like its done with Quartz scheduler.
logging.cfg
is just configuration file for ... logging module.
Default level is DEBUG and all logs just go to stdout.
Jobsched provides RESTful interface for job scheduling.
You allowed only to schedule task that was described in catalog.cfg
.
To get current task catalog you need to perform this query:
curl -i http://127.0.0.1:58631/catalog
HTTP/1.0 200 OK Content-Type: application/json Content-Length: 65 Server: Werkzeug/0.8.3 Python/2.7.3 Date: Sun, 12 Aug 2012 23:45:29 GMT { "tasks": [ "runfoo", "example", "timeout" ] }
As you see, here we have tasks "runfoo", "example" and "timeout".
Currently, only periodic running tasks are supported for scheduling.
To schedule new job you need to perform query like this one:
curl -iH 'content-type:application/json' -d '{"task":"example","name": "fake", "interval": {"seconds": 1}}' http://127.0.0.1:58631/jobs
HTTP/1.0 202 ACCEPTED Content-Type: text/html; charset=utf-8 Content-Length: 0 Location: http://127.0.0.1:58631/jobs/22bc4c4e-6749-4e25-b329-4ba0afe07d41 Server: Werkzeug/0.8.3 Python/2.7.3 Date: Sun, 12 Aug 2012 23:50:02 GMT
This is a POST query to /jobs
.
Query format is
{ "task": "<one of tasks from tasks catalog>", "name": "<name of this job>", "interval": "<interval between job runs. Job will start not immediately, but after this interval" }
Interval format is
{ "seconds": <seconds>, "minutes": <minutes>, "hours": <hours>, "days": <days>, "weeks": <weeks> }
Any of this interval parameters may be omitted
Jobsched will return HTTP 400, if request is malformed.
If scheduling was successful, Jobsched will return HTTP 202 and URL to scheduled job in Location header.
To view job details you can memorize it's job id or store entire URL from Location header after job scheduling.
You need to perform query like this one:
curl -i http://127.0.0.1:58631/jobs/2db27b92-d458-41eb-840b-dea280d17fb2
HTTP/1.0 200 OK Content-Type: application/json Content-Length: 140 Server: Werkzeug/0.8.3 Python/2.7.3 Date: Mon, 13 Aug 2012 00:15:44 GMT { "next_run": "2012-08-13 04:15:44.932468", "interval": "0:00:01", "id": "2db27b92-d458-41eb-840b-dea280d17fb2", "name": "fake" }
This is GET query to /jobs/<uuid>
.
To view list of scheduled jobs just do GET query to /jobs
Example:
curl -i http://127.0.0.1:58631/jobs
HTTP/1.0 200 OK Content-Type: application/json Content-Length: 186 Server: Werkzeug/0.8.3 Python/2.7.3 Date: Mon, 13 Aug 2012 00:20:18 GMT { "values": [ { "next_run": "2012-08-13 04:20:18.932468", "interval": "0:00:01", "id": "2db27b92-d458-41eb-840b-dea280d17fb2", "name": "fake" } ] }
To unschedule job you must send HTTP DELETE to /jobs/<uuid>
For example:
curl -i -X DELETE http://127.0.0.1:58631/jobs/2db27b92-d458-41eb-840b-dea280d17fb2
HTTP/1.0 200 OK Content-Type: text/html; charset=utf-8 Content-Length: 0 Server: Werkzeug/0.8.3 Python/2.7.3 Date: Mon, 13 Aug 2012 00:23:16 GMT
Jobsched comes with runscript.RunScript task that will allow you to execute any script with maximum livetime set to JOB_LIVETIME.
But instead, you can execute any arbitrary python code as a job!
To do this, create some file in your --task-dir
, for example job.py
.
Create a class there
class MyJob(object): def __init__(self, **my_parameters): """Initialization of your task parameters""" pass def __call__(self, info): """Your job goes here!""" pass
Then, just add it to your catalog.cfg
[myGlamorousJob] file=job.py class=MyJob myFineParameter=1 myBelovedArgument=foo theBestFunctionArgumentIeverSeen=3
Restart Jobsched.
And viola! Now you can schedule job myGlamorousJob
!
Propably you see info
parameter in __call__
. It contains your job id and timeout. Maybe you want to manage your job livetime yourself.
Instead you can use watchdog
decorator and he will manage your job execution for you!
Just like this
from jobsched.watchdog import watchdog class MyJob(object): def __init__(self, **my_parameters): """Initialization of your task parameters""" pass @watchdog() def __call__(self, info): """Your job goes here!""" pass
And thats all! Now your job will be vicously killed after JOB_LIVETIME seconds and you can find what happened in Jobsched logs, because watchdog streamed all your job output there.
You think you know better what timeout is desireable for your job?
Not a problem
from jobsched.watchdog import watchdog class MyJob(object): def __init__(self, **my_parameters): """Initialization of your task parameters""" pass @watchdog(1) def __call__(self, info): """Your job goes here!""" pass
Now your job maximum execution time is set to 1 second!
This is essentially all that someone might say about Jobsched.
- v0.1 - Initial scheduler implementation, watchdog, support for periodic jobs.