Build scripts driven from yaml file

Question 1

I'm trying to run a build scripts framework, so build scripts can be configured via yaml file tasks.yml:

---
release: # This is a 'runner' or group of related tasks. They share common initialization needs
 config:
 repo: ci-repo # Configuration parameters can be set at runner level and overriden by nested tasks/steps
 tasks: # List of tasks provided by this runner. We can run one task at a time
 publish:
 config: # Tasks can have configuration parameters
 properties:
 moo: maa
 shoo: shaa
 steps:
 - publish:
 path: org/path
 promote:
 steps: # Tasks can have many steps that will run sequentially
 - promote:
 repo: promotion-repo
 - send_email:
 recipient: [email protected]
buildcxx: # Another runner
 tasks:
 debug:
 steps:
 - clean_build_folders
 - cmake:
 cmake_args: -DFoo=Bar
...

The entry point of my package, bsf.py, accepts 2 arguments:

> ./bsf.py RUNNER TASK

For example, with the above tasks.yml:

> ./bsf.py release promote

This will run all the steps in the promote task of the release runner.

#!/usr/bin/env python
"""bsf.py"""
from argparse import ArgumentParser
import yaml
import sys
import importlib
from runners import * # Need to import all runners
def parse_arguments():
 mainparser = ArgumentParser('BSF')
 mainparser.add_argument('runner', help='Runner as specified in tasks.yml')
 mainparser.add_argument('task', help='Task to run as specified in tasks.yml')
 mainparser.add_argument('-s', '--source', default='source', help='Location of source to be build')
 return mainparser.parse_args()
def get_config(cfg):
 with open(cfg, 'r') as stream:
 return yaml.load(stream)
def get_runner_class(runner, module='runners'):
 """
 Returns the runner class specified byt the runner name
 :param runner: string with runner class name
 :param module: module where to search for runner classes
 :return: runner class
 """
 runners = inspect.getmembers(sys.modules[module], inspect.isclass) # Returns classes in module
 runner_list = dict()
 for rn in runners:
 runner_class_name, runner_class_object = rn
 runner_name = str(runner_class_name.lower()) # Convert class name to lowercase, maybe better to use a class prop
 # dict with lower case task name and proper class name capitalization
 runner_list[runner_name] = runner_class_name
 return getattr(importlib.import_module(module), runner_list[runner])
def get_resulting_task_config(default_tasks, runner_config, task_config):
 """
 Returns the resulting by overriding:
 - default is overriden by runner
 - runner is overriden by task
 :param default_tasks: global tasks definition and config
 :param runner_config: current runner config
 :param task_config: current task config
 :return: dictionary containing the resulting config
 """
 resulting_config = default_tasks
 resulting_config.update(runner_config)
 resulting_config.update(task_config)
 return resulting_config
def main():
 args = parse_arguments()
 runner = args.runner
 task = args.task
 default_tasks = get_config('tasks.yml')
 print 'Running %s:%s' % (runner, task)
 runner_config = dict()
 task_config = dict()
 if runner not in default_tasks:
 print 'ERROR: runner not defined in tasks'
 sys.exit(-1)
 if task not in default_tasks[runner]['tasks']:
 print 'ERROR task %s not defined in runner %s' % (task, runner)
 sys.exit(-1)
 task_definition = {tsk: default_tasks[runner]['tasks'][tsk] for tsk in default_tasks[runner]['tasks'] if tsk == task}
 if 'config' in default_tasks[runner]:
 runner_config = default_tasks[runner]['config']
 if 'config' in default_tasks[runner]['tasks'][task]:
 task_config = default_tasks[runner]['tasks'][task]['config']
 runner_class = get_runner_class(runner)
 rnr = runner_class(args.source, task_definition, runner_config, task_config)
 rnr.do(task)
if __name__ == '__main__':
 main()

And this is the pipeline.py file, with different classes extending a base one.

The reason for extending the base class is we may not always need all components: for example, running a build task will not need the 'artifact repository manager' needed by the publish and promote tasks. This will allow us to use just the parts of the framework that we need at a given time.

"""pipeline.py: Pipeline classes
We have a basic pipeline class with the minimum config.
Additional classes extend the functionality and initialize different parts as needed
This allows to just initialize the required parts and group related methods together
"""
import os
import sys
class Pipeline(object):
 """Base pipeline class, in charge of the minimal configuration
 """
 def __init__(self, source):
 self.source = source #: Source code folder
 self.build_number = os.environ.get('bamboo_buildNumber')
 self.vcs = None #: Version control manager
 self.binary_repo = None #: Binary repository manager, ie: Artifactory
 self.confluence_client = None #: Confluence API client
 self._build_version = None
 def init_vcs(self):
 """Initialize VCS manager from environment and/or info from source folder
 """
 self.vcs = 'Foo'
 @ property
 def build_version(self):
 if self._build_version is None:
 self._build_version = os.environ.get('bamboo_build_version')
 if self._build_version is None:
 """Ideally will try to get the version by other means"""
 print "ERROR: Can't determine the build version"
 sys.exit(-1)
 return self._build_version
class ReleaseWorker(Pipeline):
 """Extends the Pipeline with release tasks
 """
 def __init__(self, source):
 super(ReleaseWorker, self).__init__(source)
 def publish(self, path, repo):
 print 'ReleaseWorker: Publishing to %s in %s' % (path, repo)
 # self.pipeline.binary_repo.publish(path, repo)
 def promote(self, repo):
 print 'ReleaseWorker: Promoting to %s' % repo
 # self.pipeline.binary_repo.promote(repo)
 def send_email(self, recipient):
 print 'Sending email to %s' % recipient

And finally, this is the runner.py file. Each runner has a different configure method and shares a common do method. The generic do iterates through the steps (that are methods in the corresponding pipeline worker class) running them.

from pipeline import ReleaseWorker
import inspect
def _parse_step_config(step):
 # Apply step specific step_config. Defaults are defined in step method level
 if type(step) is dict: # Step contains additional config
 step_name = step.keys()[0] # Steps should be a dict with one single item
 step_config = step[step_name]
 else:
 step_name = step
 step_config = dict()
 return step_name, step_config
class Runner(object):
 """Base runner class"""
 def __init__(self, source, task_definition, runner_config, task_config):
 self.source = source
 self.config = runner_config
 self.config.update(task_config)
 self.task = task_definition
 self.pipeline = None # Should be initialized by the child class
 def configure(self):
 """Configuration should be done at child level"""
 pass
 def do(self, task):
 """Ideally, child classes should not override this method"""
 self.configure()
 steps = self.task[task]['steps']
 for step in steps:
 print('-'*120 + '\nRunning step %s' % step)
 step_name, step_config = _parse_step_config(step)
 step_method = getattr(self.pipeline, step_name) # Get the method
 valid_args = inspect.getargspec(step_method).args[1:] # See what arguments from task config are applicable
 print valid_args
 # Compute step config
 step_resulting_config = self.config
 step_resulting_config.update(step_config)
 step_arguments = {arg: (step_resulting_config[arg])
 for arg in valid_args
 if arg in step_resulting_config} # Dict with applicable args
 step_method(**step_arguments) # Run step
class Release(Runner):
 def configure(self):
 self.pipeline = ReleaseWorker(source=self.source)
 self.pipeline.binary_repo = 'Foo' # This should be an object from some manager class

In case you want to get this files, you can save all the copy/paste by cloning this repo. I hope no one gets upset for me adding that link.

I have several concern with this:

I have the feeling I'm overcomplicating things
I have a horrible naming patter there:
- runner = group of tasks
- pipeline = build (substantive, as in 'my build is broken')

In case you are interested in the real project, you can follow the development here

Question 2

Ok, I managed to simplify the get_runner_class method.

First step is to add a class attribute name to each runner, so we don't have to rely on the class name to match the yaml file key.

New Runner class:

class Relase(Runner):
 name = 'release'
...

The other simplification is to use Runner.__subclasses__() to get the list of all possible runners.

New get_runner_class method (much cleaner and explicit)

def get_runner_class(runner):
 runners = Runner.__subclasses__()
 for rn in runners:
 if rn.name == runner:
 return rn
 return None

Xabs Xabs 1931 silver badge6 bronze badges · Answer 1 · 2015-09-13 13:44:43Z

Ok, I managed to simplify the get_runner_class method.

First step is to add a class attribute name to each runner, so we don't have to rely on the class name to match the yaml file key.

New Runner class:

class Relase(Runner):
 name = 'release'
...

The other simplification is to use Runner.__subclasses__() to get the list of all possible runners.

New get_runner_class method (much cleaner and explicit)

def get_runner_class(runner):
 runners = Runner.__subclasses__()
 for rn in runners:
 if rn.name == runner:
 return rn
 return None

Stack Exchange Network

Build scripts driven from yaml file

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Build scripts driven from yaml file

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions