workflow - Dataflow computing in python -


I have n (usually n & lt; 10 but it should be scale) the processes are running on different machines and Communicate using rabbitmq via amqp. Processes are usually going on and can be implemented in any language (although most are Java / Python).

Each process requires multiple input (number / wire) and also produces multiple output (numbers or strings only). Executing a process is asynchronous: sending a message to your input queue and waiting for the callback to be started by the output queue.

Ideally the user specifies some input and desired output and the system should:

  • Determine what process is required and generate dependency graphs
  • topologically sort graph and execute it, the node transition will need to operate the event

    A node should fire, if its input is ready, Allow. I can not take any chakra for now, but eventually there will be chakra (ie, two processes may need to be repeated, until the output does not change till now).

    This should be a known problem with streaming data () and I do not want to invent the wheel again. I like a python solution and leads to a search and trails have not developed anymore, but seem to be supporting the cycle, while pips are not there. It is also not sure how actively developed casks are

    further exploration, none of which I am particularly knowledgeable

    whether with this type of problem Is there any problem with handling or is it mentioned with libraries?

    Edit: I have other libraries found:

    • (Java)

      near python.org " Flow Based Programming "is a wiki page -

Comments