gearman
 

Gearman provides a generic framework to farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance processing, or to call functions between languages. The first implementation came from the folks at Danga Interactive (LiveJournal/SixApart). This wiki was setup to provide a single place to organize all information related to Gearman. Content is being updated regularly, so please check back often. You may also want to check out other forms of communication if you would like to learn more or get involved!


How do I use Gearman?

A basic Gearman setup consists of three parts: a client, a worker, and a job server. The client is responsible for creating and requesting a job to be run, the job server finds a suitable worker that can run the job, and the worker actually performs the work. Gearman provides client and worker APIs that your applications call upon to talk with the Gearman job server (also known as gearmand). To explain this in more detail, lets look at a simple application: reverse the order of characters in a given string.

You would start off by writing a client interface, lets call it “reverse_client”. The “reverse_client” code is responsible for sending off the job and returning the result. It does this using the Gearman client interface by sending some data associated with a function name, in this case “reverse”. Code to do this in C would be (with error handling omitted for brevity):

gearman_client_create(&client);
gearman_client_add_server(&client, "127.0.0.1", 0);
result= gearman_client_do(&client, "reverse", "Hello world!",
                          strlen("Hello world!"), &result_size, &ret);

This code initializes a client instance, configures it to use a local job server, and then sends a job out to run the “reverse” function with the arguments “Hello world!”. The function name and arguments are completely arbitrary as far as Gearman is concerned, so you could send any packed binary structure that is appropriate for your application. At this point the Gearman client library packages up the job into a Gearman request and sends it to gearmand (the job server) to find an appropriate worker that can run the “reverse” function. Let's now look at how the worker code will look:

void *reverse_function(gearman_job_st *job, void *cb_arg,
                       size_t *result_size, gearman_return_t *ret_ptr)
{
  const uint8_t *workload;
  uint8_t *result;
  size_t x;
  size_t y;

  workload= gearman_job_workload(job);
  *result_size= gearman_job_workload_size(job);
  result= malloc(*result_size);
  for (y= 0, x= *result_size; x; x--, y++)
    result[y]= workload)[x - 1];
  *ret_ptr= GEARMAN_SUCCESS;
  return result;
}

gearman_worker_create(&worker);
gearman_worker_add_server(&worker, "127.0.0.1", 0);
gearman_worker_add_function(&worker, "reverse", 0, reverse_function, NULL);
while (1) gearman_worker_work(&worker);

This code defines a function “reverse_function” that takes a string input and reverses the string into a new output buffer. It is used by a worker instance object to register a function named “reverse” after it is setup to connect to the same local job server as the client. When the job server receives a job to be run, it looks at the list of workers who have registered the name “reverse”, and forwards the job onto one of the free workers. The Gearman worker library then takes this request, runs the function “reverse_function”, and sends the result of that function back through the job server to the original caller.

As you can see, the the client and worker APIs provided (along with gearmand) deal with the job management and network communication, and all you need to do is write the application specific parts of the client and worker. More detailed examples of the “reverse” function can be found in the “examples/” directory of the C implementation.

That seems like a lot of work to run a function, how is this useful?

There are a number of ways this can be useful. The simplest answer is that you can use Gearman as an interface between a client and a worker written in different languages. Perhaps you want your PHP web application to call a function written in C, you use the PHP client API with the C worker API, and stick gearmand in the middle. Of course, there are more efficient ways of doing this (like writing a PHP extension in C), but you may want a PHP client and a Python worker, or perhaps a Perl client and a Java worker. You can mix and match any of the supported language interfaces. Is your favorite language not supported yet? Get involved with the project, it's probably fairly easy for either you or one of the existing Gearman developers to put a language wrapper on top of the C library.

The next step in Gearman being useful is to put the worker code on a separate machine (or set of machines) that are better suited to do the work. Say your PHP web application wants to do image conversion, but this is too much processing to run it on the local machine. You could instead ship the image off to be converted to a separate set or worker machines, this way the load does not impact the performance of your web servers and other PHP scripts. By doing this, you get a natural form of load balancing as well, since the job server only sends new jobs to idle workers. If all the workers running on a given machine are busy, you don't need to worry about new jobs being sent there. This also makes scale-out with multi-core servers quite simple. Do you have 16 cores on a worker machine? Start up 16 instances of your worker (or perhaps more if they are not CPU bound). It is also seamless to add new machines for more workers, it just gives the job servers a larger pool of workers to choose from.

Now you're probably asking what if the job servers die? You are able to run multiple job servers, and have the clients/workers connect to multiple job servers if needed. This way if the first job server dies, clients and workers automatically fail over to the second job server. You probably don't want to run too many job servers, but having two or three is a good idea for redundancy. Here is a diagram of how a large Gearman installation may look:

From here, you can scale out your clients and workers as needed. You can draw your own physical (or virtual) machine lines where capacity allows, potentially distributing load to any number of machines. For more detailed plans on specific uses and installations, see the section on use cases.

 
gearman.txt · Last modified: 2009/01/04 03:14 by eday
 
Recent changes RSS feed Driven by DokuWiki
Hosted by Concentric
Gear icon is LGPL licensed and was created by Everaldo Coelho