下一步¶

Next Steps

中文

使用 Celery 的第一步指南故意保持简洁。在本指南中，我将更详细地演示 Celery 的功能，包括如何为你的应用程序和库添加 Celery 支持。

本文档并未涵盖 Celery 的所有功能和最佳实践，因此建议你同时阅读 User Guide。

英文

The 使用 Celery 的第一步 guide is intentionally minimal. In this guide I'll demonstrate what Celery offers in more detail, including how to add Celery support for your application and library.

This document doesn't document all of Celery's features and best practices, so it's recommended that you also read the User Guide

调用任务¶

Calling Tasks

中文

你可以使用 delay() 方法调用任务：

>>> from proj.tasks import add

>>> add.delay(2, 2)

该方法实际上是另一个方法 apply_async() 的星号参数快捷方式：

>>> add.apply_async((2, 2))

后者允许你指定执行选项，例如执行延迟时间（countdown）、发送到的队列等：

>>> add.apply_async((2, 2), queue='lopri', countdown=10)

上述示例中，该任务将被发送到名为 lopri 的队列，并且最早在消息发送后 10 秒被执行。

直接调用任务将会在当前进程中执行任务，不会发送消息：

>>> add(2, 2)
4

这三个方法 —— delay()、apply_async() 以及直接调用（__call__）构成了 Celery 的调用 API，该 API 同样用于生成任务签名。

关于调用 API 的更详细说明，请参阅 Calling User Guide。

每次任务调用都会被赋予一个唯一标识符（UUID）—— 这就是任务 ID。

delay 和 apply_async 方法会返回一个 AsyncResult 实例，可用于追踪任务的执行状态。但为此你需要启用结果后端，以便有地方存储任务状态。

默认情况下结果功能是禁用的，因为并没有一个适用于所有应用的结果后端；你需要权衡每种后端的缺点来选择是否启用。对于许多任务而言，保存返回值其实并不重要，因此将其默认禁用是合理的选择。此外请注意，结果后端不会用于任务或工作进程的监控：Celery 为此使用专用的事件消息（参见监控和管理指南）。

如果你已配置了结果后端，可以检索任务的返回值：

>>> res = add.delay(2, 2)
>>> res.get(timeout=1)
4

你可以通过 id 属性查看任务的 ID：

>>> res.id
d6b3aea2-fb9b-4ebc-8da4-848818db9114

如果任务抛出异常，你也可以检查异常及其回溯信息，事实上 result.get() 默认会传播任何错误：

>>> res = add.delay(2, '2')
>>> res.get(timeout=1)

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "celery/result.py", line 221, in get
        return self.backend.wait_for_pending(
    File "celery/backends/asynchronous.py", line 195, in wait_for_pending
        return result.maybe_throw(callback=callback, propagate=propagate)
    File "celery/result.py", line 333, in maybe_throw
        self.throw(value, self._to_remote_traceback(tb))
    File "celery/result.py", line 326, in throw
        self.on_ready.throw(*args, **kwargs)
    File "vine/promises.py", line 244, in throw
        reraise(type(exc), exc, tb)
    File "vine/five.py", line 195, in reraise
        raise value
TypeError: unsupported operand type(s) for +: 'int' and 'str'

如果你不希望错误被传播，可以通过传入 propagate 参数来关闭它：

>>> res.get(propagate=False)
TypeError("unsupported operand type(s) for +: 'int' and 'str'")

在这种情况下，它将返回所抛出的异常实例 —— 因此若要判断任务是否成功或失败，你需要使用结果实例上的对应方法：

>>> res.failed()
True

>>> res.successful()
False

那它是如何判断任务是否失败的呢？这取决于任务的状态：

>>> res.state
'FAILURE'

一个任务在任意时刻只能处于一个状态，但它可能会经历多个状态。一个典型任务的状态流程可能如下所示:

PENDING -> STARTED -> SUCCESS

STARTED 状态是一种特殊状态，只有在启用了 task_track_started 设置，或任务设置了 @task(track_started=True) 时才会被记录。

PENDING 状态实际上并不是一个已记录的状态，而是任何未知任务 ID 的默认状态。你可以通过下面的示例看到：

>>> from proj.celery import app

>>> res = app.AsyncResult('this-id-does-not-exist')
>>> res.state
'PENDING'

如果任务被重试，状态流程会更复杂一些。例如某个任务重试了两次，其状态流程可能如下：

PENDING -> STARTED -> RETRY -> STARTED -> RETRY -> STARTED -> SUCCESS

关于任务状态的更多内容请参考任务用户指南中的状态一节。

任务调用的详细说明请参阅 Calling Guide。

英文

You can call a task using the delay() method:

>>> from proj.tasks import add

>>> add.delay(2, 2)

This method is actually a star-argument shortcut to another method called apply_async():

>>> add.apply_async((2, 2))

The latter enables you to specify execution options like the time to run (countdown), the queue it should be sent to, and so on:

>>> add.apply_async((2, 2), queue='lopri', countdown=10)

In the above example the task will be sent to a queue named lopri and the task will execute, at the earliest, 10 seconds after the message was sent.

Applying the task directly will execute the task in the current process, so that no message is sent:

>>> add(2, 2)
4

These three methods - delay(), apply_async(), and applying (__call__), make up the Celery calling API, which is also used for signatures.

A more detailed overview of the Calling API can be found in the Calling User Guide.

Every task invocation will be given a unique identifier (an UUID) -- this is the task id.

The delay and apply_async methods return an AsyncResult instance, which can be used to keep track of the tasks execution state. But for this you need to enable a result backend so that the state can be stored somewhere.

Results are disabled by default because there is no result backend that suits every application; to choose one you need to consider the drawbacks of each individual backend. For many tasks keeping the return value isn't even very useful, so it's a sensible default to have. Also note that result backends aren't used for monitoring tasks and workers: for that Celery uses dedicated event messages (see 监控和管理指南).

If you have a result backend configured you can retrieve the return value of a task:

>>> res = add.delay(2, 2)
>>> res.get(timeout=1)
4

You can find the task's id by looking at the id attribute:

>>> res.id
d6b3aea2-fb9b-4ebc-8da4-848818db9114

You can also inspect the exception and traceback if the task raised an exception, in fact result.get() will propagate any errors by default:

>>> res = add.delay(2, '2')
>>> res.get(timeout=1)

Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "celery/result.py", line 221, in get
        return self.backend.wait_for_pending(
    File "celery/backends/asynchronous.py", line 195, in wait_for_pending
        return result.maybe_throw(callback=callback, propagate=propagate)
    File "celery/result.py", line 333, in maybe_throw
        self.throw(value, self._to_remote_traceback(tb))
    File "celery/result.py", line 326, in throw
        self.on_ready.throw(*args, **kwargs)
    File "vine/promises.py", line 244, in throw
        reraise(type(exc), exc, tb)
    File "vine/five.py", line 195, in reraise
        raise value
TypeError: unsupported operand type(s) for +: 'int' and 'str'

If you don't wish for the errors to propagate, you can disable that by passing propagate:

>>> res.get(propagate=False)
TypeError("unsupported operand type(s) for +: 'int' and 'str'")

In this case it'll return the exception instance raised instead -- so to check whether the task succeeded or failed, you'll have to use the corresponding methods on the result instance:

>>> res.failed()
True

>>> res.successful()
False

So how does it know if the task has failed or not? It can find out by looking at the tasks state:

>>> res.state
'FAILURE'

A task can only be in a single state, but it can progress through several states. The stages of a typical task can be:

PENDING -> STARTED -> SUCCESS

The started state is a special state that's only recorded if the task_track_started setting is enabled, or if the @task(track_started=True) option is set for the task.

The pending state is actually not a recorded state, but rather the default state for any task id that's unknown: this you can see from this example:

>>> from proj.celery import app

>>> res = app.AsyncResult('this-id-does-not-exist')
>>> res.state
'PENDING'

If the task is retried the stages can become even more complex. To demonstrate, for a task that's retried two times the stages would be:

PENDING -> STARTED -> RETRY -> STARTED -> RETRY -> STARTED -> SUCCESS

To read more about task states you should see the 状态 section in the tasks user guide.

Calling tasks is described in detail in the Calling Guide.

路由¶

Routing

中文

Celery 支持 AMQP 提供的全部路由功能，同时也支持简单的路由机制，即将消息发送到具名队列中。

通过 task_routes 配置项，你可以按任务名称进行路由，从而将所有路由配置集中管理：

app.conf.update(
    task_routes = {
        'proj.tasks.add': {'queue': 'hipri'},
    },
)

你也可以在运行时通过 apply_async 的 queue 参数来指定队列：

>>> from proj.tasks import add
>>> add.apply_async((2, 2), queue='hipri')

然后你可以通过 celery worker -Q 选项让某个 worker 从该队列中消费任务：

$ celery -A proj worker -Q hipri

你可以通过逗号分隔的列表指定多个队列。例如，你可以让 worker 同时消费默认队列和 hipri 队列，其中默认队列因历史原因被命名为 celery：

$ celery -A proj worker -Q hipri,celery

队列的顺序无关紧要，worker 会给予所有队列相同的权重。

如需了解更多关于路由的内容，包括如何充分利用 AMQP 的路由能力，请参阅路由指南。

英文

Celery supports all of the routing facilities provided by AMQP, but it also supports simple routing where messages are sent to named queues.

The task_routes setting enables you to route tasks by name and keep everything centralized in one location:

app.conf.update(
    task_routes = {
        'proj.tasks.add': {'queue': 'hipri'},
    },
)

You can also specify the queue at runtime with the queue argument to apply_async:

>>> from proj.tasks import add
>>> add.apply_async((2, 2), queue='hipri')

You can then make a worker consume from this queue by specifying the celery worker -Q option:

$ celery -A proj worker -Q hipri

You may specify multiple queues by using a comma-separated list. For example, you can make the worker consume from both the default queue and the hipri queue, where the default queue is named celery for historical reasons:

$ celery -A proj worker -Q hipri,celery

The order of the queues doesn't matter as the worker will give equal weight to the queues.

To learn more about routing, including taking use of the full power of AMQP routing, see the Routing Guide.

远程控制¶

Remote Control

中文

如果你使用的是 RabbitMQ（AMQP）、Redis 或 Qpid 作为 broker，那么你可以在运行时对 worker 进行控制与检查。

例如，你可以查看某个 worker 当前正在处理的任务：

$ celery -A proj inspect active

这是通过广播消息实现的，因此集群中的所有 worker 都会接收到远程控制指令。

你也可以通过 --destination 参数指定一个或多个 worker 来执行请求。该参数接受一个以逗号分隔的 worker 主机名列表：

$ celery -A proj inspect active --destination=celery@example.com

如果未指定 destination，则每个 worker 都会响应请求。

celery inspect 命令包含的操作不会改变 worker 的任何状态；它只会返回 worker 内部正在运行的情况和相关统计信息。你可以通过以下方式查看所有可用的 inspect 命令：

$ celery -A proj inspect --help

另一个命令是 celery control，该命令包含可在运行时改变 worker 行为的操作：

$ celery -A proj control --help

例如，你可以强制 worker 启用事件消息（event messages），用于监控任务和 worker：

$ celery -A proj control enable_events

启用事件后，你可以启动事件转储器（event dumper），查看 worker 正在执行的操作：

$ celery -A proj events --dump

或者你可以启动基于 curses 的图形界面：

$ celery -A proj events

当你完成监控后，可以通过以下命令再次关闭事件收集：

$ celery -A proj control disable_events

celery status 命令同样使用远程控制指令，用于显示当前集群中处于在线状态的 worker 列表：

$ celery -A proj status

你可以在监控指南中阅读更多关于 celery 命令和监控的内容。

英文

If you're using RabbitMQ (AMQP), Redis, or Qpid as the broker then you can control and inspect the worker at runtime.

For example you can see what tasks the worker is currently working on:

$ celery -A proj inspect active

This is implemented by using broadcast messaging, so all remote control commands are received by every worker in the cluster.

You can also specify one or more workers to act on the request using the --destination option. This is a comma-separated list of worker host names:

$ celery -A proj inspect active --destination=celery@example.com

If a destination isn't provided then every worker will act and reply to the request.

The celery inspect command contains commands that don't change anything in the worker; it only returns information and statistics about what's going on inside the worker. For a list of inspect commands you can execute:

$ celery -A proj inspect --help

Then there's the celery control command, which contains commands that actually change things in the worker at runtime:

$ celery -A proj control --help

For example you can force workers to enable event messages (used for monitoring tasks and workers):

$ celery -A proj control enable_events

When events are enabled you can then start the event dumper to see what the workers are doing:

$ celery -A proj events --dump

or you can start the curses interface:

$ celery -A proj events

when you're finished monitoring you can disable events again:

$ celery -A proj control disable_events

The celery status command also uses remote control commands and shows a list of online workers in the cluster:

$ celery -A proj status

You can read more about the celery command and monitoring in the Monitoring Guide.

时区¶

Timezone

中文

所有时间和日期在内部以及消息中都使用 UTC 时区。

当 worker 接收到消息（例如带有倒计时参数的任务）时，会将该 UTC 时间转换为本地时间。如果你希望使用不同于系统默认的时区，则需要通过 timezone 配置项进行设置：

app.conf.timezone = 'Europe/London'

英文

All times and dates, internally and in messages use the UTC timezone.

When the worker receives a message, for example with a countdown set it converts that UTC time to local time. If you wish to use a different timezone than the system timezone then you must configure that using the timezone setting:

app.conf.timezone = 'Europe/London'

优化¶

Optimization

中文

默认配置并未针对吞吐量进行优化。默认情况下， Celery 试图在处理大量短任务与少量长任务之间取得平衡，在吞吐量与公平调度之间做出折中。

如果你对公平调度有严格要求，或希望对系统进行吞吐量优化，请阅读优化指南。

英文

The default configuration isn't optimized for throughput. By default, it tries to walk the middle way between many short tasks and fewer long tasks, a compromise between throughput and fair scheduling.

If you have strict fair scheduling requirements, or want to optimize for throughput then you should read the Optimizing Guide.

现在可以做什么?¶

What to do now?

中文

阅读完本文档之后，建议继续阅读用户指南。

如果你对 API 感兴趣，也可以参考 API 参考。

英文

Now that you have read this document you should continue to the User Guide.

There's also an API reference if you're so inclined.

下一步¶

在你的项目中使用Celery¶

我们的项目¶

`proj/celery.py`¶

`proj/tasks.py`¶

启动 worker¶

停止 worker¶

在后台¶

关于 `--app` 参数¶

调用任务¶

Canvas: 设计工作流¶

并且再次调用 API …¶

原语¶

组式任务¶

链式任务¶

合集任务¶

路由¶

远程控制¶

时区¶

优化¶

现在可以做什么?¶

下一步¶

在你的项目中使用Celery¶

我们的项目¶

proj/celery.py¶

proj/tasks.py¶

启动 worker¶

停止 worker¶

在后台¶

关于 --app 参数¶

调用任务¶

Canvas: 设计工作流¶

并且再次调用 API …¶

原语¶

组式任务¶

链式任务¶

合集任务¶

路由¶

远程控制¶

时区¶

优化¶

现在可以做什么?¶

`proj/celery.py`¶

`proj/tasks.py`¶

关于 `--app` 参数¶