使用事件跟踪查询、对象和会话更改¶

Tracking queries, object and Session Changes with Events

中文

SQLAlchemy 具有一个广泛的事件监听系统，贯穿于 Core 和 ORM。在 ORM 内，有多种事件监听器钩子，在 API 层面上记录在 ORM 事件。多年来，这些事件集合不断增长，包括许多非常有用的新事件以及一些不再那么相关的旧事件。本节将尝试介绍主要的事件钩子及其可能的使用场景。

英文

SQLAlchemy features an extensive Event Listening system used throughout the Core and ORM. Within the ORM, there are a wide variety of event listener hooks, which are documented at an API level at ORM 事件. This collection of events has grown over the years to include lots of very useful new events as well as some older events that aren’t as relevant as they once were. This section will attempt to introduce the major event hooks and when they might be used.

执行事件¶

Execute Events

中文

在 1.4 版本加入: Session 现在新增了一个全面的钩子，用于拦截 ORM 所执行的所有 SELECT 语句，以及批量 UPDATE 和 DELETE 语句。这个钩子取代了之前的 QueryEvents.before_compile()、 QueryEvents.before_compile_update() 和 QueryEvents.before_compile_delete() 事件。

Session 提供了一套完整的机制，允许你拦截并修改所有通过 Session.execute() 方法执行的查询，其中包括所有由 Query 发出的 SELECT 语句，以及在加载列和关系属性时自动发出的 SELECT 语句。该机制使用 SessionEvents.do_orm_execute() 事件钩子和 ORMExecuteState 对象来表示事件状态。

英文

在 1.4 版本加入: The Session now features a single comprehensive hook designed to intercept all SELECT statements made on behalf of the ORM as well as bulk UPDATE and DELETE statements. This hook supersedes the previous QueryEvents.before_compile() event as well QueryEvents.before_compile_update() and QueryEvents.before_compile_delete().

Session features a comprehensive system by which all queries invoked via the Session.execute() method, which includes all SELECT statements emitted by Query as well as all SELECT statements emitted on behalf of column and relationship loaders, may be intercepted and modified. The system makes use of the SessionEvents.do_orm_execute() event hook as well as the ORMExecuteState object to represent the event state.

基本查询拦截¶

Basic Query Interception

中文

SessionEvents.do_orm_execute() 特别适用于拦截各种查询，包括使用 1.x style 的 Query 语句，以及在 ORM 启用的上下文中使用 2.0 style 构造函数 select()、update() 或 delete() 并传递给 Session.execute() 时。

ORMExecuteState 提供了用于修改语句、参数和执行选项的访问器。例如:

Session = sessionmaker(engine)


@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if orm_execute_state.is_select:
        # 为所有 SELECT 查询添加 populate_existing 选项
        orm_execute_state.update_execution_options(populate_existing=True)

        # 如果当前 SELECT 针对特定实体，则添加 ORDER BY
        col_descriptions = orm_execute_state.statement.column_descriptions

        if col_descriptions[0]["entity"] is MyEntity:
            orm_execute_state.statement = statement.order_by(MyEntity.name)

上述示例演示了如何对 SELECT 查询进行简单的修改。

在这个层级，SessionEvents.do_orm_execute() 事件钩子的目标是 替代早期版本中对 QueryEvents.before_compile() 事件的使用。后者在处理某些类型的加载器时并不总是会被触发，且它只能应用于 1.x style 的 Query，而不能用于 2.0 style 中的 Session.execute() 方式。

英文

SessionEvents.do_orm_execute() is firstly useful for any kind of interception of a query, which includes those emitted by Query with 1.x style as well as when an ORM-enabled 2.0 style select(), update() or delete() construct is delivered to Session.execute(). The ORMExecuteState construct provides accessors to allow modifications to statements, parameters, and options:

Session = sessionmaker(engine)


@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if orm_execute_state.is_select:
        # add populate_existing for all SELECT statements

        orm_execute_state.update_execution_options(populate_existing=True)

        # check if the SELECT is against a certain entity and add an
        # ORDER BY if so
        col_descriptions = orm_execute_state.statement.column_descriptions

        if col_descriptions[0]["entity"] is MyEntity:
            orm_execute_state.statement = statement.order_by(MyEntity.name)

The above example illustrates some simple modifications to SELECT statements. At this level, the SessionEvents.do_orm_execute() event hook intends to replace the previous use of the QueryEvents.before_compile() event, which was not fired off consistently for various kinds of loaders; additionally, the QueryEvents.before_compile() only applies to 1.x style use with Query and not with 2.0 style use of Session.execute().

添加全局 WHERE / ON 条件¶

Adding global WHERE / ON criteria

中文

最常被请求的查询扩展功能之一，就是 为所有查询中的特定实体统一添加 WHERE 条件。这个功能可以通过 with_loader_criteria() 查询选项来实现。它可以独立使用，也可以非常理想地结合 SessionEvents.do_orm_execute() 事件使用:

from sqlalchemy.orm import with_loader_criteria

Session = sessionmaker(engine)

@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if (
        orm_execute_state.is_select
        and not orm_execute_state.is_column_load
        and not orm_execute_state.is_relationship_load
    ):
        orm_execute_state.statement = orm_execute_state.statement.options(
            with_loader_criteria(MyEntity.public == True)
        )

在上面的例子中，我们为所有对 MyEntity 的 SELECT 查询添加了 public == True 的过滤条件。该条件会作用于当前查询范围内对该类的 所有加载操作。默认情况下，with_loader_criteria() 会自动传播到关系加载（relationship loaders），包括懒加载（lazy load）、selectinload 等方式。

对于一组拥有共同字段结构的类，如果这些类是通过 declarative mixin 混入方式组成，那么也可以通过 with_loader_criteria() 搭配 Python 的 lambda 表达式来实现筛选。这个 lambda 表达式会在查询编译阶段、对每个匹配的实体类调用。

假设我们有一个名为 HasTimestamp 的 mixin，用于为多个实体添加时间戳字段:

import datetime

class HasTimestamp:
    timestamp = mapped_column(DateTime, default=datetime.datetime.now)

class SomeEntity(HasTimestamp, Base):
    __tablename__ = "some_entity"
    id = mapped_column(Integer, primary_key=True)

class SomeOtherEntity(HasTimestamp, Base):
    __tablename__ = "some_entity"
    id = mapped_column(Integer, primary_key=True)

上述两个类都具有 timestamp 字段，其默认值为当前时间。我们可以通过事件钩子拦截所有继承自 HasTimestamp 的对象，并筛选出「最近一个月内」的数据:

@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if (
        orm_execute_state.is_select
        and not orm_execute_state.is_column_load
        and not orm_execute_state.is_relationship_load
    ):
        one_month_ago = datetime.datetime.today() - datetime.timedelta(days=30)

        orm_execute_state.statement = orm_execute_state.statement.options(
            with_loader_criteria(
                HasTimestamp,
                lambda cls: cls.timestamp >= one_month_ago,
                include_aliases=True,
            )
        )

警告

在调用 with_loader_criteria() 时使用的 lambda 表达式 只会针对每个唯一的类调用一次。不应在 lambda 中调用带有副作用的自定义函数。详见使用 Lambda 显著提高语句生成速度，这是一个高级特性。

参见

ORM 查询事件 - 包含关于 with_loader_criteria() 的完整可运行示例。

英文

One of the most requested query-extension features is the ability to add WHERE criteria to all occurrences of an entity in all queries. This is achievable by making use of the with_loader_criteria() query option, which may be used on its own, or is ideally suited to be used within the SessionEvents.do_orm_execute() event:

from sqlalchemy.orm import with_loader_criteria

Session = sessionmaker(engine)


@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if (
        orm_execute_state.is_select
        and not orm_execute_state.is_column_load
        and not orm_execute_state.is_relationship_load
    ):
        orm_execute_state.statement = orm_execute_state.statement.options(
            with_loader_criteria(MyEntity.public == True)
        )

Above, an option is added to all SELECT statements that will limit all queries against MyEntity to filter on public == True. The criteria will be applied to all loads of that class within the scope of the immediate query. The with_loader_criteria() option by default will automatically propagate to relationship loaders as well, which will apply to subsequent relationship loads, which includes lazy loads, selectinloads, etc.

For a series of classes that all feature some common column structure, if the classes are composed using a declarative mixin, the mixin class itself may be used in conjunction with the with_loader_criteria() option by making use of a Python lambda. The Python lambda will be invoked at query compilation time against the specific entities which match the criteria. Given a series of classes based on a mixin called HasTimestamp:

import datetime


class HasTimestamp:
    timestamp = mapped_column(DateTime, default=datetime.datetime.now)


class SomeEntity(HasTimestamp, Base):
    __tablename__ = "some_entity"
    id = mapped_column(Integer, primary_key=True)


class SomeOtherEntity(HasTimestamp, Base):
    __tablename__ = "some_entity"
    id = mapped_column(Integer, primary_key=True)

The above classes SomeEntity and SomeOtherEntity will each have a column timestamp that defaults to the current date and time. An event may be used to intercept all objects that extend from HasTimestamp and filter their timestamp column on a date that is no older than one month ago:

@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if (
        orm_execute_state.is_select
        and not orm_execute_state.is_column_load
        and not orm_execute_state.is_relationship_load
    ):
        one_month_ago = datetime.datetime.today() - datetime.timedelta(months=1)

        orm_execute_state.statement = orm_execute_state.statement.options(
            with_loader_criteria(
                HasTimestamp,
                lambda cls: cls.timestamp >= one_month_ago,
                include_aliases=True,
            )
        )

警告

The use of a lambda inside of the call to with_loader_criteria() is only invoked once per unique class. Custom functions should not be invoked within this lambda. See 使用 Lambda 显著提高语句生成速度 for an overview of the “lambda SQL” feature, which is for advanced use only.

参见

ORM 查询事件 - includes working examples of the above with_loader_criteria() recipes.

重新执行语句¶

Re-Executing Statements

中文

Deep Alchemy

SQL 语句的重新执行功能涉及到一个稍显复杂的递归过程，其目的是解决「将 SQL 执行路由到其他非 SQL 上下文」这一相对困难的问题。典型的应用包括「dogpile 缓存」和「水平分片（sharding）」，可参考下文提供的示例链接。

ORMExecuteState 可以控制语句的实际执行过程。你可以选择 跳过 SQL 执行，转而返回缓存中的预构建结果，也可以在执行时多次变更状态（如使用多个数据库连接执行相同语句），并在内存中合并多个结果。这些高级用法在 SQLAlchemy 的示例代码中都有实现。

在 SessionEvents.do_orm_execute() 钩子内部，可以调用 ORMExecuteState.invoke_statement() 方法来执行一次新的、嵌套的 Session.execute() 调用。这将 终止当前语句的默认处理流程 ，直接返回嵌套执行的结果 Result 对象。

这个嵌套调用会跳过当前钩子上所有已注册的处理器。

ORMExecuteState.invoke_statement() 会返回一个 Result 对象。该对象支持 freeze 成为可缓存格式，并可 unfreeze 为新 Result 对象，或与其它 Result 对象合并。

例如，使用该钩子实现一个缓存机制:

from sqlalchemy.orm import loading

cache = {}

@event.listens_for(Session, “do_orm_execute”) def _do_orm_execute(orm_execute_state):

if “my_cache_key” in orm_execute_state.execution_options:
cache_key = orm_execute_state.execution_options[“my_cache_key”]

if cache_key in cache:
frozen_result = cache[cache_key]

else:
frozen_result = orm_execute_state.invoke_statement().freeze() cache[cache_key] = frozen_result

return loading.merge_frozen_result(
orm_execute_state.session, orm_execute_state.statement, frozen_result, load=False,

)

搭配以上钩子，使用缓存的调用示例如下:

stmt = (: select(User).where(User.name == “sandy”).execution_options(my_cache_key=”key_sandy”)

)

result = session.execute(stmt)

这里，通过 Select.execution_options() 设置了一个名为 “my_cache_key” 的选项，然后该选项会在钩子中被识别，并与缓存中的 FrozenResult 进行匹配。

若缓存命中，直接使用 merge_frozen_result 返回结果；
若未命中，执行查询并将结果 freeze 后缓存。

参见

Dogpile 缓存

水平分片

英文

Deep Alchemy

the statement re-execution feature involves a slightly intricate recursive sequence, and is intended to solve the fairly hard problem of being able to re-route the execution of a SQL statement into various non-SQL contexts. The twin examples of “dogpile caching” and “horizontal sharding”, linked below, should be used as a guide for when this rather advanced feature is appropriate to be used.

The ORMExecuteState is capable of controlling the execution of the given statement; this includes the ability to either not invoke the statement at all, allowing a pre-constructed result set retrieved from a cache to be returned instead, as well as the ability to invoke the same statement repeatedly with different state, such as invoking it against multiple database connections and then merging the results together in memory. Both of these advanced patterns are demonstrated in SQLAlchemy’s example suite as detailed below.

When inside the SessionEvents.do_orm_execute() event hook, the ORMExecuteState.invoke_statement() method may be used to invoke the statement using a new nested invocation of Session.execute(), which will then preempt the subsequent handling of the current execution in progress and instead return the Result returned by the inner execution. The event handlers thus far invoked for the SessionEvents.do_orm_execute() hook within this process will be skipped within this nested call as well.

The ORMExecuteState.invoke_statement() method returns a Result object; this object then features the ability for it to be “frozen” into a cacheable format and “unfrozen” into a new Result object, as well as for its data to be merged with that of other Result objects.

E.g., using SessionEvents.do_orm_execute() to implement a cache:

from sqlalchemy.orm import loading

cache = {}


@event.listens_for(Session, "do_orm_execute")
def _do_orm_execute(orm_execute_state):
    if "my_cache_key" in orm_execute_state.execution_options:
        cache_key = orm_execute_state.execution_options["my_cache_key"]

        if cache_key in cache:
            frozen_result = cache[cache_key]
        else:
            frozen_result = orm_execute_state.invoke_statement().freeze()
            cache[cache_key] = frozen_result

        return loading.merge_frozen_result(
            orm_execute_state.session,
            orm_execute_state.statement,
            frozen_result,
            load=False,
        )

With the above hook in place, an example of using the cache would look like:

stmt = (
    select(User).where(User.name == "sandy").execution_options(my_cache_key="key_sandy")
)

result = session.execute(stmt)

Above, a custom execution option is passed to Select.execution_options() in order to establish a “cache key” that will then be intercepted by the SessionEvents.do_orm_execute() hook. This cache key is then matched to a FrozenResult object that may be present in the cache, and if present, the object is re-used. The recipe makes use of the Result.freeze() method to “freeze” a Result object, which above will contain ORM results, such that it can be stored in a cache and used multiple times. In order to return a live result from the “frozen” result, the merge_frozen_result() function is used to merge the “frozen” data from the result object into the current session.

The above example is implemented as a complete example in Dogpile 缓存.

The ORMExecuteState.invoke_statement() method may also be called multiple times, passing along different information to the ORMExecuteState.invoke_statement.bind_arguments parameter such that the Session will make use of different Engine objects each time. This will return a different Result object each time; these results can be merged together using the Result.merge() method. This is the technique employed by the 水平分片 extension; see the source code to familiarize.

参见

Dogpile 缓存

水平分片

事务事件¶

Transaction Events

中文

事务事件允许应用程序在 Session 层面发生事务边界时，以及 Session 在 Connection 对象上改变事务状态时被通知。

SessionEvents.after_transaction_create(), SessionEvents.after_transaction_end() —— 这些事件用于追踪 Session 的逻辑事务作用域，其行为不依赖于具体的数据库连接。这些事件的设计旨在帮助集成事务追踪系统，例如 zope.sqlalchemy。当应用程序需要将某些外部作用域与 Session 的事务作用域对齐时，可以使用这些事件。这些钩子会反映 Session 的“嵌套”事务行为，也就是说它们会追踪逻辑上的“子事务” 以及数据库中的“嵌套”事务（如 SAVEPOINT）。
SessionEvents.before_commit(), SessionEvents.after_commit(), SessionEvents.after_begin(), SessionEvents.after_rollback(), SessionEvents.after_soft_rollback() —— 这些事件允许从数据库连接的角度追踪事务事件。特别是 SessionEvents.after_begin() 是一个 按连接触发的事件；如果一个 Session 维护了多个连接，那么当这些连接在当前事务中被使用时，每个连接都会分别触发该事件。回滚和提交事件指的是 DBAPI 连接本身直接接收到回滚或提交指令的时刻。

英文

Transaction events allow an application to be notified when transaction boundaries occur at the Session level as well as when the Session changes the transactional state on Connection objects.

SessionEvents.after_transaction_create(), SessionEvents.after_transaction_end() - these events track the logical transaction scopes of the Session in a way that is not specific to individual database connections. These events are intended to help with integration of transaction-tracking systems such as zope.sqlalchemy. Use these events when the application needs to align some external scope with the transactional scope of the Session. These hooks mirror the “nested” transactional behavior of the Session, in that they track logical “subtransactions” as well as “nested” (e.g. SAVEPOINT) transactions.
SessionEvents.before_commit(), SessionEvents.after_commit(), SessionEvents.after_begin(), SessionEvents.after_rollback(), SessionEvents.after_soft_rollback() - These events allow tracking of transaction events from the perspective of database connections. SessionEvents.after_begin() in particular is a per-connection event; a Session that maintains more than one connection will emit this event for each connection individually as those connections become used within the current transaction. The rollback and commit events then refer to when the DBAPI connections themselves have received rollback or commit instructions directly.

属性更改事件¶

Attribute Change Events

中文

属性变更事件允许在对象的特定属性被修改时进行拦截。这些事件包括 AttributeEvents.set()、AttributeEvents.append() 和 AttributeEvents.remove()。这些事件非常有用，特别适合用于对单个对象进行验证操作；不过，在大多数场景中使用“验证器”钩子（validator hook）会更加方便，这些钩子在底层其实就是使用这些属性事件实现的；参见简单验证器了解背景信息。这些属性事件同样也是 backreference（反向引用）机制的实现基础。关于属性事件的使用示例，参见属性检测。

英文

The attribute change events allow interception of when specific attributes on an object are modified. These events include AttributeEvents.set(), AttributeEvents.append(), and AttributeEvents.remove(). These events are extremely useful, particularly for per-object validation operations; however, it is often much more convenient to use a “validator” hook, which uses these hooks behind the scenes; see 简单验证器 for background on this. The attribute events are also behind the mechanics of backreferences. An example illustrating use of attribute events is in 属性检测.

使用事件跟踪查询、对象和会话更改¶

执行事件¶

基本查询拦截¶

添加全局 WHERE / ON 条件¶

重新执行语句¶

持久性事件¶

`before_flush()`¶

`after_flush()`¶

`after_flush_postexec()`¶

映射器级刷新事件¶

对象生命周期事件¶

瞬态¶

瞬态到待处理¶

待处理到持久¶

待处理到瞬态¶

加载为持久¶

持久到瞬态¶

持久到已删除¶

已删除到分离¶

持久到分离¶

分离到持久¶

已删除到持久¶

事务事件¶

属性更改事件¶

使用事件跟踪查询、对象和会话更改¶

执行事件¶

基本查询拦截¶

添加全局 WHERE / ON 条件¶

重新执行语句¶

持久性事件¶

before_flush()¶

after_flush()¶

after_flush_postexec()¶

映射器级刷新事件¶

对象生命周期事件¶

瞬态¶

瞬态到待处理¶

待处理到持久¶

待处理到瞬态¶

加载为持久¶

持久到瞬态¶

持久到已删除¶

已删除到分离¶

持久到分离¶

分离到持久¶

已删除到持久¶

事务事件¶

属性更改事件¶

`before_flush()`¶

`after_flush()`¶

`after_flush_postexec()`¶