跳转至

5.10. 希尔排序

5.10. The Shell Sort

希尔排序(有时称为 递减增量排序)通过将原始列表分解为多个较小的子列表来改进插入排序,每个子列表都使用插入排序进行排序。这些子列表的独特选择方式是希尔排序的关键。希尔排序不是将列表分解为连续项的子列表,而是使用一个增量 \(i\)(有时称为 间隔),通过选择所有相隔 \(i\) 项的项来创建子列表。

这可以在 图 6 中看到。这个列表有九个项。如果我们使用增量三,则会有三个子列表,每个子列表可以通过插入排序进行排序。完成这些排序后,我们得到 图 7 中显示的列表。虽然 图 7 中显示的列表尚未完全排序,但发生了一些非常有趣的事情。通过排序子列表,我们已经将项移动到更接近它们实际位置的地方。

Image title
图 6: 增量为三的希尔排序

Image title
图 7: 排序每个子列表后的希尔排序

图 8 显示了使用增量为一的最终插入排序——换句话说,就是标准的插入排序。注意,通过执行早期的子列表排序,我们现在减少了将列表排序到最终顺序所需的总移动操作次数。在这种情况下,我们仅需进行四次额外的移动来完成这个过程。

Image title
图 8: 希尔排序:增量为 1 的最终插入排序

Image title
图 9: 希尔排序的初始子列表

我们之前提到过,增量的选择方式是希尔排序的独特之处。ActiveCode 1 中显示的函数使用了不同的增量集合。在这种情况下,我们从 \(\frac {n}{2}\) 个子列表开始。在下一次遍历中,排序 \(\frac {n}{4}\) 个子列表。最终,使用基本的插入排序对单个列表进行排序。图 9 显示了使用此增量的示例中的第一个子列表。

以下对 shell_sort 函数的调用显示了每次增量后的部分排序列表,最终排序为增量为一的插入排序。

活动: 5.10.1 希尔排序实现
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def shell_sort(a_list):
    sublist_count = len(a_list) // 2
    while sublist_count > 0:
        for pos_start in range(sublist_count):
            gap_insertion_sort(a_list, pos_start, sublist_count)
        print("增量为", sublist_count, "后的列表为", a_list)
        sublist_count = sublist_count // 2

def gap_insertion_sort(a_list, start, gap):
    for i in range(start + gap, len(a_list), gap):
        cur_val = a_list[i]
        cur_pos = i
        while cur_pos >= gap and a_list[cur_pos - gap] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - gap]
            cur_pos = cur_pos - gap
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
shell_sort(a_list)
print(a_list)



有关更多细节,CodeLens 5 允许您逐步分析算法。

追踪希尔排序
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
def shell_sort(a_list):
    sublist_count = len(a_list) // 2
    while sublist_count > 0:
        for pos_start in range(sublist_count):
            gap_insertion_sort(a_list, pos_start, sublist_count)
        print("增量为", sublist_count, "后的列表为", a_list)
        sublist_count = sublist_count // 2


def gap_insertion_sort(a_list, start, gap):
    for i in range(start + gap, len(a_list), gap):
        cur_val = a_list[i]
        cur_pos = i
        while cur_pos >= gap and a_list[cur_pos - gap] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - gap]
            cur_pos = cur_pos - gap
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
shell_sort(a_list)
print(a_list)

乍一看,你可能会认为希尔排序不可能比插入排序更好,因为它在最后一步执行了完整的插入排序。然而,事实证明,这个最终的插入排序实际上不需要进行很多比较(或移动),因为列表已经通过之前的增量插入排序进行了预排序。换句话说,每次遍历都会产生一个比上一次“更排序”的列表。这使得最终的遍历非常高效。

虽然希尔排序的一般分析超出了本书的范围,但我们可以说它的复杂度介于 \(O(n)\)\(O(n^{2})\) 之间,具体取决于上述描述的行为。对于 Listing 5 中显示的增量,性能是 \(O(n^{2})\)。通过更改增量,例如使用 \(2^{k}-1\)(1, 3, 7, 15, 31 等),希尔排序的性能可以达到 \(O(n^{\frac {3}{2}})\)

自我检测

给定以下数字列表:[5, 16, 20, 12, 3, 8, 9, 17, 19, 7] 哪个答案说明了增量为 3 时,所有交换完成后的列表内容?

  • 选项 a: [5, 3, 8, 7, 16, 19, 9, 17, 20, 12]
  • 选项 b: [3, 7, 5, 8, 9, 12, 19, 16, 20, 17]
  • 选项 c: [3, 5, 7, 8, 9, 12, 16, 17, 19, 20]
  • 选项 d: [5, 16, 20, 3, 8, 12, 9, 17, 20, 7]

  • 正确答案: a

  • 反馈 a: 每组由每隔 3 个位置表示的数字已正确排序。

  • 反馈 b: 这个解决方案适用于增量为 2 的情况。
  • 反馈 c: 这是完全排序后的列表,你已经排序得太远了。
  • 反馈 d: 增量为 3 表明每隔 3 个数字的组,如 0, 3, 6, 9 和 1, 4, 7 以及 2, 5, 8 是排序的,而不是每组三个。

The Shell sort, sometimes called the diminishing increment sort, improves on the insertion sort by breaking the original list into a number of smaller sublists, each of which is sorted using an insertion sort. The unique way that these sublists are chosen is the key to the Shell sort. Instead of breaking the list into sublists of contiguous items, the Shell sort uses an increment \(i\), sometimes called the gap, to create a sublist by choosing all items that are \(i\) items apart.

This can be seen in Figure 6. This list has nine items. If we use an increment of three, there are three sublists, each of which can be sorted by an insertion sort. After completing these sorts, we get the list shown in Figure 7. Although the list shown in Figure 7 is not completely sorted, something very interesting has happened. By sorting the sublists, we have moved the items closer to where they actually belong.

Image title
Figure 6: A Shell Sort with Increments of Three

Image title
Figure 7: A Shell Sort after Sorting Each Sublist

Figure 8 shows a final insertion sort using an increment of one—in other words, a standard insertion sort. Note that by performing the earlier sublist sorts, we have now reduced the total number of shifting operations necessary to put the list in its final order. For this case, we need only four more shifts to complete the process.

Image title
Figure 8: ShellSort: A Final Insertion Sort with Increment of 1

Image title
Figure 9: Initial Sublists for a Shell Sort

We said earlier that the way in which the increments are chosen is the unique feature of the Shell sort. The function shown in ActiveCode 1 uses a different set of increments. In this case, we begin with \(\frac {n}{2}\) sublists. On the next pass, \(\frac {n}{4}\) sublists are sorted. Eventually, a single list is sorted with the basic insertion sort. Figure 9 shows the first sublists for our example using this increment.

The following invocation of the shell_sort function shows the partially sorted lists after each increment, with the final sort being an insertion sort with an increment of one.

Activity: 5.10.1 Shell Sort Implementation
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
def shell_sort(a_list):
    sublist_count = len(a_list) // 2
    while sublist_count > 0:
        for pos_start in range(sublist_count):
            gap_insertion_sort(a_list, pos_start, sublist_count)
        print("After increments of size", sublist_count, "the list is", a_list)
        sublist_count = sublist_count // 2

def gap_insertion_sort(a_list, start, gap):
    for i in range(start + gap, len(a_list), gap):
        cur_val = a_list[i]
        cur_pos = i
        while cur_pos >= gap and a_list[cur_pos - gap] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - gap]
            cur_pos = cur_pos - gap
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
shell_sort(a_list)
print(a_list)



For more detail, CodeLens 5 allows you to step through the algorithm.

Tracing the Shell Sort
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
def shell_sort(a_list):
    sublist_count = len(a_list) // 2
    while sublist_count > 0:
        for pos_start in range(sublist_count):
            gap_insertion_sort(a_list, pos_start, sublist_count)
        print("After increments of size", sublist_count, "the list is", a_list)
        sublist_count = sublist_count // 2


def gap_insertion_sort(a_list, start, gap):
    for i in range(start + gap, len(a_list), gap):
        cur_val = a_list[i]
        cur_pos = i
        while cur_pos >= gap and a_list[cur_pos - gap] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - gap]
            cur_pos = cur_pos - gap
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
shell_sort(a_list)
print(alist)

At first glance you may think that a Shell sort cannot be better than an insertion sort since it does a complete insertion sort as the last step. It turns out, however, that this final insertion sort does not need to do very many comparisons (or shifts) since the list has been presorted by earlier incremental insertion sorts, as described above. In other words, each pass produces a list that is “more sorted” than the previous one. This makes the final pass very efficient.

Although a general analysis of the Shell sort is well beyond the scope of this text, we can say that it tends to fall somewhere between \(O(n)\) and \(O(n^{2})\), based on the behavior described above. For the increments shown in Listing 5, the performance is \(O(n^{2})\). By changing the increment, for example using \(2^{k}-1\) (1, 3, 7, 15, 31, and so on), a Shell sort can perform at \(O(n^{\frac {3}{2}})\).

Self Check

Given the following list of numbers: [5, 16, 20, 12, 3, 8, 9, 17, 19, 7] Which answer illustrates the contents of the list after all swapping is complete for a gap size of 3?

  • answer a: [5, 3, 8, 7, 16, 19, 9, 17, 20, 12]
  • answer b: [3, 7, 5, 8, 9, 12, 19, 16, 20, 17]
  • answer c: [3, 5, 7, 8, 9, 12, 16, 17, 19, 20]
  • answer d: [5, 16, 20, 3, 8, 12, 9, 17, 20, 7]

correct: a

  • feedback a: Each group of numbers represented by index positions 3 apart are sorted correctly.
  • feedback b: This solution is for a gap size of two.
  • feedback c: This is list completely sorted, you have gone too far.
  • feedback d: The gap size of three indicates that the group represented by every third number e.g. 0, 3, 6, 9 and 1, 4, 7 and 2, 5, 8 are sorted not groups of 3.

最后更新: 2024年9月13日
创建日期: 2024年9月9日