跳转至

5.9. 插入排序

5.9. The Insertion Sort

插入排序虽然仍然是 \(O(n^{2})\),但其工作方式略有不同。它始终保持列表中较低位置的已排序子列表。每次新项插入到之前的已排序子列表中,使得已排序子列表增加一个项。图 4 展示了插入排序的过程。阴影部分表示算法每次遍历时的已排序子列表。

Image title
图 4: 插入排序:完整排序

我们从假设一个包含一个项(位置 \(0\))的列表已经排序开始。在每次遍历中(每个项 1 到 \(n-1\)),当前项会与已排序子列表中的项进行比较。当我们回顾已排序子列表时,将大于当前项的项向右移动。当遇到较小的项或子列表的末尾时,可以将当前项插入。

图 5 详细展示了第五次遍历。在算法的这个点上,有一个已排序的五项子列表:17, 26, 54, 77 和 93。我们希望将 31 插入到已排序项中。第一次与 93 的比较会将 93 向右移动。77 和 54 也会被移动。当遇到项 26 时,移动过程停止,31 被放置在空位置。现在我们有了一个包含六项的已排序子列表。

Image title
图 5: 插入排序:第五次排序

insertion_sort 的实现(ActiveCode 1)显示了排序 \(n\) 个项需要再次 \(n-1\) 次遍历。迭代从位置 1 开始,移动到位置 \(n-1\),因为这些项需要插入到已排序的子列表中。第 8 行执行了将值上移一个位置的移动操作,为插入腾出空间。记住,这不是像之前的算法那样的完全交换。

插入排序的最大比较次数是前 \(n-1\) 个整数的和。这个复杂度同样是 \(O(n^{2})\)。然而,在最佳情况下,每次遍历只需要一次比较。这对于已排序的列表来说是成立的。

关于移动操作与交换操作的一个注意点也很重要。一般来说,移动操作的处理工作量大约是交换操作的三分之一,因为只执行了一次赋值。在基准测试中,插入排序通常会显示出非常好的性能。

活动: 5.9.1 插入排序
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def insertion_sort(a_list):
    for i in range(1, len(a_list)):
        cur_val = a_list[i]
        cur_pos = i

        while cur_pos > 0 and a_list[cur_pos - 1] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - 1]
            cur_pos = cur_pos - 1
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
insertion_sort(a_list)
print(a_list)



有关更多细节,CodeLens 4 允许您逐步分析算法。

追踪插入排序
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def insertion_sort(a_list):
    for i in range(1, len(a_list)):
        cur_val = a_list[i]
        cur_pos = i

        while cur_pos > 0 and a_list[cur_pos - 1] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - 1]
            cur_pos = cur_pos - 1
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
insertion_sort(a_list)
print(a_list)

自我检测

假设你有以下数字列表需要排序:
[15, 5, 4, 18, 12, 19, 14, 10, 8, 20] 哪个列表表示插入排序经过三次完整遍历后的部分排序列表?

  • 选项 a: [4, 5, 12, 15, 14, 10, 8, 18, 19, 20]
  • 选项 b: [15, 5, 4, 10, 12, 8, 14, 18, 19, 20]
  • 选项 c: [4, 5, 15, 18, 12, 19, 14, 10, 8, 20]
  • 选项 d: [15, 5, 4, 18, 12, 19, 14, 8, 10, 20]

  • 正确答案: c

  • 反馈 a: 这是冒泡排序的结果。

  • 反馈 b: 这是选择排序的结果。
  • 反馈 c: 插入排序在列表的开始处进行操作。每次遍历产生一个更长的已排序列表。
  • 反馈 d: 插入排序在列表的前端进行操作,而不是在末尾。

The insertion sort, although still \(O(n^{2})\), works in a slightly different way. It always maintains a sorted sublist in the lower positions of the list. Each new item is then inserted back into the previous sublist such that the sorted sublist is one item larger. Figure 4 shows the insertion sorting process. The shaded items represent the ordered sublists as the algorithm makes each pass.

Image title
Figure 4: Insertion Sort: Complete

We begin by assuming that a list with one item (position \(0\)) is already sorted. On each pass, one for each item 1 through \(n-1\), the current item is checked against those in the already sorted sublist. As we look back into the already sorted sublist, we shift those items that are greater to the right. When we reach a smaller item or the end of the sublist, the current item can be inserted.

Figure 5 shows the fifth pass in detail. At this point in the algorithm, here is a sorted sublist of five items: 17, 26, 54, 77, and 93. We want to insert 31 back into the already sorted items. The first comparison against 93 causes 93 to be shifted to the right. 77 and 54 are also shifted. When the item 26 is encountered, the shifting process stops and 31 is placed in the open position. Now we have a sorted sublist of six items.

Image title
Figure 5: Insertion Sort: Fifth Pass of the Sort

The implementation of insertion_sort (ActiveCode 1) shows that there are again \(n-1\) passes to sort \(n\) items. The iteration starts at position 1 and moves through position \(n-1\), as these are the items that need to be inserted back into the sorted sublists. Line 8 performs the shift operation that moves a value up one position in the list, making room behind it for the insertion. Remember that this is not a complete exchange as was performed in the previous algorithms.

The maximum number of comparisons for an insertion sort is the sum of the first \(n-1\) integers. Again, this is \(O(n^{2})\). However, in the best case, only one comparison needs to be done on each pass. This would be the case for an already sorted list.

One note about shifting versus exchanging is also important. In general, a shift operation requires approximately a third of the processing work of an exchange since only one assignment is performed. In benchmark studies, insertion sort will show very good performance.

Activity: 5.9.1 Insertion Sort
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def insertion_sort(a_list):
    for i in range(1, len(a_list)):
        cur_val = a_list[i]
        cur_pos = i

        while cur_pos > 0 and a_list[cur_pos - 1] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - 1]
            cur_pos = cur_pos - 1
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
insertion_sort(a_list)
print(a_list)



For more detail, CodeLens 4 allows you to step through the algorithm.

Tracing the Insertion Sort
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def insertion_sort(a_list):
    for i in range(1, len(a_list)):
        cur_val = a_list[i]
        cur_pos = i

        while cur_pos > 0 and a_list[cur_pos - 1] > cur_val:
            a_list[cur_pos] = a_list[cur_pos - 1]
            cur_pos = cur_pos - 1
        a_list[cur_pos] = cur_val


a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
insertion_sort(a_list)
print(a_list)

Self Check

Suppose you have the following list of numbers to sort:
[15, 5, 4, 18, 12, 19, 14, 10, 8, 20] which list represents the partially sorted list after three complete passes of insertion sort?

  • answer a: [4, 5, 12, 15, 14, 10, 8, 18, 19, 20]
  • answer b: [15, 5, 4, 10, 12, 8, 14, 18, 19, 20]
  • answer c: [4, 5, 15, 18, 12, 19, 14, 10, 8, 20]
  • answer d: [15, 5, 4, 18, 12, 19, 14, 8, 10, 20]

  • correct: c

  • feedback a: This is a bubble sort.

  • feedback b: This is the result of selection sort.
  • feedback c: Insertion sort works at the start of the list. Each pass produces a longer sorted list.
  • feedback d: Insertion sort works on the front of the list not the end.

最后更新: 2024年9月13日
创建日期: 2024年9月9日