排序算法专题

CS61B排序:选择排序,堆排序,归并排序,插入排序,希尔排序,快排

Sort

inversion(逆序对)

  • An inversion is a pair of elements that are out of order with respect to <.

一种看待sort的方法:

  • Given a sequence of elements with Z inversions.
  • Perform a sequence of operations that reduces inversions to 0.

Sort中,我们关注两种复杂度

  • time complexity
  • space complexity (extra memory usage)

Selection Sort

θ(N^2^)

  • Find smallest item.
  • Swap this item to the front and ‘fix’ it.
  • Repeat for unfixed items until all items are fixed.

Heap Sort

基于Selection Sort,但是加速了找smallest item这个过程

必须用max-heap,why?

  • min-heap也可以work,但是我们如果想直接在该数组中建立堆(without extra output array)来节省space complexity,min-heap不能做到这点

naive(朴素版)heap sort

  • Insert all items into a max heap, and discard input array. Create output array.
  • Repeat N times:
    • Delete largest item from the max heap.
    • Put largest item at the end of the unused part of the output array.

runtime

  • time complexity:θ(NlogN)
  • space complexity:θ(N) 因为开辟了额外的输出数组,长度为N
    • 这比选择排序worse,但是我们下面的策略可以改进

In-place Heapsort

no extra array,直接在原数组中建立堆(heapify),之后和朴素版的heapsort就一致了

为什么可以这样?因为按照层序从后往前的过程中,我们sink之后就保证以该位置为root的一个子堆满足heap的性质,因此我们从小到大地建立了完整有效地堆

  • Bottom-up heapify input array.
    • To bottom-up heapify, just sink nodes in reverse level order.
    • After sink(k), guaranteed that tree rooted at position k is a heap.
  • Repeat N times:
    • Delete largest item from the max heap, swapping root with last item in the heap.

in-place heap sort Demo

runtime

  • time complexity:θ(NlogN)
  • space complexity:θ(1)

Merge Sort

chapt 8里面讲过,依次挨个地合并两个有序的数组

看这个Demo就理解了 Merge Sort Demo

  • Split items into 2 roughly even pieces.
  • Mergesort each half (steps not shown, this is a recursive algorithm!)
  • Merge the two sorted halves to form the final result.
    • Compare input[i] < input[j] (if necessary).
    • Copy smaller item and increment p and i or j.

insertion sort

naive

  • Starting with an empty output sequence.
  • Add each item from input, inserting into output at right point.

in-place

  • 在该数组内用swap,而不是再开一个数组
  • Repeat for i = 0 to N - 1:
    • Designate item i as the traveling item.
    • Swap item backwards until traveller is in the right place among all previously examined items.

note:

对于差不多有序的序列来说很快

runtime与inversion(逆序对)的个数成正比

一些经验表示,对于<15个元素的序列,insertion sort是所有排序中最快(粗略的解释:Divide and conquer algorithms like heapsort / mergesort spend too much time dividing, but insertion sort goes straight to the conquest.)

image-20230219213456835

Shell Sort

利用insertion对元素少的序列很快这一特点

  • 将序列中,每间隔d的index分为一组,对该组进行insertion sort
  • 不断d--,直到d=1,整个序列恰恰都在一组,完成排序

Quick Sort

一个好的quick sort:需要有

  • pick good pivot
  • good partition strategy

Quicksorting N items: (Demo)

The Core Idea of Tony’s Sort: Partitioning

To partition an array a[] on element x=a[i] is to rearrange a[] so that: (partition作用的那个i,就叫做pivot)

  • x moves to position j (may be the same as i)
  • All entries to the left of x are <= x.
  • All entries to the right of x are >= x.

实现:

  • 可以用BST,把pivot当做root,比他小的都在left,比他大的都在right
  • simple but not fastest: 3 scan
    • Create another array. Scan and copy all the red items to the first R spaces. Then scan for and copy the white item. Then scan and copy the blue items to the last B spaces.

observations:

  • 当我们partition(k=a[i])之后,k就在它的最终位置上不动了了(左边的都比他小,右边的都比他大)
  • Can sort two halves separately, e.g. through recursive use of partitioning.左右两边完全独立了

步骤:

  • Pick the pivot to be partitioned: e.g.Partition on leftmost item.
  • Quicksort left half.
  • Quicksort right half.

runtime

  • 值得一提的是,quick sort的速度与选取的pivot最终落在的位置有很大关系

Best Case: Pivot Always Lands in the Middle

递归深度logN

image-20230220200918361

Worst Case: Pivot Always Lands at Beginning of Array

递归深度N

image-20230220201459150

一个有趣的观察:

  • Quick Sort的过程其实就是BST建立的过程,我们可以看到每次选取的pivot都是子树的一个根,向下建立BST,so crazy!
  • 回想Random insertion into a BST takes O(N log N) time.
image-20230220202301322

问题来了,N^2^与NlogN的差别可谓巨大,所以我们该怎么避免worst case的出现?

Avoiding the Quicksort Worst Case

小心两种特殊的序列,他们可能会使你的策略陷入N^2^

  • Bad ordering: Array already in sorted order.
  • Bad elements: Array with all duplicates.

we mainly focus on

  • How you select your pivot.
  • How you partition around that pivot.

Philosophy 1: Randomness

随机性是一个好的Quick Sort所必需的,对于一些确定的or伪随机的,总会存在潜在的危险. See McIlroy’s “A Killer Adversary for Quicksort

  • Strategy #1: Pick pivots randomly.
  • Strategy #2: Shuffle(搅乱) before you sort.

Philosophy 2b: Smarter Pivot Selection

Use the median (or an approximation) as our pivot.也就是选取的pivot正好落在中间

利用partition来寻找中值pivot

  • 一直partition,让partition(k)之后k的位置往中间靠,直到k落在length/2的位置
image-20230220210311996

worst case:θ(N^2^) i.e.[1 2 3 4 5 6 7 8 9 10 … N]

average case:θ(N)

Hoare Partitioning(双指针法)

Demo

Create L and G pointers at left and right ends.

  • L pointer is a friend to small items,and hates large or equal items.(注意对于equal的我们不管,不然对于全重复元素的序列我们会陷入worst case)

  • G pointer is a friend to large items,and hates small or equal items.

  • Walk pointers towards each other,stopping on a hated item.

    • When both pointers have stopped,swap and move pointers by one.

    • When pointers cross,you are done walking.

  • Swap pivot with G.

Sorting Properties

stability

  • Equivalent items don’t ‘cross over’ when being stably sorted.

排序算法专题
http://example.com/2022/01/22/排序算法专题/
Author
Jianhui Yin
Posted on
January 22, 2022
Licensed under