Blog#116: Heap Sort: A Beginner's Guide to Sorting Data Like a Pro

image.png

Hi, I'm Tuan, a Full-stack Web Developer from Tokyo 😊. Follow my blog to not miss out on useful and interesting articles in the future.

Have you ever wanted to sort a large amount of data quickly and efficiently? Look no further than heap sort! Heap sort is a type of sorting algorithm that can sort data in an array in as little as O(n log n) time. But what does that mean, and how can you use it in the real world? In this article, we'll take a beginner-friendly approach to explaining heap sort and show you five examples of how to use it in JavaScript.

What is Heap Sort?

Imagine you have a big pile of legos. You want to sort them by color, but you don't want to take them all out of the pile and sort them one by one. That would take a long time! Instead, you could use a method called "heap sort" to sort the legos more quickly.

In computer science, a heap is a special kind of data structure that can be used to sort data. It's like a big tree where each "node" (or lego) has a value, and the value of each node is either greater than or equal to (in a "max heap") or less than or equal to (in a "min heap") the values of its children. By organizing the data in this way, it becomes much faster and easier to find the smallest or largest value in the heap.

How Does Heap Sort Work?

Heap sort is a sorting algorithm that uses a data structure called heap to sort data. It follows two steps:

  1. Build a heap out of the data.
  2. Repeatedly extract the maximum element from the heap and move it to the end of the array, which effectively sorts the data in increasing order.

The heap is a special type of data structure that satisfies the heap property, meaning that the value of each node is either greater than or equal to (in a "max heap") or less than or equal to (in a "min heap") the values of its children. By organizing the data in this way, it becomes much faster and easier to find the smallest or largest value in the heap.

In the example I provided, the heap is built using the buildHeap function, which starts by initializing the heap with the middle element of the array, and then repeatedly "heapifying" the array by comparing the value of each node with its children and swapping them if necessary.

Once the heap is built, the heapSort function is used to sort the array by repeatedly extracting the maximum element from the heap and moving it to the end of the array, and then re-heapifying the remaining elements. This process is repeated until the entire array is sorted.

To make it more clear, let's consider the following example:

let data = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];

In this example, data is an array of integers that we want to sort in ascending order.

First, we will use the buildHeap function to build a max heap out of the data:

function buildHeap(arr) {
    let n = arr.length;
    // Starting from the middle element of the array
    for (let i = Math.floor(n / 2) - 1; i >= 0; i--) {
        heapify(arr, n, i);
    }
}

This function starts by initializing the heap with the middle element of the array, and then repeatedly "heapifying" the array by comparing the value of each node with its children and swapping them if necessary.

function heapify(arr, n, i) {
    let largest = i;
    let left = 2 * i + 1;
    let right = 2 * i + 2;

    // If the left child is larger than the root
    if (left < n && arr[left] > arr[largest]) {
        largest = left;
    }

    // If the right child is larger than the root
    if (right < n && arr[right] > arr[largest]) {
        largest = right;
    }

    // If the largest is not the root
    if (largest !== i) {
        [arr[i], arr[largest]] = [arr[largest], arr[i]];
        heapify(arr, n, largest);
    }
}
buildHeap(data);
console.log(data); // [9, 6, 5, 5, 5, 3, 2, 4, 1, 1, 3]

As we can see, the first element of the array is 9, which is the largest number in the array.

Then, we use the heapSort function to sort the array by repeatedly extracting the maximum element from the heap and moving it to the end of the array, and then re-heapifying the remaining elements.

function heapSort(arr) {
    buildHeap(arr);
    let n = arr.length;

    for (let i = n - 1; i >= 0; i--) {
        // swap first and last element 
        [arr[0], arr[i]] = [arr[i], arr[0]];
        heapify(arr, i, 0);
    }
    return arr;
}
console.log(heapSort(data)); // [1, 1, 2, 3, 3, 4, 5, 5, 5, 6, 9]

As we can see the array is sorted in ascending order.

Final code:

let data = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];

function buildHeap(arr) {
  let n = arr.length;
  // Starting from the middle element of the array
  for (let i = Math.floor(n / 2) - 1; i >= 0; i--) {
    heapify(arr, n, i);
  }
}

function heapify(arr, n, i) {
  let largest = i;
  let left = 2 * i + 1;
  let right = 2 * i + 2;

  // If the left child is larger than the root
  if (left < n && arr[left] > arr[largest]) {
    largest = left;
  }

  // If the right child is larger than the root
  if (right < n && arr[right] > arr[largest]) {
    largest = right;
  }

  // If the largest is not the root
  if (largest !== i) {
    [arr[i], arr[largest]] = [arr[largest], arr[i]];
    heapify(arr, n, largest);
  }
}

buildHeap(data);
console.log(data); // [9, 6, 5, 5, 5, 3, 2, 4, 1, 1, 3]

function heapSort(arr) {
  buildHeap(arr);
  let n = arr.length;

  for (let i = n - 1; i >= 0; i--) {
    // swap first and last element
    [arr[0], arr[i]] = [arr[i], arr[0]];
    heapify(arr, i, 0);
  }
  return arr;
}

console.log(heapSort(data)); // [1, 1, 2, 3, 3, 4, 5, 5, 5, 6, 9]

Use Cases

Now that you understand how heap sort works, let's take a look at five examples of how you can use it in the real world.

Sorting a large dataset

One of the most common use cases for heap sort is sorting a large dataset. Since heap sort has a time complexity of O(n log n), it can handle large amounts of data quickly and efficiently.

let data = generateLargeDataset();
console.log(heapSort(data));

Priority queues

Heaps are often used to implement priority queues, which are data structures where each element has a "priority" and elements are retrieved in order of priority. In the case of max heap, the element with the highest priority will be at the root of the heap, and can be easily extracted.

class PriorityQueue {
    constructor() {
        this.heap = [];
    }
    insert(value, priority) {
        this.heap.push({value, priority});
        this.buildHeap();
    }
    extractMax() {
        let max = this.heap[0];
        this.heap[0] = this.heap.pop();
        this.heapify(0);
        return max;
    }
}

let queue = new PriorityQueue();
queue.insert("high priority task", 10);
queue.insert("low priority task", 1);
console.log(queue.extractMax()); // {value: "high priority task", priority: 10}

Top K elements

Heap sort can also be used to quickly find the top K elements in a dataset.

let data = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];
let k = 3;
let topK = heapSort(data).slice(data.length-k,data.length);
console.log(topK); // [6, 5, 5]

Sorting a stream of data

Heap sort can also be used to sort a stream of data as it comes in, rather than waiting for all the data to be collected before sorting.

let data = new Stream();
let heap = new Heap();
data.on("data", (val) => {
    heap.insert(val);
});
data.on("end", () => {
    console.log(heap.sort());
});

Sorting a very large file

Heap sort can also be used to sort a very large file that doesn't fit in memory. By reading chunks of the file into memory, sorting them with heap sort, and then writing the chunks back to the file, you can sort the entire file without ever loading it all into memory at once.

let file = fs.createReadStream("largeFile.txt");
let chunk = "";
file.on("data", (data) => {
  chunk += data;
  if (chunk.length > 1e6) {
    // if chunk is larger than 1MB
    let sortedChunk = heapSort(chunk.split("\n"));
    fs.appendFileSync("sortedLargeFile.txt", sortedChunk.join("\n"));
    chunk = "";
  }
});
file.on("end", () => {
  let sortedChunk = heapSort(chunk.split("\n"));
  fs.appendFileSync("sortedLargeFile.txt", sortedChunk.join("\n"));
  console.log("File sorted!");
});

Conclusion

Heap sort is a powerful sorting algorithm that can handle large amounts of data quickly and efficiently. By understanding how it works and seeing some examples of how to use it, you can start incorporating heap sort into your own projects. From sorting large datasets, to finding the top K elements, heap sort has a variety of use cases, and can be easily implemented in JavaScript.

And Finally

As always, I hope you enjoyed this article and learned something new. Thank you and see you in the next articles!

If you liked this article, please give me a like and subscribe to support me. Thank you. 😊


The main goal of this article is to help you improve your English level. I will use Simple English to introduce to you the concepts related to software development. In terms of IT knowledge, it might have been explained better and more clearly on the internet, but remember that the main target of this article is still to LEARN ENGLISH.

NGUYỄN ANH TUẤN

Xin chào, mình là Tuấn, một kỹ sư phần mềm đang làm việc tại Tokyo. Đây là blog cá nhân nơi mình chia sẻ kiến thức và kinh nghiệm trong quá trình phát triển bản thân. Hy vọng blog sẽ là nguồn cảm hứng và động lực cho các bạn. Hãy cùng mình học hỏi và trưởng thành mỗi ngày nhé!

Đăng nhận xét

Mới hơn Cũ hơn