Merge K Sorted List

Posted on

Problem

Merge k sorted linked lists and return it as one sorted list. The definition of the ListNode class and the signature of the MergeKLists method can’t be modified.

Below is my submission to LeetCode which has been accepted and run time is not that bad (better than 55% of the submissions). But I am curious why mine is not in the top 10%. There must be something that can be improved in terms of algorithm performance. Can you please suggest something in this regard?

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LeetCode23_MergeKSortedList
{

    /// <summary>
    /// This class definition is given as input. And can't be modified.
    /// </summary>
    public class ListNode
    {
        public int val;
        public ListNode next;
        public ListNode(int x) { val = x; }
    }

    /// <summary>
    /// Just so that we can add an Id field
    /// </summary>
    public class ListNodeWithId : ListNode
    {
        public Guid id { get; set; } = Guid.NewGuid();
        public ListNodeWithId (ListNode n): base (n.val)
        {
            next = n.next;
        }
    }
    /// <summary>
    /// To be able to handle duplicate valued nodes
    /// </summary>
    public class DuplicateKeyComparer : IComparer<ListNodeWithId>
    { 
        public int Compare(ListNodeWithId x, ListNodeWithId y)
        {            
            int result = x.val.CompareTo(y.val);

            if (result == 0)
                return x.id.CompareTo(y.id);   // Handle equal valued node as distinct still
            else
                return result;
        }

    }
    public class Solution
    {
        public ListNode MergeKLists(ListNode[] lists)
        {         
            var listSet = new SortedSet<ListNodeWithId>(new DuplicateKeyComparer());
            foreach (var list in lists.Where(x => x != null))
                listSet.Add(new ListNodeWithId(list));

            ListNode mergedListHead = null;
            ListNode mergedListCurrent = null;

            while (listSet.Count() > 0)
            {
                var min = listSet.First();

                int minVal = min.val;
                listSet.Remove(min);

                var nextNode = min.next;
                if ( nextNode != null)
                    listSet.Add(new ListNodeWithId(nextNode));

                var mergedListNode = new ListNode(minVal);

                if (mergedListCurrent != null)
                {
                    mergedListCurrent.next = mergedListNode;
                    mergedListCurrent = mergedListCurrent.next;
                }
                else
                {
                    mergedListHead = mergedListNode;
                    mergedListCurrent = mergedListNode;
                }
            }

            return mergedListHead;
        }
    }
}

Solution

    return x.id.CompareTo(y.id);   // Handle equal valued node as distinct still

Your tie-breaker is a high-entropy GUID. The contest inputs may have long runs of monotonic non-decreasing or non-increasing values, for which a boring sequential serial number id might do better at preserving runs than a GUID.

Rather than listSet, which is redundant with the declaration, I would prefer a name of simply lists.

I can’t say I’m crazy about trivial definitions like this:

            int minVal = min.val;

At this point:

            var mergedListNode = new ListNode(min.val);

you have a perfectly nice min list node, right? But you discard it, and create a duplicate copy of it? Why? Seems like you’re just keeping the GC busy.

I wouldn’t mind seeing this happen before the while begins:

            else
            {
                mergedListHead = mergedListNode;
                mergedListCurrent = mergedListNode;
            }

You asked about speed. I think your algorithm is a perfectly good one – relying on SortedSet makes sense. But you do lots of node copying and reallocation, and you do lots of SortedSet operations. It’s all in line with N log N, that’s good, I’m just saying its more operations than necessary. Recall that Timsort is fast because it preserves runs. (BTW, it’s not clear to me that stability is buying you anything, that keeping an ID on the side is helping you.)

Let’s try to be lazy. Consider a 2-way merge, where first list has values of 1 to 1e6 (or they could even all be 1e6), and second list has values of million and one up to 2e6 (or they’re all 2e6). Minimum amount of work we could possibly do is linear scan of first list, splice its final node to front of second list, scan to end, and return. No SortedSet operations, no deallocation or new allocations. What would you need to attain high speed in that special case? Either peek at minimum value before adding a node back into the set, or else you’d need some way of obtaining first two values so you can keep looping while you’re below that 2nd lowest starting value. Good luck!

Leave a Reply

Your email address will not be published. Required fields are marked *