30 April 2023
Goal: Finish implementing a DemuxSort class and see it working
Completing the implementation of
DemuxSort took longer than anticipated because I forgot to put a looping construct in the enumerator and spun my wheels debugging. In any case, it’s done!
For a manual test I initialized three scanners and passed them to the sorting class, ordering by timestamp. It worked! However, I was puzzled when I tried sorting by a different column and direction. The results were chaotic. BUT! That’s because the scanner sources need to be ordered in the same way that the demux sorter is ordering. Otherwise the algorithm doesn’t make sense. You can’t divide and conquer if your strategy changes between the diving and the conquering! My puzzled looks near the end of the video can be resolved.
My puzzlement led me to want to put
DemuxSort under test next time, but I don’t think I will. It would be nice to have things under test, but I think that might deflate my motivation. Instead, I’d like to get back to implementing the pass 1..n code.
Of note, the divide and conquer strategy seems useful for a general case – even if the dataset to be sorted isn’t so large to require out-of-core sorting. That is, it may be useful to still operate on chunks of data (eg, 4kb) and sort those, then put the sorted pages behind a demux sort interface. In that case, I’d think that the
DemuxSort implementation I have is nice because it takes a scanner instance and the scanner can worry about the details of what and how it’s scanning. Though, I should be sure that scanners reading from disk are reading 4kb at a time so that we get enough data for each IO.
Written by Dan Drust on 30 April 2023
Continue Reading: Database Daily: Pass 0 - Divide …