COIS Project Programming Part
Heapsort COIS Project Programming Part For all of these make sure you show testing of each relevant cases, both testing your code *and* comparing to what the tree should look like. Whatever your output, it’s up to you to show the testing in a readable way. (This is a bit different from earlier assignments where linear structures print off nicely to show their state).
|HeapSort Code + Testing (ensure you show the state of the heap before sorting, the state after)|
Obviously a heapsort just… sorts the result.
|Heap working correctly|
|AVL: modified to support generics, with testing|
|AVL: Nodes support parent|
Splay (2 marks total)
|Splay: Find (this is the same as any BST search/find)|
|Splay: modified to support generics, with testing|
2-3-4 (note: insert is given to you in the lab)
|2-3-4: modified to support generics, with testing|
You may want more columns to show building with small/medium/large number of elements. As usual aim for something that takes on the order of 10 seconds at least as your large data set.
|Building Tree with series of inserts||Find||Traverse|
|Not in 2023||N/A||N/A|
|Theory 1 O(1) for heap insert|
|Theory 2: Min max priority Queue|
|Theory 3: Showing that a Red-Black equivalent to a 2-3-4 tree|
|Theory 4: B tree is a file system|
You are going to test and implement your own Heap, AVL, Splay, and 2-3-4 trees (note that heaps and AVL is mostly in the lab). Your implementations need to support generics, (using icomparable) so you need to test with sortable objects or simple version of those you make yourself, but I would test these with integers first. Your AVL tree should have nodes keeping track of both their children and their parent.
- AVL Tree.
Red Black Tree Normally I ask for Red-Black Trees in this assignment, but I don’ think there’s time in summer 2023
- Splay Tree.
- A 2-3-4 Tree. (These are also called 2-4 Trees or 2,4 trees) – this is also mostly in a lab.
It’s up to you to write test cases to show that the trees are working properly for small numbers of inputs (~ 20 elements are fine).
- Traverse (Breadth first + one of the depth first traversals, I suggest in-order). For AVL and Splay these traversals are the same (exactly the same) as the BST ones, since they are both BSTs.
- Find (note that in the splay tree “Find” puts the found node at the root)
- FindAndDelete (this should search for a specific entry, if it exists delete it and then properly fix the tree, if it doesn’t exist it can do nothing)
You will likely need to make several methods which make this work (rotations for example), show tests for each of those as well.
AVL, Red black, and 2-3-4 trees largely solve the same problem: At any point the tree should be close to balanced. So, if you insert a bunch of unbalanced data which one is fastest keeping it balanced? If you insert random data which is faster about keeping it balanced? The point of the balance is so search is relatively efficient, so once you have a tree, how long does it take to find things. I will leave it to you to design the experiment here – but it’s straightforward.
Splay Trees and heaps on the other hand, are a bit different, since they solve a different problems.
Splay Trees are a BST where the most recently accessed node is the root. This is useful for lots of things, leaderboards, text processing, and rather importantly: caching. The problem is that designing an experiment to evaluate caching behaviour is pretty subtle: high level it’s just store a bunch of data and then read that data back with an access pattern that is biased toward certain frequently accessed values. In words that’s easy, building a valid mathematical model of that is beyond what you have time for. So run your splay tree through the same testing as your AVL tree above, but understand that the results don’t really mean much.
Heap: This is used for problems like a priority queue, the ‘kth’ element of a list, or a sequence from k1… k2 (where k1 and k2 would be in the middle of a list somewhere, say the 10th to 20th smallest elements in a list of 40). This is neat, but there’s nothing to compare it to. So, build a large heap and compare how long it takes to build compared to the AVL/RB/2-3-4 but beyond that, you can’t really say too much, you could compare to your circular priority queues but as with the splay tree case, you’re sort of out of time for that sort of complex stuff. You should implement heapsort and test that.
- The main feature of a binary heap is that on average an insert is O(1) whereas a Binary Search Tree is O(log(n)), explain why this is, and show the timing for this, with your examples in (1) and (2) from the programming questions.
- Design a Min/max priority queue. You only need to design (not program) a data type that supports the following operations:
- insert, delete the maximum, and delete the minimum (all in logarithmic time);
- and find the maximum and find the minimum (both in constant time). Hint: Use two heaps
- Theory Question (Note: I realise there are answers to this on the web). Show how Red-Black Tree insertions are equivalent to 2-3-4 tree insertions.
- How would you use a B tree combined with a linked list to make a filesystem? (This is essentially FAT from MS-DOS)