MIT 6.830 Lab5
这个Lab要求实现SimpleDB所用的B+树索引结构,包括对B+树的增删查操作。
给定一个field和一个page,要从这个page向下递归找到tuple所在的叶子节点。
具体的实现思路为:
- 获取数据页的类型
- 判断该数据页是否为叶子节点,若是则结束递归,返回结果
- 若不是则说明该页为非叶子节点,将页面进行类型转换
- 获取非叶子节点索引的迭代器
- 对非叶子节点的entry进行迭代,若field为空,则直接找到最左的叶子节点即可
- 找到第一个大于等于field的entry,然后递归其左子节点
- 若到了最后一个页面,则递归其右子节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
| private BTreeLeafPage findLeafPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreePageId pid, Permissions perm,Field f) throws DbException, TransactionAbortedException { if (pid.pgcateg() == BTreePageId.LEAF){ BTreeLeafPage page = (BTreeLeafPage) getPage(tid, dirtypages, pid, perm); return page; } BTreeInternalPage page = (BTreeInternalPage) getPage(tid, dirtypages, pid, perm); Iterator<BTreeEntry> iterator = page.iterator(); BTreeEntry next = null; if (f == null){ if (iterator.hasNext()){ next = iterator.next(); return findLeafPage(tid, dirtypages, next.getLeftChild(), perm, f); } return null; } while (iterator.hasNext()){ next = iterator.next(); Field key = next.getKey(); if (f.compare(Op.LESS_THAN_OR_EQ, key)){ return findLeafPage(tid, dirtypages, next.getLeftChild(), perm, f); } } if (next != null){ return findLeafPage(tid, dirtypages, next.getRightChild(), perm, f); } return null; }
BTreeLeafPage findLeafPage(TransactionId tid, BTreePageId pid, Field f) throws DbException, TransactionAbortedException { return findLeafPage(tid, new HashMap<>(), pid, Permissions.READ_ONLY, f); }
|
在对B+树作插入节点的操作时,需要考虑slot是否已满。若已满,则要分裂当前slot中的节点,具体地,可分为分裂非叶子节点和分裂叶子节点两种情况。
思路为:
- 新建一个leaf Page,作为新的页面
- 将满页面的entry复制到新页面中,边复制边删除
- 检查之前的满页面是否有右兄弟,若有则更新指针
- 更新脏页
- 更新兄弟指针
- 找出父节点并创建entry进行插入,最后更新脏页
- 根据field找出要插入的页面并返回
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| public BTreeLeafPage splitLeafPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreeLeafPage page, Field field) throws DbException, IOException, TransactionAbortedException { int mid = page.getNumTuples()/2; BTreeLeafPage newPage = (BTreeLeafPage) getEmptyPage(tid, dirtypages, BTreePageId.LEAF); Iterator<Tuple> iterator = page.reverseIterator(); while (iterator.hasNext() && mid > 0){ Tuple next = iterator.next(); page.deleteTuple(next); newPage.insertTuple(next); mid--; } Tuple tuple = iterator.next(); BTreeInternalPage parentPage = getParentWithEmptySlots(tid, dirtypages, page.getParentId(), field); assert tuple != null; BTreeEntry insertEntry = new BTreeEntry(tuple.getField(keyField), page.getId(), newPage.getId()); parentPage.insertEntry(insertEntry); if (page.getRightSiblingId() != null){ BTreeLeafPage right = (BTreeLeafPage) getPage(tid, dirtypages, page.getRightSiblingId(), Permissions.READ_WRITE); right.setLeftSiblingId(newPage.getId()); dirtypages.put(right.getId(), right); } newPage.setRightSiblingId(page.getRightSiblingId()); newPage.setLeftSiblingId(page.getId()); page.setRightSiblingId(newPage.getId()); page.setParentId(parentPage.getId()); newPage.setParentId(parentPage.getId()); dirtypages.put(parentPage.getId(), parentPage); dirtypages.put(page.getId(), page); if (tuple.getField(keyField).compare(Op.GREATER_THAN_OR_EQ, field)){ return page; } return newPage; }
|
思路为:
- 新建一个internal Page,作为新页面
- 将满页面的entry复制到新页面,边复制边删除
- 将中间节点挤出去
- 更新脏页
- 更新左右孩子指针
- 更新左右页面的孩子指针
- 根据中间节点获取父节点,将tuple插入到父节点中,并更新脏页和指针
- 根据field找到要插入的页面并返回
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| public BTreeInternalPage splitInternalPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreeInternalPage page, Field field) throws DbException, IOException, TransactionAbortedException { int mid = page.getNumEntries()/2; BTreeInternalPage newPage = (BTreeInternalPage) getEmptyPage(tid, dirtypages, BTreePageId.INTERNAL); Iterator<BTreeEntry> iterator = page.reverseIterator(); while (iterator.hasNext() && mid > 0){ BTreeEntry next = iterator.next(); page.deleteKeyAndRightChild(next); newPage.insertEntry(next); mid--; } BTreeEntry tuple = iterator.next(); page.deleteKeyAndRightChild(tuple); tuple.setLeftChild(page.getId()); tuple.setRightChild(newPage.getId()); updateParentPointers(tid, dirtypages, newPage); BTreeInternalPage parentPage = getParentWithEmptySlots(tid, dirtypages, page.getParentId(), field); parentPage.insertEntry(tuple); page.setParentId(parentPage.getId()); newPage.setParentId(parentPage.getId()); dirtypages.put(parentPage.getId(), parentPage); dirtypages.put(newPage.getId(), newPage); dirtypages.put(page.getId(), page); if (tuple.getKey().compare(Op.GREATER_THAN_OR_EQ, field)){ return page; } return newPage; }
|
删除操作涉及到后续slot中的节点分配,最好是让所有slot的空间分配较为合理,既不会太空也不会很快达到分裂的程度。
因此有两种重新分配的策略,一种是两个页面的元组数量求平均,另一种是一个页面留下总容量的一半,这里采用前者来实现。
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| public void stealFromLeafPage(BTreeLeafPage page, BTreeLeafPage sibling, BTreeInternalPage parent, BTreeEntry entry, boolean isRightSibling) throws DbException { int mid = (page.getNumTuples() + sibling.getNumTuples()) / 2 - page.getNumTuples(); Iterator<Tuple> iterator = isRightSibling ? sibling.iterator() : sibling.reverseIterator(); Tuple next = null; while (mid > 0){ next = iterator.next(); sibling.deleteTuple(next); page.insertTuple(next); mid--; } entry.setKey(next.getField(keyField)); parent.updateEntry(entry); }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| public void stealFromLeftInternalPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreeInternalPage page, BTreeInternalPage leftSibling, BTreeInternalPage parent, BTreeEntry parentEntry) throws DbException, TransactionAbortedException { int mid = (page.getNumEntries() + leftSibling.getNumEntries()) / 2 - page.getNumEntries(); if (mid <= 0) return; Iterator<BTreeEntry> entryIterator = leftSibling.reverseIterator(); Iterator<BTreeEntry> iterator = page.iterator(); BTreeEntry last = entryIterator.next(); BTreeEntry first = iterator.next(); BTreePageId leftChild = last.getRightChild(); BTreePageId rightChild = first.getLeftChild(); BTreeEntry insertEntry = new BTreeEntry(parentEntry.getKey(), leftChild, rightChild); page.insertEntry(insertEntry); if (mid == 1){ leftSibling.deleteKeyAndRightChild(last); parentEntry.setKey(last.getKey()); parent.updateEntry(parentEntry); updateParentPointers(tid, dirtypages, page); dirtypages.put(parent.getId(), parent); dirtypages.put(page.getId(), page); dirtypages.put(leftSibling.getId(), leftSibling); return; } else { leftSibling.deleteKeyAndRightChild(last); page.insertEntry(last); mid -= 2; } while (mid > 0){ BTreeEntry next = entryIterator.next(); leftSibling.deleteKeyAndRightChild(next); page.insertEntry(next); mid--; } BTreeEntry tuple = entryIterator.next(); leftSibling.deleteKeyAndRightChild(tuple); parentEntry.setKey(tuple.getKey()); parent.updateEntry(parentEntry); updateParentPointers(tid, dirtypages, page); dirtypages.put(parent.getId(), parent); dirtypages.put(page.getId(), page); dirtypages.put(leftSibling.getId(), leftSibling); }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
| public void stealFromRightInternalPage(TransactionId tid, Map<PageId, Page> dirtypages, BTreeInternalPage page, BTreeInternalPage rightSibling, BTreeInternalPage parent, BTreeEntry parentEntry) throws DbException, TransactionAbortedException { int mid = (page.getNumEntries() + rightSibling.getNumEntries()) / 2 - page.getNumEntries(); if (mid <= 0) return; Iterator<BTreeEntry> entryIterator = rightSibling.iterator(); Iterator<BTreeEntry> iterator = page.reverseIterator(); BTreeEntry first = entryIterator.next(); BTreeEntry last = iterator.next(); BTreePageId rightChild = first.getLeftChild(); BTreePageId leftChild = last.getRightChild(); BTreeEntry insertEntry = new BTreeEntry(parentEntry.getKey(), leftChild, rightChild); page.insertEntry(insertEntry); if (mid == 1){ rightSibling.deleteKeyAndLeftChild(first); parentEntry.setKey(first.getKey()); parent.updateEntry(parentEntry); updateParentPointers(tid, dirtypages, page); dirtypages.put(parent.getId(), parent); dirtypages.put(page.getId(), page); dirtypages.put(rightSibling.getId(), rightSibling); return; } else { rightSibling.deleteKeyAndRightChild(first); page.insertEntry(first); mid -= 2; } while (mid > 0){ BTreeEntry next = entryIterator.next(); rightSibling.deleteKeyAndLeftChild(next); page.insertEntry(next); mid--; } BTreeEntry tuple = entryIterator.next(); rightSibling.deleteKeyAndLeftChild(tuple); parentEntry.setKey(tuple.getKey()); parent.updateEntry(parentEntry); updateParentPointers(tid, dirtypages, page); dirtypages.put(parent.getId(), parent); dirtypages.put(page.getId(), page); dirtypages.put(rightSibling.getId(), rightSibling); }
|
页面的合并需要注意指针的更新。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| public void mergeLeafPages(TransactionId tid, Map<PageId, Page> dirtypages, BTreeLeafPage leftPage, BTreeLeafPage rightPage, BTreeInternalPage parent, BTreeEntry parentEntry) throws DbException, IOException, TransactionAbortedException { Iterator<Tuple> iterator = rightPage.iterator(); leftPage.setRightSiblingId(rightPage.getRightSiblingId()); if (rightPage.getRightSiblingId() != null){ BTreeLeafPage leafPage = (BTreeLeafPage) getPage(tid, dirtypages, rightPage.getRightSiblingId(), Permissions.READ_WRITE); leafPage.setLeftSiblingId(leafPage.getId()); dirtypages.put(leafPage.getId(), leafPage); } while (iterator.hasNext()){ Tuple next = iterator.next(); rightPage.deleteTuple(next); leftPage.insertTuple(next); } setEmptyPage(tid, dirtypages, rightPage.getId().getPageNumber()); deleteParentEntry(tid, dirtypages, leftPage, parent, parentEntry); dirtypages.remove(rightPage.getId()); dirtypages.put(leftPage.getId(), leftPage); dirtypages.put(parent.getId(), parent); }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| public void mergeInternalPages(TransactionId tid, Map<PageId, Page> dirtypages, BTreeInternalPage leftPage, BTreeInternalPage rightPage, BTreeInternalPage parent, BTreeEntry parentEntry) throws DbException, IOException, TransactionAbortedException { Iterator<BTreeEntry> rightIterator = rightPage.iterator(); Iterator<BTreeEntry> leftIterator = leftPage.reverseIterator(); BTreeEntry first = rightIterator.next(); BTreeEntry last =leftIterator.next(); BTreePageId rightChild = first.getLeftChild(); BTreePageId leftChild = last.getRightChild(); BTreeEntry insertEntry = new BTreeEntry(parentEntry.getKey(), leftChild, rightChild); leftPage.insertEntry(insertEntry); rightPage.deleteKeyAndLeftChild(first); leftPage.insertEntry(first); while (rightIterator.hasNext()){ BTreeEntry next = rightIterator.next(); rightPage.deleteKeyAndLeftChild(next); leftPage.insertEntry(next); } updateParentPointers(tid, dirtypages, leftPage); setEmptyPage(tid, dirtypages, rightPage.getId().getPageNumber()); deleteParentEntry(tid, dirtypages, leftPage, parent, parentEntry); dirtypages.remove(rightPage.getId()); dirtypages.put(leftPage.getId(), leftPage); dirtypages.put(parent.getId(), parent); }
|