diff options
| -rw-r--r-- | book.tex | 313 |
1 files changed, 308 insertions, 5 deletions
@@ -219,7 +219,8 @@ A consequence of this is that we get \emph{Universe reduction}, where we have ma \section{Constant time \LCA{} (\RMQ{})}% \label{sec:constant-lca} -\subsubsection{Step 1: Reduction} +\subsubsection{Step 1: Reduction}% +\label{sec:first-euler-tour} We start by reducting the problem to $\pm 1 \textsc{RMQ}$, in which adjacent elements of the array differs by at most 1. Walk an \emph{Euler Tour} of the tree: \todo{Add figure} @@ -601,7 +602,8 @@ Since all operations are linear, with the exception of the recursive call, we ge $T(n) = T(\frac{2}{3} n) + O(n) = O(n)$. Linear time! -\chapter{Temporal Structures} +\chapter{Temporal Structures}% +\label{ch:time} \topics{Partial persistency, full persistency, funcitonal persistency, pareial retroactivity, full retroactivity.} @@ -834,11 +836,13 @@ with $O(\log^2 m)$ overhead. However, it is also possible to get full retroactiv Before tackeling the real problem, we look at an easier problem. \subsection{List Labeling} -We want to store integer labels in a list, such that insert/delete here queries are constant, +We want to store integer labels in a list, such that insert/delete queries around a node in the list are constant, and that the list is in a strictly monotone ordering. -\emph{Label Space} is the size of the labels as a function of the number of elements in the list we want to store. +Let \emph{Label Space} be the size of the labels as a function of the number of elements in the list we want to store. Table~\ref{tab:list-labeling} shows the best known updates for different sizes of the label space. +\todo{Come back to this} + \begin{table}[b] \centering \begin{tabular}{c l} @@ -857,10 +861,309 @@ Table~\ref{tab:list-labeling} shows the best known updates for different sizes o \chapter{Geometry} +\topics{Orthogonal Range Search, Range Trees, Layered Range Trees, Fractional Cascading} + +We look at problems involving geometry, for instance queries in 2D space: +given a set of points, which points are in an axis aligned rectangle? + +In general, geometric data strucutres is all about data in higher dimensions. +We differentiate between static structures and dynamic structures. + +\section{Line Sweep} +We look at the problem of maintaining line-segments in 2D\@; we would like to store the order of the lines and the intersections. +In Chapter~\ref{ch:time} we looked at time traveling data structures. +We can use these to shave off a single dimension on geometry problems by pretending that one axis is time. + +By walking along the x-axis we can maintain a BBST with the points. +On each time $t$ where something happends, that is either a segment is started, ended, or a crossing occur, we can +translate this to an operation in the BBST\@. +For the static version we need a persistent BBST\@. +This allows queries to be done in a specific time, which is our $x$ coordinate. +Now if we would like to know which line is above a point $(x, y)$, we can translate the $x$ coordinate to a time $t$, +and query for the successor of $y$ in the BBST at time $t$. +We get $O(\log n)$ query after $O(n \log n)$ preprocessing (building the persistent BBST). + +The dynamic version is similar, but we need a retroacitve BBST, since we need to insert segment begin and segmen end events. + +\section{Orthogonal Range Search}% +\label{sec:range-tree} +In this problem we want to maintain $n$ points in $d$ dimentional space, +and answer queries where we ask for the points in a $d$ dimentional hypercube $[a_1, b_1] \times [a_2, b_2] \times \cdots \times [a_d, b_d]$. +We want existence, count, or the actual points. +The time bound we are aiming for initially is $O(\log^d n + k)$ where $k$ is the number of points we return (trivial bound). +Again we differentiate between the dynamic and the static version. + +\subsection{$d=1$} +We store the points in a BBST where all the points are leaves (this only doubles the space). +For a range search $[a, b]$ we find +the nodes $a' = pred(a)$, $b' = succ(b)$, and \code{LCA}{a', b'}, and report the points in the subtrees in between $a'$ and $b'$. +Since both $pred$ and $succ$ are $O(\log n)$ and the size of the subtree between $a'$ and $b'$ is $O(k)$ we get queries in $O(\log n + k)$ time. + +\subsection{$d=2$} +We store a 1-dimentional tree on the $x$ coordinate of all points, similar to in the $d=1$ case. +This time however, we augment the nodes of the tree with a new tree, containing the same points, but sorted on $y$. +That is, any node is itself the root of a tree sorted on $x$ which containts the points $P$ as leaves. +The node also points to a new tree which stores $P$ on $y$. +The y-trees are independent, and have no cross links. + +We note that each point is in $O(\log n)$ trees: the main tree, and one for each ancestor node, of which there are $O(\log n)$. +On a query, we find the subtrees of the main tree containing all points with $x$ coordinates in the range. +Then we go through all y-trees and do a range search on $y$. +This gives us $O(\log^2 n + k)$ query time. + +The space requirement is only $O(n \log n)$, and construction time is also $O(n \log n)$. +Observe that the space recurrence is $S(n) = 2S(n/2) + O(n)$, the same as the time recurrence for \algo{Merge-Sort}. + +\subsection{$d=D$} +The approach taken in the $d=2$ case generalizes to any dimension. +We end up with $O(\log^D n + k)$ query, $\Theta(n \log^{D+1} n)$ space and construction, +and $\Theta(\log^D n)$ update (in the dynamic setting). + + +\section{Layered Range Tree}% +\label{sec:layered-range-tree} +We observe that the approach taken in the previous section is wasteful: when $d=2$ we search for the same y-intervals in $O(\log n)$ trees. +We want to take advantage of this by reusing the searches. + +Instead of having the nodes in the x-tree store another tree, this time they only point to a sorted array on y. +The idea is that we only want to do a single search on y, which will be in the root node (the array containing all points). +Now, when we walk down from the root to \code{LCA}{a', b'} (the pred.\ and succ.) we can follow pointers into the child array, +so that we know at once where we are. +Now when we get to the subtrees we want to output we have the y-interval of points that we are interested in, +and since the subtree is completely covered on $x$, these are exactly the points we are looking for, +so we get this search ``for free''. + +Note that we depend on the fact that a child node has a subset of the point that the parent has. + +We start out with querying the points in the $y$ array which takes $O(\log n)$ time, and then we walk down to the leaves, +which is a walk of length $O(\log n)$. +On each step we output points, of which there are $k$ in total. +Hence we end up with $O(\log n + \log n + k) = O(\log n + k)$ queries ($O(\log^{d-1} n)$ in general). +The space is the same as previously: $O(n \log n)$ ($O(n \log^{d-1} n)$ in general). +The construction time is also the same, since the pointer setup can be done in linear time, so we get the same reccurence as \algo{Merge-Sort}, yet again. + +Unfortinately, this does not generalize to higher dimensions: we can only shave off one $\log$ factor using this approach. + +\section{Weight-Balance trees} +\newcommand{\bba}{BB[$\alpha$]} +We would like to use range trees in a dynamic setting. +The tree we look at is the \bba{} tree. +A weight-balanced tree is similar to a \emph{height} balance tree, which we know: AVL trees and Red-Black trees are examples of height-balanced trees. +With weight-balanced trees we would naturally like to balance the weight --- the number of nodes --- of the subtrees instead of the height. + +More formally, we have the following invariant: +\begin{align*} + \forall x\ &size(left(x)) \geq \alpha\ size(x)\\ + &size(right(x)) \geq \alpha\ size(x)\quad\text{where }\alpha \in {[0, 1/2]} +\end{align*} + +A curious property of this invariant is that it implies height balance: $h \leq \log_{\frac{1}{1 - \alpha}} n$ + +On update we simply insert at the leaves, and update the weights upward in the tree, +assuming all internal nodes store the weights of its children explicitly. +When a node becomes unbalanced, we simply rebuild the entire subtree from scratch. +While this might seem slow, we can use an amortization scheme where we charge the $\Theta(k)$ updates in a subtree for that +subtrees rebuild time, since we need a lot of changes in a subtree before the root of that subtree becomes unbalanced. +The details are a little messy, but the bottom line is we get $O(\log n)$ amortized update. + +We can apply this to the range tree from Section~\ref{sec:range-tree} to get $O(\log^d n)$ amortized updates. + +We would also like to use the layered approach from Section~\ref{sec:layered-range-tree} to shave off a $\log$ factor, +but it turns out that array rebuilding is problematic. +However, we only need something array like in the root node of the tree, since we +only need a binary search there, and we never use random access for the arrays in the internal nodes as we only follow pointers. +We can replace the root array with a BBST, and the internal array with linked lists. +We end up with the same query time $O(\log^{d-1} n + k)$, since the procedure is exactly the same, but also get $O(\log^d n)$ updates. + + \chapter{Connectivity in Dynamic Graphs} +\topics{Dynamic connectivity on trees, Euler tour trees}. + +Before starting, we point out that this chapter is subject to fewer proofs, and more stated results. + +We would like to solve the problem of connectivity queries. +We maintain a graph which are subject to updates (edge insertion and deletion), and we answer queries of the form ``is $u$ and $v$ connected''? +As in previous sections we split the problem into two variants: fully dynamic and partially dynamic. + +\begin{definition}{Fully Dynamic} + Connectivity queries in which the graph is fully dynamic +\end{definition} + +\begin{definition}{Partially Dynamic} + Connectivity queries in which the graph update can be either edge insertions \emph{or} edge deletions, but not both. + Only insertions is called \emph{incremental}, and only deletion is called \emph{decremental}. +\end{definition} + +Unless specified, we consider fully dynamic connectivity. + +\section{General Results} + +\subsubsection{Trees} +We can handle connectivity queries for trees in $O(\log n)$ time, by using Link-Cut trees or Euler-Tour Trees (Section~\ref{sec:euler-tour}). +If we limit ourselves to decremental connectivity, constant time is possible. + +\subsubsection{Plane Graphs} +A plane graph is a planar graph with a fixed embedding; that is, edges know which faces they divide, and updates specify the face of the inserted element. +Similar to with trees, $O(\log n)$ is also possible. + +\subsubsection{General Graphs} +Is $O(\log n)$ per operation possible? +This is an open problem, but we know how to get $O(\log^2)$ (amortized) update, and $O(\frac{\log n}{\log\log n})$ query. +If we are willing to get slower updates for faster queries, $O(\sqrt{n})$ update and $O(1)$ query is possible. + +For the incremental case, we can get $\Theta(\alpha(a, b))$, where $\alpha$ is the inverse Ackermann function, by using \algo{Union-Find}. + +Decremental is possible in $O(m \log n + n \text{ polylog } n + \text{\# queries})$, where $m$ is the number of edges and $n$ is the number of vertices. + +There is also a known fundamental lower bound: either update or query have to be $\Omega(\log n)$. + +\section{Euler-Tour Trees}% +\label{sec:euler-tour} +We now look at the specifics regarding the result on tree connectivity, namely that $O(\log n)$ per operation is possible. +We have already seen Euler Tour trees, in Section~\ref{sec:first-euler-tour}. +The general idea is to traverse the tree and write down a node every time we get to it. +Then we build a BBST of the written down nodes, where the ordering is the order in the list. +Each node in the tree store the first and last visit in the BBST\@. +The Euler Tree supports the following operations: + +\begin{enumerate} + \item \algo{Make-Tree}: Make a new isolated tree + \item \code{Find-Root}{v}: find the root of $v$s tree + \item \code{Link}{u, v}: Attach $u$ as a child of $v$ + \item \code{Cut}{v}: remove $v$ from its parent +\end{enumerate} + +We look at how each operation is implemented. +Before we proceed, we remind ourselved of some of the operations that a BBST supports in $O(\log n)$ time: +\code{Split}{x}: turn the tree into two trees, one in which have the keys $< x$ and the other have the keys $> x$; +\code{Concat}{x, y}: turn the two trees $x$ and $y$ where $\forall x_i,y_i\ x_i < y_i$ into one tree with the keys $x \cup y$. +Both operations can be done in $O(\log n)$ time. + +\subsection{Make-Tree} +This is trivial: the tree for a singleton is the singleton itself. + +\subsection{Find-Root} +Note that the root of the tree is not the root of the BBST\@. +We start in the first tour visit of $v$, walk up to the root, and down to the rightmost node in the tree. +The rightmost node is the first visited node, which is the root of the \emph{actual} tree in which we want to find the root. +This takes $O(\log n)$ time. + +\subsection{Link} +We find the last occurence of $v$ in the BBST, and insert the tree of $u$ in there. +We also need to make sure that $u$ and $v$ themselves are occuring as they should after concatinating in $u$s tree. +A single split and two concats. + +Note that $u$ have to be the root of its tree. What do we do if it is not? +We can \emph{reroot} the tree: pick up the node we want to be the new root, such that the remaining of the tree ``falls down''. +This is a cyclic shift in the euler tour, and can be done in one split and one concat, +by splitting at the first occurence of $v$ in the tour, and concating it to the end. + +\subsection{Cut} +We find the first and last occurence of $v$ in the tree, and cut at those two places, since $v$s subtree +is a contiguous interval in the euler tour. + +Then we concat the first and last part together, and remove one of the $parent(v)$ nodes, so there are not two in a row. +Two splits and one concat. + +\subsubsection{} +Since all operations consists of walking up or down, splitting or concating, which all takes $O(\log n)$ time, we get $O(\log n)$ for all operations. +Connectivity queries can be done by comparing the roots of the nodes we are querying. + +\section{Fully Dynamic Graphs} +We look at how to obtain $O(\log^2 n)$ amortized queries for fully dynamic graphs. +We maintain a spanning forest of the graph, using Euler-Tour trees. +Now edge insertion corresponds to \algo{Link}. +Edge deletion have two cases: if the edge deleted is not in the spanning forest we maintain, nothing has changed; +If it \emph{is} we run into trouble, since simply deleting the edge does not imply that the graph becomes disconnected: +there might be another edge that we did not use in the spanning tree, because the two components were already connected by +the edge that we are now deleting. +If we know that no such edge exist, we can simply \algo{Cut} out the tree, and we are done. + +The way we do this is to assign \emph{levels} to edges, and store $O(\log n)$ levels spanning forests, where some edges +may get lost when going a level down. +All edges start at level $\log n$, and the level is monotoically decresing, and at least 1. + +Let $G_i$ be the subgraph of edges with level $\leq i$. +Note that $G_{\log n} = G$. + +Let $F_i$ be the spanning forest of $G_i$, stored using Euler-Tour trees. +Note that $F_{\log n}$ answers the connectivity queries in $G$, since the forest spans the entire graph, +and support connectivity queries in $O(\log n)$ time. + +We maintain the following invariants: + +\subsubsection{Invariant 1} +Every connected component of $G_i$ has $\leq 2^i$ vertices. + +\subsubsection{Invariant 2} +The forests nest: $F_0 \subseteq F_1 \subseteq \cdots \subseteq F_{\log n}$, +and are gived by $F_i = F_{\log n} \cap G_i$. +There is only one forest, and $F_i$ is just the part of the forest with the lower levels. +This also means that $F_{\log n}$ is a minimal spanning forest with respect to edge levels. + +\subsubsection{Insertion} +On insertion of $e=(u, v)$ we set $e.level = \log n$, and add $e$ to $u$ and $v$s indicence lists. +If $(u, v)$ are not connected we add $e$ to $F_{\log n}$. +This makes for $O(\log n)$ insertion. + +\subsubsection{Removal} +This is the hard part. +We start by removing $e$ from the indicence lists of the veritces it is connected to. +This can be done in constant time if the edge itself stores a pointer into where it is in those lists. +Then we check if $e \in F_{\log n}$ we are done (if $e$ is in any forest it is in $F_{\log n}$, since they nest). +Else, we have to delete $e$ from $F_{e.level}\dots F_{\log n}$, which is exactly the trees that $e$ lives in. +All of these are Euler-Tour trees, and there are at most $O(\log n)$ of them, which makes a total cost of $O(\log^2 n)$. + +Now we have to look for a replacement edge. We know by invariant 2 that there are no edges with a lower level, +since then that edge would be in the tree istead of $e$. +So if there is a replacement edge, it has level $\geq e.level$. +We search upwards from $e.level$ to $\log n$. +For each level $i$ we let $T_u$ and $T_v$ be the trees of $F_i$ containing $u$ and $v$. +Without loss of generality, let $|T_u| \leq |T_v|$. +By invariant 1, we know that the sizes of these components are limited: $|T_u| + |T_v| \leq 2^i$, since they were connected before deleting $e$. +This means that $T_u \leq 2^{i-1}$, so we \emph{can} push down all edges in $T_u$ to level $i-1$ without destroying invariant 1. +We will use this as the charging scheme to get the amortized running time we want. + +We look at all edges $e'=(x,y), x \in T_u$ at level $i$. +The edge is either internal to $T_u$, or it goes to $T_v$, like $e$ does. +Why can it not go to another component $T_w$? +Assume there is an edge $f=(x, w),\ w \in T_w$ of level $i$. +Since $f.level = i$ we know that $f \in G_i$, and since $F_{\log n}$ is a minimal spanning forest, +we know that if $T_u$ and $T_w$ are connected in $G$ they are connected in $G_i$, since $f$ can be used. +But this contradicts the assumption, namely that $f$ is not internal to $T_u$. +Therefore $T_u$ and $T_w$ cannot be connected in $G$, so $f$ cannot exist. + +If $e'$ is internal to $T_u$ it does not help us, so we set $e'.level = i - 1$, which we can afford. +If $e'$ goes to $T_v$ we are done, since it is a replacement edge; insert it into $F_i, \dots, F_{\log n}$. + +Overall we pay $O(\log^2 n + \log n \cdot \text{\# level decreses})$, +but the number of level decreses is bounded by the number of inserts times $\log n$, +since edge levels are strictly decresing and between $\log n$ and 1. +We can charge inserts with $\log n$, making the amortized cost of delete $O(\log^2 n)$, which is what we wanted. + +The last complication is that we need to augment the tree with subtree sizes at every node, in order to make the comparison $T_u \leq T_v$ in constant time, +and that we somehow must find all edges on a certain level. +To handle this we store in all internal nodes in the Euler-Tour trees an array, signaling whether the nodes in this subtree +has any level $i$ edges adjacent to them. +In addition, the adjacency lists of the nodes store one list for each level, instead of having one list for all edges. +This makes the search to find the next level $i$ edge $O(\log n)$. + +\section{Other Results} +We list some other related results in the field of conenctivity. + +\subsection{k-connectivity} +2-edge connectivity is maintainable in $O(\log^4 n)$ time, and 2-vertex connectivity in $O(\log^5 n)$ time. + +\subsection{Minimum Spanning Forest} +The general problem of maintaining a minimum spanning forest can be solved dynamically in $O(\log^4 n)$ time. + + + + + + -Dynamic connectivity on trees, euler tour trees. \chapter{Lower Bounds} |
