diff options
Diffstat (limited to 'book.tex')
| -rw-r--r-- | book.tex | 259 |
1 files changed, 257 insertions, 2 deletions
@@ -7,6 +7,7 @@ \usepackage{booktabs} \usepackage{amsmath} \usepackage{amsfonts} +\usepackage{amssymb} \usepackage{mathtools} \usepackage{tikz} \usetikzlibrary{positioning,calc,shapes} @@ -602,8 +603,262 @@ Linear time! \chapter{Temporal Structures} -Partial persistency, full persistency, funcitonal persistency, -pareial retroactivity, full retroactivity. +\topics{Partial persistency, full persistency, funcitonal persistency, pareial retroactivity, full retroactivity.} + +When working with temporal data structures, our machine model is the \emph{Pointer Machine} + +\begin{definition}{Pointer Machine} + A computational model in which we have \emph{Nodes} with $O(1)$ fields. + The fields can be values or \emph{Pointers}, which points to other nodes. + Field access, node creation, data operations are all $O(1)$ time. + The pointer machine does not have arrays. +\end{definition} + + +\section{Persistence} +A persistent data structure is a data strucure that ``never changes''. +On a mutable operation to the structure, a new structure is made. +This means that all operations done to the data strucure is relative to a specific version of the structure. + +There are four levels of persistence: +\begin{enumerate} + \item Partial persistence: Only the latest version is editable. The versions are linearly ordered through time. + \item Full persistence: Any version can be updated, to produce a new version. The versions makes a tree. + \item Confluent persistence: Versions can be merged to produce a new version. The graph is now a DAG\@. + \item Functional: Nodes in the data structur are never modified, only new ones are created. The version of the structure is maintained by a pointer. +\end{enumerate} + + +\subsection{Partial Persistence} +Any data structure using the pointer machine with $\leq p = O(1)$ pointers to any node in any version can be made partially persistent, +with $O(1)$ amortized multiplicative overhead and $O(1)$ space per change. +We show how this can be done. + +Each node store \emph{Back Pointers}, which points to all nodes that have a pointer to this node. +Since there are at most $p$ such pointers, this is $O(1)$ extra space. +In addition, the nodes store a history of modifications $(version, field, value)$ tuples. +The history is of size $\leq 2p$ (using the fact that $p=O(1)$). +On field reads we read the value explicitly stored in the node, and then apply all the modifications that were done up to the verison +we are reading from. +When the history is full, we create a new node with the entire modification history applied to its fields, and an empty history. +Now we need to update the back pointers of the nodes we point to, since these nodes also store back pointers. +This is easy, since the back pointers are not persistent. +But we also need to update the pointers (not back pointers) of other nodes (which is the nodes we have in our back pointer list), +and these pointers are persistent, so we must recurse. + +It is not trivial to see that this recursion terminates. +What if the nodes we update get a full history from updating their pointer to us, +which makes us update our pointers making our history full again? + +We can use the \emph{Potential Method} for amortized analysis. +Remember that \emph{Amortized cost} is actual cost + change in potential. +Let $\Phi = c \cdot \Sigma$, where $\Sigma$ is the number of modifications in all nodes latest versions. +Observe that when we make a change, the amortized cost is +$$\leq c + c\ [- 2cp + p \cdot \text{recursions}]\text{?}$$ +The first $c$ is for computation, the second $c$ is the added potential, since we +just added one modification to a node, +and $[-2cp + p \cdot \text{recursions}]\text{?}$ are for the recursions, if there is one. +If there is no recursion, we have constant time. +If there is a recursion, we replace it by the same expression, which causes the $-2cp$ to cancel. +We do end up with another recursion, but if it does happen, we still get rid of the $-2cp$ term. + +In conclusion, we use $O(3p) = O(1)$ extra space for each node, and allow $O(1)$ extra time on updates, to +get partial persistence for any data structure. + + +\subsection{Full Persistence} +We take a similar approach to full persistency as we did with partial persistency. +The first difference is that we need \emph{all} pointers to be bi-directional, and not only the field pointers, as previously. +The second and most important difference is that now versions of the structure is no longer a line, but a tree. +In order to go around this problem we linearize the version tree: +traverse the tree, and write out first and last visit of each node\todo{add figure}. + +However, we need this structure to be dynamic, +so we use an \emph{Order Maintenance data structure}, +which supports insertion before or after a given item in $O(1)$ (like a linked list), +and supports relative order query of two items in $O(1)$ time. + +Let $p$ still be the in-degree of nodes. +Let $d$ be the maximum number of fields in a node. +We now allow $2(d + p + 1)$ modifications in a nodes history. + +A single nodes history now consists of the same triples, but the $version$ can no longer be compared by $\leq$, +since the versions are in a tree. This is what we need the $O(1)$ relative ordering for. + +On field read we can simply go through all modifications in the node (since there is a constant of them), +and use relative ordering to find the latest update of the field we are reading, from an ancestor of the current version. + +Updates when the history is full is different from the approach taken in partial persistence. +We would like to split the history tree into two parts of the same size, +and apply all of the modifications from the root to the newly cut out subtree to the new node. +This was the original node loses the modifications in the new subtree, and the new node loses +the modifications that are left in the original node. +Now we need to update pointers. +We have $d$ fields, $p$ back pointers, and $d + p + 1$ items in the history, all of which could be pointers, +which makes $\leq 2d + 2p + 1$ pointers. +Amortization scheme still works out (although the potential function is slightly different), and we get $O(1)$ updates. + +\subsubsection{} +Confluent persistency and functional were not covered in class. + + +\section{Temporal Data Structures} + +Temporal data structures allow all operations to specify a time in which the operation is to be performed. +The key difference between temporal and persistent structures is that in a temporal structure previous alterations to the structure +should ``propagate'' through time, and update the current structure. +A persistent data structure would not update the current strucure, but simply make a new data structure for each of the paths taken through time. + +\subsection{Retroactivity} +We need three operations: +\begin{enumerate} + \item \code{Insert}{t, $\ op(\dots)$}: retroactively perform the specified operation at time $t$. + \item \code{Delete}{t}: retroactively undo the operation at time $t$. + \item \code{Query}{t, $\ op(\dots)$}: execute the query at time $t$. +\end{enumerate} + +We differentiate between \emph{Partial retroactivity}, in which we are only allowed to \algo{Query} in the present, +and \emph{Full retroactivity}, in which we may query whenever we want. + +\subsubsection{A Simple Case} +If the updates we perform are commutative ($x, y = y, x$) and invertible ($x, x^{-1} = \varnothing$) +then partial retroactivity is easy: +Insert operations can be done in the present, since the ordering does not matter, and +delete operations can be done as inverse insert operations. + +An example is if we were to sum a sequence of numbers. +If we sum $1 + 3$ and would like to insert a $+2$ in between, we can put it at the end. +If we want to delete the $+3$, we can add a $-3$ at the end. + +\subsection{Full Retroactivity} +What do we need for full retroactivity? + +\begin{definition}{Decomposable Search Problem} +\label{sec:dsp} + $\text{query}(x, A\cup B) = f(\text{query}(x, A), \text{query}(x, B))$ +\end{definition} + +\begin{example} + \algo{Nearest-Neighbour}, \algo{Successor}, and \algo{Point-Location} are all decomposable search problems. +\end{example} + +If what we query is a DSP, we can achieve full retroactivity in $O(\log m)$ factor overhead, +where $m$ is the number of retroactive operations. +We can do this by using a \emph{Segment Tree} over the operations. +We think of the operations as begin in a time interval, from they are made until they are deleted. +Then we can store the operations in the nodes of the maximal subtrees that covers the interval. +Each node in the segment tree has its own copy of the underlying datastructure which we are making retroactive. +Since there are $O(\log m)$ nodes an element can be in, and the individual results can be joined \emph{somehow} (via $f$, since the problem is a DSP), +we get $O(\log m)$ multiplicative overhead. + + +\subsection{General Transformations} +What can we do if we do in general? +The most obvious method is the \emph{Rollback Method}: +when we want to do a retroactive change, we roll back the data structure to the given time, +make the change, and replay all changes afterwards. +This makes for a $O(r)$ multiplicative overhead, where $r$ is the number of time units in the past. +Unfortunately, depending on the exact model used, this is the best we can do in general. + +There is a lower bound: $\Omega(r)$ overhead can be necessary for retroactivity. +This means that the most efficient way to make retroactive changes is to go back, make the change, +and redo whatever comes after --- the rollback method! + +To see why this is the case, consider a very simply computer with two registers $X$ and $Y$, and with the following operations: +$X=x$, $Y\mathrel{+}=\Delta$, and $Y = XY$. +On query, the machine returns $Y$, and all operations are $O(1)$. +The operation sequence +$$\big<Y \mathrel{+}= a_n,\ Y=XY,\ Y\mathrel{+}= a_{n-1},\ \dots,\ Y\mathrel{+}= a_0\big>$$ +computes the polynomial $\sum^{n}_{i} a_{i}x^{i}$. +We can use this to compute the polynomial for any $X$ by retroactively inserting $X=x$ at $t=0$. +However, it is not possible to reeveluate such a polynomial using field operations any faster than to just evaluate it again. + + +\subsection{Priority Queue} +We now look at an example of a retroactive priority queue that supports +\algo{Insert}, \algo{Delete-Min}, and is partially retroactive. +We assume keys are only inserted once. + +We can plot the lifetime of the queue in 2D, where the x dimension is time and the y dimensions in key value. +\todo{add plot} +Keys in the queue are plotted as points when they are inserted, and are extended as horizonal rays. +On \algo{Delete-Min}, we shoot a ray from the x-axis at the time of the delete upward untill it hits a horizontal ray. +This makes $\rceil$ patterns. + +Let $Q_{t}$ be the set of keys in the queue at time $t$. +What happens if we \code{Insert}{t, \text{insert}$(k)$}? +Its ray will extend to the right and hit a deletion ray, which will delete it. +This will make the element that ray previously deleted to not be deleted after all, +so its ray will extend further to the right and hit another deletion ray. +In effect, the element that ends ut not getting deleted is +$\max\{k, k' \mid k' \text{\ deleted at time} \geq t\}$, but this relation is hard to maintain. + +In order to make things easier we define a \emph{Brige}. + +\begin{definition}{Bridge} + A Bridge is a time $t$ such that $Q_t \in Q_{\text{now}}$ +\end{definition} + +Now if $t'$ is the bridge preceding $t$ we see that +$$\max\{k'\mid k' \text{\ deleted at time}\geq t\} = \max\{k'\notin Q_{\text{now}} \mid k' \text{\ inserted at time}\geq t'\}$$ +In other terms, the largest key deleted after a time $t$ is the largest key inserted after the previous bridge that is not in the final set. + +Now we can store $Q_{\text{now}}$ as a BBST on values, and all insertions in a BBST on time. +The latter trees nodes is also augmented with the value of the largest insert in its subtree that is not in $Q_{\text{now}}$. +At last, we store a third BBST with \emph{all} the updates, ordered on time, and also augmented as follows: +$$ +\text{Augmentation} = +\begin{cases} + 0 \text{\ if \code{insert}{k}, and\ }k \in Q_\text{now}\\ + +1 \text{\ if \code{insert}{k}, and\ }k \notin Q_\text{now}\\ + -1 \text{\ if \algo{Delete-Min}}\\ +\end{cases} +$$ +In addition the internal nodes store subtree sums and min/max prefix sums. +We do this in order to detect brdiges, since a bridge is a prefix summing to 0. +When we have to find out which element to insert into $Q_{\text{now}}$ we can walk up from the node in the insertion BST, +and find the max to the right, since this BST stores this for all subtrees. +$O(\log n)$ time. + + +\subsection{Other Structures} +We list other strucutes that can be made retroactive. +A Queue can be made partial retroactive with $O(1)$ overhead and full retroactive with $O(\log m)$ overhead. +A Deque and \algo{Union-Find} can also be make fully retroactive with $O(\log m)$ overhead. +The priority queue, which we just made partially retroactive with $O(\log m)$ overhead, +can be made fully retroactive with $O(\sqrt{m} \log m)$ overhead. +Successor queries can be done in $O(\log m)$ partial retroactive, +and since it is a decomposable search problem (see~\ref{sec:dsp}) we can pay a $\log$ factor to make it fully retroactive, +with $O(\log^2 m)$ overhead. However, it is also possible to get full retroactivity with only $O(\log m)$ overhead. + + +\section{List Order Maintenance} +Before tackeling the real problem, we look at an easier problem. + +\subsection{List Labeling} +We want to store integer labels in a list, such that the list, such that insert/delete here queries are constant, +and that the list is in a strictly monotone ordering. +\emph{Label Space} is the size of the labels as a function of the number of elements in the list we want to store. +Table~\ref{tab:list-labeling} shows the best known updates for different sizes of the label space. + +\begin{table}[b] + \centering + \begin{tabular}{c l} + Label Space & Best known update\\\midrule + $(1+\epsilon)n\dots n \log n$ & $O(\log^2 n)$ \\ + $n^{1 + \epsilon} \dots n^{O(1)}$ & $O(\log n)$ \\ + $2^n$ & $O(1)$ + \end{tabular} + \caption{% +\label{tab:list-labeling}% + Table showing label space vs best known update for \algo{List-Labeling}} +\end{table} + + + + + +\chapter{Geometry} \chapter{Connectivity in Dynamic Graphs} |
