summaryrefslogtreecommitdiff
path: root/book.tex
diff options
context:
space:
mode:
Diffstat (limited to 'book.tex')
-rw-r--r--book.tex259
1 files changed, 257 insertions, 2 deletions
diff --git a/book.tex b/book.tex
index 209b474..2a62342 100644
--- a/book.tex
+++ b/book.tex
@@ -7,6 +7,7 @@
\usepackage{booktabs}
\usepackage{amsmath}
\usepackage{amsfonts}
+\usepackage{amssymb}
\usepackage{mathtools}
\usepackage{tikz}
\usetikzlibrary{positioning,calc,shapes}
@@ -602,8 +603,262 @@ Linear time!
\chapter{Temporal Structures}
-Partial persistency, full persistency, funcitonal persistency,
-pareial retroactivity, full retroactivity.
+\topics{Partial persistency, full persistency, funcitonal persistency, pareial retroactivity, full retroactivity.}
+
+When working with temporal data structures, our machine model is the \emph{Pointer Machine}
+
+\begin{definition}{Pointer Machine}
+ A computational model in which we have \emph{Nodes} with $O(1)$ fields.
+ The fields can be values or \emph{Pointers}, which points to other nodes.
+ Field access, node creation, data operations are all $O(1)$ time.
+ The pointer machine does not have arrays.
+\end{definition}
+
+
+\section{Persistence}
+A persistent data structure is a data strucure that ``never changes''.
+On a mutable operation to the structure, a new structure is made.
+This means that all operations done to the data strucure is relative to a specific version of the structure.
+
+There are four levels of persistence:
+\begin{enumerate}
+ \item Partial persistence: Only the latest version is editable. The versions are linearly ordered through time.
+ \item Full persistence: Any version can be updated, to produce a new version. The versions makes a tree.
+ \item Confluent persistence: Versions can be merged to produce a new version. The graph is now a DAG\@.
+ \item Functional: Nodes in the data structur are never modified, only new ones are created. The version of the structure is maintained by a pointer.
+\end{enumerate}
+
+
+\subsection{Partial Persistence}
+Any data structure using the pointer machine with $\leq p = O(1)$ pointers to any node in any version can be made partially persistent,
+with $O(1)$ amortized multiplicative overhead and $O(1)$ space per change.
+We show how this can be done.
+
+Each node store \emph{Back Pointers}, which points to all nodes that have a pointer to this node.
+Since there are at most $p$ such pointers, this is $O(1)$ extra space.
+In addition, the nodes store a history of modifications $(version, field, value)$ tuples.
+The history is of size $\leq 2p$ (using the fact that $p=O(1)$).
+On field reads we read the value explicitly stored in the node, and then apply all the modifications that were done up to the verison
+we are reading from.
+When the history is full, we create a new node with the entire modification history applied to its fields, and an empty history.
+Now we need to update the back pointers of the nodes we point to, since these nodes also store back pointers.
+This is easy, since the back pointers are not persistent.
+But we also need to update the pointers (not back pointers) of other nodes (which is the nodes we have in our back pointer list),
+and these pointers are persistent, so we must recurse.
+
+It is not trivial to see that this recursion terminates.
+What if the nodes we update get a full history from updating their pointer to us,
+which makes us update our pointers making our history full again?
+
+We can use the \emph{Potential Method} for amortized analysis.
+Remember that \emph{Amortized cost} is actual cost + change in potential.
+Let $\Phi = c \cdot \Sigma$, where $\Sigma$ is the number of modifications in all nodes latest versions.
+Observe that when we make a change, the amortized cost is
+$$\leq c + c\ [- 2cp + p \cdot \text{recursions}]\text{?}$$
+The first $c$ is for computation, the second $c$ is the added potential, since we
+just added one modification to a node,
+and $[-2cp + p \cdot \text{recursions}]\text{?}$ are for the recursions, if there is one.
+If there is no recursion, we have constant time.
+If there is a recursion, we replace it by the same expression, which causes the $-2cp$ to cancel.
+We do end up with another recursion, but if it does happen, we still get rid of the $-2cp$ term.
+
+In conclusion, we use $O(3p) = O(1)$ extra space for each node, and allow $O(1)$ extra time on updates, to
+get partial persistence for any data structure.
+
+
+\subsection{Full Persistence}
+We take a similar approach to full persistency as we did with partial persistency.
+The first difference is that we need \emph{all} pointers to be bi-directional, and not only the field pointers, as previously.
+The second and most important difference is that now versions of the structure is no longer a line, but a tree.
+In order to go around this problem we linearize the version tree:
+traverse the tree, and write out first and last visit of each node\todo{add figure}.
+
+However, we need this structure to be dynamic,
+so we use an \emph{Order Maintenance data structure},
+which supports insertion before or after a given item in $O(1)$ (like a linked list),
+and supports relative order query of two items in $O(1)$ time.
+
+Let $p$ still be the in-degree of nodes.
+Let $d$ be the maximum number of fields in a node.
+We now allow $2(d + p + 1)$ modifications in a nodes history.
+
+A single nodes history now consists of the same triples, but the $version$ can no longer be compared by $\leq$,
+since the versions are in a tree. This is what we need the $O(1)$ relative ordering for.
+
+On field read we can simply go through all modifications in the node (since there is a constant of them),
+and use relative ordering to find the latest update of the field we are reading, from an ancestor of the current version.
+
+Updates when the history is full is different from the approach taken in partial persistence.
+We would like to split the history tree into two parts of the same size,
+and apply all of the modifications from the root to the newly cut out subtree to the new node.
+This was the original node loses the modifications in the new subtree, and the new node loses
+the modifications that are left in the original node.
+Now we need to update pointers.
+We have $d$ fields, $p$ back pointers, and $d + p + 1$ items in the history, all of which could be pointers,
+which makes $\leq 2d + 2p + 1$ pointers.
+Amortization scheme still works out (although the potential function is slightly different), and we get $O(1)$ updates.
+
+\subsubsection{}
+Confluent persistency and functional were not covered in class.
+
+
+\section{Temporal Data Structures}
+
+Temporal data structures allow all operations to specify a time in which the operation is to be performed.
+The key difference between temporal and persistent structures is that in a temporal structure previous alterations to the structure
+should ``propagate'' through time, and update the current structure.
+A persistent data structure would not update the current strucure, but simply make a new data structure for each of the paths taken through time.
+
+\subsection{Retroactivity}
+We need three operations:
+\begin{enumerate}
+ \item \code{Insert}{t, $\ op(\dots)$}: retroactively perform the specified operation at time $t$.
+ \item \code{Delete}{t}: retroactively undo the operation at time $t$.
+ \item \code{Query}{t, $\ op(\dots)$}: execute the query at time $t$.
+\end{enumerate}
+
+We differentiate between \emph{Partial retroactivity}, in which we are only allowed to \algo{Query} in the present,
+and \emph{Full retroactivity}, in which we may query whenever we want.
+
+\subsubsection{A Simple Case}
+If the updates we perform are commutative ($x, y = y, x$) and invertible ($x, x^{-1} = \varnothing$)
+then partial retroactivity is easy:
+Insert operations can be done in the present, since the ordering does not matter, and
+delete operations can be done as inverse insert operations.
+
+An example is if we were to sum a sequence of numbers.
+If we sum $1 + 3$ and would like to insert a $+2$ in between, we can put it at the end.
+If we want to delete the $+3$, we can add a $-3$ at the end.
+
+\subsection{Full Retroactivity}
+What do we need for full retroactivity?
+
+\begin{definition}{Decomposable Search Problem}
+\label{sec:dsp}
+ $\text{query}(x, A\cup B) = f(\text{query}(x, A), \text{query}(x, B))$
+\end{definition}
+
+\begin{example}
+ \algo{Nearest-Neighbour}, \algo{Successor}, and \algo{Point-Location} are all decomposable search problems.
+\end{example}
+
+If what we query is a DSP, we can achieve full retroactivity in $O(\log m)$ factor overhead,
+where $m$ is the number of retroactive operations.
+We can do this by using a \emph{Segment Tree} over the operations.
+We think of the operations as begin in a time interval, from they are made until they are deleted.
+Then we can store the operations in the nodes of the maximal subtrees that covers the interval.
+Each node in the segment tree has its own copy of the underlying datastructure which we are making retroactive.
+Since there are $O(\log m)$ nodes an element can be in, and the individual results can be joined \emph{somehow} (via $f$, since the problem is a DSP),
+we get $O(\log m)$ multiplicative overhead.
+
+
+\subsection{General Transformations}
+What can we do if we do in general?
+The most obvious method is the \emph{Rollback Method}:
+when we want to do a retroactive change, we roll back the data structure to the given time,
+make the change, and replay all changes afterwards.
+This makes for a $O(r)$ multiplicative overhead, where $r$ is the number of time units in the past.
+Unfortunately, depending on the exact model used, this is the best we can do in general.
+
+There is a lower bound: $\Omega(r)$ overhead can be necessary for retroactivity.
+This means that the most efficient way to make retroactive changes is to go back, make the change,
+and redo whatever comes after --- the rollback method!
+
+To see why this is the case, consider a very simply computer with two registers $X$ and $Y$, and with the following operations:
+$X=x$, $Y\mathrel{+}=\Delta$, and $Y = XY$.
+On query, the machine returns $Y$, and all operations are $O(1)$.
+The operation sequence
+$$\big<Y \mathrel{+}= a_n,\ Y=XY,\ Y\mathrel{+}= a_{n-1},\ \dots,\ Y\mathrel{+}= a_0\big>$$
+computes the polynomial $\sum^{n}_{i} a_{i}x^{i}$.
+We can use this to compute the polynomial for any $X$ by retroactively inserting $X=x$ at $t=0$.
+However, it is not possible to reeveluate such a polynomial using field operations any faster than to just evaluate it again.
+
+
+\subsection{Priority Queue}
+We now look at an example of a retroactive priority queue that supports
+\algo{Insert}, \algo{Delete-Min}, and is partially retroactive.
+We assume keys are only inserted once.
+
+We can plot the lifetime of the queue in 2D, where the x dimension is time and the y dimensions in key value.
+\todo{add plot}
+Keys in the queue are plotted as points when they are inserted, and are extended as horizonal rays.
+On \algo{Delete-Min}, we shoot a ray from the x-axis at the time of the delete upward untill it hits a horizontal ray.
+This makes $\rceil$ patterns.
+
+Let $Q_{t}$ be the set of keys in the queue at time $t$.
+What happens if we \code{Insert}{t, \text{insert}$(k)$}?
+Its ray will extend to the right and hit a deletion ray, which will delete it.
+This will make the element that ray previously deleted to not be deleted after all,
+so its ray will extend further to the right and hit another deletion ray.
+In effect, the element that ends ut not getting deleted is
+$\max\{k, k' \mid k' \text{\ deleted at time} \geq t\}$, but this relation is hard to maintain.
+
+In order to make things easier we define a \emph{Brige}.
+
+\begin{definition}{Bridge}
+ A Bridge is a time $t$ such that $Q_t \in Q_{\text{now}}$
+\end{definition}
+
+Now if $t'$ is the bridge preceding $t$ we see that
+$$\max\{k'\mid k' \text{\ deleted at time}\geq t\} = \max\{k'\notin Q_{\text{now}} \mid k' \text{\ inserted at time}\geq t'\}$$
+In other terms, the largest key deleted after a time $t$ is the largest key inserted after the previous bridge that is not in the final set.
+
+Now we can store $Q_{\text{now}}$ as a BBST on values, and all insertions in a BBST on time.
+The latter trees nodes is also augmented with the value of the largest insert in its subtree that is not in $Q_{\text{now}}$.
+At last, we store a third BBST with \emph{all} the updates, ordered on time, and also augmented as follows:
+$$
+\text{Augmentation} =
+\begin{cases}
+ 0 \text{\ if \code{insert}{k}, and\ }k \in Q_\text{now}\\
+ +1 \text{\ if \code{insert}{k}, and\ }k \notin Q_\text{now}\\
+ -1 \text{\ if \algo{Delete-Min}}\\
+\end{cases}
+$$
+In addition the internal nodes store subtree sums and min/max prefix sums.
+We do this in order to detect brdiges, since a bridge is a prefix summing to 0.
+When we have to find out which element to insert into $Q_{\text{now}}$ we can walk up from the node in the insertion BST,
+and find the max to the right, since this BST stores this for all subtrees.
+$O(\log n)$ time.
+
+
+\subsection{Other Structures}
+We list other strucutes that can be made retroactive.
+A Queue can be made partial retroactive with $O(1)$ overhead and full retroactive with $O(\log m)$ overhead.
+A Deque and \algo{Union-Find} can also be make fully retroactive with $O(\log m)$ overhead.
+The priority queue, which we just made partially retroactive with $O(\log m)$ overhead,
+can be made fully retroactive with $O(\sqrt{m} \log m)$ overhead.
+Successor queries can be done in $O(\log m)$ partial retroactive,
+and since it is a decomposable search problem (see~\ref{sec:dsp}) we can pay a $\log$ factor to make it fully retroactive,
+with $O(\log^2 m)$ overhead. However, it is also possible to get full retroactivity with only $O(\log m)$ overhead.
+
+
+\section{List Order Maintenance}
+Before tackeling the real problem, we look at an easier problem.
+
+\subsection{List Labeling}
+We want to store integer labels in a list, such that the list, such that insert/delete here queries are constant,
+and that the list is in a strictly monotone ordering.
+\emph{Label Space} is the size of the labels as a function of the number of elements in the list we want to store.
+Table~\ref{tab:list-labeling} shows the best known updates for different sizes of the label space.
+
+\begin{table}[b]
+ \centering
+ \begin{tabular}{c l}
+ Label Space & Best known update\\\midrule
+ $(1+\epsilon)n\dots n \log n$ & $O(\log^2 n)$ \\
+ $n^{1 + \epsilon} \dots n^{O(1)}$ & $O(\log n)$ \\
+ $2^n$ & $O(1)$
+ \end{tabular}
+ \caption{%
+\label{tab:list-labeling}%
+ Table showing label space vs best known update for \algo{List-Labeling}}
+\end{table}
+
+
+
+
+
+\chapter{Geometry}
\chapter{Connectivity in Dynamic Graphs}