Reasoning about RoboCup soccer narratives
Hannaneh Hajishirzi, Julia Hockenmaier, et al.
UAI 2011
Sequencing by hybridization is a method of reconstructing a long DNA string - that is, figuring out its nucleotide sequence - from knowledge of its short substrings. Unique reconstruction is not always possible, and the goal of this paper is to study the number of reconstructions of a random string. For a given string, the number of reconstructions is determined by the pattern of repeated substrings; in an appropriate limit substrings will occur at most twice, so the pattern of repeats is given by a pairing: a string of length 2n in which each symbol occurs twice. A pairing induces a 2-in, 2-out graph, whose directed edges are defined by successive symbols of the pairing - for example the pairing ABBCAC induces the graph with edges AB, BB, BC, and so forth - and the number of reconstructions is simply the number of Euler circuits in this 2-in, 2-out graph. The original problem is thus transformed into one about pairings: to find the number fk(n) of n-symbol pairings having k Euler circuits. We show how to compute this function, in closed form, for any fixed k, and we present the functions explicitly for k=1,...,9. The key is a decomposition theorem: the Euler "circuit number" of a pairing is the product of the circuit numbers of "component" sub-pairings. These components come from connected components of the "interlace graph", which has the pairing's symbols as vertices, and edges when symbols are "interlaced". (A and B are interlaced if the pairing has the form ABAB or BABA.) We carry these results back to the original question about DNA strings, and provide a total variation distance upper bound for the approximation error. We perform an asymptotic enumeration of 2-in, 2-out digraphs to show that, for a typical random n-pairing, the number of Euler circuits is of order no smaller than 2n/n, and the expected number is asymptotically at least e-1/22n-1/n. Since any n-pairing has at most 2n-1 Euler circuits, this pinpoints the exponential growth rate. © 2000 Elsevier Science B.V.
Hannaneh Hajishirzi, Julia Hockenmaier, et al.
UAI 2011
Peter Wendt
Electronic Imaging: Advanced Devices and Systems 1990
John R. Kender, Rick Kjeldsen
IEEE Transactions on Pattern Analysis and Machine Intelligence
J.P. Locquet, J. Perret, et al.
SPIE Optical Science, Engineering, and Instrumentation 1998