LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

 Options: Use Classic View Use Monospaced Font Show Text Part by Default Show All Mail Headers Topic: [<< First] [< Prev] [Next >] [Last >>]

 Re: Trees, was Re: Mapping Functions Versions for All and Some Lars Hellström <[log in to unmask]> Fri, 3 Feb 2012 12:59:22 +0100 text/plain (151 lines) Bruno Le Floch skrev 2012-02-02 19.24: > Hello Lars, > > The Church Boolean discussion is not forgotten, just on the > back-burner with so many things at the same time. It seemed to bring > some performance improvements, but there were some awkward parts to it > in the way I initially tried to implement them. In your last email on > the topic a few months back, you had a better approach some working > code, which I didn't get to benchmark carefully. I definitely need to > get back to boolean expressions, since at the moment > (...)&&(...)||(...)&&(...) treats&& and || with the same priority, > which is wrong. Whereas with the way I did infix expression parsing, it would be difficult not to give them different priorities. :-) > Perhaps that will mean switching to Church booleans, > perhaps not: there are a lot of issues to consider, including > bootstrapping expl3, having a nice interaction with (future) fp > expression, no breaking change, etc. > >> PS: Since I'm posting anyway, I suppose I should mention this too in case >> anyone is interested: After my autumn experiments with Church booleans, I >> went on to implement a fully expandable package for to> text> mappings using 2-3-trees. I didn't quite finish it before getting >> sidetracked by other projects (in this case, actual research), but insertion >> of entries is there and works. Next would have been popping off entries; >> together the two would suffice for making a prioriqueue, which I would >> imagine can be useful in places. > > I'm interested in what you have to say about 2-3 trees. I implemented > sorting based on splay trees, which I think I will also use for > Unicode character property lookup in l3regex eventually. None of that > code is in the svn yet. > > What do you mean by converting one to a? I started out wanting to encode a "sparse array". Lookup keys are integers (since these are easy to compare), whereas values can be arbitrary pieces of balanced text. > Could you give a couple of examples if you have time? OK. The following (somewhat artificial) piece of code builds a tree by staring with an empty tree and then inserting some entires into it (in mock-random order, to illustrate that the order is arbitrary: \ttt_empty_tree:n     { \ttt_insert_entry:nnnn{0}{noll} }     { \ttt_insert_entry:nnnn{1}{ett} }     { \ttt_insert_entry:nnnn{4}{fyra} }     { \ttt_insert_entry:nnnn{6}{sex} }     { \ttt_insert_entry:nnnn{8}{\r{a}tta} }     { \ttt_insert_entry:nnnn{9}{nio} }     { \ttt_insert_entry:nnnn{10}{tio} }     { \ttt_insert_entry:nnnn{12}{tolv} }     { \ttt_insert_entry:nnnn{2}{tv\aa} }     { \ttt_insert_entry:nnnn{3}{tre} }     { \ttt_insert_entry:nnnn{5}{fem} }     { \ttt_insert_entry:nnnn{7}{sju} }     { \ttt_insert_entry:nnnn{11}{elva} }     { \ttt_insert_entry:nnnn{13}{tretton} }     { \cs_set:Npn\l_my_tree } (The argument specifiers reflect the argument structure present when the macro is expanded. All but the first command above have one "n" argument for the tree being built, and the value of that is of course not explicit in the code.) After that, \l_my_tree is the \long macro: ->d{t{l{0}{noll}}{1}{ett}{l{2}{tv\aa }}{3}{tre}{L{4}{fyra}{5}{fem}}}{6}{sex}{t{L{7}{sju}{8}{\r {a}tta}}{9}{nio}{l{10}{tio}}{11}{elva}{L{12}{tolv}{13}{tretton}}}. Note the \aa in the entry for 2; this shows that the values have not undergone expansion when the tree was being built. The tree structure is   0---+       1   2---+----+       3 |   4 | |   +---+ |   5 |            6---   7 |   +---+ |   8 | |       9 | 10---+----+      11 12 |   +---+ 13 Overhead compared to a flat list is $3n-2$ tokens ($n$ letters and $2n-2$ braces), where $n$ is the number of nodes in the tree. The relation between $n$ and the number of entries $N$ is that $$n-1 \leq N \leq 2n$$. Looking up an entry is straightforward; the commands \message{Looking~up~8:} \exp_args:Nno\ttt_lookup_entry:nnSF{8}{\l_my_tree}    {\typeout}{\message{--NotFound--}} \message{Looking~up~2:} \exp_args:Nno\ttt_lookup_entry:nnSF{2}{\l_my_tree}    {\typeout}{\message{--NotFound--}} \message{Looking~up~22:} \exp_args:Nno\ttt_lookup_entry:nnSF{22}{\l_my_tree}    {\typeout}{\message{--NotFound--}} produces Looking up 8: \r {a}tta Looking up 2: tv\r a Looking up 22: --NotFound-- (\typeout expands the \aa, but it is called as \typeout{tv\aa}). The nonstandard "S" argument specifier might be read as "success": it is in spirit similar to T (true branch), but has a different syntax (extra argument(s) is supplied), so a ...SF macro would not be a Church boolean. I also implemented nonexact searches (not so different for a tree as the above whose keys are dense in an interval, but one can pick a search key outside that interval): \message{Infimum~of~22~is} \def\messagetypeout:nn#1#2{\message{#1:}\typeout{#2}} % Helper for showing                                                        % result \exp_args:Nno\ttt_get_infimum:nnSF{22}{\l_my_tree}    {\messagetypeout:nn}{\message{--NotFound--}} produces Infimum of 22 is 13: tretton All operations complete in $O(\log n)$ expansions. One might argue that the true asymptotic complexity is more like $$O(S\log n) \geq O(n\log n)$$, where $S$ is the total number of tokens in the tree, but the term contributing this is for TeX grabbing tokens of a macro argument, the constant factor of which is very small. Hence the practical (wall-clock) complexity is more likely to also be $O(\log n)$. :-) Lars Hellström