## LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

 Options: Use Forum View Use Monospaced Font Show Text Part by Default Condense Mail Headers Message: [<< First] [< Prev] [Next >] [Last >>] Topic: [<< First] [< Prev] [Next >] [Last >>] Author: [<< First] [< Prev] [Next >] [Last >>]

Bruno Le Floch skrev 2012-02-02 19.24:
> Hello Lars,
>
> The Church Boolean discussion is not forgotten, just on the
> back-burner with so many things at the same time. It seemed to bring
> some performance improvements, but there were some awkward parts to it
> in the way I initially tried to implement them. In your last email on
> the topic a few months back, you had a better approach some working
> code, which I didn't get to benchmark carefully. I definitely need to
> get back to boolean expressions, since at the moment
> (...)&&(...)||(...)&&(...) treats&&  and || with the same priority,
> which is wrong.

Whereas with the way I did infix expression parsing, it would be difficult
not to give them different priorities. :-)

> Perhaps that will mean switching to Church booleans,
> perhaps not: there are a lot of issues to consider, including
> bootstrapping expl3, having a nice interaction with (future) fp
> expression, no breaking change, etc.
>
>> PS: Since I'm posting anyway, I suppose I should mention this too in case
>> anyone is interested: After my autumn experiments with Church booleans, I
>> went on to implement a fully expandable package for<integer>  to<balanced
>> text>  mappings using 2-3-trees. I didn't quite finish it before getting
>> sidetracked by other projects (in this case, actual research), but insertion
>> of entries is there and works. Next would have been popping off entries;
>> together the two would suffice for making a prioriqueue, which I would
>> imagine can be useful in places.
>
> I'm interested in what you have to say about 2-3 trees. I implemented
> sorting based on splay trees, which I think I will also use for
> Unicode character property lookup in l3regex eventually. None of that
> code is in the svn yet.
>
> What do you mean by converting one<integer>  to a<balanced text>?

I started out wanting to encode a "sparse array". Lookup keys are integers
(since these are easy to compare), whereas values can be arbitrary pieces of
balanced text.

> Could you give a couple of examples if you have time?

OK. The following (somewhat artificial) piece of code builds a tree by
staring with an empty tree and then inserting some entires into it (in
mock-random order, to illustrate that the order is arbitrary:

\ttt_empty_tree:n
{ \ttt_insert_entry:nnnn{0}{noll} }
{ \ttt_insert_entry:nnnn{1}{ett} }
{ \ttt_insert_entry:nnnn{4}{fyra} }
{ \ttt_insert_entry:nnnn{6}{sex} }
{ \ttt_insert_entry:nnnn{8}{\r{a}tta} }
{ \ttt_insert_entry:nnnn{9}{nio} }
{ \ttt_insert_entry:nnnn{10}{tio} }
{ \ttt_insert_entry:nnnn{12}{tolv} }
{ \ttt_insert_entry:nnnn{2}{tv\aa} }
{ \ttt_insert_entry:nnnn{3}{tre} }
{ \ttt_insert_entry:nnnn{5}{fem} }
{ \ttt_insert_entry:nnnn{7}{sju} }
{ \ttt_insert_entry:nnnn{11}{elva} }
{ \ttt_insert_entry:nnnn{13}{tretton} }
{ \cs_set:Npn\l_my_tree }

(The argument specifiers reflect the argument structure present when the
macro is expanded. All but the first command above have one "n" argument for
the tree being built, and the value of that is of course not explicit in the
code.)

After that, \l_my_tree is the \long macro:
->d{t{l{0}{noll}}{1}{ett}{l{2}{tv\aa
}}{3}{tre}{L{4}{fyra}{5}{fem}}}{6}{sex}{t{L{7}{sju}{8}{\r
{a}tta}}{9}{nio}{l{10}{tio}}{11}{elva}{L{12}{tolv}{13}{tretton}}}.

Note the \aa in the entry for 2; this shows that the values have not
undergone expansion when the tree was being built.

The tree structure is

0---+
1
2---+----+
3    |
4   |    |
+---+    |
5        |
6---
7        |
+---+    |
8   |    |
9    |
10---+----+
11
12   |
+---+
13

Overhead compared to a flat list is $3n-2$ tokens ($n$ letters and $2n-2$
braces), where $n$ is the number of nodes in the tree. The relation between
$n$ and the number of entries $N$ is that $$n-1 \leq N \leq 2n$$.

Looking up an entry is straightforward; the commands

\message{Looking~up~8:}
\exp_args:Nno\ttt_lookup_entry:nnSF{8}{\l_my_tree}
{\typeout}{\message{--NotFound--}}
\message{Looking~up~2:}
\exp_args:Nno\ttt_lookup_entry:nnSF{2}{\l_my_tree}
{\typeout}{\message{--NotFound--}}
\message{Looking~up~22:}
\exp_args:Nno\ttt_lookup_entry:nnSF{22}{\l_my_tree}
{\typeout}{\message{--NotFound--}}

produces

Looking up 8:
\r {a}tta
Looking up 2:
tv\r a
Looking up 22: --NotFound--

(\typeout expands the \aa, but it is called as \typeout{tv\aa}). The
nonstandard "S" argument specifier might be read as "success": it is in
spirit similar to T (true branch), but has a different syntax (extra
argument(s) is supplied), so a ...SF macro would not be a Church boolean.

I also implemented nonexact searches (not so different for a tree as the
above whose keys are dense in an interval, but one can pick a search key
outside that interval):

\message{Infimum~of~22~is}
\def\messagetypeout:nn#1#2{\message{#1:}\typeout{#2}} % Helper for showing
% result
\exp_args:Nno\ttt_get_infimum:nnSF{22}{\l_my_tree}
{\messagetypeout:nn}{\message{--NotFound--}}

produces

Infimum of 22 is 13:
tretton

All operations complete in $O(\log n)$ expansions. One might argue that the
true asymptotic complexity is more like $$O(S\log n) \geq O(n\log n)$$,
where $S$ is the total number of tokens in the tree, but the term
contributing this is for TeX grabbing tokens of a macro argument, the
constant factor of which is very small. Hence the practical (wall-clock)
complexity is more likely to also be $O(\log n)$. :-)

Lars Hellström