LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

	LISTSERV Archives
	LATEX-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Forum View Use Monospaced Font Show Text Part by Default Condense Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Sender:	Mailing list for the LaTeX3 project <[log in to unmask]>
Date:	Fri, 3 Feb 2012 12:59:22 +0100
Reply-To:	Mailing list for the LaTeX3 project <[log in to unmask]>
Message-ID:	<[log in to unmask]>
Subject:	Re: Trees, was Re: Mapping Functions Versions for All and Some
MIME-Version:	1.0
Content-Transfer-Encoding:	8bit
In-Reply-To:	<[log in to unmask]>
Content-Type:	text/plain; charset=ISO-8859-1; format=flowed
From:	Lars Hellström <[log in to unmask]>
Parts/Attachments:	text/plain (151 lines)

Bruno Le Floch skrev 2012-02-02 19.24:
> Hello Lars,
>
> The Church Boolean discussion is not forgotten, just on the
> back-burner with so many things at the same time. It seemed to bring
> some performance improvements, but there were some awkward parts to it
> in the way I initially tried to implement them. In your last email on
> the topic a few months back, you had a better approach some working
> code, which I didn't get to benchmark carefully. I definitely need to
> get back to boolean expressions, since at the moment
> (...)&&(...)||(...)&&(...) treats&&  and || with the same priority,
> which is wrong.

Whereas with the way I did infix expression parsing, it would be difficult 
not to give them different priorities. :-)

> Perhaps that will mean switching to Church booleans,
> perhaps not: there are a lot of issues to consider, including
> bootstrapping expl3, having a nice interaction with (future) fp
> expression, no breaking change, etc.
>
>> PS: Since I'm posting anyway, I suppose I should mention this too in case
>> anyone is interested: After my autumn experiments with Church booleans, I
>> went on to implement a fully expandable package for<integer>  to<balanced
>> text>  mappings using 2-3-trees. I didn't quite finish it before getting
>> sidetracked by other projects (in this case, actual research), but insertion
>> of entries is there and works. Next would have been popping off entries;
>> together the two would suffice for making a prioriqueue, which I would
>> imagine can be useful in places.
>
> I'm interested in what you have to say about 2-3 trees. I implemented
> sorting based on splay trees, which I think I will also use for
> Unicode character property lookup in l3regex eventually. None of that
> code is in the svn yet.
>
> What do you mean by converting one<integer>  to a<balanced text>?

I started out wanting to encode a "sparse array". Lookup keys are integers 
(since these are easy to compare), whereas values can be arbitrary pieces of 
balanced text.

> Could you give a couple of examples if you have time?

OK. The following (somewhat artificial) piece of code builds a tree by 
staring with an empty tree and then inserting some entires into it (in 
mock-random order, to illustrate that the order is arbitrary:

\ttt_empty_tree:n
    { \ttt_insert_entry:nnnn{0}{noll} }
    { \ttt_insert_entry:nnnn{1}{ett} }
    { \ttt_insert_entry:nnnn{4}{fyra} }
    { \ttt_insert_entry:nnnn{6}{sex} }
    { \ttt_insert_entry:nnnn{8}{\r{a}tta} }
    { \ttt_insert_entry:nnnn{9}{nio} }
    { \ttt_insert_entry:nnnn{10}{tio} }
    { \ttt_insert_entry:nnnn{12}{tolv} }
    { \ttt_insert_entry:nnnn{2}{tv\aa} }
    { \ttt_insert_entry:nnnn{3}{tre} }
    { \ttt_insert_entry:nnnn{5}{fem} }
    { \ttt_insert_entry:nnnn{7}{sju} }
    { \ttt_insert_entry:nnnn{11}{elva} }
    { \ttt_insert_entry:nnnn{13}{tretton} }
    { \cs_set:Npn\l_my_tree }

(The argument specifiers reflect the argument structure present when the 
macro is expanded. All but the first command above have one "n" argument for 
the tree being built, and the value of that is of course not explicit in the 
code.)

After that, \l_my_tree is the \long macro:
->d{t{l{0}{noll}}{1}{ett}{l{2}{tv\aa 
}}{3}{tre}{L{4}{fyra}{5}{fem}}}{6}{sex}{t{L{7}{sju}{8}{\r 
{a}tta}}{9}{nio}{l{10}{tio}}{11}{elva}{L{12}{tolv}{13}{tretton}}}.

Note the \aa in the entry for 2; this shows that the values have not 
undergone expansion when the tree was being built.

The tree structure is

  0---+
      1
  2---+----+
      3    |
  4   |    |
  +---+    |
  5        |
           6---
  7        |
  +---+    |
  8   |    |
      9    |
10---+----+
     11
12   |
  +---+
13

Overhead compared to a flat list is $3n-2$ tokens ($n$ letters and $2n-2$ 
braces), where $n$ is the number of nodes in the tree. The relation between 
$n$ and the number of entries $N$ is that \(n-1 \leq N \leq 2n\).

Looking up an entry is straightforward; the commands

\message{Looking~up~8:}
\exp_args:Nno\ttt_lookup_entry:nnSF{8}{\l_my_tree}
   {\typeout}{\message{--NotFound--}}
\message{Looking~up~2:}
\exp_args:Nno\ttt_lookup_entry:nnSF{2}{\l_my_tree}
   {\typeout}{\message{--NotFound--}}
\message{Looking~up~22:}
\exp_args:Nno\ttt_lookup_entry:nnSF{22}{\l_my_tree}
   {\typeout}{\message{--NotFound--}}

produces

Looking up 8:
\r {a}tta
Looking up 2:
tv\r a
Looking up 22: --NotFound--

(\typeout expands the \aa, but it is called as \typeout{tv\aa}). The 
nonstandard "S" argument specifier might be read as "success": it is in 
spirit similar to T (true branch), but has a different syntax (extra 
argument(s) is supplied), so a ...SF macro would not be a Church boolean.

I also implemented nonexact searches (not so different for a tree as the 
above whose keys are dense in an interval, but one can pick a search key 
outside that interval):

\message{Infimum~of~22~is}
\def\messagetypeout:nn#1#2{\message{#1:}\typeout{#2}} % Helper for showing
                                                       % result
\exp_args:Nno\ttt_get_infimum:nnSF{22}{\l_my_tree}
   {\messagetypeout:nn}{\message{--NotFound--}}

produces

Infimum of 22 is 13:
tretton

All operations complete in $O(\log n)$ expansions. One might argue that the 
true asymptotic complexity is more like \(O(S\log n) \geq O(n\log n)\), 
where $S$ is the total number of tokens in the tree, but the term 
contributing this is for TeX grabbing tokens of a macro argument, the 
constant factor of which is very small. Hence the practical (wall-clock) 
complexity is more likely to also be $O(\log n)$. :-)


Lars Hellström

ATOM RSS1 RSS2

LISTSERV.UNI-HEIDELBERG.DE
Universität Heidelberg \| Impressum \| Datenschutzerklärung