[squeak-dev] Why doesn't ECompletion / OCompletion use a Trie?
Juan Vuletich
juan at jvuletich.org
Wed Jun 15 11:40:05 UTC 2011
Hi Levente,
Thanks for your answer. I thought I had missed something, as I think a
trie would do a much better job. Your answer tells me I had already got
it, and I still believe a trie is better.
(comments inline)
Levente Uzonyi wrote:
> On Tue, 14 Jun 2011, Juan Vuletich wrote:
>
>> Hi Folks,
>>
>> I've been playing a bit with ECompletion / OCompletion on Cuis
>> (thanks, Levente!). What surprises me is the use of a single level
>> Dictionary with 2 character prefixes as the data structure to hold
>> the string dictionary. Does anybody know why a Trie (
>> http://en.wikipedia.org/wiki/Trie ) wasn't used?
>
> OCompletion suggests the recently used methods, therefore the length
> of the list for each prefix is limited to 40 by default. So looking up
> selectors beginning with a two letter prefix is fast enough (I
> increased it to 400 in my image and it's still fast enough). 40 is not
> always sufficient, sometimes the method you're looking for is not in
> the list of the three recently used methods, but it works for most cases.
Having the list limited to 40 might be ok when you have a 2 or 3 letter
prefix, but the majority of the methods will never be suggested, even if
you keep typing a longer prefix. At some moment, the list will be empty,
and maybe you still need to finish typing a rather long (seldom used,
hard to remember) selector. When trying OCompletion, I found this very
annoying. For instance, in Pharo 1.2, the selectors table holds less
than 6,000 selectors, while the system has over 32,000 distinct
selectors. This means that over 80% of the selectors will never be
suggested.
Showing the most popular selectors first is quite nice. But having a
hard, hidden boundary on what will be suggested and what not is not that
nice.
> With a trie you'd have to sort the words matching the prefix by access
> time for every lookup which may be less efficient, especially for
> small prefixes. A trie would also have to hold all possible selectors,
> while the lists limited to 40 avoid this.
The trie will get faster as the prefixes are longer, beating easily the
table, that will have to filter a much larger set of words. Besides, the
sorting of the matching results of the trie will not be too bad, if I do
filtering and sorting at the same time, keeping (for example) just the
40 latests words. (Note to Juan: do it!).
And as I said before, holding all possible selectors is much better than
holding (maybe) 20% of them.
Besides, the trie is better for natural language dictionaries, where the
sorting by date feature is most likely not good and the set of words can
get much larger. My current implementation can also match words
regardless of case and accents, and suggest a preferred spelling (for
example, with correct accents).
Not to mention that code is much simpler, smaller and general...
> Btw, I'm thinking about integrating code completion into Squeak
> itself. The idea is to add hooks to the system for code completion and
> provide an implementation for common things, like menus, key bindings,
> etc. There would also be a simple symbol table based completion tool
> integrated by default, like Cmd+Q, but with menus, like in Pharo. Then
> stripped down versions of E/OCompletion could be loaded as a
> replacement. I'm also thinking about implementing a mix of ECompletion
> and OCompletion which would show the recently used selectors first
> like OCompletion, but filters/sorts the suggestions based on type
> information like ECompletion.
Sounds great. Maybe my implementation can be one of the alternatives.
Cheers,
Juan Vuletich
> Levente
More information about the Squeak-dev
mailing list
|