[squeak-dev] The XML libraries run fine on Squeak 6 alpha

Marcel Taeumel marcel.taeumel at hpi.de
Tue Oct 6 07:22:14 UTC 2020


Hi Levente,

thanks for this review. The points (4), (3), and (2) are very serious. There needs to be a lot of work done before this can be treated as a valuable addition to Trunk's XML implementation. Especially backwards compatibility must be considered.

I am -1 then. Sorry for not taking a look at the code before. I would still like to see improvements happen in the Trunk's XML implementation.

Best,
Marcel
Am 05.10.2020 21:53:34 schrieb Levente Uzonyi <leves at caesar.elte.hu>:
-1

After a bit of analysis, I came to the conclusion that it would be a bad
idea to include those packages in the Trunk. Why?

1. There are more important things than XML
2. These packages are huge
3. The code is not backwards compatible
4. It would introduce parallel collection hierarchies which would cause
confusion
5. Some undeclared variables


Details:

1. There are more important things than XML

I think there's not much to say here. Among the serialization formats,
JSON, YAML are far more popular than XML nowadays. Yet, we have neither
supported in the Trunk (apart from a minimal hence incomplete JSON
implementation in WebClient. It's really amazing how small it is though.)


2. These packages are huge

The current version of the XML-Parser in Trunk has the following stats:

classes: 26
methods: 379
linesOfCode: 2069

It's obviously incomplete though.
The proposed packages have the following stats:

classes: 758 (21.5% of all classes in the Trunk including the XML packages)
methods: 10129 (14%)
linesOfCode: 73897 (13.7%)

That's 25x-35x more than the current implementation.
When the packages are loaded in the current Trunk image, they take up
21.5%, 14%, 13.7% of all classes, methods, linesOfCode, respectively.
One could say that the extensive tests are responsible for the bloat, but
no, it's probably related to the complexity of XML. Without tests, the
numbers are:

classes: 512 (15.6%)
methods: 6744 (9.8%)
linesOfCode: 32886 (6.6%)

I don't think XML is important enough in general to justify those numbers.


3. The code is not backwards compatible

The code is not a drop-in replacement. The interface is different. Code
using the current package would have to be migrated.


4. It would introduce parallel collection hierarchies which would cause
confusion

The code uses various collections similar to those in the Trunk but not
compatible with them and not in the same class hierarchy:

BitmapCharcterSet - An independent character set implementation. Not a
subclass of CharacterSet.

StandardOrderedDictioanry (and its 5 subclasses) - An alternative to
OrderedDictionary implementation. A subclass of Collection, so not part of
the HashedCollection hierarchy (for good reasons).

If these were part of the Trunk, one could be wondering which one to use:
OrderedDictionary or OrderPreservingDictionary? What's the difference?
Maybe StandardOrderedDictionary?

5. Some undeclared variables

The code is written with compatibility in mind, so it supports Pharo and
Squeak. Even if you use Squeak, code related to Zinc, Pharo's HTTP library
will still be loaded directly referencing Zinc's classes.


Without going any deeper, I think the best would be to not include these
packages in the Trunk.
I'd go even further: I would remove the current XML-Parser as well but
provide an easy way to load either version. That way the image would be
smaller, simpler.
If we had the CI infrastructure (we probably have, but I'm not sure), we
could even test whether these packages work as expected, and treat them
as official external packages.

There's currently one method that depends on the XML-Parser package in
Trunk: SugarLauncher >> #gconfPropertiesAt:.
It uses the current XML-Parser API, so even if we decide to include the
more complete XML implementation in Trunk, we will still have to change
it to make it work.


Levente

On Mon, 5 Oct 2020, Eliot Miranda wrote:

> Hi All,
>
> votes for & against inclusion of Monty’s XML in trunk/6 beta?
>
> +1
>
> _,,,^..^,,,_ (phone)
>
>> On Oct 4, 2020, at 12:44 PM, monty wrote:
>>
>> Installing the (head)s of:
>> XMLParser (which loads "XMLWriter" too)
>> XMLParser-XPath (be careful not to install the old, unfinished "XPath" project)
>> XMLParser-StAX
>> XMLParser-HTML
>> from the SqueakMap works fine; all the tests still pass. XMLParser installation still pops up a warning you have to click through b/c of a conflict with the old XMLParser in the image, but it's fine other than that.
>>
>> Supposedly there are conflicts with XStreams, but I have nothing to do with that.
>>
>> An aside: some tests are skipped by default because they rely on external resources, like the file system or HTTP, which makes it easier to distinguish real regressions from something like a network outage. The skipping is more obvious on Pharo because their TestCase has a #skip message that the TestRunner UI supports. On Squeak these tests just silently pass. To run them, send XMLSkippableTest #stopSkippingAll. Unfortunately b/c of the W3C rate limiting, the tests using xhtml DTDs timeout if you run them more than once. You could say they should be cached (the default resolver does just that), but the purpose of these tests is to test real HTTP retrieval, so that defeats the purpose.
>> ___
>> montyos.wordpress.com
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20201006/86ea29c8/attachment-0001.html>


More information about the Squeak-dev mailing list