[squeak-dev] Development methodology (was: tedious programming-in-the-debugger error needs fixing)

Fri Oct 9 19:54:37 UTC 2020

Hi Chris,

Thanks for still participating!

Am Fr., 9. Okt. 2020 um 01:33 Uhr schrieb Chris Muller <asqueaker at gmail.com>:
>>
>> What exactly do you think is so massive about Git?
>
>
> I wanted to grok git by approaching it via its public API.  With my new GraphQL expertise, I figured I could spank out a client prototype in an evening.  Then I found the schema:
>
>    https://docs.github.com/en/free-pro-team@latest/graphql/overview/public-schema
>

Oh, the GitHub GraphQL API is definitely not Git! Git has nothing to
say about GraphQL, nor does it say anything about issues, apps, gists,
reactions, pull requests, ... just to name a few.

>
> A lot of extra, ignorable features, basically is the definition of over-engineered.  Don't get me wrong, it's a great tool for developers with your level of expertise.  I'm more of a "user", though, the extra stuff is harder for me.
>

Git was built with some technical goals in mind and a sane user
interface appeared only gradually over its first years. That supports
your impression.

Alternative user interfaces have been suggested on top of the data
model. https://gitless.com/ for example. The Git Browser uses some of
its concepts, such as not having a staging area and selecting what's
in and out of a commit while creating the new version, just like
Monticello. Gitless is just a different command line UI, everything
that is going on in the repository is just the same. Like TortoiseGit
and the Git Browser are just different graphical user interfaces, the
former for files, the latter for objects.

Conclusion: the "extraneous" features of the tools serve a purpose in
their specific context. Since we have a different context, we will
build different tools of course. That does not remove any of the
advantages of using Git as a repository, or versions database if you
will.

A database with advanced tool support out in the world for various
needs, such as collaboration supported by platforms like GitHub. That
is an option, but it can be a useful one.

I would like to make a pun of the "ignorable" features: the Git index
or staging area, for example, is a means to achieve the partial commit
feature that Eliot mentioned Vanessa has added to Monticello (i. e.
the "ignore" feature). But the Git command line user interface lives
in a world of bytes, text, and files (instead of specialized objects)
and command lines (instead of graphical tools). You could say that
this ignore variable of the MCSaveDialog is the negated equivalent of
the index.

>
> Support for branching was added to Monticello in 2012.  See MCVersionNameTest>>#testBranches.  Eliot used them during development of cog or spur, but I'm not aware of this feature having been all that critical for Squeak.  We tend to like just one master branch with occasional releases.  But, it's there and basically achieves the needed functionality.
>

But it is a hidden feature because nothing in the UI indicates its
existence, right? This makes it harder to use. Also it is piggy-backed
on a different concept (version names), that's why I consider this a
workaround about the limitations of the data model.

Also your statement does not address the problems of branching in a
multi-package project that I wrote about in the earlier long message.

>
> I thought the social benefits (exposure and growth, I guess) were via exposure to the github user base.  If we hosted ourselves, would it be any different than the "deserted island" situation we have now?  That's all what I was referring to.  I thought the purpose was for the exposure.  If you have to host yourself anyway then we're basically down to a tool comparison?
>

You do not *have to* host yourself. :-) But you could if you don't
trust the corporate (yet very open-source-supporting) GitHub.

The social benefits of exposure do also exist, but it was not in the
focus of this thread. That was an improved development process with
regards to the easier tracking of contributions. For example, if you
re-work your inbox submission and create another version, it will open
another thread on the mailing list, with no traceable connection to
the previous conversation. Not so with pull requests on GitHub (or
whatever they are called on the alternatives to GitHub), if you update
your submission on the same branch (even if you replace the commits
that you sent before), the old conversation is still attached to the
pull request.

By the way, that is something you could trace via that exhaustive
GitHub GraphQL API if you wanted to build a Squeak interface to GitHub
pull requests. ;-) Sure it seems overwhelming, but who says that you
have to consume and support it all? It just offers a lot of options,
at your service... I don't think this is a bad thing.

>>
>> Chris Muller-3 wrote
>>
>> > For example, you could submit an
>> > improvement that allows original contributors of Inbox items to move them
>> > to Treated themself.
>>
>> How? Only Trunk committers have access to the Squeaksource treating backend,
>> so neither the code nor the tool is available to normal users for
>> improvement. Guest users cannot even delete versions from the inbox
>> repository, can they?
>
>
> It's public read, anyone can access the code and submit improvements to the Inbox.  But, I agree, it'd be nice if there were a way for strangers to contribute more obviously.  One idea would be for each SqueakSource repository to have it's own internal "Inbox" Repository to support some of these features...
>

That would be a nice first step towards something like pull requests
for projects other than Trunk. But you didn't address that submitters
cannot move their own contributions to Treated, even if they wanted
to.

>
> The point is Monticello is relatively small, simple, and malleable, and this makes it feasible to improve, even if getting it implemented requires writing email.
>

Tools on top of the Git data model would also be malleable. Writing
email is not the primary issue here (except for the generational
favoring of platforms over mailing lists maybe). Integrating the
conversation with the code contributions is. As we said, if we were to
use GitHub, we should make sure to tie in the conversations there into
the mailing lists, just like it has been done for OpenSmalltalk-VM.

> Monticello does suffer from some scalability issues that will eventually need to be addressed.

When is the eventual time?

Git is one possible solution to the scalability problem. One solution
where you don't have to reinvent the hosting software. One solution
that already has a pure-Smalltalk implementation.

>
> But the redundancy among versions is a feature, not a flaw.  To this day, planes have poor internet access, this redundancy is about availability.  In the Magma-based, there is only one instance of each MCDefinition shared amongst all Versions, but not everyone set that up on their laptop (I made it as easy as I could).
>

Git uses similar value object sharing in its data model. :-) And if
you clone a Git repository, you have it right on your laptop. If you
download a single Monticello version and go offline, you just have one
snapshot and an ancestry where you cannot look at the past snapshots.
With a Git repository, you typically have everything at your hands.

Of course you shouldn't clone the repository just to install some
package (as opposed to develop it). The download-to-install use case
is typically satisfied by explicit releases where you put an archive
or installer somewhere. Monticello also fills this role currently,
next to SqueakMap. In the Git world, there are typically web
interfaces that allow you to download just one particular snapshot.
Metacello uses the particular HTTP interface of GitHub to download a
zip, for example.

>
> What I had wanted to do start by sucking in their GraphQL schema into my fantastic new GraphQL Engine, and map their types to the Monticello types.  Basically appear to BE a Git server, but mapped behind the scenes to legacy MC repos that could be accessed via the legacy way, for users that wanted to.
>

Hmm this breaks down because that schema is not just about Git, it is
also about all of GitHub as indicated above. So your server that
implements the whole schema would really be another GitHub server, not
another Git server.

The basic objects of Git are just blobs, trees, commits, tags, and
refs. But this is not the abstraction level of an MCDefinition. There
is no point to have an MCTreeDefinition. Instead you would map your
MCSomethingDefinition into a tree of blobs and store that in the Git
repository. Blobs, trees, and commits would be handled by the
MCRepository subclass and something like the MCWriters and MCReaders
instead. Does this make sense to you?

To be a Git server, you need to be able to manipulate repositories and
to provide the fetch and push interface. For repositories, the
Smalltalk code is already there. For the interface, only the client
side of the protocols is implemented in Smalltalk yet.

Kind regards,
Jakob