Hi Chris,
Thanks for still participating!
Am Fr., 9. Okt. 2020 um 01:33 Uhr schrieb Chris Muller asqueaker@gmail.com:
What exactly do you think is so massive about Git?
I wanted to grok git by approaching it via its public API. With my new GraphQL expertise, I figured I could spank out a client prototype in an evening. Then I found the schema:
https://docs.github.com/en/free-pro-team@latest/graphql/overview/public-sche...
Oh, the GitHub GraphQL API is definitely not Git! Git has nothing to say about GraphQL, nor does it say anything about issues, apps, gists, reactions, pull requests, ... just to name a few.
A lot of extra, ignorable features, basically is the definition of over-engineered. Don't get me wrong, it's a great tool for developers with your level of expertise. I'm more of a "user", though, the extra stuff is harder for me.
Git was built with some technical goals in mind and a sane user interface appeared only gradually over its first years. That supports your impression.
Alternative user interfaces have been suggested on top of the data model. https://gitless.com/ for example. The Git Browser uses some of its concepts, such as not having a staging area and selecting what's in and out of a commit while creating the new version, just like Monticello. Gitless is just a different command line UI, everything that is going on in the repository is just the same. Like TortoiseGit and the Git Browser are just different graphical user interfaces, the former for files, the latter for objects.
Conclusion: the "extraneous" features of the tools serve a purpose in their specific context. Since we have a different context, we will build different tools of course. That does not remove any of the advantages of using Git as a repository, or versions database if you will.
A database with advanced tool support out in the world for various needs, such as collaboration supported by platforms like GitHub. That is an option, but it can be a useful one.
I would like to make a pun of the "ignorable" features: the Git index or staging area, for example, is a means to achieve the partial commit feature that Eliot mentioned Vanessa has added to Monticello (i. e. the "ignore" feature). But the Git command line user interface lives in a world of bytes, text, and files (instead of specialized objects) and command lines (instead of graphical tools). You could say that this ignore variable of the MCSaveDialog is the negated equivalent of the index.
Support for branching was added to Monticello in 2012. See MCVersionNameTest>>#testBranches. Eliot used them during development of cog or spur, but I'm not aware of this feature having been all that critical for Squeak. We tend to like just one master branch with occasional releases. But, it's there and basically achieves the needed functionality.
But it is a hidden feature because nothing in the UI indicates its existence, right? This makes it harder to use. Also it is piggy-backed on a different concept (version names), that's why I consider this a workaround about the limitations of the data model.
Also your statement does not address the problems of branching in a multi-package project that I wrote about in the earlier long message.
I thought the social benefits (exposure and growth, I guess) were via exposure to the github user base. If we hosted ourselves, would it be any different than the "deserted island" situation we have now? That's all what I was referring to. I thought the purpose was for the exposure. If you have to host yourself anyway then we're basically down to a tool comparison?
You do not *have to* host yourself. :-) But you could if you don't trust the corporate (yet very open-source-supporting) GitHub.
The social benefits of exposure do also exist, but it was not in the focus of this thread. That was an improved development process with regards to the easier tracking of contributions. For example, if you re-work your inbox submission and create another version, it will open another thread on the mailing list, with no traceable connection to the previous conversation. Not so with pull requests on GitHub (or whatever they are called on the alternatives to GitHub), if you update your submission on the same branch (even if you replace the commits that you sent before), the old conversation is still attached to the pull request.
By the way, that is something you could trace via that exhaustive GitHub GraphQL API if you wanted to build a Squeak interface to GitHub pull requests. ;-) Sure it seems overwhelming, but who says that you have to consume and support it all? It just offers a lot of options, at your service... I don't think this is a bad thing.
Chris Muller-3 wrote
For example, you could submit an improvement that allows original contributors of Inbox items to move them to Treated themself.
How? Only Trunk committers have access to the Squeaksource treating backend, so neither the code nor the tool is available to normal users for improvement. Guest users cannot even delete versions from the inbox repository, can they?
It's public read, anyone can access the code and submit improvements to the Inbox. But, I agree, it'd be nice if there were a way for strangers to contribute more obviously. One idea would be for each SqueakSource repository to have it's own internal "Inbox" Repository to support some of these features...
That would be a nice first step towards something like pull requests for projects other than Trunk. But you didn't address that submitters cannot move their own contributions to Treated, even if they wanted to.
The point is Monticello is relatively small, simple, and malleable, and this makes it feasible to improve, even if getting it implemented requires writing email.
Tools on top of the Git data model would also be malleable. Writing email is not the primary issue here (except for the generational favoring of platforms over mailing lists maybe). Integrating the conversation with the code contributions is. As we said, if we were to use GitHub, we should make sure to tie in the conversations there into the mailing lists, just like it has been done for OpenSmalltalk-VM.
Monticello does suffer from some scalability issues that will eventually need to be addressed.
When is the eventual time?
Git is one possible solution to the scalability problem. One solution where you don't have to reinvent the hosting software. One solution that already has a pure-Smalltalk implementation.
But the redundancy among versions is a feature, not a flaw. To this day, planes have poor internet access, this redundancy is about availability. In the Magma-based, there is only one instance of each MCDefinition shared amongst all Versions, but not everyone set that up on their laptop (I made it as easy as I could).
Git uses similar value object sharing in its data model. :-) And if you clone a Git repository, you have it right on your laptop. If you download a single Monticello version and go offline, you just have one snapshot and an ancestry where you cannot look at the past snapshots. With a Git repository, you typically have everything at your hands.
Of course you shouldn't clone the repository just to install some package (as opposed to develop it). The download-to-install use case is typically satisfied by explicit releases where you put an archive or installer somewhere. Monticello also fills this role currently, next to SqueakMap. In the Git world, there are typically web interfaces that allow you to download just one particular snapshot. Metacello uses the particular HTTP interface of GitHub to download a zip, for example.
What I had wanted to do start by sucking in their GraphQL schema into my fantastic new GraphQL Engine, and map their types to the Monticello types. Basically appear to BE a Git server, but mapped behind the scenes to legacy MC repos that could be accessed via the legacy way, for users that wanted to.
Hmm this breaks down because that schema is not just about Git, it is also about all of GitHub as indicated above. So your server that implements the whole schema would really be another GitHub server, not another Git server.
The basic objects of Git are just blobs, trees, commits, tags, and refs. But this is not the abstraction level of an MCDefinition. There is no point to have an MCTreeDefinition. Instead you would map your MCSomethingDefinition into a tree of blobs and store that in the Git repository. Blobs, trees, and commits would be handled by the MCRepository subclass and something like the MCWriters and MCReaders instead. Does this make sense to you?
To be a Git server, you need to be able to manipulate repositories and to provide the fetch and push interface. For repositories, the Smalltalk code is already there. For the interface, only the client side of the protocols is implemented in Smalltalk yet.
Kind regards, Jakob