package universes and filters question

lex at cc.gatech.edu lex at cc.gatech.edu
Tue Aug 10 20:20:26 UTC 2004


goran.krampe at bluefish.se wrote:
> For example - the idea of being able to set up private "submaps" (or
> whatever) or even public ones is a feature that I *want* to offer - I
> just *also* want to avoid all the negatives of such a model, see below.
> 

Yes, but then you keep coming back to:

	All in all it is very, very simply.
	We need ONE model. We need ONE map.


I don't see how to have both.


> Perhaps a good way to go forward here is to start talking about use
> cases? What are the use cases we want to support?

That might help, but I think we all have the idea. Let me quickly try.

The main functionality I have aimed for is that people can open a tool,
see a list of things that are available to install, select things to
install, and have it happen.  The user interaction should be minimized
beyond selection of what they want to be in their image.

Developers should be able to register packages to be available for other
users to install.

Users should be able to select from different groups of packages that
have been assembled under different assumptions.  Most notably, users
should be able to build an image based only on stable packages that have
only had bugfixes for a while and which have all been tested with each
other loaded.  Thus, the system needs to support having different
policies for different groups of packages.

Finally, ideally, it should be possible to use the technology even if
you don't want to use any particular public server for some reason. 
These reasons include private material, strictly enforced policies, and
random experiments that the central server's admin happens to disapprove
of.


> > In the meantime, I certainly suggest that people keep posting stuff on
> > SqueakMap no matter how much they use Universes.  They are complementary
> > tools.
> 
> Well, I would say they are competing tools. I may be wrong.

I see no reason they compete.  Freshmeat and apt are not in competition,
for example.  Neither are Google and Yahoo.




> > Please consider carefully this one issue: does *everything*, down to
> > every last minor variation of every package anyony posts for any purpose
> > whatsoever, really need to be in the same index of packages?  As a mind
> > experiment for what this map will be like, imagine from the Linux world
> > that someone made a big index holding every package in every version of
> > Slackware, Debian, Redhat, Gentoo, and OpenBSD.  My mental picture of
> > this has a lot of minor variations of most packages.  There will be a
> > dozen or more gcc-2.95's.
> 
> The comparison isn't "fair" IMHO. I could argue the other way and say
> "Does all the 12000 packages (or how many they are now) in Debian really
> need to be in one repository?".

You asked the question wrong.  It should be: "do all 12,000,000 Linux
packages need to be in one repository?"  Debian's answer is no.  Pick
12,000 packages that work together.


> Btw - "same index of packages" above doesn't really say anything about
> the architecture of it. 

Agreed.  I am talking mostly about the model, and it can be implemented
different ways.



> This kind of reasoning doesn't help me. The first scenario you wrote was
> "designed" to sound scary :) didn't work as a comparison. The second
> "scenario" doesn't sound hard at all to maintain within a single logical
> map (still disregarding any architecture questions).

It seems hard to me, but I look forward to seeing what you come up with.





> To me it is obvious that there should just be a SINGLE SMPackage called
> SharedStreams - because there *is* only one such logical package. Then
> it has multiple releases - and those have attributes. Simple, easy.

I disagree.  Why make this assumption?  This is a design issue just like
any other, and actually it seems very reasonable that "IRC" gives you
different packages in different universes.  A 2.8 universe would give
you the original one, and a 3.8 one would likely give you the new
"enhanced" client.


> And the universe you describe above just turns out to be a *view* (=sub
> selection of the model based on various criteria). 

It is not just a sub-selection, because the same package in different
universes will have different attributes.  The name, description,
version, and dependencies can all change depending on what universe you
are in.

Or alternatively, you say that no package is in multiple universes, but
then you have to post multiple copies of a package that are tweaked in
various ways.



> It would be simple to filter out all packages marked as beta or better,
> for my Squeak version with at least one published realeas available etc.
> Tada.

That does not get you all the way there.  Aside from the above issue,
you also need to convince people to actually publish packages that will
appear under the criteria.



> > I ask about this one issue because if you let go of the idea of having
> > absolutely everything in one index, then the rest of the universes
> > approach seems to mesh very nicely with where Squeak Map seems to be
> > going.  It becomes straightforward to let maps be merged into larger
> > maps, and to let non-central maps be locally administered with their own
> > accounts and policies.  We could then have simple dependencies, stable
> > releases, mixin servers, private servers, and local update policies, all
> > without requiring any further cleverness.
> 
> Lex - you make it sound sooo easy! But it isn't. I do want to
> investigate how my "tree model" perhaps could be turned into a model
> where maps can be mixed more freely BUT... it comes with lots of
> effects:

Well, please give it a shot.  In the Universes system, I have
accepted two things: the local copy is always somewhat stale, and the
individual elements of a compound map may get updated at different
rates.  Once I accept these, I buck any inconsistencies problems up to
the user tools, e.g. "Package Foo depends on Bar which is not
available."  Maybe these assumptions would be useful for distributed
SqueakMap's.


> If you look at the current SM model there are various crosslinks in the
> model, just a few examples:
> 
[...examples...]
> Now... if the above crosslinks weren't there at all - if the SM model
> was just a bunch of independent object "trees" - then merging them would
> be trivial, because the information would be "independent".
> 
> Now that is not true, and I have said this over and over. Julian told me
> that, ok, but the "forreign" links - can't they simple be encoded as the
> UUID so that they can be ripped out, put on another server and then
> merged back together and then they get resolved? Hehe... sure. And what
> happens when the thing it refers to isn't there anymore? etc. etc.

Maybe you should reconsider names instead of UUID's?  Or maybe every
UUID should be transmitted along with a suggested name?

Anyway, I wonder if broken links are really such a problem, even with
UUID's.  "Package foo depends on some package that is not available". 
Maybe you can just assume that links sometimes get and deal with it,
just like we deal with undeclared variables in a running Squeak image.



> In short - I am saying for the 11th time: Splitting an object model over
> multiple servers and then somehow magically being able to modify them in
> a distributed fashion and just merging them back together... it is a
> hard problem.

It is an issue.  But if you design it into your model it does not have
to be bad.




> Now, instead of going over and over like a broken record, this is what I
> plan to do:
[... no consideration of map-per-universe...]

Okay, then.  Good luck.



> But I am very afraid of the effects when people start setting up their
> own little maps all over the place with lots of duplication, redundancy,
> synch problems, servers being down/up, servers simply being unknown
> etc... well. Obviously this doesn't scare you at all, which I don't
> understand why. These are the things I can already hear:

Well, no, they do not bother me.  Theory aside, it seems to work well in
Debian.


But here is some theory to perhaps ease your worries:

	1. Maps matter precisely to the people who know about them.  Unknown
maps don't matter to people who don't know about them. There would still
be a central development map that all the mutually-compatible public
stuff is in, and that would be the only one that many users ever use.

	2. Most users go to one map -- either a development map or one of the
stable ones -- and download all the packages they ever download via that
one map.  They know precisely where to look for stuff, and many users
will never know about servers at all.

	3. The servers only hold the catalogs, not necessarily the packages
themselves.  It is usually fine to use a slightly old catalog.  If a
server is down, the client tools still work.
	

Given this theory, your questions are easy to answer:


> "Hey! Where did you find that? Oh, I didn't know about *that* server....
> Hmmm, it isn't up now, do you have a copy you can email me?"

The first answer would usually be "in the package browser", so the rest
of your scenario would be so rare as to be irrelevant.



> "Hi! I just posted my little Application on my own map *here*. Bye!"

"What a dork!"

In more detail: why would anyone do this?   Hardly anyone will
see this guy's application, and it's their fault.


> "Hmmm, does anyone have a list of all known maps at this point in time?
> Sure, here is my list, but I heard that Ned has a bigger list with more
> stuff on it."

"Go look on the Swiki's All Known Maps page."


> "How many packages are there for Squeak? Well, we don't really know. "

There are:

	UUniverse universeForSystem packages size  -->  431
	
packages (counting versions multiply) in the version of Squeak I am
using.  For the stable release, there are:

	universe _ UUniverse stableUniverse.
	client _ (UUniverseClient forUniverse: universe).
	client sendMessage: UMRequestPackages new.
	client processIO.  "repeat a few times"
	universe packages size.   -->  428

If you don't want to count multiple versions individualy, then use
#packageNames insetad of #packages.

If you literally mean all packages in the world, then learn to live
without knowing.  This is fundamentally unknowable so long as people
keep things private from each other.


> "Ok,
> but can someone tell me where to look to find ZZZ? Sure, you can look
> here, and here, and here, and perhaps over there..."

"Open a package browser.  If it's not there, then don't bother."

If that sounds harsh, what do you suggest telling people who run into
the same situation on SqueakMap?


> Am I the
> ONLY one who remembers how it was before SM? Am I the only Debian user
> who has been hunting for .deb packages on web sites, looking around for
> entries to put in my sources.list, wondering where to find package XXX
> that actually works on Debian?

I only need to do this in cases where it is against policy for them to
put it in the central servers.  What do you propose people do in
SqueakMap when it is against policy to post something on there?




> > In summary, I would be happy to have one tool or toolset we all agree on
> > using.  But while the really great Grand Unified Tools are still being
> > developed, shouldn't we use simple existing solutions and make do as
> > well as possible?
> 
> Of course we should. I just don't see why you portrait yourself as the
> proponent for something "simple" and "existing" when in fact SM is the
> thing that exists and is simple. :) Really.

SqueakMap does not meet the use cases I gave above, while the simpler
Universes system does.  Thus I think this is a fair assessment.  SM, of
course, also does other things that Universes does not do, but those
things are not required for the use cases I gave.

Lex



More information about the Squeak-dev mailing list