[squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Mon Oct 26 13:25:45 UTC 2020


Hi Tom, Hi Tobias,


> Maybe I also missed an update that cleaned the diff but couldn't find a "perfect" one in the inbox thus far :)


<http://www.hpi.de/>
Is WebClient-Core-ct.128 not yet perfect enough for you? :-)

> And It won't work in 4 weeks time anyways

You can still use an access token as a password - actually, I never trust any Squeak image with my real GitHub password because Squeak does not store the password protected.

> Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

Still an edge case. Does this read nice to you?
https://github.com/Metacello/metacello/blob/3b7d6814d155088910d6c7b17e85d89ba55f078e/repository/Metacello-Platform.squeak.package/MetacelloSqueakPlatform.class/instance/httpGet.username.pass.do..st

Best,
Christoph
________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Tom Beckmann <tomjonabc at gmail.com>
Gesendet: Montag, 26. Oktober 2020 10:59:24
An: The general-purpose Squeak developers list
Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz

Hi Tobias,

> On 26.10.2020, at 08:39, Tom Beckmann <tomjonabc at gmail.com<mailto:tomjonabc at gmail.com>> wrote:
>
> Hey everyone,
>
> to sum up my understanding: the goal of this change is to change this pattern
>
> WebClient new
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>     do: [:req | req headerAt: 'Authorization' put: 'Basic ', 'myuser:mypasswd' base64Encoded]
>
> to
>
> WebClient new
>     preAuthenticateMethod: #basic;
>     user: 'myuser' password: 'mypasswd';
>     httpGet: 'https://api.github.com/repos/myuser/myprivaterepo/zipball/master'
>
> I dare say the fact that this pattern is required cannot be disputed since a number of services, such as Github, require the user to authenticate in this way.

Sure, but it wont work for me anyways, Since I have 2FA.
:D

Even with 2FA, `'Basic ', 'myuser:myPersonalAccessToken' base64Encoded` would currently give you the likely easiest way to access the endpoint using most clients. The way it reads [1] you are correct in that even this will no longer work and we would need to switch to `'token myPersonalAccessToken'` as the Authorization header. We should investigate soon if this would also affect the GitBrowser.

Best,
Tom

[1] https://developer.github.com/changes/2020-02-14-deprecating-password-auth/


On Mon, Oct 26, 2020 at 10:37 AM Tobias Pape <Das.Linux at gmx.de<mailto:Das.Linux at gmx.de>> wrote:

> On 25.10.2020, at 23:00, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de<mailto:Christoph.Thiede at student.hpi.uni-potsdam.de>> wrote:
>
> Hi Tobias,
>
> > The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> Hm, but this would be an information leak on the server-side. :-)

somewhat, and I get what GitHub does…


>
> > Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
>
> Sorry about that. Still, even in curl, anyauth is an opt-in feature, not an opt-out. So my proposal for adding #preAuthenticationMethod as an opt-in feature would be equivalent to adding an #anyAuth property as an opt-out. Why should WebClient be less powerful than curl? I see it can be abused, but Squeak already contains a lot of dangerous protocols that still can be useful in particular situations. Just insert a warning into the method comment and it'll be OK I think.

I don't think it works that way.


>
> > Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
> True; but still, this would require either a change in the design or an edge-case implementation because the WebClient connection logic is not aware of whether GitHub or BitBucket or whatever else should be contacted.

Or, you know, for wherever your GitHub client is implemented[1], make sure to first go to the authz url when you have a password?

[1]: Yes, I think you/we/one should have a GitHub client if working with an API. It is not just a simple "web site" anymore.


>
> > Otherwise, encode it in the URL?
>
> Do you mean like http://username:password@www.example.com? At the moment, WebClient is not treating this differently than WebClient >> #username and #password. Is this the correct behavior? curl, again, uses preauth in this situation, and Pharo does this, too. Unfortunately, I could not find a clear answer to this question in RFC1738 ...

ahh yes, :D.
I just imagined a browser, which would ask for PW on 401, but sent it anyway if given in the URL.


=-=-=
Think of it that way. You do not present your passport at every door you ring a bell, but only those you're asked for it. ;)



>
> > Don't rely on just my "judgement" ;)
>
> Your arguments are highly appreciated! I'm just trying to figure out your motivations ... Yepp, some >=3rd opinions would be a good thing. :-)


:)

best regards
        -tobias
>
> Best,
> Christoph
> Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org<mailto:squeak-dev-bounces at lists.squeakfoundation.org>> im Auftrag von Tobias Pape <Das.Linux at gmx.de<mailto:Das.Linux at gmx.de>>
> Gesendet: Sonntag, 25. Oktober 2020 22:19:49
> An: The general-purpose Squeak developers list
> Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
>
> Hi
>
> > On 25.10.2020, at 22:03, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de<mailto:Christoph.Thiede at student.hpi.uni-potsdam.de>> wrote:
> >
> > Hi Tobias,
> >
> > sorry for the long delay. I was on holiday a few days and did not manage to return earlier to this interesting topic ...
> >
> >
> > > No thats wrong. Curl will only send auth data if you provide it.
> >
> > > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> > > The point is, you instruct Curl to _provide credentials unconditionally_.
> >
> > I think this is exactly the point. "curl -u user:pw" reads as "Download a resource, 'specify[ing] the user name and password to use for server authentication"' (cited from the man).
>
> Yes, that means _unconditionally_, even if the source does not need it.
> This is called an information leak.
>
>
> > If I do
> > WebClient httpDo: [:client | client username: 'user'; password: 'pw'; get: 'https://example.com/rest/whoami'],
> > this reads exactly the same for me. Otherwise, #username: and #password: better might be renamed into #optionalUsername/#lazyUsername/#usernameIfAsked etc.
> >
>
> Nope.
>
> > Apart from this, I have tested the behavior for Pharo, too, where the default web client is ZnClient: And it works like curl, too, rather than like our WebClient, i.e. the following works without any extra low-level instructions:
> >
> > ZnClient new
> > url: 'https://api.github.com/repos/LinqLover/openHAB-configuration/zipball/master';
> > username: 'LinqLover' password: 'mytoken';
> > downloadTo: 'foo.zip'
> >
> > > So you always know beforehand which resources need authentication?
> > > Neat, I dont :D
> >
> > I suppose we are having different use cases in mind: You are thinking of a generic browser application while I am thinking of a specific API client implementation. Is this correct?
>
> No. The API should stick to the HTTP RFCs and use 401 to say: You need authentication.
>
> > If I'm developing a REST API client, I do have to know whether a resource requires authentication or whether it doesn't. This is usually specified in the API documentation. Why add another level of uncertainty by using this trial-and-error strategy?
>
> Because preemtive auth is wrong.
>
> Sadly you omitted the anyauth stuff that actually works how Authentication in HTTP is spec'ed.
> Yes, it is "one request more". Yes, it is right thing to do.
>
> Just because it is convenient and just because people are doing it, it does not mean it is good.
> In fact, the whole "curl as API-consumer" is fine, but sticking "-u" to each and every request is a security nightmare just second to "curl ... | sudo bash".
>
> >
> >
> > In the context of my Metacello PR, the problem is that following your approach of specifying the Authorization header would mess up all the different layers of abstraction that are not aware of web client implementations and headers but only of a URL and a username/password pair. I had hoped that I could pass a constant string 'Basic' to the Authorization header for all cases where the WebClient is invoked, but unfortunately, GitHub does not understand this either, the header must contain the password even in the first run.
>
> Except when you use  "https://api.github.com/authorizations" first, which 401s.
>
>
> > It would lead to some portion on unhandsome spaghetti code if I had to implement an edge case for GitHub in the relevant method (MetacelloSqueakPlatform class >> #downloadZipArchive:to:username:pass:); for this reason, I would find it really helpful to turn on preAuth at this place.
>
> > Do you dislike this feature in general, even when turned off by default?
>
> Yes. WebClient is not just an API-consumer. It ought to be safe.
> Otherwise, encode it in the URL?
>
> > I'm not even requiring to make this an opt-out feature, this inbox version only implements it as an opt-in.
>
> I don't know.
>
> I think there has been too little input from others here.
> Don't rely on just my "judgement" ;)
>
> Best regards
>         -Tobias
>
>
> >
> > Best,
> > Christoph
> >
> > Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org<mailto:squeak-dev-bounces at lists.squeakfoundation.org>> im Auftrag von Tobias Pape <Das.Linux at gmx.de<mailto:Das.Linux at gmx.de>>
> > Gesendet: Dienstag, 13. Oktober 2020 10:04:23
> > An: The general-purpose Squeak developers list
> > Betreff: Re: [squeak-dev] The Inbox: WebClient-Core-ct.126.mcz
> >
> > Hi
> >
> > > On 12.10.2020, at 23:42, Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de<mailto:Christoph.Thiede at student.hpi.uni-potsdam.de>> wrote:
> > >
> > > Hi Tobias,
> > >
> > > okay, I see this authorization pattern now, so you mentioned two ways to work around the current limitations:
> > > First, by GETting https://api.github.com/authorizations before, or second, by passing the Authorization header manually.
> > > Is this correct?
> >
> > Yes. However, the second one is the one GitHub "recommends"
> >
> >
> > >
> > > However, both of these approaches lack of the RESTful-typical simplicity of "making a single HTTP request without dealing with complex call protocols or low-level connectivity code". To give an illustration of my use case, please see this PR on Metacello:https://github.com/Metacello/metacello/pull/534
> > > IMHO it would be a shame if you could not access a popular REST API like api.github.com<http://api.github.com> in Squeak using a single message send to the WebClient/WebSocket class.
> >
> > There is no such thing as simplicity when a REST-Based resource-provider supports both authenticated and unauthenticated access.
> > If you cannot know beforehand, no single-request stuff is gonna help. No dice.
> >
> >
> > >
> > > > > Why not?
> > > >
> > > > It leaks credentials unnecessarily.
> > >
> > > Ah, good point! But this pattern (EAFP for web connections) is not really state of the art, is it? As mentioned, curl, for example, sends the authentication data in the very first request, which is a tool I would tend to *call* state of the art.
> >
> > No thats wrong. Curl will only send auth data if you provide it.
> >
> > Doing "curl -u user:pw" is the same as using WebClient httpGet:do: and adding the Authorization header manually
> >
> >
> > The sequence is split manually:
> > ```
> > $ curl https://api.github.com/repos/krono/debug/zipball/master
> > {
> >   "message": "Not Found",
> >   "documentation_url": "https://docs.github.com/rest/reference/repos#download-a-repository-archive"
> > }
> > # Well, I'm left to guess. Maybe exists, maybe not.
> > $ curl -u krono https://api.github.com/repos/krono/debug/zipball/master
> >
> > ```
> > (In this case, I can't even show what's going on as I use 2FA, which makes single-request REST to _never_ work on private repos.)
> >
> > The point is, you instruct Curl to _provide credentials unconditionally_.
> > The "heavy lifting" of finding out when to do that is not done by curl but by users of curl.
> >
> > Look:
> >
> > ```
> > $ curl http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > $
> > # Well, no response?
> > $ curl  -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de> (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de>
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:43:04 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de> left intact
> > ```
> >
> > Thats the 401 we're looking for. We have found that the resource needs authentication.
> >
> > Sidenote: Curl can do the roundtrip (man curl, search anyauth):
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de> (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de>
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 401
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Length: 0
> > < WWW-Authenticate: Basic realm="SwaSource - XP aware"
> > <
> > * Connection #0 to host www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de> left intact
> > * Issue another request to this URL: 'http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpaware/'
> > * Found bundle for host www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de>: 0x7fb8c8c0b1b0 [can pipeline]
> > * Re-using existing connection! (#0) with host www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de>
> > * Connected to www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de> (2001:638:807:204::8d59:e178) port 80 (#0)
> > * Server auth using Basic with user 'topa'
> > > GET /hirschfeld/squeaksource/xpaware/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de>
> > > Authorization: Basic *******************
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:05 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 15131
> > < Vary: Accept-Encoding
> > ```
> >
> > And in this case it does _not_ send auth in the first request but only in the second.
> >
> > Sidenote2: If the first request comes back 200, no second one is issued, no credentials leak:
> >
> > ```
> > $ curl -u topa --anyauth -v http://www.hpi.uni-potsdam.de/hirschfeld/squeaksource/xpforums/
> > Enter host password for user 'topa':
> > *   Trying 2001:638:807:204::8d59:e178...
> > * TCP_NODELAY set
> > * Connected to www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de> (2001:638:807:204::8d59:e178) port 80 (#0)
> > > GET /hirschfeld/squeaksource/xpforums/ HTTP/1.1
> > > Host: www.hpi.uni-potsdam.de<http://www.hpi.uni-potsdam.de>
> > > User-Agent: curl/7.54.0
> > > Accept: */*
> > >
> > < HTTP/1.1 200
> > < Date: Tue, 13 Oct 2020 07:46:56 GMT
> > < Server: nginx/1.14.2
> > < Content-Type: text/html
> > < Content-Length: 75860
> > < Vary: Accept-Encoding
> > ```
> >
> >
> >
> >
> >
> > > And speed is another point, given the fact that internet connections in Squeak are really slow ...
> > > Why do you call this behavior a leak? The application developer will not pass authentication data to the web client unless they expect the server to consume these data anyway.
> >
> > So you always know beforehand which resources need authentication?
> > Neat, I dont :D
> >
> > > If you deem it necessary, we could turn off the pre-authentication as soon as the client was redirected to another server ...
> >
> > What happens here is that we're bending over backwards because github decided to be a bad player.
> >
> > I mean, on most sited you visit in browsers, no auth data is sent _unless_ you are asked to (redirect to a login) or you _explicitely_ click on a login link.
> >
> > If you want preemtive auth, do it with WebClient httpGet:do:.
> >
> >
> >
> > >
> > > > I understand that the method is maybe not the most common style, but I think that functional changes should in such cases not be mixed with style changes.
> > >
> > > Alright, please see WebClient-Core-ct.128. But maybe we should consider to use prettyDiff for the mailing list notifications as a default? Just an idea.
> >
> > I personally find prettydiffs useless, but that's just me.
> >
> > Best regards
> >         -Tobias



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20201026/4e5c8edf/attachment-0001.html>


More information about the Squeak-dev mailing list