[squeak-dev] Interesting retrieval error with XMLDOMParser (solution found, needs guru attention)
gettimothy
gettimothy at zoho.com
Thu Sep 23 17:16:21 UTC 2021
I think I found a fix.
for
(WebClient httpGet: 'https://w1.weather.gov/xml/current_obs/index.xml')
httpGet: calls httpGet: do: where we see the WebClient setting paramaters for the WebRequest thing in bold below.
httpGet: urlString do: aBlock
"GET the response from the given url"
"(WebClient httpGet: 'http://www.squeak.org') content"
| request |
self initializeFromUrl: urlString.
request := self requestWithUrl: urlString.
request method: 'GET'.
userAgent ifNotNil:[:ua | request headerAt: 'User-Agent' put: ua].
self contentDecoders ifNotNil: [:decoders | request headerAt: 'Accept-Encoding' put: decoders].
aBlock value: request.
^self sendRequest: request
If I take some of that stuff and put the userAgent part into XMLHTTPWebWebClientRequest>>basicSend
basicSend
self webClientClient userAgent ifNotNil:[:ua | webClientRequest headerAt: 'User-Agent' put: ua].
" self webClientClient contentDecoders ifNotNil: [:decoders | webClientRequest headerAt: 'Accept-Encoding' put: decoders]." <-- note commented out
^ self responseClass
request: self
webClientResponse:
(self webClientClient
"#sendRequest: unfortunately requires #initializeFromUrl:
to be sent first"
initializeFromUrl: self url;
sendRequest: self webClientRequest)
The following calls work:
(XMLDOMParser parseURL: 'https://www.w3schools.com/xml/simple.xml') explore.
(XMLDOMParser onURL: 'https://w1.weather.gov/xml/current_obs/index.xml' upToLimit:nil) parseDocument; explore.
(XMLDOMParser onURL: 'https://w1.weather.gov/xml/current_obs/index.xml') parseDocument; explore.
(XMLDOMParser parseURL: 'https://w1.weather.gov/xml/current_obs/index.xml') explore.
To summarize.
for
(WebClient httpGet: 'https://w1.weather.gov/xml/current_obs/index.xml') content.
The WebClient correctly sets the headers for the WebRequest.
For the XMLHTTPWebClientRequest which wraps both WebClient and WebRequest as instance variables,
the WebRequest headers are not properly set.
While I do not know enough about these systems to say definitively, it seems to me that the place to set them correctly is when the instance variable for WebRequest is created.
cheers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20210923/735521f0/attachment.html>
More information about the Squeak-dev
mailing list
|