[squeak-dev] Interesting retrieval error with XMLDOMParser (solution found, needs guru attention)

gettimothy gettimothy at zoho.com
Thu Sep 23 17:16:21 UTC 2021


I think I found a fix.





for 

(WebClient httpGet: 'https://w1.weather.gov/xml/current_obs/index.xml')









httpGet: calls httpGet: do:  where we see the WebClient setting paramaters for the WebRequest thing in bold below.



httpGet: urlString do: aBlock

"GET the response from the given url"

"(WebClient httpGet: 'http://www.squeak.org') content"



| request |

self initializeFromUrl: urlString.

request := self requestWithUrl: urlString.

request method: 'GET'.

userAgent ifNotNil:[:ua | request headerAt: 'User-Agent' put: ua].

self contentDecoders ifNotNil: [:decoders | request headerAt: 'Accept-Encoding' put: decoders].

aBlock value: request.

^self sendRequest: request





If I take some of that stuff and put the userAgent part into XMLHTTPWebWebClientRequest>>basicSend



basicSend



self webClientClient userAgent ifNotNil:[:ua | webClientRequest headerAt: 'User-Agent' put: ua].

"	self webClientClient contentDecoders ifNotNil: [:decoders | webClientRequest headerAt: 'Accept-Encoding' put: decoders]."   <-- note commented out



^ self responseClass

      request: self

      webClientResponse:

            (self webClientClient

                  "#sendRequest: unfortunately requires #initializeFromUrl:

                  to be sent first"

                  initializeFromUrl: self url;

                  sendRequest: self webClientRequest)





The following calls work:



(XMLDOMParser parseURL: 'https://www.w3schools.com/xml/simple.xml')  explore.

(XMLDOMParser onURL: 'https://w1.weather.gov/xml/current_obs/index.xml' upToLimit:nil) parseDocument; explore.

(XMLDOMParser onURL: 'https://w1.weather.gov/xml/current_obs/index.xml') parseDocument; explore.

(XMLDOMParser parseURL: 'https://w1.weather.gov/xml/current_obs/index.xml')  explore.





To summarize.



for 

(WebClient httpGet: 'https://w1.weather.gov/xml/current_obs/index.xml') content.





The WebClient correctly sets the headers for the WebRequest.







For the XMLHTTPWebClientRequest which wraps both WebClient and WebRequest  as instance variables,

the WebRequest headers are not properly set.





While I do not know enough about these systems to say definitively, it seems to me that the place to set them correctly is when the instance variable for WebRequest is created.



cheers.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20210923/735521f0/attachment.html>


More information about the Squeak-dev mailing list