[squeak-dev] HTML parser (again) (again)

Sean P. DeNigris sean at clipperadams.com
Fri Oct 29 07:40:13 UTC 2010


All the threads in the mailing list seem to die off unresolved.  What are the
options available in current Squeak, and what are the differences?

Here's my experience:
1. HTML (Squeaksource) - parser pulled out of Scamper.  Loads in Squeak 4.1,
seems awkward, but maybe I don't know how to use it.  Here's a snippet I
wrote that worked:
  doc := HtmlParser parse: htmlString.
  thread := doc body subEntities detect: [ :e | e id = 'thread' ].
  subject_header := thread contents detect: [ :e | e id = 'message_heading'
].
  subject := (subject_header contents at: 1) text.
2. HTML & CSS Validating Parser (Squeaksource) - It loads, but I don't have
the slightest clue how to use it.  I found references to people using it. 
They must be Alan Kay's close relatives, or live in machine world like the
Lawnmower Man because I couldn't find a shred of documentation or even one
class that looked plausible as a starting point.
3. Soup
  * In 4.1
     - "Installer squeaksource project: 'Soup'; install: 'Soup'" ->"depends
on... RxMatcher" warning
     - "Installer squeaksource project: 'Soup'; install:
'Network-Protocols'", then Soup -> "NonBooleanReceiver: proceed for truth"
(See [1] for log).

Also, two general points that would put turbo boosters behind the community:
<rant>
1. I know this is totally ungrateful, but please, if you design a library
and are nice enough to release it for free to the community, take 10 seconds
and at least add an XxxInfo category and class with a simple example in the
class comment (if you don't have time for HelpSystem, etc.).  It can be
worse than not having a library, to spend hours trying to use one, and never
have it work.
2. If you ask questions, and the community rallies behind you on the mailing
list, IRC, etc., and you solve the problem - hooray!  Please show your
gratitude and pay it forward by sharing the solution on the list, so that
others who have the same situation can benefit, and we can all learn without
exhausting the time and energy of the gurus having to explain the same thing
over and over.
</rant>

Thank you.
Sean

[1] Log: NonBooleanReceiver: proceed for truth.
28 October 2010 9:31:15.328 pm

VM: Mac OS - Smalltalk
Image: Squeak4.1 [latest update: #9957]

SecurityManager state:
Restricted: false
FileAccess: true
SocketAccess: true
Working Dir /Users/sean/Squeak/Fresh Images/Squeak4.1
Trusted Dir /foobar/tooBar/forSqueak/bogus
Untrusted Dir /Users/sean/Library/Preferences/Squeak/Internet/My Squeak

HTTPSocket(Object)>>mustBeBooleanIn:
	Receiver: a HTTPSocket[connected]
	Arguments and temporary variables: 
		context: 	[] in HTTPSocket class>>httpGetDocument:args:accept:request:
		proceedValue: 	nil
	Receiver's instance variables: 
		semaphore: 	a Semaphore()
		socketHandle: 	#[183 120 202 76 0 0 0 0 144 223 51 0]
		readSemaphore: 	a Semaphore()
		writeSemaphore: 	a Semaphore()
		primitiveOnlySupportsOneSemaphore: 	false
		headerTokens: 	nil
		headers: 	nil
		responseCode: 	nil

HTTPSocket(Object)>>mustBeBoolean
	Receiver: a HTTPSocket[connected]
	Arguments and temporary variables: 

	Receiver's instance variables: 
		semaphore: 	a Semaphore()
		socketHandle: 	#[183 120 202 76 0 0 0 0 144 223 51 0]
		readSemaphore: 	a Semaphore()
		writeSemaphore: 	a Semaphore()
		primitiveOnlySupportsOneSemaphore: 	false
		headerTokens: 	nil
		headers: 	nil
		responseCode: 	nil

[] in HTTPSocket class>>httpGetDocument:args:accept:request:
	Receiver: HTTPSocket
	Arguments and temporary variables: 
<<error during printing>
	Receiver's instance variables: 
		superclass: 	Socket
		methodDict: 	a MethodDictionary(#contentType->(HTTPSocket>>#contentType "a
Compi...etc...
		format: 	146
		instanceVariables: 	#('headerTokens' 'headers' 'responseCode')
		organization: 	('accessing' contentType contentType: contentsLength:
getHeader: ...etc...
		subclasses: 	nil
		name: 	#HTTPSocket
		classPool: 	a Dictionary(#HTTPBlabEmail->'' #HTTPPort->80
#HTTPProxyCredentials-...etc...
		sharedPools: 	nil
		environment: 	Smalltalk globals "a SystemDictionary with lots of globals"
		category: 	#'Network-Protocols'

SmallInteger(Integer)>>timesRepeat:
	Receiver: 3
	Arguments and temporary variables: 
		aBlock: 	[closure] in HTTPSocket
class>>httpGetDocument:args:accept:request:
		count: 	1
	Receiver's instance variables: 
3

--- The full stack ---
HTTPSocket(Object)>>mustBeBooleanIn:
HTTPSocket(Object)>>mustBeBoolean
[] in HTTPSocket class>>httpGetDocument:args:accept:request:
SmallInteger(Integer)>>timesRepeat:
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HTTPSocket class>>httpGetDocument:args:accept:request:
HTTPSocket class>>httpGet:args:accept:request:
HTTPSocket class>>httpGet:args:user:passwd:
[] in MCHttpRepository>>allFileNames
[] in [] in MCHttpRepository>>displayProgress:during:
BlockClosure>>on:do:
[] in MCHttpRepository>>displayProgress:during:
[] in [] in ProgressInitiationException>>defaultMorphicAction
BlockClosure>>on:do:
[] in ProgressInitiationException>>defaultMorphicAction
BlockClosure>>ensure:
ProgressInitiationException>>defaultMorphicAction
ProgressInitiationException>>defaultAction
UndefinedObject>>handleSignal:
MethodContext(ContextPart)>>handleSignal:
MethodContext(ContextPart)>>handleSignal:
MethodContext(ContextPart)>>handleSignal:
ProgressInitiationException(Exception)>>signal
ProgressInitiationException>>display:at:from:to:during:
ProgressInitiationException class>>display:at:from:to:during:
MorphicUIManager>>displayProgress:at:from:to:during:
MCHttpRepository>>displayProgress:during:
MCHttpRepository>>allFileNames
MCHttpRepository(MCFileBasedRepository)>>allFileNamesOrCache
MCHttpRepository(MCFileBasedRepository)>>readableFileNames
InstallerMonticello>>mcThing
[] in InstallerMonticello>>basicInstall
[] in BlockClosure>>valueSupplyingAnswers:
BlockClosure>>on:do:
BlockClosure>>valueSupplyingAnswers:
BlockClosure>>valueSuppressingMessages:supplyingAnswers:
InstallerMonticello(Installer)>>withAnswersDo:
InstallerMonticello>>basicInstall
[] in InstallerMonticello(Installer)>>installLogging
InstallerMonticello(Installer)>>logErrorDuring:
InstallerMonticello(Installer)>>installLogging
InstallerMonticello(Installer)>>install
InstallerMonticello(Installer)>>install:
UndefinedObject>>DoIt
Compiler>>evaluate:in:to:notifying:ifFail:logged:
[] in SmalltalkEditor(TextEditor)>>evaluateSelection
BlockClosure>>on:do:
SmalltalkEditor(TextEditor)>>evaluateSelection
SmalltalkEditor(TextEditor)>>doIt
SmalltalkEditor(TextEditor)>>doIt:
...etc...
-- 
View this message in context: http://forum.world.st/HTML-parser-again-again-tp3018595p3018595.html
Sent from the Squeak - Dev mailing list archive at Nabble.com.



More information about the Squeak-dev mailing list