[squeak-dev] HTML parser (again) (again)
Sean P. DeNigris
sean at clipperadams.com
Fri Oct 29 07:40:13 UTC 2010
All the threads in the mailing list seem to die off unresolved. What are the
options available in current Squeak, and what are the differences?
Here's my experience:
1. HTML (Squeaksource) - parser pulled out of Scamper. Loads in Squeak 4.1,
seems awkward, but maybe I don't know how to use it. Here's a snippet I
wrote that worked:
doc := HtmlParser parse: htmlString.
thread := doc body subEntities detect: [ :e | e id = 'thread' ].
subject_header := thread contents detect: [ :e | e id = 'message_heading'
].
subject := (subject_header contents at: 1) text.
2. HTML & CSS Validating Parser (Squeaksource) - It loads, but I don't have
the slightest clue how to use it. I found references to people using it.
They must be Alan Kay's close relatives, or live in machine world like the
Lawnmower Man because I couldn't find a shred of documentation or even one
class that looked plausible as a starting point.
3. Soup
* In 4.1
- "Installer squeaksource project: 'Soup'; install: 'Soup'" ->"depends
on... RxMatcher" warning
- "Installer squeaksource project: 'Soup'; install:
'Network-Protocols'", then Soup -> "NonBooleanReceiver: proceed for truth"
(See [1] for log).
Also, two general points that would put turbo boosters behind the community:
<rant>
1. I know this is totally ungrateful, but please, if you design a library
and are nice enough to release it for free to the community, take 10 seconds
and at least add an XxxInfo category and class with a simple example in the
class comment (if you don't have time for HelpSystem, etc.). It can be
worse than not having a library, to spend hours trying to use one, and never
have it work.
2. If you ask questions, and the community rallies behind you on the mailing
list, IRC, etc., and you solve the problem - hooray! Please show your
gratitude and pay it forward by sharing the solution on the list, so that
others who have the same situation can benefit, and we can all learn without
exhausting the time and energy of the gurus having to explain the same thing
over and over.
</rant>
Thank you.
Sean
[1] Log: NonBooleanReceiver: proceed for truth.
28 October 2010 9:31:15.328 pm
VM: Mac OS - Smalltalk
Image: Squeak4.1 [latest update: #9957]
SecurityManager state:
Restricted: false
FileAccess: true
SocketAccess: true
Working Dir /Users/sean/Squeak/Fresh Images/Squeak4.1
Trusted Dir /foobar/tooBar/forSqueak/bogus
Untrusted Dir /Users/sean/Library/Preferences/Squeak/Internet/My Squeak
HTTPSocket(Object)>>mustBeBooleanIn:
Receiver: a HTTPSocket[connected]
Arguments and temporary variables:
context: [] in HTTPSocket class>>httpGetDocument:args:accept:request:
proceedValue: nil
Receiver's instance variables:
semaphore: a Semaphore()
socketHandle: #[183 120 202 76 0 0 0 0 144 223 51 0]
readSemaphore: a Semaphore()
writeSemaphore: a Semaphore()
primitiveOnlySupportsOneSemaphore: false
headerTokens: nil
headers: nil
responseCode: nil
HTTPSocket(Object)>>mustBeBoolean
Receiver: a HTTPSocket[connected]
Arguments and temporary variables:
Receiver's instance variables:
semaphore: a Semaphore()
socketHandle: #[183 120 202 76 0 0 0 0 144 223 51 0]
readSemaphore: a Semaphore()
writeSemaphore: a Semaphore()
primitiveOnlySupportsOneSemaphore: false
headerTokens: nil
headers: nil
responseCode: nil
[] in HTTPSocket class>>httpGetDocument:args:accept:request:
Receiver: HTTPSocket
Arguments and temporary variables:
<<error during printing>
Receiver's instance variables:
superclass: Socket
methodDict: a MethodDictionary(#contentType->(HTTPSocket>>#contentType "a
Compi...etc...
format: 146
instanceVariables: #('headerTokens' 'headers' 'responseCode')
organization: ('accessing' contentType contentType: contentsLength:
getHeader: ...etc...
subclasses: nil
name: #HTTPSocket
classPool: a Dictionary(#HTTPBlabEmail->'' #HTTPPort->80
#HTTPProxyCredentials-...etc...
sharedPools: nil
environment: Smalltalk globals "a SystemDictionary with lots of globals"
category: #'Network-Protocols'
SmallInteger(Integer)>>timesRepeat:
Receiver: 3
Arguments and temporary variables:
aBlock: [closure] in HTTPSocket
class>>httpGetDocument:args:accept:request:
count: 1
Receiver's instance variables:
3
--- The full stack ---
HTTPSocket(Object)>>mustBeBooleanIn:
HTTPSocket(Object)>>mustBeBoolean
[] in HTTPSocket class>>httpGetDocument:args:accept:request:
SmallInteger(Integer)>>timesRepeat:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HTTPSocket class>>httpGetDocument:args:accept:request:
HTTPSocket class>>httpGet:args:accept:request:
HTTPSocket class>>httpGet:args:user:passwd:
[] in MCHttpRepository>>allFileNames
[] in [] in MCHttpRepository>>displayProgress:during:
BlockClosure>>on:do:
[] in MCHttpRepository>>displayProgress:during:
[] in [] in ProgressInitiationException>>defaultMorphicAction
BlockClosure>>on:do:
[] in ProgressInitiationException>>defaultMorphicAction
BlockClosure>>ensure:
ProgressInitiationException>>defaultMorphicAction
ProgressInitiationException>>defaultAction
UndefinedObject>>handleSignal:
MethodContext(ContextPart)>>handleSignal:
MethodContext(ContextPart)>>handleSignal:
MethodContext(ContextPart)>>handleSignal:
ProgressInitiationException(Exception)>>signal
ProgressInitiationException>>display:at:from:to:during:
ProgressInitiationException class>>display:at:from:to:during:
MorphicUIManager>>displayProgress:at:from:to:during:
MCHttpRepository>>displayProgress:during:
MCHttpRepository>>allFileNames
MCHttpRepository(MCFileBasedRepository)>>allFileNamesOrCache
MCHttpRepository(MCFileBasedRepository)>>readableFileNames
InstallerMonticello>>mcThing
[] in InstallerMonticello>>basicInstall
[] in BlockClosure>>valueSupplyingAnswers:
BlockClosure>>on:do:
BlockClosure>>valueSupplyingAnswers:
BlockClosure>>valueSuppressingMessages:supplyingAnswers:
InstallerMonticello(Installer)>>withAnswersDo:
InstallerMonticello>>basicInstall
[] in InstallerMonticello(Installer)>>installLogging
InstallerMonticello(Installer)>>logErrorDuring:
InstallerMonticello(Installer)>>installLogging
InstallerMonticello(Installer)>>install
InstallerMonticello(Installer)>>install:
UndefinedObject>>DoIt
Compiler>>evaluate:in:to:notifying:ifFail:logged:
[] in SmalltalkEditor(TextEditor)>>evaluateSelection
BlockClosure>>on:do:
SmalltalkEditor(TextEditor)>>evaluateSelection
SmalltalkEditor(TextEditor)>>doIt
SmalltalkEditor(TextEditor)>>doIt:
...etc...
--
View this message in context: http://forum.world.st/HTML-parser-again-again-tp3018595p3018595.html
Sent from the Squeak - Dev mailing list archive at Nabble.com.
More information about the Squeak-dev
mailing list
|