[squeak-dev] Re: Extracting data from web pages using Squeak

Edgar J. De Cleene edgardec2001 at yahoo.com.ar
Mon Jun 16 19:36:38 UTC 2008




El 6/16/08 3:41 PM, "Edgar J. De Cleene" <edgardec2001 at yahoo.com.ar>
escribió:

> Here you have a crude html text visualizer.
> Works in SqueakLightII
> http://ftp.squeak.org/various_images/SqueakLight/SqueakLightII.7069.zip
> File first
> Network-HTML-md.4.mcz
> Then 
> HTMLScrollableField.st
> and in Workspace do it
> HTMLScrollableField help
> 
> Edgar


Here the other file

-------------- next part --------------
'From SqueakLight of 16 August 2006 [latest update: #422] on 5 April 2007 at 9:05:41 am'!
ScrollableField subclass: #HTMLScrollableField
	instanceVariableNames: 'visitedUrls'
	classVariableNames: ''
	poolDictionaries: ''
	category: 'SqueakRos'!

!HTMLScrollableField methodsFor: 'accessing' stamp: 'edc 12/29/2005 11:56'!
visitedUrls
	"Answer the value of visitedUrls"

	^ visitedUrls! !

!HTMLScrollableField methodsFor: 'accessing' stamp: 'edc 12/29/2005 11:56'!
visitedUrls: anObject
	"Set the value of visitedUrls"

	visitedUrls _ anObject! !


!HTMLScrollableField methodsFor: 'as yet unclassified' stamp: 'edc 9/30/2006 09:51'!
jumpToUrl: url

| stream aText root last  thisPage |
last := (visitedUrls at: 1) lastIndexOf: $/.
root := (visitedUrls at: 1) copyFrom: 1 to: last.
last :=  (url lastIndexOf: $/) + 1.
thisPage :=  (url copyFrom: last to: url size).
thisPage := root ,thisPage.
	stream := (HTTPSocket httpGet: thisPage  accept: 'application/octet-stream') contents.
	
	aText := (HtmlParser parse: stream) formattedText.
	
	self setMyText: aText.
	self visitedUrls add: thisPage.
	! !

"-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- "!

HTMLScrollableField class
	instanceVariableNames: ''!

!HTMLScrollableField class methodsFor: 'as yet unclassified' stamp: 'edc 12/29/2005 11:57'!
default
	| newInstance r |
	newInstance := self newStandAlone.
	newInstance
		color: (Color
				r: 0.972
				g: 0.972
				b: 0.662).
	r := Rectangle
				left: World left + 50
				right: World right - 50
				top: World top + 100
				bottom: World bottom - 50.
	newInstance bounds: r.
	newInstance visitedUrls: OrderedCollection new.
	^ newInstance! !

!HTMLScrollableField class methodsFor: 'as yet unclassified' stamp: 'edc 4/5/2007 08:59'!
help
	"HTMLScrollableField help"
	| stream aText sf |
	stream := (HTTPSocket httpGet: 'http://wiki.squeak.org/squeak//5871' accept: 'application/octet-stream') contents.
	
	aText := (HtmlParser parse: stream) formattedText.
	sf := HTMLScrollableField default.
	sf setMyText: aText.
	sf visitedUrls add: 'http://www.worldwideschool.org/library/books/lit/sciencefiction/TheMysteriousIsland/'.
	^sf openInWorld! !

!HTMLScrollableField class methodsFor: 'as yet unclassified' stamp: 'edc 5/12/2006 12:08'!
isla
	"HTMLScrollableField isla"
	| stream aText sf |
	stream := (HTTPSocket httpGet: 'http://www.worldwideschool.org/library/books/lit/sciencefiction/TheMysteriousIsland/chap1.html' accept: 'application/octet-stream') contents.
	
	aText := (HtmlParser parse: stream) formattedText.
	sf := HTMLScrollableField default.
	sf setMyText: aText.
	sf visitedUrls add: 'http://www.worldwideschool.org/library/books/lit/sciencefiction/TheMysteriousIsland/'.
	^sf openInWorld! !

!HTMLScrollableField class methodsFor: 'as yet unclassified' stamp: 'edc 9/28/2006 12:13'!
url: url
	| stream aText sf |
	stream := (HTTPSocket httpGet: url  accept: 'application/octet-stream') contents.
	
	aText := (HtmlParser parse: stream) formattedText.
	sf := HTMLScrollableField default.
	sf setMyText: aText.
	sf visitedUrls add: url.
	^sf openInWorld! !


More information about the Squeak-dev mailing list