[squeak-dev] The Trunk: Collections-mt.886.mcz

Levente Uzonyi leves at caesar.elte.hu
Mon Apr 20 11:51:55 UTC 2020


On Mon, 20 Apr 2020, Marcel Taeumel wrote:

> > Still strange that such things are in Collections.
> 
> There has been a discussion on this list about CollectionsExtras. Maybe we should follow up on that.

An HTML to text converter wouldn't fit into CollectionsExtras either. It 
has nothing to do with collections.


Levente

> 
> Best,
> Marcel
>
>       Am 18.04.2020 15:29:14 schrieb Thiede, Christoph <christoph.thiede at student.hpi.uni-potsdam.de>:
>
>       Great idea!
> 
> 
> > Still strange that such things are in Collections.
> 
> __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
> Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Jakob Reschke <forums.jakob at resfarm.de>
> Gesendet: Freitag, 17. April 2020 19:45:38
> An: squeak-dev at lists.squeakfoundation.org
> Betreff: Re: [squeak-dev] The Trunk: Collections-mt.886.mcz  
> Still strange that such things are in Collections.
> 
> Am Fr., 17. Apr. 2020 um 16:56 Uhr schrieb <commits at source.squeak.org>:
> >
> > Marcel Taeumel uploaded a new version of Collections to project The Trunk:
> > http://source.squeak.org/trunk/Collections-mt.886.mcz
> >
> > ==================== Summary ====================
> >
> > Name: Collections-mt.886
> > Author: mt
> > Time: 17 April 2020, 4:56:30.33186 pm
> > UUID: c7d64e56-0d06-e34f-8a61-8f5a7eb9277d
> > Ancestors: Collections-eem.885
> >
> > To our HTML-to-Text converter, add support for <img> tags. Either download an image (or picture) from the Web or evaluate some code to retrieve either a Form or a Morph. As documented in #httpGetImage:, this complements the support of "code://" in TextURL.
> >
> > =============== Diff against Collections-eem.885 ===============
> >
> > Item was added:
> > + ----- Method: HtmlReadWriter>>httpGetImage: (in category 'private') -----
> > + httpGetImage: url
> > +       "To not add a direct dependency to WebClient, provide this hook for getting am image from an HTTP url. Maybe we can have this via an AppRegistry at some point. Maybe extend WebBrowser."
> > +
> > +       (url beginsWith: 'code://') ifTrue: [
> > +               "Same support for Smalltalk expressions as in TextURL >> #actOnClickFor:."
> > +               ^ ([Compiler evaluate: (url allButFirst: 7)] ifError: [nil])
> > +                       ifNotNil: [:object | object isForm ifTrue: [object] ifFalse: [nil]]].
> > +
> > +       ^ (Smalltalk classNamed: 'WebClient') ifNotNil: [:client |
> > +               ([client httpGet: url] ifError: [nil]) ifNotNil: [:response |
> > +                       response code = 200 ifFalse: [nil] ifTrue: [
> > +                               [Form fromBinaryStream: response content asByteArray readStream]
> > +                                       ifError: [nil]]]]!
> >
> > Item was added:
> > + ----- Method: HtmlReadWriter>>mapImgTag: (in category 'mapping') -----
> > + mapImgTag: aTag
> > +
> > +       | result startIndex stopIndex attribute src form |
> > +       result := OrderedCollection new.
> > +
> > +       "<img src=""https://squeak.org/img/downloads/image.png"">"
> > +       attribute := 'src'.
> > +       startIndex := aTag findString: attribute.
> > +       startIndex > 0 ifTrue: [
> > +               startIndex := aTag findString: '"' startingAt: startIndex+attribute size.
> > +               startIndex > 0
> > +                       ifTrue: [stopIndex := aTag findString: '"' startingAt: startIndex+1]
> > +                       ifFalse: [
> > +                               "URLs without quotes..."
> > +                               startIndex := aTag findString: '=' startingAt: startIndex+attribute size.
> > +                               stopIndex := aTag findString: '>' startingAt: startIndex+1].
> > +               src := aTag copyFrom: startIndex+1 to: stopIndex-1.
> > +               form := (self httpGetImage: src) ifNil: [(Form dotOfSize: 12 color: Color veryLightGray)].
> > +               result
> > +                       add: form asTextAnchor;
> > +                       add: (TextColor color: Color transparent)].
> > +       ^ result!
> >
> > Item was changed:
> >   ----- Method: HtmlReadWriter>>mapTagToAttribute: (in category 'mapping') -----
> >   mapTagToAttribute: aTag
> >
> >         aTag = '<b>' ifTrue: [^ {TextEmphasis bold}].
> >         aTag = '<i>' ifTrue: [^ {TextEmphasis italic}].
> >         aTag = '<u>' ifTrue: [^ {TextEmphasis underlined}].
> >         aTag = '<s>' ifTrue: [^ {TextEmphasis struckOut}].
> >         aTag = '<code>' ifTrue: [^ self mapCodeTag].
> >         aTag = '<pre>' ifTrue: [self breakLines: false. ^ {}].
> >         (#('<div' '<span' '<center>' ) anySatisfy: [:ea | aTag beginsWith: ea])
> >                 ifTrue: [^(self mapAlignmentTag: aTag) union: (self mapContainerTag: aTag)].
> >         (aTag beginsWith: '<font') ifTrue: [^ self mapFontTag: aTag].
> >         (aTag beginsWith: '<a') ifTrue: [^ self mapATag: aTag].
> > +       (aTag beginsWith: '<img') ifTrue: [^ self mapImgTag: aTag].
> >
> >         "h1, h2, h3, ..."
> >         (aTag second = $h and: [aTag third isDigit])
> >                 ifTrue: [^ {TextEmphasis bold}].
> >
> >         ^ {}!
> >
> > Item was changed:
> >   ----- Method: HtmlReadWriter>>processEmptyTag: (in category 'reading') -----
> >   processEmptyTag: aTag
> >
> >         (aTag beginsWith: '<br') ifTrue: [
> >                 self addCharacter: Character cr.
> >                 ^ self].
> >
> > +       (aTag beginsWith: '<img') ifTrue:[
> > +               ^ self processStartTag: aTag].
> > +
> >         (self isTagIgnored: aTag)
> >                 ifTrue: [^ self].
> >
> >         "TODO... what?"!
> >
> > Item was added:
> > + ----- Method: HtmlReadWriter>>processEndTagEagerly: (in category 'reading') -----
> > + processEndTagEagerly: aTag
> > +       "Not all tags need an end tag. Simulate that here."
> > +
> > +       (aTag beginsWith: '<img')
> > +               ifTrue: [^ self processEndTag: '</img>'].!
> >
> > Item was changed:
> >   ----- Method: HtmlReadWriter>>processStartTag: (in category 'reading') -----
> >   processStartTag: aTag
> >
> >         | index |
> >         (self isTagIgnored: aTag) ifTrue: [^ self].
> >
> >         index := count - offset.
> >
> >         aTag = '<br>' ifTrue: [
> >                 self addCharacter: Character cr.
> >                 ^ self].
> > +
> >         (aTag beginsWith: '<img') ifTrue: [
> > +               self addString: Character startOfHeader asString.
> > +               offset := offset + 1.
> > +               index := index - 1].
> > -               self addString: '[image]'.
> > -               ^ self].
> >
> >         self processRunStackTop. "To add all attributes before the next tag adds some."
> >
> >         "Copy attr list and add new attr."
> >         runStack push: ({runStack top first copy addAll: (self mapTagToAttribute: aTag); yourself. index + 1 . index + 1}).
> > +
> > +       "For tags such as <img>, we should simulate the closing tag because there won't be any."
> > +       self processEndTagEagerly: aTag.!
> > -       !
> >
> >
> 
> 
> 
>


More information about the Squeak-dev mailing list