[squeak-dev] The Inbox: HelpSystem-Core-ct.129.mcz

commits at source.squeak.org commits at source.squeak.org
Mon Mar 2 09:33:00 UTC 2020


Christoph Thiede uploaded a new version of HelpSystem-Core to project The Inbox:
http://source.squeak.org/inbox/HelpSystem-Core-ct.129.mcz

==================== Summary ====================

Name: HelpSystem-Core-ct.129
Author: ct
Time: 2 March 2020, 10:32:56.341949 am
UUID: 65f1da58-ae12-c14a-a12c-bc9a3b7a08b3
Ancestors: HelpSystem-Core-mt.119

Improves parsing of html help topics

- Detect relative links and convert them to absolute version
- Add support for a cleanseBlock that will be applied to the html body source
- Trim leading and trailing blanks from the text
- Cache contents

Small refactoring:

- Remove unnecessary duplicate parse logic from #subtopics
- Again in #subtopics, don't pass the result of [self fooBlock] but the instance variable fooBlock instead. Don't manifest default values ...

=============== Diff against HelpSystem-Core-mt.119 ===============

Item was changed:
  AbstractHelpTopic subclass: #HtmlHelpTopic
+ 	instanceVariableNames: 'url level selectBlock convertBlock cleanseBlock document contents subtopicUrls subtopics'
- 	instanceVariableNames: 'url document selectBlock convertBlock subtopicUrls subtopics level'
  	classVariableNames: ''
  	poolDictionaries: ''
  	category: 'HelpSystem-Core-Model'!

Item was added:
+ ----- Method: HtmlHelpTopic>>cleanseBlock (in category 'accessing') -----
+ cleanseBlock
+ 	"Answer the block that will be applied to the HTML body source in order to filter relevant information."
+ 
+ 	^ cleanseBlock ifNil: [ [:contents | contents] ]!

Item was added:
+ ----- Method: HtmlHelpTopic>>cleanseBlock: (in category 'accessing') -----
+ cleanseBlock: aBlock
+ 	"Indicate the block that will be applied to the HTML body source in order to filter relevant information."
+ 
+ 	cleanseBlock := aBlock.!

Item was changed:
  ----- Method: HtmlHelpTopic>>contents (in category 'accessing') -----
  contents
  
+ 	| start end source text rootUrl |
+ 	contents ifNotNil: [^ contents].
+ 	
+ 	start := self document findString: '<body'.
- 	| start end |
- 	start := (self document findString: '<body').
  	start := (self document findString: '>' startingAt: start) + 1.
  	end := self document findString: '</body>' startingAt: start.
  	
  	start > end ifTrue: [^ self document].
  	
+ 	source := self document copyFrom: start to: end - 1.
+ 	source := self cleanseBlock value: source.
+ 	text := (source copyReplaceAll: String cr with: '<br>')
+ 		asTextFromHtml.
+ 	
+ 	"Convert relative URLs (https://www.w3.org/TR/WD-html40-970917/htmlweb.html#h-5.1.2)"
+ 	rootUrl := url readStream in: [:urlStream |
+ 		| host scheme |
+ 		scheme := urlStream upToAll: '://'.
+ 		host := urlStream upTo: $/.
+ 		scheme , '://' , host].
+ 	(text runs gather: #yourself) withoutDuplicates
+ 		select: [:attribute | attribute isKindOf: TextURL]
+ 		thenDo: [:attribute |
+ 			(attribute info beginsWith: '..')
+ 				ifTrue: [attribute url: self url , (attribute info skip: 2)].
+ 			(attribute info beginsWith: '/')
+ 				ifTrue: [attribute url: rootUrl , attribute info]].
+ 	
+ 	^ contents := text withBlanksTrimmed!
- 	^ ((self document copyFrom: start to: end - 1)
- 		copyReplaceAll: String cr with: '<br>') 
- 		asTextFromHtml!

Item was changed:
  ----- Method: HtmlHelpTopic>>subtopics (in category 'accessing') -----
  subtopics
  
- 	| start end urls |
  	subtopics ifNotNil: [^ subtopics].
  	
- 	urls := OrderedCollection new.
- 	
- 	start := self document findString: '<a '.
- 	[start > 0] whileTrue: [
- 		start := self document findString: 'href' startingAt: start.
- 		start := (self document findString: '"' startingAt: start) + 1.
- 		end := self document findString: '"' startingAt: start.
- 		urls addIfNotPresent: (self document copyFrom: start to: end - 1).
- 		start := self document findString: '<a ' startingAt: start.].
- 	
  	subtopics := (self subtopicUrls collect: [:aUrl | self class new
  			level: self level + 1;
  			url: aUrl;
+ 			selectBlock: selectBlock;
+ 			convertBlock: convertBlock;
+ 			cleanseBlock: cleanseBlock]).
- 			selectBlock: self selectBlock;
- 			convertBlock: self convertBlock]).
  		
  	Project current uiProcess == Processor activeProcess
  		ifTrue: [self fetchSubtopics].
  	
  	^ subtopics!



More information about the Squeak-dev mailing list