[squeak-dev] The Inbox: HelpSystem-Core-ct.129.mcz
commits at source.squeak.org
commits at source.squeak.org
Mon Mar 2 09:33:00 UTC 2020
Christoph Thiede uploaded a new version of HelpSystem-Core to project The Inbox:
http://source.squeak.org/inbox/HelpSystem-Core-ct.129.mcz
==================== Summary ====================
Name: HelpSystem-Core-ct.129
Author: ct
Time: 2 March 2020, 10:32:56.341949 am
UUID: 65f1da58-ae12-c14a-a12c-bc9a3b7a08b3
Ancestors: HelpSystem-Core-mt.119
Improves parsing of html help topics
- Detect relative links and convert them to absolute version
- Add support for a cleanseBlock that will be applied to the html body source
- Trim leading and trailing blanks from the text
- Cache contents
Small refactoring:
- Remove unnecessary duplicate parse logic from #subtopics
- Again in #subtopics, don't pass the result of [self fooBlock] but the instance variable fooBlock instead. Don't manifest default values ...
=============== Diff against HelpSystem-Core-mt.119 ===============
Item was changed:
AbstractHelpTopic subclass: #HtmlHelpTopic
+ instanceVariableNames: 'url level selectBlock convertBlock cleanseBlock document contents subtopicUrls subtopics'
- instanceVariableNames: 'url document selectBlock convertBlock subtopicUrls subtopics level'
classVariableNames: ''
poolDictionaries: ''
category: 'HelpSystem-Core-Model'!
Item was added:
+ ----- Method: HtmlHelpTopic>>cleanseBlock (in category 'accessing') -----
+ cleanseBlock
+ "Answer the block that will be applied to the HTML body source in order to filter relevant information."
+
+ ^ cleanseBlock ifNil: [ [:contents | contents] ]!
Item was added:
+ ----- Method: HtmlHelpTopic>>cleanseBlock: (in category 'accessing') -----
+ cleanseBlock: aBlock
+ "Indicate the block that will be applied to the HTML body source in order to filter relevant information."
+
+ cleanseBlock := aBlock.!
Item was changed:
----- Method: HtmlHelpTopic>>contents (in category 'accessing') -----
contents
+ | start end source text rootUrl |
+ contents ifNotNil: [^ contents].
+
+ start := self document findString: '<body'.
- | start end |
- start := (self document findString: '<body').
start := (self document findString: '>' startingAt: start) + 1.
end := self document findString: '</body>' startingAt: start.
start > end ifTrue: [^ self document].
+ source := self document copyFrom: start to: end - 1.
+ source := self cleanseBlock value: source.
+ text := (source copyReplaceAll: String cr with: '<br>')
+ asTextFromHtml.
+
+ "Convert relative URLs (https://www.w3.org/TR/WD-html40-970917/htmlweb.html#h-5.1.2)"
+ rootUrl := url readStream in: [:urlStream |
+ | host scheme |
+ scheme := urlStream upToAll: '://'.
+ host := urlStream upTo: $/.
+ scheme , '://' , host].
+ (text runs gather: #yourself) withoutDuplicates
+ select: [:attribute | attribute isKindOf: TextURL]
+ thenDo: [:attribute |
+ (attribute info beginsWith: '..')
+ ifTrue: [attribute url: self url , (attribute info skip: 2)].
+ (attribute info beginsWith: '/')
+ ifTrue: [attribute url: rootUrl , attribute info]].
+
+ ^ contents := text withBlanksTrimmed!
- ^ ((self document copyFrom: start to: end - 1)
- copyReplaceAll: String cr with: '<br>')
- asTextFromHtml!
Item was changed:
----- Method: HtmlHelpTopic>>subtopics (in category 'accessing') -----
subtopics
- | start end urls |
subtopics ifNotNil: [^ subtopics].
- urls := OrderedCollection new.
-
- start := self document findString: '<a '.
- [start > 0] whileTrue: [
- start := self document findString: 'href' startingAt: start.
- start := (self document findString: '"' startingAt: start) + 1.
- end := self document findString: '"' startingAt: start.
- urls addIfNotPresent: (self document copyFrom: start to: end - 1).
- start := self document findString: '<a ' startingAt: start.].
-
subtopics := (self subtopicUrls collect: [:aUrl | self class new
level: self level + 1;
url: aUrl;
+ selectBlock: selectBlock;
+ convertBlock: convertBlock;
+ cleanseBlock: cleanseBlock]).
- selectBlock: self selectBlock;
- convertBlock: self convertBlock]).
Project current uiProcess == Processor activeProcess
ifTrue: [self fetchSubtopics].
^ subtopics!
More information about the Squeak-dev
mailing list
|