<body><div id="__MailbirdStyleContent" style="font-size: 10pt;font-family: Arial;color: #000000">
Credits go to Toni Mattis (https://github.com/amintos) for the idea and implementation! :-)<div><br></div><div>Thanks!</div><div class="mb_sig"></div><blockquote class="history_container" type="cite" style="border-left-style:solid;border-width:1px; margin-top:20px; margin-left:0px;padding-left:10px;">
<p style="color: #AAAAAA; margin-top: 10px;">Am 04.07.2019 16:32:51 schrieb commits@source.squeak.org <commits@source.squeak.org>:</p><div style="font-family:Arial,Helvetica,sans-serif">A new version of Collections was added to project The Inbox:<br>http://source.squeak.org/inbox/Collections-mt.838.mcz<br><br>==================== Summary ====================<br><br>Name: Collections-mt.838<br>Author: mt<br>Time: 4 July 2019, 4:32:40.854026 pm<br>UUID: ac4ab442-79c0-d246-8dec-914be7ee5356<br>Ancestors: Collections-pre.837<br><br>To String, adds simple analysis of natural language in source code. No word stemming.<br><br>1) Refactor #findTokens: to look like #lines (i.e. #linesDo: and #lineIndicesDo:).<br>2) Add #findFeaturesDo: like #findTokens:do: and #linesDo:.<br><br>Try this:<br><br>HTTPDownloadRequest name findFeatures.<br>(Morph >> #drawOn:) getSource asString findFeatures.<br><br>Where can that be useful?<br><br>- Automatic insertion of "*" for search terms like "WeakDictionary" to also find WeakIdentityDictionary etc.<br>- Prefix emphasis for names lists of classes in code browsers: MCAddition, MCAncestry, etc.<br><br>=============== Diff against Collections-pre.837 ===============<br><br>Item was added:<br>+ ----- Method: String>>findFeatureIndicesDo: (in category 'accessing - features') -----<br>+ findFeatureIndicesDo: aBlock<br>+ "State machine that separates camelCase, UPPERCase, number/operator combinations and skips colons"<br>+ | last state char "0 = start, 1 = a, 2 = A, 3 = AA, 4 = num, 5 = op" |<br>+ <br>+ state := 0.<br>+ last := 1.<br>+ <br>+ 1 to: self size do: [ :index |<br>+ char := self at: index.<br>+ "a"<br>+ char isLowercase ifTrue: [<br>+ (state < 3)="" iftrue:="" [state="" :="1]." "*a="" -=""> a"<br>+ (state == 3) ifTrue: [<br>+ "AAa -> A + Aa (camel case follows uppercase)"<br>+ aBlock value: last value: index - 2.<br>+ last := index - 1.<br>+ state := 2].<br>+ (state > 3) ifTrue: [<br>+ "+a -> + | a (letter follows non-letter)" <br>+ aBlock value: last value: index - 1.<br>+ last := index.<br>+ state := 1]] <br>+ ifFalse: [<br>+ char isUppercase ifTrue: [<br>+ (state == 0)<br>+ ifTrue: [state := 2] "start -> A"<br>+ ifFalse: [<br>+ (state < 2="" or:="" [state=""> 3]) ifTrue: [<br>+ "*A -> * | A (uppercase begins, flush before)"<br>+ aBlock value: last value: index - 1.<br>+ last := index.<br>+ state := 2] ifFalse: [<br>+ "AA -> AA (uppercase continues)"<br>+ state := 3]]]<br>+ ifFalse: [<br>+ ("char == $: or:" char isSeparator) ifTrue: [<br>+ "skip colon/whitespace"<br>+ (state > 0) ifTrue: [<br>+ aBlock value: last value: index - 1.<br>+ state := 0].<br>+ last := index + 1]<br>+ ifFalse: [<br>+ char isDigit ifTrue: [<br>+ (state == 0)<br>+ ifTrue: [state := 4]<br>+ ifFalse: [<br>+ (state ~= 4) ifTrue: [<br>+ aBlock value: last value: index - 1.<br>+ last := index.<br>+ state := 4]]]<br>+ ifFalse: [<br>+ (state == 0)<br>+ ifTrue: [state := 5]<br>+ ifFalse: [<br>+ (state < 5)="" iftrue:=""><br>+ aBlock value: last value: index - 1.<br>+ last := index.<br>+ state := 5]]]]]]].<br>+ last <= self="" size="" iftrue:=""></=><br>+ aBlock value: last value: self size]!<br><br>Item was added:<br>+ ----- Method: String>>findFeatures (in category 'accessing - features') -----<br>+ findFeatures<br>+ <br>+ ^ Array streamContents: [:features |<br>+ self findFeaturesDo: [:feature | features nextPut: feature]]!<br><br>Item was added:<br>+ ----- Method: String>>findFeaturesDo: (in category 'accessing - features') -----<br>+ findFeaturesDo: aBlock<br>+ "Simple analysis for natural language in source code. No support for word stemming."<br>+ <br>+ self findFeatureIndicesDo: [:start :end |<br>+ (self at: start) isLetter ifTrue: [<br>+ aBlock value: (self copyFrom: start to: end) asLowercase]].!<br><br>Item was changed:<br> ----- Method: String>>findTokens: (in category 'accessing') -----<br> findTokens: delimiters<br>+ "Answer the collection of tokens that result from parsing self."<br>+ <br>+ ^ OrderedCollection streamContents: [:tokens |<br>+ self<br>+ findTokens: delimiters<br>+ do: [:token | tokens nextPut: token]]!<br>- "Answer the collection of tokens that result from parsing self. Return strings between the delimiters. Any character in the Collection delimiters marks a border. Several delimiters in a row are considered as just one separation. Also, allow delimiters to be a single character."<br>- <br>- | tokens keyStart keyStop separators |<br>- <br>- tokens := OrderedCollection new.<br>- separators := delimiters isCharacter <br>- ifTrue: [Array with: delimiters]<br>- ifFalse: [delimiters].<br>- keyStop := 1.<br>- [keyStop <= self="" size]=""></=><br>- [keyStart := self skipDelimiters: separators startingAt: keyStop.<br>- keyStop := self findDelimiters: separators startingAt: keyStart.<br>- keyStart <><br>- ifTrue: [tokens add: (self copyFrom: keyStart to: (keyStop - 1))]].<br>- ^tokens!<br><br>Item was added:<br>+ ----- Method: String>>findTokens:do: (in category 'accessing') -----<br>+ findTokens: delimiters do: aBlock<br>+ <br>+ self<br>+ findTokens: delimiters<br>+ indicesDo: [:start :end | aBlock value: (self copyFrom: start to: end)].!<br><br>Item was added:<br>+ ----- Method: String>>findTokens:indicesDo: (in category 'accessing') -----<br>+ findTokens: delimiters indicesDo: aBlock<br>+ "Parse self to find tokens between delimiters. Any character in the Collection delimiters marks a border. Several delimiters in a row are considered as just one separation. Also, allow delimiters to be a single character. Similar to #lineIndicesDo:."<br>+ <br>+ | tokens keyStart keyStop separators |<br>+ separators := delimiters isCharacter <br>+ ifTrue: [Array with: delimiters]<br>+ ifFalse: [delimiters].<br>+ keyStop := 1.<br>+ [keyStop <= self="" size]="" whiletrue:=""></=><br>+ keyStart := self skipDelimiters: separators startingAt: keyStop.<br>+ keyStop := self findDelimiters: separators startingAt: keyStart.<br>+ keyStart <><br>+ ifTrue: [aBlock value: keyStart value: keyStop - 1]].!<br><br><br></div></blockquote>
</div></body>