[Goodie][Beta]Regular Expressions Plugin (RePlugin 3.3beta)

Andrew C. Greenberg werdna at mucow.com
Sat Aug 17 03:10:43 UTC 2002


A non-text attachment was scrubbed...
Name: RePlugin3.3.1.cs
Type: application/text
Size: 440863 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20020816/44aa6bc3/RePlugin3.3.1.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RePlugin.gz
Type: application/x-gzip
Size: 51327 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20020816/44aa6bc3/RePlugin.bin
-------------- next part --------------


Attached, please find a changeset and machine-independent source codes 
for the 3.3beta version of RePlugin.  This is an "early adopters" 
version to get the code in the hands of VM builders who might assemble 
and test on various platforms (I built an internal plugin on MacOSX that 
passed all tests in the SUnit suite -- that's it so far).  I'd be 
obliged for reports and copies of the 3.3beta plugin.

Features of the new plugin:

	1) Brand new and improved interface to the plugin via class Re.
	2) Interfaces with the latest and greatest of Phillip Hazel's PCRE 
(2.9)
	3) Fixes many problems arising from bit rot -- this version should 
build under the latest version of the compiler and under VMMaker.
	4) Many bugs fixed.  Much faster.  Extensive SUnit suite.

It is advised that old and new users alike read the pretty extensive 
documentation in class "Re." The documentation on my website is not 
entirely current with this version.  I hope to be posting a clean and 
friendly version in the next few weeks, including copies of the 
real-deal plugins produced by our friends and colleagues.

------------------------

'From Squeak3.3alpha of 12 January 2002 [latest update: #4934] on 16 
August 2002 at 10:56:49 pm'!
"Change Set:		RePlugin3.3
Date:			16 August 2002
Author:			acg

Perl-Style Regular Expressions in Smalltalk
by Andrew C. Greenberg

Version 3.3beta

I.  Regular Expressions in General

	Regular expressions are a language for specifying text to ease the 
searching and manipulation of text.  A complete discussion of regular 
expressions is beyond the scope of this document.  See Jeffrey Friedl, 
Mastering Regular Expressions, by O'Reilly for a relatively complete.  
The regular expressions supported by this package are similar to those 
presently used in Perl 5.05 and Python, and are based upon Philip 
Hazel's excellent PCRE libraries (incorporated almost without change, 
subject to a free license described in Re aLicenseComment.  Thanks are 
due to Markus Kohler and Stephen Pair for their assistance in the 
initial ports of early versions of the Plugin.

An explanation of the expressions available in this package are 
summarized in Re aRegexComment, Re anOptionsComment and Re 
aGlobalSearchComment.

A more detailed description of RePlugin is available downloading the 
file 'RePluginDoco,' which can be obtained from 
http://www.gate.net/~werdna/RePlugin.html, into your default directory, 
and then executing

		Utilities reconstructTextWindowsFromFileNamed: 'RePluginDoco'

II. Overview of the 'Package.'

	The following new classes are provided:

		Class					Description of Instances
		----------------------	
	-------------------------------------------------------------------
		Re						A regular expression matching engine
		ReMatch				Result of a search using Re
		RePattern				Deprecated engine class from earlier plugin versions
		RePlugin				The Plugin 'Glue' to the PCRE Library.

		String					Various new messages were added to String, which are
								the principal means for users to access the package.

PluginCodeGenerator has been deleted from the packgage.


III. Some Examples.

	A. Simple Matching and Querying of Matches

	To search a string for matches in a regular expression, use String 
reMatch:

		'just trying to catch some zzz''s before noon' matchRe: 'z+'

which returns true if matched, and false otherwise.  If more information 
from a positive search result is desired, the method reMatch will return 
a ReMatch object corresponding to the result.

		'just trying to catch some zzz''s before noon' reMatch: 'z+'

The match object can be queried in various ways.  For example, to obtain 
details when parenthetical phrases of a regular expression are captured:

		|m|
		m _ 'Andy was born on 10/02/1957, and not soon enough!!'
			reMatch: '(\d\d)/(\d\d)/((19|20)?\d\d)'.
		m matches

answers with:
	
		('10' '02' '1957' '19' )

The first message answers a ReMatch m representing the result of a 
search of the string for matches of re (nil would be returned if no 
match was found).  The third message answered a collection of the 
parenthetical subgroups matched, each showing the day, month and year as 
extracted from the string.

	B. Global Matching and String Substitutions

	You can perform global searches to repeatedly search a string for 
non-overlapping occurrences of a pattern by using reMatch:collect:  For 
example,

		'this is a test' collectRe: '\w+'

can be used to gather a collection of all words in the search string, 
answering:

		OrderedCollection ('this' 'is' 'a' 'test' )

For slightly more complex collections, you can use #reMatch:andCollect:  
Additionally, you can perform global searches with text substitutions 
using reMatch:sub:  For example,

		'this is a test' reMatch: '\w+' andReplace: [:m | '<', (m match), '>']

can be used to replace every word in the search string with the word 
enclosed by matching brackets, answering:

		'<this> <is> <a> <test>'

Further examples and documentation can be found in the references above, 
and in the comments and definitions set forth in ReMatch, RePattern and 
String.
"!


More information about the Squeak-dev mailing list