<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p><font face="Georgia">If you just want to replace it yourself, try
this:</font></p>
<br>
<table style="color: rgb(0, 0, 0); font-family: Times; font-size:
medium; font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: 400; letter-spacing:
normal; orphans: 2; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: 2;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
text-decoration-style: initial; text-decoration-color: initial;
width: 2499px; height: 614px;">
<tbody>
<tr>
<td rowspan="2" id="fancyText7" style="width: 1494px;">
<div style="font-family: Georgia, Times, serif; height:
608px; overflow: auto;">
<div class="codd" style="background-color: rgb(255, 255,
238); padding: 4px; max-width: 1000px;">'From
Squeak3.4alpha of ''11 November 2002'' [latest update:
#5109] on 16 November 2002 at 8:06:43 pm'</div>
<div class="ceven" style="background-color: rgb(238, 255,
255); padding: 4px; max-width: 1000px;"><span
style="color: blue;">"Change Set: ISO8859<br>
Date: 15 November 2002<br>
Author: Boris Gaertner<br>
<br>
Jean-Marie Zajac pointed out that accented characters
in ISO-8859-1 encoding are not displayed as expected.
Scamper is not encoding-aware, but it translates
ISO-8859-1 to the encoding that is used in Squeak.
Unfortunately, due to a subtle bug the translation is
done twice: first, the entire source is translated,
later parsed entities are translated again. This
change set drops the translation of parsed entites. To
make it work, it adds the translation of character
entity references (characters that are written in the
form &#<integer>; or in the form
&<character name>; see sections 5.3.1 and
5.3.2 of the HTML 4.0 specification.)<br>
<br>
Jean-Marie tested a first version and found a new bug,
later he tested a second version that is seemingly ok.
With his test he helped me to understand where the
real problem was burried. Thanks a lot!<span> </span><br>
<br>
"</span></div>
<div class="codd" style="background-color: rgb(255, 255,
238); padding: 4px; max-width: 1000px;"><span
style="font-weight: 700;">HtmlText</span><span> </span>methodsFor:
'private-initialization' stamp: 'BG 11/15/2002 21:40'</div>
<div class="ceven" style="background-color: rgb(238, 255,
255); padding: 4px; max-width: 1000px;"><span
style="font-weight: 700;">initialize:</span><span> </span>source0<br>
super initialize: source0.<br>
self text: source0<span> </span><span style="color:
red; background-color: yellow;">replaceHtml</span>CharRefs.</div>
<div class="codd" style="background-color: rgb(255, 255,
238); padding: 4px; max-width: 1000px;"><span
style="font-weight: 700;">String</span><span> </span>methodsFor:
'internet' stamp: 'BG 11/15/2002 21:18'</div>
<div class="ceven" style="background-color: rgb(238, 255,
255); padding: 4px; max-width: 1000px;"><span
style="color: red; background-color: yellow;">replaceHtml</span><span
style="font-weight: 700;">CharRefs</span><br>
<br>
| pos ampIndex scIndex special specialValue outString
outPos newOutPos |<br>
<br>
outString ← String new: self size.<br>
outPos ← 0.<br>
<br>
pos ← 1.<br>
<br>
[ pos <= self size ] whileTrue: [<span> </span><br>
<span style="color: blue;">"read up to the next
ampersand"</span><br>
ampIndex ← self indexOf: $& startingAt: pos
ifAbsent: [0].<br>
<br>
ampIndex = 0 ifTrue: [<br>
pos = 1 ifTrue: [ ↑self ] ifFalse: [ ampIndex ← self
size+1 ] ].<br>
<br>
newOutPos ← outPos + ampIndex - pos.<br>
outString<br>
replaceFrom: outPos + 1<br>
to: newOutPos<br>
with: self<br>
startingAt: pos.<br>
outPos ← newOutPos.<br>
pos ← ampIndex.<br>
<br>
ampIndex <= self size ifTrue: [<br>
<span style="color: blue;">"find the $;"</span><br>
scIndex ← self indexOf: $; startingAt: ampIndex
ifAbsent: [ self size + 1 ].<br>
<br>
special ← self copyFrom: ampIndex+1 to: scIndex-1.<span> </span><br>
specialValue ← HtmlEntity valueOfHtmlEntity: special.<span> </span><br>
<br>
specialValue<br>
ifNil: [<br>
<span style="color: blue;">"not a recognized entity.
wite it back"</span><br>
scIndex > self size
ifTrue: [ scIndex ← self size ].<br>
<br>
newOutPos ← outPos + scIndex - ampIndex + 1.<br>
outString<br>
replaceFrom: outPos+1<br>
to: newOutPos<br>
with: self<br>
startingAt: ampIndex.<br>
outPos ← newOutPos.]<br>
ifNotNil: [<br>
outPos ← outPos + 1.<br>
outString at: outPos put: specialValue isoToSqueak.].<br>
<br>
pos ← scIndex + 1. ]. ].<br>
<br>
<br>
↑outString copyFrom: 1 to: outPos</div>
</div>
</td>
</tr>
</tbody>
</table>
<br>
<div class="moz-cite-prefix">On 10/22/17 1:05 PM, Bernhard Pieber
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:5963B789-8CEB-42E1-8599-25C49944934C@pieber.com">
<pre wrap="">Dear Squeakers,
I tried to parse an HTML file like this in a trunk image and ran into a MNU:
FileStream fileNamed: ’some.html’ do: [:stream | HtmlParser parse: stream]
In HtmlText>>#initialize the message #replaceHtmlCharRefs is sent. I suppose this method was once the image. Otherwise HtmlParser would never have worked. How can I find out, when it got lost? How would you do it?
Cheers,
Bernhard
</pre>
</blockquote>
<br>
</body>
</html>