So, this (before the change) is one of the failing tests today in trunk.
With this change (which follows the lead of another test in this class),
we can see that reading chunks with lineEndConvention set is completely
ignored - in all cases, it returns the raw data unchanged.
Should the text be changed when reading in chunk data? The method
#nextChunk calls UTF8TextConverter #nextChunkFromStream:, which calls
MultiByteFileStream #basicUpTo:, which does a raw upTo: on the stream. In
other words, it explicitly ignores any conversion.
Which path is correct, please?
Also, could $! existing inside of an encoded UTF8 character? It is looking
for raw bytes int eh stream that match change character; I'm not familiar
enough with UTF8 to know if that it is possible that a second (or third)
byte in a UTF8 character could match it, and hence doing a #basicUpTo:
would most definitely be wrong.
-cbc
On Sat, Aug 16, 2014 at 6:00 PM, <commits(a)source.squeak.org> wrote:
> A new version of MultilingualTests was added to project The Inbox:
> http://source.squeak.org/inbox/MultilingualTests-cbc.19.mcz
>
> ==================== Summary ====================
>
> Name: MultilingualTests-cbc.19
> Author: cbc
> Time: 16 August 2014, 6:00:20.609 pm
> UUID: b3ccb0dc-5adc-3a4f-b722-98deab3ae9be
> Ancestors: MultilingualTests-fbs.18
>
> MultiByteFileStream>>testLineEndingChunk is failing on Windows platforms
> (maybe others). Tweaked test to show which of the 4 is failing.
>
> =============== Diff against MultilingualTests-fbs.18 ===============
>
> Item was changed:
> ----- Method: MultiByteFileStreamTest>>testLineEndingChunk (in category
> 'testing') -----
> testLineEndingChunk
> + | failures |
> -
> fileName := 'foolinend.txt'.
> + failures := OrderedCollection new.
> MultiByteFileStream forceNewFileNamed: fileName do: [ :file |
> file
> wantsLineEndConversion: false;
> nextPutAll: 'line 1'; cr;
> nextPutAll: 'line 2'; crlf;
> nextPutAll: 'line 3'; lf;
> nextPutAll: 'line 4'; nextPut: $!! ].
> {
> {#cr. 'line 1' , String cr , 'line 2' , String cr , 'line
> 3' , String cr , 'line 4'}.
> {#lf. 'line 1' , String cr , 'line 2' , String cr , 'line
> 3' , String cr , 'line 4'}.
> {#crlf. 'line 1' , String cr , 'line 2' , String cr ,
> 'line 3' , String cr , 'line 4'}.
> {nil. 'line 1' , String cr , 'line 2' , String crlf ,
> 'line 3' , String lf , 'line 4'}
> } do: [:lineEndingResult |
> + MultiByteFileStream oldFileNamed: fileName do: [ :file | |
> actual |
> - MultiByteFileStream oldFileNamed: fileName do: [ :file |
> file lineEndConvention: lineEndingResult first.
> + lineEndingResult last = (actual := file nextChunk)
> ifFalse: [
> + failures add: (lineEndingResult copyWith:
> actual).
> + ].
> + ] ].
> + self assert: failures isEmpty!
> - self assert: lineEndingResult last equals: file
> nextChunk ] ]!
>
>
>
Changes to Trunk (http://source.squeak.org/trunk.html) in the last 24 hours:
http://lists.squeakfoundation.org/pipermail/packages/2014-August/007258.html
Name: Multilingual-ul.200
Ancestors: Multilingual-ul.199
Fix the startup of various file streams.
- ensure that MultiByteFileStream's default line end convention is initialized at startup, and only at startup
- don't initialize the stdioFiles more than once at startup
=============================================
http://lists.squeakfoundation.org/pipermail/packages/2014-August/007259.html
Name: Network-ul.152
Ancestors: Network-nice.151
Patched all methods of Socket which wait for the readSemaphore the same way #waitForDataIfClosed: was patched.
Created a new preference for the maximum timeout, because in some real-time applications it's better to use less than 500ms.
This fixes some random long waits/unexpected behavior for the users of SocketStream (e.g. WebClient, RFB, etc).
=============================================
The "Wiki" link in the header should point to http://wiki.squeak.org/squeak.
I'd really like to see the <>s properly aligned within the circles.
But otherwise, thanks!
frank
On Thu, 14 Aug 2014, commits(a)source.squeak.org wrote:
> A new version of Multilingual was added to project The Inbox:
> http://source.squeak.org/inbox/Multilingual-cbc.200.mcz
>
> ==================== Summary ====================
>
> Name: Multilingual-cbc.200
> Author: cbc
> Time: 14 August 2014, 1:32:39.389 pm
> UUID: 82f3111f-cca2-2941-9d64-1a9132c45dae
> Ancestors: Multilingual-ul.199
>
> Set the variable wantsLineEndConversions to true (in MultiByteFileStream initialize) if the platform line endings are no #cr.
> This results in default file writing to convert all #cr to whatever the default is for that platform. On platforms where #cr is the default, it does no conversions.
I'm pretty sure that it's intentional that no conversion is set by
default. MultiByteFileStream is the default FileStream now, while
CrLfFileStream is (was) a special stream. MultiByteFileStream can convert
line endings if you tell them that you want it to, otherwise it'll behave
like the old StandardFileStream.
Levente
>
> =============== Diff against Multilingual-ul.199 ===============
>
> Item was changed:
> ----- Method: MultiByteFileStream>>initialize (in category 'initialize-release') -----
> initialize
>
> super initialize.
> + wantsLineEndConversion := (LineEndDefault = #cr) not.
> - wantsLineEndConversion := false.
> self initializeConverter!
>
>
>
Levente Uzonyi uploaded a new version of Network to project The Trunk:
http://source.squeak.org/trunk/Network-ul.152.mcz
==================== Summary ====================
Name: Network-ul.152
Author: ul
Time: 15 August 2014, 4:25:21.446 am
UUID: 3af235fb-434f-4941-a044-fd0d71c5d46f
Ancestors: Network-nice.151
Patched all methods of Socket which wait for the readSemaphore the same way #waitForDataIfClosed: was patched.
Created a new preference for the maximum timeout, because in some real-time applications it's better to use less than 500ms.
This fixes some random long waits/unexpected behavior for the users of SocketStream (e.g. WebClient, RFB, etc).
=============== Diff against Network-nice.151 ===============
Item was changed:
Object subclass: #Socket
instanceVariableNames: 'semaphore socketHandle readSemaphore writeSemaphore'
+ classVariableNames: 'Connected DeadServer InvalidSocket MaximumReadSemaphoreWaitTimeout OtherEndClosed Registry RegistryThreshold TCPSocketType ThisEndClosed UDPSocketType Unconnected WaitingForConnection'
- classVariableNames: 'Connected DeadServer InvalidSocket OtherEndClosed Registry RegistryThreshold TCPSocketType ThisEndClosed UDPSocketType Unconnected WaitingForConnection'
poolDictionaries: ''
category: 'Network-Kernel'!
!Socket commentStamp: 'gk 12/13/2005 00:43' prior: 0!
A Socket represents a network connection point. Current sockets are designed to support the TCP/IP and UDP protocols. Sockets are the lowest level of networking object in Squeak and are not normally used directly. SocketStream is a higher level object wrapping a Socket in a stream like protocol.
ProtocolClient and subclasses are in turn wrappers around a SocketStream to provide support for specific network protocols such as POP, NNTP, HTTP, and FTP.!
Item was added:
+ ----- Method: Socket class>>maximumReadSemaphoreWaitTimeout (in category 'preferences') -----
+ maximumReadSemaphoreWaitTimeout
+
+ <preference: 'Maximum readSemaphore wait timeout.'
+ category: 'general'
+ description: 'The number of milliseconds for which we''ll wait for the readSemaphore of a Socket to signal. This is used by a workaround for a VM bug. Lower values use more CPU, but result in less delay in extremal cases.'
+ type: #Number>
+ ^MaximumReadSemaphoreWaitTimeout ifNil: [ 500 ]!
Item was added:
+ ----- Method: Socket class>>maximumReadSemaphoreWaitTimeout: (in category 'preferences') -----
+ maximumReadSemaphoreWaitTimeout: anInteger
+ "The number of milliseconds for which we'll wait for the readSemaphore to signal. This is used by a workaround for a VM bug."
+
+ MaximumReadSemaphoreWaitTimeout := anInteger!
Item was changed:
----- Method: Socket>>waitForDataFor:ifClosed:ifTimedOut: (in category 'waiting') -----
waitForDataFor: timeout ifClosed: closedBlock ifTimedOut: timedOutBlock
"Wait for the given nr of seconds for data to arrive."
| startTime msecsDelta |
startTime := Time millisecondClockValue.
msecsDelta := (timeout * 1000) truncated.
[(Time millisecondsSince: startTime) < msecsDelta] whileTrue: [
(self primSocketReceiveDataAvailable: socketHandle)
ifTrue: [^self].
self isConnected
ifFalse: [^closedBlock value].
+ "Providing a maximum for the time for waiting is a workaround for a VM bug which causes sockets waiting for data forever in some rare cases, because the semaphore doesn't get signaled. Remove the ""min: self class maximumReadSemaphoreWaitTimeout"" part when the bug is fixed."
self readSemaphore waitTimeoutMSecs:
+ ((msecsDelta - (Time millisecondsSince: startTime) max: 0) min: self class maximumReadSemaphoreWaitTimeout).
- (msecsDelta - (Time millisecondsSince: startTime) max: 0).
].
(self primSocketReceiveDataAvailable: socketHandle)
ifFalse: [
self isConnected
ifTrue: [^timedOutBlock value]
ifFalse: [^closedBlock value]].!
Item was changed:
----- Method: Socket>>waitForDataIfClosed: (in category 'waiting') -----
waitForDataIfClosed: closedBlock
"Wait indefinitely for data to arrive. This method will block until
data is available or the socket is closed."
[
(self primSocketReceiveDataAvailable: socketHandle)
ifTrue: [^self].
self isConnected
ifFalse: [^closedBlock value].
+ "Providing a maximum for the time for waiting is a workaround for a VM bug which causes sockets waiting for data forever in some rare cases, because the semaphore doesn't get signaled. Replace the ""waitTimeoutMSecs: self class maximumReadSemaphoreWaitTimeout"" part with ""wait"" when the bug is fixed."
+ self readSemaphore waitTimeoutMSecs: self class maximumReadSemaphoreWaitTimeout ] repeat
- "This 500 ms waiting is a workaround for a VM bug which causes sockets waiting for data forever randomly, because the semaphore doesn't get signaled. Revert to ""self readSemaphore wait"" when the bug is fixed."
- self readSemaphore waitTimeoutMSecs: 500 ] repeat
!
Item was changed:
----- Method: Socket>>waitForDisconnectionFor: (in category 'waiting') -----
waitForDisconnectionFor: timeout
"Wait for the given nr of seconds for the connection to be broken.
Return true if it is broken by the deadline, false if not.
The client should know the connection is really going to be closed
(e.g., because he has called 'close' to send a close request to the other end)
before calling this method."
| startTime msecsDelta status |
startTime := Time millisecondClockValue.
msecsDelta := (timeout * 1000) truncated.
status := self primSocketConnectionStatus: socketHandle.
[((status == Connected) or: [(status == ThisEndClosed)]) and:
[(Time millisecondsSince: startTime) < msecsDelta]] whileTrue: [
self discardReceivedData.
+ "Providing a maximum for the time for waiting is a workaround for a VM bug which causes sockets waiting for data forever in some rare cases, because the semaphore doesn't get signaled. Remove the ""min: self class maximumReadSemaphoreWaitTimeout"" part when the bug is fixed."
self readSemaphore waitTimeoutMSecs:
+ ((msecsDelta - (Time millisecondsSince: startTime) max: 0) min: self class maximumReadSemaphoreWaitTimeout).
- (msecsDelta - (Time millisecondsSince: startTime) max: 0).
status := self primSocketConnectionStatus: socketHandle].
^ status ~= Connected!
Levente Uzonyi uploaded a new version of Multilingual to project The Trunk:
http://source.squeak.org/trunk/Multilingual-ul.200.mcz
==================== Summary ====================
Name: Multilingual-ul.200
Author: ul
Time: 15 August 2014, 4:11:35.855 am
UUID: 7fac98fd-9abf-4f69-a21f-3e16b5710480
Ancestors: Multilingual-ul.199
Fix the startup of various file streams.
- ensure that MultiByteFileStream's default line end convention is initialized at startup, and only at startup
- don't initialize the stdioFiles more than once at startup
=============== Diff against Multilingual-ul.199 ===============
Item was removed:
- ----- Method: MultiByteFileStream class>>startUp (in category 'class initialization') -----
- startUp
-
- self guessDefaultLineEndConvention.
- !
Item was added:
+ ----- Method: MultiByteFileStream class>>startUp: (in category 'class initialization') -----
+ startUp: resuming
+
+ resuming ifTrue: [ self guessDefaultLineEndConvention ]
+ !
Item was changed:
+ (PackageInfo named: 'Multilingual') postscript: '"Remove CrLfFileStream from the startupList"
+ (Smalltalk classNamed: ''CrLfFileStream'') ifNotNil: [ :class |
+ Smalltalk removeFromStartUpList: class ]'!
- (PackageInfo named: 'Multilingual') postscript: '"Initialize the value of wantsLineEndConversion in all MultiByteFileStreams"
- MultiByteFileStream allSubInstancesDo: [ :each |
- (each instVarNamed: #wantsLineEndConversion) ifNil: [
- each instVarNamed: #wantsLineEndConversion put: false ] ]'!