[squeak-dev] The Inbox: Collections-nice.891.mcz

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Sun May 3 15:23:12 UTC 2020


were equipped of course, hem!

Le dim. 3 mai 2020 à 17:22, Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> a écrit :

> bah excuse the grammar...
> I constantly invert now vs know and where vs were, I wish your brain is
> equipped with auto-correction.
>
> Le dim. 3 mai 2020 à 17:20, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> a écrit :
>
>>
>>
>> Le dim. 3 mai 2020 à 17:13, Tobias Pape <Das.Linux at gmx.de> a écrit :
>>
>>>
>>> > On 03.05.2020, at 15:52, Nicolas Cellier <
>>> nicolas.cellier.aka.nice at gmail.com> wrote:
>>> >
>>> > Hi Subbu,
>>> > Yes those raw bits are somehow like immediates, but not exactly...
>>>
>>> So the name maybe should include "raw"?
>>> :D
>>> -t
>>>
>>
>> Yes, that was the first name that came in my mind.
>> From the abstract superclass POV, those are raw bits.
>> Only subclass really now how to interpret those bits as value objects.
>> I don't know why I then changed my mind...
>> Maybe because RawBits does not convey the meaning of FixedWidth.
>> Note that the class query is #isBits. So the AbstractBitsArray is somehow
>> in line with that.
>>
>> >
>>> > Immediates are objects having their value encoded into the pointer
>>> slot (either in 4 or 8 bytes, according to 32bits or 64bits VM word size).
>>> > Currently, this covers only SmallInteger, Character and SmallFloat on
>>> 64bits.
>>> >
>>> > Here we have values encoded into slots of 1, 2, 4 or 8 bytes, but not
>>> into an object oriented pointer slot.
>>> > Technically, #(1 2.0 $3) is an Array of immediates, while ((ColorArray
>>> with: Color black) first) is not an immediate...
>>> > So even if it is the same notion of encoded value, it's not an exact
>>> match...
>>> >
>>> > Concerning the use cases, I effectively want to use such bit arrays
>>> for fast data transfer.
>>> > For example, it is useful for FFI I use exclusivily this kind of array
>>> for Smallapack...
>>> > But also when reading big files in Matlab, National Instrument TDMS or
>>> HDF5 format.
>>> > it really helps to have all the possible flavours for common
>>> elementary types of values.
>>> > Otherwise, I have to use an intermediate ByteArray, or pointers to
>>> external heap via FFI (like I did in Smallapack).
>>> >
>>> > More than often, the data transfer can handle offset and stride via a
>>> BitBlt tricks (unless we have an odd layout).
>>> > This enables extracting a single "column" or bloc of data from a big
>>> file with a single copy.
>>> > I may need to extend BitBlt to cope with all the available bit-widths,
>>> not just 8 (byte) or 32 (word) though.
>>> >
>>> > Also, those formats offer packed and contiguous memory layout which is
>>> an advantage too when dealing with large chunks of data.
>>> > Especially if we have vectorized primitives operating on the arrays.
>>> >
>>> > Also, creating non-immediate objects on the fly thru #at: #at:put: is
>>> very efficient if VM has generation scavenger because those objects are
>>> generally short-lived.
>>> > While retaining all the pointers to a whole collection of non
>>> immediate objects is putting a lot of pressure on the garbage collector.
>>> >
>>> > The advantage somehow diminish with the advent of 64bits VM: most
>>> values can be immediates, so we have quasi-contiguous data at a few
>>> exceptions, and not so much GC pressure.
>>> > But still, the primitives can operate on raw bits, without having to
>>> handle the immediate tag, nor exceptional (non immediate) values.
>>> >
>>> > For the anecdote, in the 90s, I started to experiment some crashes in
>>> objectworks/visualworks when handling large Arrays of Float.
>>> > The console would only report: *out of memory*.
>>> > With increasing processor speed, the memory where exhausted before the
>>> low space monitoring process had a chance to handle the situation.
>>> > I then decided to handle all my Arrays of Float (Double) thru some
>>> UninterpretedBytes and ad-hoc primitives for at: at:put:
>>> > Since then, I never came back to pointer oriented arrays: if we want
>>> Smalltalk to scale, we need those basic objects  :)
>>> >
>>> >
>>> > Le dim. 3 mai 2020 à 06:50, K K Subbu <kksubbu.ml at gmail.com> a écrit :
>>> > On 02/05/20 5:41 pm, commits at source.squeak.org wrote:
>>> > > Nicolas Cellier uploaded a new version of Collections to project The
>>> Inbox:
>>> > > http://source.squeak.org/inbox/Collections-nice.891.mcz
>>> > >
>>> > > ==================== Summary ====================
>>> > >
>>> > > Name: Collections-nice.891
>>> > > Author: nice
>>> > > Time: 2 May 2020, 7:40:45.298967 pm
>>> > > UUID: 08510be0-8293-6744-959d-c1d41bc13ae1
>>> > > Ancestors: Collections-nice.890
>>> > >
>>> > > Experimental - For discussion
>>> > >
>>> > > Group some (most) non-pointers collections under an abstract
>>> FixedBitWifthArray.
>>> > > I know, the name is hard to pronounce and thus ugly: it's opened to
>>> discussion.
>>> > >
>>> > > This enables factorization of some methods, for example the trick
>>> for atAllPut:
>>> > > Also notice that most methods are shared between FloatArray and
>>> Float64Array.
>>> >
>>> > How about ImmediateWord/ImmediateObject and an ImmediateArray (an
>>> array
>>> > consisting only of Immediate elements)? It would be consistent with
>>> > isImmediateClass method.
>>> >
>>> > An object chunk could be checked at loading time to see if it needs to
>>> > be converted from immediate to pointers or vice versa. In the typical
>>> > case, this will be a nop. But if the image is moved to a different
>>> host
>>> > type (say from 64b to 32b or from x86 to ARM), then some immediate
>>> > numbers may be converted into pointers or vice versa. If this
>>> increases
>>> > loading time for large images, then the image may be saved locally.
>>> >
>>> > This is just a strawman. I haven't really thought through all its
>>> > implications.
>>> >
>>> > Regards .. Subbu
>>> >
>>> >
>>>
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200503/f934a46e/attachment.html>


More information about the Squeak-dev mailing list