[Vm-dev] Primitive to crop a ByteArray?
bert at freudenbergs.de
Thu Nov 8 22:51:12 UTC 2012
On 08.11.2012, at 22:42, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
> On Thu, Nov 8, 2012 at 4:48 PM, Mariano Martinez Peck
> <marianopeck at gmail.com> wrote:
>> On Thu, Nov 8, 2012 at 4:38 PM, Bert Freudenberg <bert at freudenbergs.de> wrote:
>>> On 2012-11-08, at 16:22, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
>>>> Hi guys. I have the following scenario. I have a buffer (ByteArray)
>>>> that I pass by FFI to a function of size N. This function puts data in
>>>> the array and answers me the M number of bytes that it put. M <= N.
>>>> Finally, I need to copy the array of size N to the accuare size M.
>>>> To do that, I am using #copyFrom:to:. If the ByteArray is large (which
>>>> could be the case), this function takes significant time because it
>>>> needs to allocate space for the new large resulting array. So...is
>>>> there a destructive primitive where I can "crop" the existing array,
>>>> modify its size field and mark the remaining bytes as "free space for
>>>> the heap".
>>>> Do we have a primitive for that?
>>> We do not have any primitive that changes an object's size.
>>> However, if your problem is indeed the time for allocating the new array, then maybe there should be a primitive for that? E.g. one that copies a portion of one array to a new object. This would avoid having to initialize the memory of the new array - this is what's taking time, otherwise allocation is normally constant in time.
>>> OTOH it seems unusual that you would have to use extremely large buffers where initialization time matters. E.g. if you were to use this for reading from a file or stream then it might make more sense to use many smaller buffers rather than a single huge one, no?
>> Hi Bert. I should have explained better. The library I am wrapping is
>> LZ4, a fast compressor: http://code.google.com/p/lz4/
>> The function to compress looks like this:
>> int LZ4_compress (const char* source, char* dest, int isize);
>> LZ4_compress() :
>> Compresses 'isize' bytes from 'source' into 'dest'.
>> Destination buffer must be already allocated,
>> and must be sized to handle worst cases situations (input data not
>> Worst case size evaluation is provided by macro LZ4_compressBound()
>> isize : is the input size. Max supported value is ~1.9GB
>> return : the number of bytes written in buffer dest
>> So, from the image side, I am doing (summarized):
>> dest := ByteArray new: (self funcCompressBound: aByteArray size) + 4.
>> bytesCompressedOrError := self funcCompress: aByteArray
>> byteArrayDestination: dest isize: aByteArray size maxOutputSize:
>> maxOutputSize .
>> compressed := dest copyFrom: 1 to: bytesCompressedOrError + 4.
>> But a normal scenario of compression is when the bytearray is indeed
>> quite large. The function requires that "dest" is already allocated.
>> And then I need to do the #copyFrom:to:, and that is what I was trying
>> to avoid.
> maybe there is something faster than #copyFrom:to: for this case?
Not yet. There was a discussion some time ago to add an "uninitialized allocation" primitive (i.e. a ByteArray/WordArray not initialized to 0) but I think we didn't follow up on that.
Are you actually keeping that huge array in memory? If not, e.g you are sending it to disk or network, then just keeping a count around (like OrderedCollection), or putting it into a read stream) would suffice.
- Bert -
More information about the Vm-dev