[Vm-dev] Primitive to crop a ByteArray?
leves at elte.hu
Fri Nov 9 02:21:34 UTC 2012
On Thu, 8 Nov 2012, Mariano Martinez Peck wrote:
> On Thu, Nov 8, 2012 at 4:48 PM, Mariano Martinez Peck
> <marianopeck at gmail.com> wrote:
>> On Thu, Nov 8, 2012 at 4:38 PM, Bert Freudenberg <bert at freudenbergs.de> wrote:
>>> On 2012-11-08, at 16:22, Mariano Martinez Peck <marianopeck at gmail.com> wrote:
>>>> Hi guys. I have the following scenario. I have a buffer (ByteArray)
>>>> that I pass by FFI to a function of size N. This function puts data in
>>>> the array and answers me the M number of bytes that it put. M <= N.
>>>> Finally, I need to copy the array of size N to the accuare size M.
>>>> To do that, I am using #copyFrom:to:. If the ByteArray is large (which
>>>> could be the case), this function takes significant time because it
>>>> needs to allocate space for the new large resulting array. So...is
>>>> there a destructive primitive where I can "crop" the existing array,
>>>> modify its size field and mark the remaining bytes as "free space for
>>>> the heap".
>>>> Do we have a primitive for that?
>>> We do not have any primitive that changes an object's size.
>>> However, if your problem is indeed the time for allocating the new array, then maybe there should be a primitive for that? E.g. one that copies a portion of one array to a new object. This would avoid having to initialize the memory of the new array - this is what's taking time, otherwise allocation is normally constant in time.
>>> OTOH it seems unusual that you would have to use extremely large buffers where initialization time matters. E.g. if you were to use this for reading from a file or stream then it might make more sense to use many smaller buffers rather than a single huge one, no?
>> Hi Bert. I should have explained better. The library I am wrapping is
>> LZ4, a fast compressor: http://code.google.com/p/lz4/
>> The function to compress looks like this:
>> int LZ4_compress (const char* source, char* dest, int isize);
>> LZ4_compress() :
>> Compresses 'isize' bytes from 'source' into 'dest'.
>> Destination buffer must be already allocated,
>> and must be sized to handle worst cases situations (input data not
>> Worst case size evaluation is provided by macro LZ4_compressBound()
>> isize : is the input size. Max supported value is ~1.9GB
>> return : the number of bytes written in buffer dest
>> So, from the image side, I am doing (summarized):
>> dest := ByteArray new: (self funcCompressBound: aByteArray size) + 4.
>> bytesCompressedOrError := self funcCompress: aByteArray
>> byteArrayDestination: dest isize: aByteArray size maxOutputSize:
>> maxOutputSize .
>> compressed := dest copyFrom: 1 to: bytesCompressedOrError + 4.
>> But a normal scenario of compression is when the bytearray is indeed
>> quite large. The function requires that "dest" is already allocated.
>> And then I need to do the #copyFrom:to:, and that is what I was trying
>> to avoid.
> maybe there is something faster than #copyFrom:to: for this case?
Do you think #copyFrom:to: is slow because it shows up in the profiler? If
yes, then try running the same code without calling the compression
function to see how "slow" #copyFrom:to: really is.
Also, calling #funcCompressBound: is just silly. Reimplement it in
Smalltalk, it's easy and probably more efficient.
If it turns out that it's really #copyFrom:to: what's slow, then consider
compressing your data in smaller chunks (at most a few MiB each). In this
case you can even precalculate the upper bound.
P.S.: And please don't propose VM changes for such a marginal issue.
P.P.S.: I still doubt that using LZ4 will give you any benefit.
>>> - Bert -
More information about the Vm-dev