Magma read performance

List overview All Threads
Download

newer

older

Re: hard-drive read-performance

Magma upgrater

Chris Muller

24 Nov 2010 24 Nov '10

6:38 p.m.

There has been interest in Magma's read and query performance lately, so I thought I would share results of a recent benchmark test.

It's an actual application which does very heavy reading and writing to a Magma repository. This test was 24GB of repository work, over two days. My goal was to determine, once the persistent model becomes many times the size of RAM and HD access appeared to become the limiting factor:

- how fast is Magma, in terms of "objects per second" and in "kilobytes per second"? - how fast is this relative to the speed when the repository was empty?

Conclusion:

- Magma started at 4K objects per second (empty repository), 282K per second. - Finished with 2.3K objects per second (6GB repository), 750K per second. - Memory consumption by the image never exceeded 300MB.

It is important to note, these times are from the client MagmaSession point-of-view, including the full server roundtrip plus materialization. Also, as can be seen from the attached data, there were many requests which only brought back one or two objects which, while dramatically lowering the overall reported throughput, is actually a real-world scenario for applications.

Verbose description:

When the test first started the repository was tiny, and there were only ~4K server requests during a 5-minute monitored period. The total number of objects read across all ~4K requests was ~17M for an average throughput of 4K objects per second (ops). As the model grew in size and complexity, more and more objects during the 5-minute monitored intervals were required to perform application-posting of the same number of input records; doubling to ~8K server requests during a 5-minute period, however only a few more objects brought back (total 21M), for an average of 2500 ops. 32 hours after that, the rate of reads dropped to about 2300 ops. This is due to two factors:

- the objects were larger (e.g., more pointers to other objects) - they were less clustered (having been replaced with objects which could not be co-located with the original object)

The fact that objects got bigger (e.g., more pointers to other objects) is corroborated by another stat, "the number of *bytes* per second" read off the HD in order to access those objects. Initially, when the test first started, there were about 4K *requests* for objects in one of the first five-minute periods, which were read at a rate of 282K per second. 24 hours later, there were only 2 times as many requests for objects, but were read-and-materialized at a rate of 750K bytes / second.

- Chris

Attachments:

bench.txt.gz (application/x-gzip — 72.1 KB)

Show replies by date

Facundo Vozzi

25 Nov 25 Nov

1:56 p.m.

Hi Chris, what kind of application is that? At work I'm working on a ten years old production system with Oracle DBMS and it's 1 GB of size.

Thanks for share your analysis again, Facu

On Wed, Nov 24, 2010 at 2:38 PM, Chris Muller ma.chris.m@gmail.com wrote:

...

There has been interest in Magma's read and query performance lately, so I thought I would share results of a recent benchmark test.

It's an actual application which does very heavy reading and writing to a Magma repository. This test was 24GB of repository work, over two days. My goal was to determine, once the persistent model becomes many times the size of RAM and HD access appeared to become the limiting factor:

how fast is Magma, in terms of "objects per second" and in

"kilobytes per second"?

how fast is this relative to the speed when the repository was empty?

Conclusion:

Magma started at 4K objects per second (empty repository), 282K per

second.

Finished with 2.3K objects per second (6GB repository), 750K per second.

Memory consumption by the image never exceeded 300MB.

It is important to note, these times are from the client MagmaSession point-of-view, including the full server roundtrip plus materialization. Also, as can be seen from the attached data, there were many requests which only brought back one or two objects which, while dramatically lowering the overall reported throughput, is actually a real-world scenario for applications.

Verbose description:

When the test first started the repository was tiny, and there were only ~4K server requests during a 5-minute monitored period. The total number of objects read across all ~4K requests was ~17M for an average throughput of 4K objects per second (ops). As the model grew in size and complexity, more and more objects during the 5-minute monitored intervals were required to perform application-posting of the same number of input records; doubling to ~8K server requests during a 5-minute period, however only a few more objects brought back (total 21M), for an average of 2500 ops. 32 hours after that, the rate of reads dropped to about 2300 ops. This is due to two factors:

the objects were larger (e.g., more pointers to other objects)

they were less clustered (having been replaced with objects which

could not be co-located with the original object)

The fact that objects got bigger (e.g., more pointers to other objects) is corroborated by another stat, "the number of *bytes* per second" read off the HD in order to access those objects. Initially, when the test first started, there were about 4K *requests* for objects in one of the first five-minute periods, which were read at a rate of 282K per second. 24 hours later, there were only 2 times as many requests for objects, but were read-and-materialized at a rate of 750K bytes / second.

Chris

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Chris Muller

26 Nov 26 Nov

7:43 p.m.

It's a proprietary data-aggregation tool that permutes every combination of data-attributes of an input-source at multiple levels. Even small inputs into the system can lead to very large output graphs. It's a signature test-application for Magma; building and accessing a fairly large and complex data-structure and something I think would be very difficult to do, at least abstractly, with an RDBMS..

On Thu, Nov 25, 2010 at 6:56 AM, Facundo Vozzi facundov79@gmail.com wrote:

...

Hi Chris, what kind of application is that? At work I'm working on a ten years old production system with Oracle DBMS and it's 1 GB of size.

Thanks for share your analysis again, Facu On Wed, Nov 24, 2010 at 2:38 PM, Chris Muller ma.chris.m@gmail.com wrote:

...
There has been interest in Magma's read and query performance lately, so I thought I would share results of a recent benchmark test.

It's an actual application which does very heavy reading and writing to a Magma repository. This test was 24GB of repository work, over two days. My goal was to determine, once the persistent model becomes many times the size of RAM and HD access appeared to become the limiting factor:

- how fast is Magma, in terms of "objects per second" and in "kilobytes per second"? - how fast is this relative to the speed when the repository was empty?

Conclusion:

- Magma started at 4K objects per second (empty repository), 282K per second. - Finished with 2.3K objects per second (6GB repository), 750K per second. - Memory consumption by the image never exceeded 300MB.

It is important to note, these times are from the client MagmaSession point-of-view, including the full server roundtrip plus materialization. Also, as can be seen from the attached data, there were many requests which only brought back one or two objects which, while dramatically lowering the overall reported throughput, is actually a real-world scenario for applications.

Verbose description:

When the test first started the repository was tiny, and there were only ~4K server requests during a 5-minute monitored period. The total number of objects read across all ~4K requests was ~17M for an average throughput of 4K objects per second (ops). As the model grew in size and complexity, more and more objects during the 5-minute monitored intervals were required to perform application-posting of the same number of input records; doubling to ~8K server requests during a 5-minute period, however only a few more objects brought back (total 21M), for an average of 2500 ops. 32 hours after that, the rate of reads dropped to about 2300 ops. This is due to two factors:

- the objects were larger (e.g., more pointers to other objects) - they were less clustered (having been replaced with objects which could not be co-located with the original object)

The fact that objects got bigger (e.g., more pointers to other objects) is corroborated by another stat, "the number of *bytes* per second" read off the HD in order to access those objects. Initially, when the test first started, there were about 4K *requests* for objects in one of the first five-minute periods, which were read at a rate of 282K per second. 24 hours later, there were only 2 times as many requests for objects, but were read-and-materialized at a rate of 750K bytes / second.

- Chris

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Facundo Vozzi

8:35 p.m.

Thanks you Chris, good to know that there is a real world magma application with 24 GB of size.

See you, Facu

On Fri, Nov 26, 2010 at 3:43 PM, Chris Muller asqueaker@gmail.com wrote:

...

It's a proprietary data-aggregation tool that permutes every combination of data-attributes of an input-source at multiple levels. Even small inputs into the system can lead to very large output graphs. It's a signature test-application for Magma; building and accessing a fairly large and complex data-structure and something I think would be very difficult to do, at least abstractly, with an RDBMS..

On Thu, Nov 25, 2010 at 6:56 AM, Facundo Vozzi facundov79@gmail.com wrote:

...
Hi Chris, what kind of application is that? At work I'm working on a ten years old production system with Oracle DBMS and it's 1 GB of size.

Thanks for share your analysis again, Facu On Wed, Nov 24, 2010 at 2:38 PM, Chris Muller ma.chris.m@gmail.com

wrote:

...
...
There has been interest in Magma's read and query performance lately, so I thought I would share results of a recent benchmark test.

It's an actual application which does very heavy reading and writing to a Magma repository. This test was 24GB of repository work, over two days. My goal was to determine, once the persistent model becomes many times the size of RAM and HD access appeared to become the limiting factor:

how fast is Magma, in terms of "objects per second" and in

"kilobytes per second"?

how fast is this relative to the speed when the repository was empty?

Conclusion:

Magma started at 4K objects per second (empty repository), 282K per

second.

Finished with 2.3K objects per second (6GB repository), 750K per

second.

Memory consumption by the image never exceeded 300MB.

It is important to note, these times are from the client MagmaSession point-of-view, including the full server roundtrip plus materialization. Also, as can be seen from the attached data, there were many requests which only brought back one or two objects which, while dramatically lowering the overall reported throughput, is actually a real-world scenario for applications.

Verbose description:

When the test first started the repository was tiny, and there were only ~4K server requests during a 5-minute monitored period. The total number of objects read across all ~4K requests was ~17M for an average throughput of 4K objects per second (ops). As the model grew in size and complexity, more and more objects during the 5-minute monitored intervals were required to perform application-posting of the same number of input records; doubling to ~8K server requests during a 5-minute period, however only a few more objects brought back (total 21M), for an average of 2500 ops. 32 hours after that, the rate of reads dropped to about 2300 ops. This is due to two factors:

the objects were larger (e.g., more pointers to other objects)

they were less clustered (having been replaced with objects which

could not be co-located with the original object)

The fact that objects got bigger (e.g., more pointers to other objects) is corroborated by another stat, "the number of *bytes* per second" read off the HD in order to access those objects. Initially, when the test first started, there were about 4K *requests* for objects in one of the first five-minute periods, which were read at a rate of 282K per second. 24 hours later, there were only 2 times as many requests for objects, but were read-and-materialized at a rate of 750K bytes / second.

Chris

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Chris Muller

29 Nov 29 Nov

4:15 a.m.

It is not a real world magma application. As I said, it was just a internal benchmark test.

On Fri, Nov 26, 2010 at 1:35 PM, Facundo Vozzi facundov79@gmail.com wrote:

...

Thanks you Chris, good to know that there is a real world magma application with 24 GB of size. See you, Facu On Fri, Nov 26, 2010 at 3:43 PM, Chris Muller asqueaker@gmail.com wrote:

...
It's a proprietary data-aggregation tool that permutes every combination of data-attributes of an input-source at multiple levels. Even small inputs into the system can lead to very large output graphs. It's a signature test-application for Magma; building and accessing a fairly large and complex data-structure and something I think would be very difficult to do, at least abstractly, with an RDBMS..

On Thu, Nov 25, 2010 at 6:56 AM, Facundo Vozzi facundov79@gmail.com wrote:

...
Hi Chris, what kind of application is that? At work I'm working on a ten years old production system with Oracle DBMS and it's 1 GB of size.

Thanks for share your analysis again, Facu On Wed, Nov 24, 2010 at 2:38 PM, Chris Muller ma.chris.m@gmail.com wrote:

...
There has been interest in Magma's read and query performance lately, so I thought I would share results of a recent benchmark test.

It's an actual application which does very heavy reading and writing to a Magma repository. This test was 24GB of repository work, over two days. My goal was to determine, once the persistent model becomes many times the size of RAM and HD access appeared to become the limiting factor:

- how fast is Magma, in terms of "objects per second" and in "kilobytes per second"? - how fast is this relative to the speed when the repository was empty?

Conclusion:

- Magma started at 4K objects per second (empty repository), 282K per second. - Finished with 2.3K objects per second (6GB repository), 750K per second. - Memory consumption by the image never exceeded 300MB.

It is important to note, these times are from the client MagmaSession point-of-view, including the full server roundtrip plus materialization. Also, as can be seen from the attached data, there were many requests which only brought back one or two objects which, while dramatically lowering the overall reported throughput, is actually a real-world scenario for applications.

Verbose description:

When the test first started the repository was tiny, and there were only ~4K server requests during a 5-minute monitored period. The total number of objects read across all ~4K requests was ~17M for an average throughput of 4K objects per second (ops). As the model grew in size and complexity, more and more objects during the 5-minute monitored intervals were required to perform application-posting of the same number of input records; doubling to ~8K server requests during a 5-minute period, however only a few more objects brought back (total 21M), for an average of 2500 ops. 32 hours after that, the rate of reads dropped to about 2300 ops. This is due to two factors:

- the objects were larger (e.g., more pointers to other objects) - they were less clustered (having been replaced with objects which could not be co-located with the original object)

The fact that objects got bigger (e.g., more pointers to other objects) is corroborated by another stat, "the number of *bytes* per second" read off the HD in order to access those objects. Initially, when the test first started, there were about 4K *requests* for objects in one of the first five-minute periods, which were read at a rate of 282K per second. 24 hours later, there were only 2 times as many requests for objects, but were read-and-materialized at a rate of 750K bytes / second.

- Chris

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

Magma mailing list Magma@lists.squeakfoundation.org http://lists.squeakfoundation.org/mailman/listinfo/magma

4905

Age (days ago)

4910

Last active (days ago)

magma@lists.squeakfoundation.org

4 comments

3 participants

tags (0)

participants (3)

Chris Muller
Chris Muller
Facundo Vozzi