[squeak-dev] Please consider Apache Integration of Services

henry henry at callistohouse.club
Fri Nov 10 22:11:06 UTC 2017


Note, my quote below, Storm is implementable by any language, hello Smalltalk!

At the risk of throwing a rock in the pool, I must as well acknowledging the unique offerings we have in Smalltalk spaces. The challenge as I see it is the lack of a coordinated effort to adopt common interfaces used in industry. I would hold up the history of the Cryptography team how different folks came together to join efforts in creating a shared library. Which they did. It holds up over the test of time and is ported through multiple Smalltalk environments. Using that as a reference, were the various cloud and BigData applications be seen as worthy to build good integration with, the access to all the great data-manipulation tools and presentations that Smalltalk environments offer will finally be accessible to put on the table in corporate and advanced data processing environments.

Alright, you may accept what I have been saying then for those of us still in contemplation will be curious about what work that entails. Allow me to present a few of the projects working together through Apache Foundation. They are truly leaders in creating the computing of BigData. The core architecture that is commonly used in BigData and Cloud is called the Lambda Architecture, consisting of a fault-tolerant event streaming source, such as Apache Kafka [1], a batch-processing pipe and NoSQL database coordinator, such as Apache Cassandra [2], and a real-time processing pipe, for example Apache Storm [3]. There is also analytics on the other side of storage, such as through Cassandra to Hadoop and queries. What I would like to highlight is Apache Storm.

Apache Storm is quite simple in idea though more complex on hardware. One thing to keep in mind is that the fault-tolerance requirement has forced Kafka and Storm to both be replication centric in a de-centralized way. They tend to use Apache Zookeeper [4] to monitor progress through durable queues of data. Storm in particular is a way to consume streams of events, including the ability to join and filter them. Here are two blurbs, one about Storm Architecture and the other about how other languages can implement Storm architecture pieces, especially Bolts.

My question is why can't and why shouldn't Smalltalks (Pharo, Squeak, Gemstone, Smalltalk Express, Swift, ST/X, Dolphin) be able to participate? Working software engineers are building this stuff and lots of time and money are being spent converting critical data in industry. It seems now is the time to jump aboard as much computing will be done in these areas and we can compete! Here are blurbs about Storm then links to Kafka [1], Cassandra [2] and Storm [3].

------
"There are just three abstractions in Storm: spouts, bolts, and topologies. A spout is a source of streams in a computation. Typically a spout reads from a queueing broker such as Kestrel, RabbitMQ, or Kafka, but a spout can also generate its own stream or read from somewhere like the Twitter streaming API. Spout implementations already exist for most queueing systems.

A bolt processes any number of input streams and produces any number of new output streams. Most of the logic of a computation goes into bolts, such as functions, filters, streaming joins, streaming aggregations, talking to databases, and so on.

A topology is a network of spouts and bolts, with each edge in the network representing a bolt subscribing to the output stream of some other spout or bolt. A topology is an arbitrarily complex multi-stage stream computation. Topologies run indefinitely when deployed."

------
"Storm was designed from the ground up to be usable with any programming language. At the core of Storm is a [Thrift](http://thrift.apache.org/) [definition](https://github.com/apache/storm/blob/master/storm-core/src/storm.thrift) for defining and submitting topologies. Since Thrift can be used in any language, topologies can be defined and submitted from any language.

Similarly, spouts and bolts can be defined in any language. Non-JVM spouts and bolts communicate to Storm over a [JSON-based protocol](http://storm.apache.org/documentation/Multilang-protocol.html) over stdin/stdout. Adapters that implement this protocol exist for [Ruby](https://github.com/apache/storm/blob/master/storm-multilang/ruby/src/main/resources/resources/storm.rb), [Python](https://github.com/apache/storm/blob/master/storm-multilang/python/src/main/resources/resources/storm.py), [Javascript](https://github.com/apache/storm/blob/master/storm-multilang/javascript/src/main/resources/resources/storm.js), [Perl](https://github.com/dan-blanchard/io-storm)."

------

[1] https://kafka.apache.org/
[2] http://cassandra.apache.org/
[3] http://storm.apache.org/
[4] https://zookeeper.apache.org/

- HH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20171110/42fdca88/attachment.html>


More information about the Squeak-dev mailing list