[squeak-dev] Ask objects to group themselves by similar meanings of        words....

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Sun Apr 5 11:35:26 UTC 2020


Sounds like an interesting project! A few years ago, I had a lot of fun with a sort of similar project, setting up an object model for the Latin grammar. It was really pleasant, and Latin is such a delightfully logical language! Unfortunately, I did not yet know Squeak/Smalltalk when I was working at this project ...


I wish you much joy and success with your project. Carpe Squicum! :-)


Best,

Christoph

<http://www.hpi.de/>
________________________________
Von: gettimothy <gettimothy at zoho.com>
Gesendet: Freitag, 3. April 2020 22:08:19
An: Thiede, Christoph
Cc: squeak-dev
Betreff: Re: AW: [squeak-dev] Ask objects to group themselves by similar meanings of    words....


Hi Christoph

Thanks for the reply.

The "embeddings" is the "interesting idea" I was looking for. I was sort of hoping Squeak already had something like that. (:


What I am doing is teaching myself Latin.  See. Write. Hear. Say. via a Seaside App.

The imported objects will take care of the "See" part.
I can probably get them to import , or link to how they sound via Wiktionary.

I would like objects I imported to organize themselves in different ways. Alphabetic order is the obvious one and the one the wikipedia pages use.

I think having these things "cluster about meaning/concept" and "cluster about sound" would be a useful pedagogical approach.


For example spherical things: Ball, Sun, Planet, potato,
Colors:
animals
Big things
Small things..



When I import the medical roots: https://en.wikipedia.org/wiki/List_of_medical_roots,_suffixes_and_prefixes
and say you are studying the skeletal system in anatomy, stuff clustered around bones.


Thanks for your reply.


---- On Fri, 03 Apr 2020 15:41:57 -0400 Thiede, Christoph <Christoph.Thiede at student.hpi.uni-potsdam.de<mailto:Christoph.Thiede at student.hpi.uni-potsdam.de>> wrote ----


So if I understand you correctly, your question is not actually related to Squeak/Smalltalk at all but rather to the general problem of comparing English vocables by semantic? Off-topic, but still an interesting topic :)


I can only give you a few rough keywords, maybe one of them can help you, and maybe you were already ten steps ahead of me :-)


If you only care about similarity by letters, the simplest solution might be something like calculating the Longest Common Prefix of two strings and comparing the result with a threshold. (That term is googlable :)) However, this won't help you with pairs such as "acentric - acrocentric" unless you use some kind of fuzzy matching.


If you actually care about the semantic similarity, one approach could be a gigantic dictionary of synonyms. I'm sure there are any relevant databases on the web.

The problem with synonyms is that they can compare words only dually. But are "centrifugal" and "centripetal" actually synonyms? It totally depends on the perspective. Maybe you won't be happy with this approach.

A more sophisticated approach is word embeddings. The rough idea is to map each vocable to a large vector in which each component quantifies how related the vocable is to a specific topic. There's a lot of research around this field ...


PS: What are you trying to do with these results, eventually? :-)


Best,

Christoph

________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org<mailto:squeak-dev-bounces at lists.squeakfoundation.org>> im Auftrag von gettimothy via Squeak-dev <squeak-dev at lists.squeakfoundation.org<mailto:squeak-dev at lists.squeakfoundation.org>>
Gesendet: Freitag, 3. April 2020 21:00:57
An: squeak-dev
Betreff: [squeak-dev] Ask objects to group themselves by similar meanings of words....

Hi folks

I have extracted the various Greek and Latin Roots from https://en.wikipedia.org/wiki/List_of_Greek_and_Latin_roots_in_English/A–G<https://en.wikipedia.org/wiki/List_of_Greek_and_Latin_roots_in_English/A%E2%80%93G> to Squeak objects.
The objects correlate to one row in the various tables at the link.

For example, I have one object for:

<tr>
<td><b>abac-</b><sup id="cite_ref-2" class="reference"><a href="#cite_note-2">[2]</a></sup></td>
<td>slab</td>
<td>Greek</td>
<td><span lang="grc"><a href="https://en.wiktionary.org/wiki/%E1%BC%84%CE%B2%CE%B1%CE%BE#Ancient_Greek" class="extiw" title="wikt:ἄβαξ">ἄβαξ, ἄβακος</a></span> (<span title="Ancient Greek transliteration" lang="grc-Latn"><i>ábax, ábakos</i></span>), <span lang="grc"><a href="https://en.wiktionary.org/wiki/%E1%BC%80%CE%B2%CE%B1%CE%BA%CE%AF%CF%83%CE%BA%CE%BF%CF%82#Ancient_Greek" class="extiw" title="wikt:ἀβακίσκος">ἀβακίσκος</a></span> (<span title="Ancient Greek transliteration" lang="grc-Latn"><i>abakískos</i></span>)</td>
<td>abaciscus, <a href="/wiki/Abacus" title="Abacus">abacus</a>, <a href="/wiki/Abax" class="mw-redirect" title="Abax">abax</a>
</td></tr>


the cells are put into accessors..corresponding to the headers of the table:

Root, Meaning, Origin, Etymology, English examples.

MyObject
      root -> abac
      meaning -> slab
      language -> greek
      etymology -> blah
       examples -> more-blah


Focusing on "english examples" I am interested in

LatinRoots select:[:each | each english_examples  "have same or similar meanings"]

If anybody has pointers to projects that have grappled with that problem I would appreciate a link.

answers like "Your question is completely nonsensical" are ok, too (:

thanks for your time.







-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200405/d4016535/attachment.html>


More information about the Squeak-dev mailing list