FloatArray : puzzled about varying speed
Herbert König
herbertkoenig at gmx.net
Wed Feb 14 09:31:53 UTC 2007
Hello Folks,
I have a method which mainly calculates with FloatArray using *=, /=
and friends. It's mainly sum products (inputs * coefficients) sum.
When I call it the first time, it finishes in a second, next time I
call it it takes 8 seconds.
The puzzling thing is, I get it back to speed if I re-initialize the
coefficients to new random numbers. Inputs and coefficients are in the
0.0 to 1.0 range. This does not change during the calculations, as
after every recalculation I normalise the coefficients using:
coefficients /= coefficients sum.
I use some 800 samples as values and I have 100 Arrays of 150
coefficients. So 800*100*150 calculations.
The only thing I can think of is that the Arrays of 150 inputs are
mainly populated with zeroes and 1 to 10 non zero values. So a lot of
the coefficients end up as '7.00649232162409e-45'.
Wikipedia says the closest to zero number in 32 Bit float is some
1e-38 so maybe the above numbers get a special (and time consuming)
treatment in the primitives?
I suspect this for if I add 1E-30 to every coefficient during
normalization I get a continuos speed of 2 seconds.
But in this case I get a significant number of incremental gc's
(veryfied the gc percentage in all 3 cases)
Is this some bug to put on Mantis? Some other measure to take? I'm
completely at a loss here.
MessageTallys follow for all three cases.
Thanks
Herbert mailto:herbertkoenig at gmx.net
First call to MessageTally:
- 953 tallies, 953 msec.
**Tree**
100.0% {953ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate
59.3% {565ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate
|22.6% {215ms} Neuron>>sofmLearnAtLearnRate:
| |7.7% {73ms} FloatArray>>*=
| |6.2% {59ms} FloatArray>>+=
| |5.4% {51ms} FloatArray>>-
| | |4.1% {39ms} FloatArray>>-=
| |3.4% {32ms} primitives
|13.9% {132ms} Neuron>>normalizeCoefficients
| |7.8% {74ms} FloatArray>>/=
| |6.1% {58ms} primitives
|9.5% {91ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
| |8.5% {81ms} primitives
|8.3% {79ms} Array(Collection)>>max
| |5.2% {50ms} Float(Magnitude)>>max:
| |2.9% {28ms} Array(Collection)>>inject:into:
| | 2.9% {28ms} Array(SequenceableCollection)>>do:
|4.2% {40ms} OrderedCollection>>at:
34.6% {330ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs
|21.8% {208ms} Array(SequenceableCollection)>>withIndexDo:
|12.4% {118ms} Neuron>>calculateOutput
| 8.4% {80ms} FloatArray>>*
| |6.5% {62ms} FloatArray>>*=
| 4.0% {38ms} primitives
3.5% {33ms} OrderedCollection(Collection)>>remove:
3.5% {33ms} OrderedCollection>>remove:ifAbsent:
3.4% {32ms} primitives
**Leaves**
30.3% {289ms} Array(SequenceableCollection)>>withIndexDo:
14.2% {135ms} FloatArray>>*=
7.8% {74ms} FloatArray>>/=
6.2% {59ms} FloatArray>>+=
6.1% {58ms} Neuron>>normalizeCoefficients
5.9% {56ms} OrderedCollection>>at:
5.4% {51ms} Float(Magnitude)>>max:
4.1% {39ms} FloatArray>>-=
4.0% {38ms} Neuron>>calculateOutput
3.4% {32ms} OrderedCollection>>remove:ifAbsent:
3.4% {32ms} Neuron>>sofmLearnAtLearnRate:
2.9% {28ms} Array(SequenceableCollection)>>do:
**Memory**
old +0 bytes
young +161,996 bytes
used +161,996 bytes
free -161,996 bytes
**GCs**
full 0 totalling 0ms (0.0% uptime)
incr 155 totalling 39ms (4.0% uptime), avg 0.0ms
tenures 0
root table 0 overflows
********************************************************************
Second call to MessageTally:
- 8215 tallies, 8215 msec.
**Tree**
100.0% {8215ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate
82.4% {6769ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate
|35.2% {2892ms} Neuron>>sofmLearnAtLearnRate:
| |17.2% {1413ms} FloatArray>>+=
| |17.1% {1405ms} FloatArray>>*=
|28.9% {2374ms} Neuron>>normalizeCoefficients
| |19.6% {1610ms} primitives
| |9.3% {764ms} FloatArray>>/=
|16.7% {1372ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
| 16.6% {1364ms} primitives
16.8% {1380ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs
15.2% {1249ms} Array(SequenceableCollection)>>withIndexDo:
**Leaves**
31.8% {2612ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
19.6% {1610ms} Neuron>>normalizeCoefficients
17.9% {1470ms} FloatArray>>*=
17.2% {1413ms} FloatArray>>+=
9.3% {764ms} FloatArray>>/=
**Memory**
old +0 bytes
young +169,864 bytes
used +169,864 bytes
free -169,864 bytes
**GCs**
full 0 totalling 0ms (0.0% uptime)
incr 203 totalling 54ms (1.0% uptime), avg 0.0ms
tenures 0
root table 0 overflows
***********************************************************************
nth call to MessageTally adding a small constant in normalisation:
- 2096 tallies, 2100 msec.
**Tree**
100.0% {2100ms} RingSofmTrainer(SofmTrainer)>>trainEpochVariableLearnRate
83.5% {1754ms} RingSOFM(SelfOrganizingFeatureMap)>>learnOneStepAtVariableRate
|57.2% {1201ms} Neuron>>normalizeCoefficients
| |51.5% {1082ms} FloatArray>>+=
| | |47.8% {1004ms} Fraction>>asFloat
| | | |39.3% {825ms} LargePositiveInteger(Integer)>>asFloat
| | | | |38.5% {809ms} primitives
| | | |3.5% {74ms} SmallInteger(Magnitude)>>max:
| | |3.7% {78ms} primitives
| |2.9% {61ms} primitives
| |2.8% {59ms} FloatArray>>/=
|10.5% {221ms} Neuron>>sofmLearnAtLearnRate:
| |3.6% {76ms} FloatArray>>-
| | |2.9% {61ms} FloatArray>>-=
| |3.2% {67ms} FloatArray>>*=
| |2.8% {59ms} FloatArray>>+=
|9.6% {202ms} OrderedCollection(SequenceableCollection)>>withIndexDo:
| |8.9% {187ms} primitives
|3.0% {63ms} Array(Collection)>>max
|2.4% {50ms} OrderedCollection>>at:
13.7% {288ms} RingSOFM(SelfOrganizingFeatureMap)>>calculateOutputs
8.6% {181ms} Array(SequenceableCollection)>>withIndexDo:
4.9% {103ms} Neuron>>calculateOutput
3.5% {74ms} FloatArray>>*
2.3% {48ms} FloatArray>>*=
**Leaves**
38.5% {809ms} LargePositiveInteger(Integer)>>asFloat
17.5% {368ms} Array(SequenceableCollection)>>withIndexDo:
6.5% {137ms} FloatArray>>+=
6.2% {130ms} SmallInteger(Magnitude)>>max:
5.6% {118ms} FloatArray>>*=
3.3% {69ms} OrderedCollection>>at:
2.9% {61ms} FloatArray>>-=
2.9% {61ms} Neuron>>normalizeCoefficients
2.8% {59ms} FloatArray>>/=
**Memory**
old +0 bytes
young +14,744 bytes
used +14,744 bytes
free -14,744 bytes
**GCs**
full 0 totalling 0ms (0.0% uptime)
incr 972 totalling 320ms (15.0% uptime), avg 0.0ms
tenures 0
root table 0 overflows
More information about the Squeak-dev
mailing list
|