msvcrt printf bug

classic Classic list List threaded Threaded
78 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

tei.andu
I fixed the bug in my float to string code. Here it is: http://pastebin.com/z9hYEWF1
Tested for all possible inputs against mingw sprintf with _XOPEN_SOURCE defined to 1. All input values match to the last decimal.

--
Securely sent with Tutanota. Claim your encrypted mailbox today!
https://tutanota.com
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Keith Marshall-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 19/01/17 16:24, [hidden email] wrote:
> Tested ... with _XOPEN_SOURCE defined to 1.

Aside: this is an odd choice for _XOPEN_SOURCE.  Typically, it
would be 400, 500, 600, or 700, depending on the level of POSIX.1
subset support you want; see include/_mingw.h for details.

- --
Regards,
Keith.

Public key available from keys.gnupg.net
Key fingerprint: C19E C018 1547 DE50 E1D4 8F53 C0AD 36C6 347E 5A3F
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (GNU/Linux)

iQIcBAEBAgAGBQJYgTC1AAoJEMCtNsY0flo/2+0P/3wI/TV6ZACjCDxblvJjw/Ws
R7Uxja6msVaCOoRQ53xkctkdY20sk2mQO/Ho13Wfl0/kykNDRfiov7H6e4vBz5z7
RUBZKL2I0MPMZ/b4rrCwQ1MdDKPbXp3A6I7dXBAnMqI+2mBFvxv0Gf1/QV5MtUNN
YgQqF/MRH8r6++BScsK5v86GzstPoCJTiyzZ/k6ynnFbCEOz56kc3tX0QR37FIoJ
nVzQIyLSmTUtJmi+9oUqTyy4Cagn3JeBUdK0qj7uW84jP1XTop6rbMeDCGl84I6l
J7GtBwAlAmmi2WZ9HM6sVnBp+52zaGNItYa3RbnktXmRCiN6eSRnlKXqIGd+zObi
4ZzgciFMQoe87YlvVptRhTOpj9b8IF6Cm6lGtQEZq2Xl8u0y5/X9FomxzUGZ9jec
gsdu/m1flSQaFRudmBi8Pn3ZlA0X4plGWoM18Vyx/7vylJtbUc+MVQ+OhOzXlFpC
+xN/1pdeJp2FRcftYtsfzcVQ4vPx87V/uwebkFHJtV1OROICwPwviVrHSQ9re3P4
HyI++o2VThhgg9/EP6L5jGk1ZDPlprFMHVk6nes242gnDIo+H5X8xBWu80NOMaMB
xjkxwJZqgwOrpKdk5yCYE6cps25eWwes4RdYI58KvpFrAibKchmSINyVa1NZeJtj
1mLqF4LXuSAbzb0fMuQs
=JlxH
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
In reply to this post by Peter Rockett
Hi Peter,

On 19-Jan-17 12:35, Peter Rockett wrote:
> Guys
>
> At the risk of wading into a thread that has become very heated, could I
> drag this back to technical matters.

Excellent idea indeed.

> I think everybody (apart maybe from the OP) agrees how floating point
> numbers behave. Keith makes a good point about rounding. Can I toss in
> another feature that changing the compiler optimisation level often
> reorders instructions meaning that rounding errors accumulate in
> different ways. So changing the optimisation level often slightly
> changes the numerical answers. :-\

I agree that it could well (or even should?) be the case... but it's not
in my case - to my own pleasant surprise.

I can build with -O3 to get the most juice for releases, or with -O0 to
debug... my logs are still the same (spare for actual bugs). I even
compile with -mtune=native -march=native -mpopcnt -mmmx -mssse3 -msse4.2
(native: I'm on Xeons), although I doubt very much the Borland compiler
knows anything about those optimizations... and yet the latter's logs
are still identical to MinGW's.
Honestly it beats me as well... but I'm sure glad it's the case! :-)

> Emanuel - The one thing I cannot grasp is that you have built s/w with a
> range of toolchains, but you are very focussed on obtaining exactly the
> same numerical answers - seemingly to the level of false precision - for
> each build.

Argh... no, Peter: I am (and always was!) well aware of the inherent
IMprecision of floats - it is thoroughly managed in the algorithm and is
sufficient for that particular purpose (e.g. nobody cares whether a task
should be scheduled a millisecond sooner or later... when most of them
take hours!).

> I am struggling to see the reason for this, especially as
> you are talking about a stochastic (GA) algorithm. Why is this such a
> big issue for you? You mentioned you work in aerospace. Is this some
> sort of ultra safety conscious aerospace certification thing?

No - we optimize the scheduling of processes, not the internals of the
processes themselves. There are two reasons for my obsession with
consistency across compilers, as I explained a few posts back:
(1) we edit and _debug_ in Borland, for the simple reason that the
environment is frankly very good, so our productivity is excellent(*)...
but we actually release (ship) MinGW builds, because they are
dramatically faster than Borland's. Now in order to debug in Borland a
glitch spotted in our MinGW releases, we must be able to reproduce
exactly the same glitch in the former. On the other hand, given the
nature of what we do (GAs), and the fact that some glitches don't show
up before hours of computation... well I think you've got it already
(2) I found that it really is an excellent QC practice to make sure the
two builds behave exactly the same, because each time it was NOT the
case, there was a bug somewhere. Each and every time, no exceptions.

(*) Before sending me a barrage of complaints, please be aware that I do
have NetBeans... but (1) I find its editor simply not on par with
Borland's, and (2) trying to attach our DLL to debug, the list of PIDs
was just empty (if some of you has an advice on how to solve that, I
would be grateful - because the raw gdb is really painful).

All the best,

Emanuel

> P.
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

KHMan
On 1/20/2017 9:53 AM, Emanuel Falkenauer wrote:

> On 19-Jan-17 12:35, Peter Rockett wrote:
> [snip snip]
>> I think everybody (apart maybe from the OP) agrees how floating point
>> numbers behave. Keith makes a good point about rounding. Can I toss in
>> another feature that changing the compiler optimisation level often
>> reorders instructions meaning that rounding errors accumulate in
>> different ways. So changing the optimisation level often slightly
>> changes the numerical answers. :-\
>
> I agree that it could well (or even should?) be the case... but it's not
> in my case - to my own pleasant surprise.
>
> I can build with -O3 to get the most juice for releases, or with -O0 to
> debug... my logs are still the same (spare for actual bugs). I even
> compile with -mtune=native -march=native -mpopcnt -mmmx -mssse3 -msse4.2
> (native: I'm on Xeons), although I doubt very much the Borland compiler
> knows anything about those optimizations... and yet the latter's logs
> are still identical to MinGW's.
> Honestly it beats me as well... but I'm sure glad it's the case! :-)

Since nobody has filled this in...

In the past it has been pointed out to me that gcc by default
respects the possibility that (a+b != b+a) for floating point, so
it does not attempt those kinds of reordering. So one ought to get
consistent results from the FPU.

By default -O[0123s] is still math-strict [1]. -Ofast, a host of
other math options and some CPU-specific options break strictness
for more speed. If -O[0123s] misbehaves, I guess it should be
investigated as a potential bug.

Agner Fog's manual 1, section 8 [2] gives a table of optimizations
performed, including floating point optimizations and many
compilers including gcc and Borland. So it is likely that gcc did
all optimizations while being math-strict.

[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[2] http://www.agner.org/optimize/

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Alan W. Irwin
In reply to this post by Emanuel Falkenauer
On 2017-01-20 02:53+0100 Emanuel Falkenauer wrote:

>> I think everybody (apart maybe from the OP) agrees how floating point
>> numbers behave. Keith makes a good point about rounding. Can I toss in
>> another feature that changing the compiler optimisation level often
>> reorders instructions meaning that rounding errors accumulate in
>> different ways. So changing the optimisation level often slightly
>> changes the numerical answers. :-\
>
> I agree that it could well (or even should?) be the case... but it's not
> in my case - to my own pleasant surprise.
>
> I can build with -O3 to get the most juice for releases, or with -O0 to
> debug... my logs are still the same (spare for actual bugs). I even
> compile with -mtune=native -march=native -mpopcnt -mmmx -mssse3 -msse4.2
> (native: I'm on Xeons), although I doubt very much the Borland compiler
> knows anything about those optimizations... and yet the latter's logs
> are still identical to MinGW's.
> Honestly it beats me as well... but I'm sure glad it's the case! :-)

Interesting thread.

@Emanuel:

Floating-point representations and calculations are exact for a
subset of floating point numbers (e.g., positive or negative powers of
two times an integer). But if you are using arbitrary floating-point
numbers in your calculations, then I frankly cannot understand your
result since time and again I have seen examples where compiler
optimization changes the answer in the 15 place (if you are really
lucky and it is often much worse than that if you have any
significance loss which is a common problem if you are solving
solutions of linear equations). Also consider a number similar to
0.4321499999999999999999999 which rounds to either 0.4321 or 0.4322
depending on minute floating-point errors.  If such numbers appear in
your log output even heavy rounding is not going to make your logs
agree for different optimization levels.  Perhaps your logs have no
arbitrary floating-point numbers in them and instead actually contain
exact floating-point answers (such as integers, half-integers, etc.)
or no floating-point decimal output at all?  For example, your log
could simply say that a certain category of answer was achieved
without giving exact details, and you should indeed get that same
result regardless of optimization level if you are well-protected
(e.g., make no logical decisions based on floating-point comparisons)
against floating-point errors.

>
>> Emanuel - The one thing I cannot grasp is that you have built s/w with a
>> range of toolchains, but you are very focussed on obtaining exactly the
>> same numerical answers - seemingly to the level of false precision - for
>> each build.

@ Everybody:

Here I have to agree with Emanual that sometimes such a result is
desireable for testing purposes.  One example of this I have run into
is the positions and velocities of the planets (planetary ephemerides)
that are distributed by JPL in both binary and ascii forms.  So when
converting between the two forms for debugging purposes you would like
to start with the binary form (64-bit double precision IEEE floating
point numbers) and be able to convert to ascii form and back again
with no bit flips at all in those binary results.  It turns out that
with gcc on Linux and the Linux C library that with x86_64 hardware
(where intermediate floating-point results are stored in 80-bit
registers) that this result was obtained. Apparently the C library
converted the binary format to ascii decimal format with sufficient
additional (likely 80-bit) precision so that the result was exact to
something like 20 places.  And if my ascii representation included
those ascii guard digits that was sufficient so the result could be
converted back exactly to the 64-bit floating-point representation. I
could also take the ascii result distributed by JPL (which apparently also had
sufficient guard digits likely because they were using x86_64 hardware
with a decent C library that took advantage of that 80-bit
floating-point precision) and convert those results to their
distributed binary form exactly.  So that was a very nice round-trip
test result for extremely large masses of floating-point numbers.

By the way, I tried the same round-trip binary to ascii to binary
ephemeris test using MinGW gcc on Wine, and the upshot was I
discovered a bug (#28422) in the scanf family of functions implemented
for Wine (they were actually using 32-bit floating-point numbers for
the conversion at the time so it was a significant bug) that they have
subsequently fixed.  And after that Wine fix my round-trip test worked
for that platform as well.

In sum, if you have some scanf-type conversion from ascii to binary
representation of floating point or some printf-type conversion from
binary to ascii representation of floating point that is not done
using the maximum possible precision for the hardware, types like
Emanual and me who are keen on testing will come back to haunt you!
:-)

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
In reply to this post by KHMan
Hi KHMan,

Thank you for your well documented explanation - it shows how it could
be that my "stubbornness" is actually grounded in reality. :-)  Gosh,
good Lord I never even knew about -Ofast! :-D
And I guess kudos also to the IEEE: these "standards bodies" are often
seen as stifling creativity, but in this case they're just invaluable:
I'd hate our clients reporting different results just because they'd be
running on AMDs!

All the best,

Emanuel

P.S. Your "(a+b != b+a) for floating point" example is excellent: that's
indeed the kind of "gymnastics" we often do to manage the floats'
inherent IMprecision.

On 20-Jan-17 04:34, KHMan wrote:

> On 1/20/2017 9:53 AM, Emanuel Falkenauer wrote:
>> On 19-Jan-17 12:35, Peter Rockett wrote:
>> [snip snip]
>>> I think everybody (apart maybe from the OP) agrees how floating point
>>> numbers behave. Keith makes a good point about rounding. Can I toss in
>>> another feature that changing the compiler optimisation level often
>>> reorders instructions meaning that rounding errors accumulate in
>>> different ways. So changing the optimisation level often slightly
>>> changes the numerical answers. :-\
>> I agree that it could well (or even should?) be the case... but it's not
>> in my case - to my own pleasant surprise.
>>
>> I can build with -O3 to get the most juice for releases, or with -O0 to
>> debug... my logs are still the same (spare for actual bugs). I even
>> compile with -mtune=native -march=native -mpopcnt -mmmx -mssse3 -msse4.2
>> (native: I'm on Xeons), although I doubt very much the Borland compiler
>> knows anything about those optimizations... and yet the latter's logs
>> are still identical to MinGW's.
>> Honestly it beats me as well... but I'm sure glad it's the case! :-)
> Since nobody has filled this in...
>
> In the past it has been pointed out to me that gcc by default
> respects the possibility that (a+b != b+a) for floating point, so
> it does not attempt those kinds of reordering. So one ought to get
> consistent results from the FPU.
>
> By default -O[0123s] is still math-strict [1]. -Ofast, a host of
> other math options and some CPU-specific options break strictness
> for more speed. If -O[0123s] misbehaves, I guess it should be
> investigated as a potential bug.
>
> Agner Fog's manual 1, section 8 [2] gives a table of optimizations
> performed, including floating point optimizations and many
> compilers including gcc and Borland. So it is likely that gcc did
> all optimizations while being math-strict.
>
> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> [2] http://www.agner.org/optimize/
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: [OT] comparing floating point results of different compilers

Emanuel Falkenauer
In reply to this post by Tuomo Latto-3
Hi Tuomo,

Extremely well put: that's exactly what we do (i.e. checking the RESULTS).

Cheers,

Emanuel


On 18-01-17 10:47, Tuomo Latto wrote:

> On 18.01.2017 10:48, Alberto Luaces wrote:
>> Emanuel Falkenauer writes:
>>
>>> Now to debug in Embarcadero a problem found in a MinGW build, we must
>>> be sure that the two versions behave exactly the same, and it's anyway
>>> a good QC practice to make sure that that is indeed the case. In order
>>> to do that, we have developed a whole tracing system where one version
>>> writes the details of its progress into a log and the other version
>>> then reads it to spot any differences in behavior. The system already
>>> saved us years in debugging... but floats have always been a problem:
>>> Embarcadero and MinGW don't (s)print(f) them the same! The last digit
>>> in the log is different so often, that many times the logging becomes
>>> largely useless: millions of BOGUS differences show up.  Usually we
>>> resign ourselves to use sprintf("%.4f",...) or such, and skip by hand
>>> what are clearly bogus "differences" remaining - but that always
>>> leaves the uncertainty that the differences COULD actually be real,
>>> i.e. that the exact float values are really different in the two
>>> builds.
>> Honest question: how can you compare the outputs of a program compiled
>> by two different compilers?  How can you make sure that all the
>> computations are carried in the same order?  That cannot be done even
>> when using different optimization flags on the same compiler.
> But the fact that programs produced by different compilers work the same
> way, produce the same effects and, in general, even work at all,
> proves that they are the same to the relevant degree. Remember, we are
> not talking about the exact choice or ordering of CPU instructions
> for a single task, but are, in fact, talking about the results of those
> instructions, the completion of that task.
> Logging isn't a separate functionality but integrated to the program
> and thus taken into account in the analysis of the data flow.
> If a compiler optimizes results away or compromises their integrity
> or precision without being asked to, it is faulty.
>
> The way a compiler knows whether a value in a variable is a "result"
> or not is by looking if (and how) the data is being used - in this case
> probably by if it is taken outside the FPU (memory, GP registers, etc.).
> If it is not, it's an intermediate value that can be discarded once
> not needed, optimized away, or, if aggressively optimized, replaced
> by a more efficient way. On the other hand if it is used, then it is
> a result required in the program that the compiler aims to create,
> a target to satisfy, even if it optimizes how it reaches that target.
>
>> Are you by chance comparing the result of an iterative process that
>> should converge to a final value?
> The ftoa calculation is such a process that could be useful to have it
> provide the same answers when compared across compilers, like for Emanuel.
> However, the input float values for the process should still be equal
> up to some precision for all environments, which is what he is trying to
> verify by comparing the logs.
>
>


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
In reply to this post by Alan W. Irwin
Hello Alan,

On 20-01-17 04:55, Alan W. Irwin wrote:
> Interesting thread.

Glad you enjoy it. To be very honest, I have never expected it to be
blown into those "interstellar" proportions... but, well, there ARE some
pretty fundamental computational issues in stake.

> @Emanuel:
>
> Floating-point representations and calculations are exact for a
> subset of floating point numbers (e.g., positive or negative powers of
> two times an integer). But if you are using arbitrary floating-point
> numbers in your calculations, then I frankly cannot understand your
> result since time and again I have seen examples where compiler
> optimization changes the answer in the 15 place (if you are really
> lucky and it is often much worse than that if you have any
> significance loss which is a common problem if you are solving
> solutions of linear equations). Also consider a number similar to
> 0.4321499999999999999999999 which rounds to either 0.4321 or 0.4322
> depending on minute floating-point errors.  If such numbers appear in
> your log output even heavy rounding is not going to make your logs
> agree for different optimization levels.  Perhaps your logs have no
> arbitrary floating-point numbers in them and instead actually contain
> exact floating-point answers (such as integers, half-integers, etc.)
> or no floating-point decimal output at all?  For example, your log
> could simply say that a certain category of answer was achieved
> without giving exact details, and you should indeed get that same
> result regardless of optimization level if you are well-protected
> (e.g., make no logical decisions based on floating-point comparisons)
> against floating-point errors.

I do see what you're getting at, and frankly I grappled with those same
issues myself as well (huh: about 20 years ago). But I realized one
FUNDAMENTAL thing: the "minute floating-point errors" you refer to are
NOT RANDOM - they're products of well-defined computations done by the
FPU, which these days is normally IEEE-compliant, at least as basic
operations are concerned (I'm not talking about sqrts and such, which
might well be proprietary). Ergo they MUST yield EXACTLY the same
results (bit-wise), or the IEEE standard is useless. I mean... it's not
as if a FPU had a Schrödinger cat embedded!
Yes, sometimes the results are a bit surprising (e.g. your
0.4321499999999999999999999 instead of 0.43215)... but that doesn't
bother me a bit AS LONG AS the same "surprise" is produced by every
build I produce with whatever toolchain (MinGW, Borland... I even made
our code bitwise-compatible with MS Visual C++! [To no avail, btw: it's
slightly SLOWER than the excellent MinGW craft.]).

Sure, those "surprising" results need to be managed (which we do in our
algorithms) - but apart from that, the behavior should be EXACTLY the
same across toolchains - and the latest KHMan's post confirms that
conviction. Ergo it's perfectly reasonable to print those values in my
comparison logs - AS LONG AS the same binary float is always printed the
same decimal way. That was an annoying problem for us... until now.

>
>>> Emanuel - The one thing I cannot grasp is that you have built s/w with a
>>> range of toolchains, but you are very focussed on obtaining exactly the
>>> same numerical answers - seemingly to the level of false precision - for
>>> each build.
> @ Everybody:
>
> Here I have to agree with Emanual that sometimes such a result is
> desireable for testing purposes.  One example of this I have run into
> is the positions and velocities of the planets (planetary ephemerides)
> that are distributed by JPL in both binary and ascii forms.  So when
> converting between the two forms for debugging purposes you would like
> to start with the binary form (64-bit double precision IEEE floating
> point numbers) and be able to convert to ascii form and back again
> with no bit flips at all in those binary results.  It turns out that
> with gcc on Linux and the Linux C library that with x86_64 hardware
> (where intermediate floating-point results are stored in 80-bit
> registers) that this result was obtained. Apparently the C library
> converted the binary format to ascii decimal format with sufficient
> additional (likely 80-bit) precision so that the result was exact to
> something like 20 places.  And if my ascii representation included
> those ascii guard digits that was sufficient so the result could be
> converted back exactly to the 64-bit floating-point representation. I
> could also take the ascii result distributed by JPL (which apparently also had
> sufficient guard digits likely because they were using x86_64 hardware
> with a decent C library that took advantage of that 80-bit
> floating-point precision) and convert those results to their
> distributed binary form exactly.  So that was a very nice round-trip
> test result for extremely large masses of floating-point numbers.
>
> By the way, I tried the same round-trip binary to ascii to binary
> ephemeris test using MinGW gcc on Wine, and the upshot was I
> discovered a bug (#28422) in the scanf family of functions implemented
> for Wine (they were actually using 32-bit floating-point numbers for
> the conversion at the time so it was a significant bug) that they have
> subsequently fixed.  And after that Wine fix my round-trip test worked
> for that platform as well.

Extra-cool! :-)

> In sum, if you have some scanf-type conversion from ascii to binary
> representation of floating point or some printf-type conversion from
> binary to ascii representation of floating point that is not done
> using the maximum possible precision for the hardware, types like
> Emanual and me who are keen on testing will come back to haunt you!
> :-)

Oh yeah: haunting indeed!  :-)

Cheers,

Emanuel


> Alan
> __________________________
> Alan W. Irwin
>
> Astronomical research affiliation with Department of Physics and Astronomy,
> University of Victoria (astrowww.phys.uvic.ca).

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

KHMan
In reply to this post by Emanuel Falkenauer
On 1/20/2017 12:07 PM, Emanuel Falkenauer wrote:

> Hi KHMan,
>
> Thank you for your well documented explanation - it shows how it could
> be that my "stubbornness" is actually grounded in reality. :-)  Gosh,
> good Lord I never even knew about -Ofast! :-D
> And I guess kudos also to the IEEE: these "standards bodies" are often
> seen as stifling creativity, but in this case they're just invaluable:
> I'd hate our clients reporting different results just because they'd be
> running on AMDs!
>
> All the best,
>
> Emanuel
>
> P.S. Your "(a+b != b+a) for floating point" example is excellent: that's
> indeed the kind of "gymnastics" we often do to manage the floats'
> inherent IMprecision.

My memory failed me and I think I should correct the above. IIRC I
tripped on a+(b+c) != (a+b)+c. I remembered well that I blundered,
but did not recall correctly what I blundered on... It popped out
of my slow neurons about an hour after I posted -- fast memory
retrieval is not reliable. I'm not sure though what the IEEE
standards say about a+b and b+a equivalence, if any.

In any case, Agner Fog is a much better point of reference for
these sort of things...

> On 20-Jan-17 04:34, KHMan wrote:
>> On 1/20/2017 9:53 AM, Emanuel Falkenauer wrote:
>>> On 19-Jan-17 12:35, Peter Rockett wrote:
>>> [snip snip]
>>>> I think everybody (apart maybe from the OP) agrees how floating point
>>>> numbers behave. Keith makes a good point about rounding. Can I toss in
>>>> another feature that changing the compiler optimisation level often
>>>> reorders instructions meaning that rounding errors accumulate in
>>>> different ways. So changing the optimisation level often slightly
>>>> changes the numerical answers. :-\
>>> I agree that it could well (or even should?) be the case... but it's not
>>> in my case - to my own pleasant surprise.
>>>
>>> I can build with -O3 to get the most juice for releases, or with -O0 to
>>> debug... my logs are still the same (spare for actual bugs). I even
>>> compile with -mtune=native -march=native -mpopcnt -mmmx -mssse3 -msse4.2
>>> (native: I'm on Xeons), although I doubt very much the Borland compiler
>>> knows anything about those optimizations... and yet the latter's logs
>>> are still identical to MinGW's.
>>> Honestly it beats me as well... but I'm sure glad it's the case! :-)
>> Since nobody has filled this in...
>>
>> In the past it has been pointed out to me that gcc by default
>> respects the possibility that (a+b != b+a) for floating point, so
>> it does not attempt those kinds of reordering. So one ought to get
>> consistent results from the FPU.
>>
>> By default -O[0123s] is still math-strict [1]. -Ofast, a host of
>> other math options and some CPU-specific options break strictness
>> for more speed. If -O[0123s] misbehaves, I guess it should be
>> investigated as a potential bug.
>>
>> Agner Fog's manual 1, section 8 [2] gives a table of optimizations
>> performed, including floating point optimizations and many
>> compilers including gcc and Borland. So it is likely that gcc did
>> all optimizations while being math-strict.
>>
>> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>> [2] http://www.agner.org/optimize/
>>


--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
Hi again, KHMan,

On 20-Jan-17 09:20, KHMan wrote:

> On 1/20/2017 12:07 PM, Emanuel Falkenauer wrote:
>> Hi KHMan,
>>
>> Thank you for your well documented explanation - it shows how it could
>> be that my "stubbornness" is actually grounded in reality. :-)  Gosh,
>> good Lord I never even knew about -Ofast! :-D
>> And I guess kudos also to the IEEE: these "standards bodies" are often
>> seen as stifling creativity, but in this case they're just invaluable:
>> I'd hate our clients reporting different results just because they'd be
>> running on AMDs!
>>
>> All the best,
>>
>> Emanuel
>>
>> P.S. Your "(a+b != b+a) for floating point" example is excellent: that's
>> indeed the kind of "gymnastics" we often do to manage the floats'
>> inherent IMprecision.
> My memory failed me and I think I should correct the above. IIRC I
> tripped on a+(b+c) != (a+b)+c. I remembered well that I blundered,
> but did not recall correctly what I blundered on... It popped out
> of my slow neurons about an hour after I posted -- fast memory
> retrieval is not reliable. I'm not sure though what the IEEE
> standards say about a+b and b+a equivalence, if any.
>
> In any case, Agner Fog is a much better point of reference for
> these sort of things...

Don't worry, I'm sure everybody "in the know" understood: you need to
first add the small values so their SUM actually counts in the large
one, rather than adding them to the large one one by one only to be
rounded out without a trace... ;-)
I'm pretty sure IEEE says NOTHING about a+b v. b+a... but it sure (i.e.
hopefully!!) DOES standardize what should be the outcome in each case -
and that's all I need.

Best,

Emanuel

>
>> On 20-Jan-17 04:34, KHMan wrote:
>>> On 1/20/2017 9:53 AM, Emanuel Falkenauer wrote:
>>>> On 19-Jan-17 12:35, Peter Rockett wrote:
>>>> [snip snip]
>>>>> I think everybody (apart maybe from the OP) agrees how floating point
>>>>> numbers behave. Keith makes a good point about rounding. Can I toss in
>>>>> another feature that changing the compiler optimisation level often
>>>>> reorders instructions meaning that rounding errors accumulate in
>>>>> different ways. So changing the optimisation level often slightly
>>>>> changes the numerical answers. :-\
>>>> I agree that it could well (or even should?) be the case... but it's not
>>>> in my case - to my own pleasant surprise.
>>>>
>>>> I can build with -O3 to get the most juice for releases, or with -O0 to
>>>> debug... my logs are still the same (spare for actual bugs). I even
>>>> compile with -mtune=native -march=native -mpopcnt -mmmx -mssse3 -msse4.2
>>>> (native: I'm on Xeons), although I doubt very much the Borland compiler
>>>> knows anything about those optimizations... and yet the latter's logs
>>>> are still identical to MinGW's.
>>>> Honestly it beats me as well... but I'm sure glad it's the case! :-)
>>> Since nobody has filled this in...
>>>
>>> In the past it has been pointed out to me that gcc by default
>>> respects the possibility that (a+b != b+a) for floating point, so
>>> it does not attempt those kinds of reordering. So one ought to get
>>> consistent results from the FPU.
>>>
>>> By default -O[0123s] is still math-strict [1]. -Ofast, a host of
>>> other math options and some CPU-specific options break strictness
>>> for more speed. If -O[0123s] misbehaves, I guess it should be
>>> investigated as a potential bug.
>>>
>>> Agner Fog's manual 1, section 8 [2] gives a table of optimizations
>>> performed, including floating point optimizations and many
>>> compilers including gcc and Borland. So it is likely that gcc did
>>> all optimizations while being math-strict.
>>>
>>> [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>> [2] http://www.agner.org/optimize/
>>>
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Alan W. Irwin
In reply to this post by Emanuel Falkenauer
On 2017-01-20 08:17+0100 Emanuel Falkenauer wrote:

I said:

>> @Emanuel:
>>
>> Floating-point representations and calculations are exact for a
>> subset of floating point numbers (e.g., positive or negative powers of
>> two times an integer). But if you are using arbitrary floating-point
>> numbers in your calculations, then I frankly cannot understand your
>> result since time and again I have seen examples where compiler
>> optimization changes the answer in the 15 place (if you are really
>> lucky and it is often much worse than that if you have any
>> significance loss which is a common problem if you are solving
>> solutions of linear equations). Also consider a number similar to
>> 0.4321499999999999999999999 which rounds to either 0.4321 or 0.4322
>> depending on minute floating-point errors.  If such numbers appear in
>> your log output even heavy rounding is not going to make your logs
>> agree for different optimization levels.  Perhaps your logs have no
>> arbitrary floating-point numbers in them and instead actually contain
>> exact floating-point answers (such as integers, half-integers, etc.)
>> or no floating-point decimal output at all?  For example, your log
>> could simply say that a certain category of answer was achieved
>> without giving exact details, and you should indeed get that same
>> result regardless of optimization level if you are well-protected
>> (e.g., make no logical decisions based on floating-point comparisons)
>> against floating-point errors.
Emanuel replied:

>
> I do see what you're getting at, and frankly I grappled with those same
> issues myself as well (huh: about 20 years ago). But I realized one
> FUNDAMENTAL thing: the "minute floating-point errors" you refer to are
> NOT RANDOM - they're products of well-defined computations done by the
> FPU, which these days is normally IEEE-compliant, at least as basic
> operations are concerned (I'm not talking about sqrts and such, which
> might well be proprietary). Ergo they MUST yield EXACTLY the same
> results (bit-wise), or the IEEE standard is useless. I mean... it's not
> as if a FPU had a Schrödinger cat embedded!
> Yes, sometimes the results are a bit surprising (e.g. your
> 0.4321499999999999999999999 instead of 0.43215)... but that doesn't
> bother me a bit AS LONG AS the same "surprise" is produced by every
> build I produce with whatever toolchain (MinGW, Borland... I even made
> our code bitwise-compatible with MS Visual C++! [To no avail, btw: it's
> slightly SLOWER than the excellent MinGW craft.]).
>
> Sure, those "surprising" results need to be managed (which we do in our
> algorithms) - but apart from that, the behavior should be EXACTLY the
> same across toolchains - and the latest KHMan's post confirms that
> conviction. Ergo it's perfectly reasonable to print those values in my
> comparison logs - AS LONG AS the same binary float is always printed the
> same decimal way. That was an annoying problem for us... until now.

Hi Emanuel:

I took KHMan's post with a bit of a "grain of salt" because those
references did not say explicitly (although they could be interpreted
that way) that you are guaranteed the same floating-point result with
Linux gcc regardless of -O optimization level.  And historically the
PLplot results were a definite counter-example to that idea.  There,
our PostScript 64-bit floating-point result for plots (heavily rounded
to 4 places, hence my example above) did differ (but only
occasionally) in the last of those 4 places from one optimization
level to another, and at that time I ascribed that result to the above
model of the (rare) propagation of small errors in the ~15th figure to
heavily rounded results.

But since you don't seem to be encountering such issues yourself with
modern Linux gcc, I tried the experiment of generating PLplot results
for all our C examples with our PostScript device with the Linux gcc
(Debian 4.9.2-10) compiler using both -O0 and -O3, and indeed the
results are identical (for 33 standard plplot examples representing
several hundred pages of plots so it is an extensive test)!  So it
appears the conclusion must be that floating-point calculations are
done in much more consistent ways now with gcc for both your case and
this recent PLplot test , and I intend to double-check that
(unexpected for me) conclusion for the timeephem and FreeEOS projects
where results are not so heavily rounded as those of PLplot.

However, for whatever reason (e.g., everyone might now be using the
same IEEE floating-point calculation algorithms) assuming this
floating-point consistency is now true for most compilers (so long as
no optimization is done other than the normal -O0 through -O3
options), that is an important floating-point breakthrough that would
make comparisons of results from one compiler to others _much_ simpler
(as you have already discovered for your software project).
Furthermore, if care was used to not gratuitously mess with the order
of computations from one software release to the next, this
breakthrough should also simplify consistency testing of results for
various software versions.

Finally, to drag this post back on topic for this list, I haven't paid
much attention to earlier posts in this thread, but assuming from your
subject line this floating-point consistency is not currently
available for any unpatched official release of the MinGW version of
gcc, this inconsistency is something I would strongly encourage the
MinGW developers to address to aid such important cross-platform
consistency testing.

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Alan W. Irwin
In reply to this post by Emanuel Falkenauer
On 2017-01-20 09:55+0100 Emanuel Falkenauer wrote:

> I'm pretty sure IEEE says NOTHING about a+b v. b+a... but it sure (i.e.
> hopefully!!) DOES standardize what should be the outcome in each case -
> and that's all I need.

I agree.  Floating-point consistency is extremely valuable.  We
definitely did not have that with gcc a few years back, but we now
apparently do (at least with Linux gcc), and that is a wonderful new
discovery for me today.

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Peter Rockett
In reply to this post by Emanuel Falkenauer
On 20/01/17 01:53, Emanuel Falkenauer wrote:
> Hi Peter,
...snip

> Emanuel - The one thing I cannot grasp is that you have built s/w with a
> range of toolchains, but you are very focussed on obtaining exactly the
> same numerical answers - seemingly to the level of false precision - for
> each build.
> ...snip
>
>> I am struggling to see the reason for this, especially as
>> you are talking about a stochastic (GA) algorithm. Why is this such a
>> big issue for you? You mentioned you work in aerospace. Is this some
>> sort of ultra safety conscious aerospace certification thing?
> No - we optimize the scheduling of processes, not the internals of the
> processes themselves. There are two reasons for my obsession with
> consistency across compilers, as I explained a few posts back:
> (1) we edit and _debug_ in Borland, for the simple reason that the
> environment is frankly very good, so our productivity is excellent(*)...
> but we actually release (ship) MinGW builds, because they are
> dramatically faster than Borland's. Now in order to debug in Borland a
> glitch spotted in our MinGW releases, we must be able to reproduce
> exactly the same glitch in the former. On the other hand, given the
> nature of what we do (GAs), and the fact that some glitches don't show
> up before hours of computation... well I think you've got it already
> (2) I found that it really is an excellent QC practice to make sure the
> two builds behave exactly the same, because each time it was NOT the
> case, there was a bug somewhere. Each and every time, no exceptions.
>
> (*) Before sending me a barrage of complaints, please be aware that I do
> have NetBeans... but (1) I find its editor simply not on par with
> Borland's, and (2) trying to attach our DLL to debug, the list of PIDs
> was just empty (if some of you has an advice on how to solve that, I
> would be grateful - because the raw gdb is really painful).
>
> All the best,
>
> Emanuel
Emanuel - Thanks for clarifying. I now understand.

At the risk of getting flamed by the command line warriors on this list,
have you looked at the CodeBlocks or CodeLite IDEs for running MinGW
directly?  Both have excellent graphical interfaces to gdb. (In fact, I
have no idea how to run gdb from the command line! Never done it...)
Seems working solely with mingw will save you a lot of the QC grief you
seem to be having now.

(In passing, a similar problem seems to present very frequently with
folks writing/debgugging/testing programs in Matlab and then porting to
C(++) for the production code. The porting seems to consume a massive
amount of programmer time.)

P.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Peter Rockett
In reply to this post by Keith Marshall-3
On 19/01/17 14:09, Keith Marshall wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 19/01/17 11:19, Peter Rockett wrote:
On 19/01/17 08:21, [hidden email] wrote:
Keith, have a look here please:
http://www.exploringbinary.com/quick-and-dirty-floating-point-to-decimal-conversion
Quote from the article:
Every binary floating-point number has an exact decimal equivalent 
<http://www.exploringbinary.com/number-of-decimal-digits-in-a-binary-fraction/>, 
which can be expressed as a decimal string of finite length.

When I started this, I didn't know about this article. Also another 
in-depth look at float to decimal conversion from the same author:
http://www.exploringbinary.com/maximum-number-of-decimal-digits-in-binary-floating-point-numbers 
<http://www.exploringbinary.com/maximum-number-of-decimal-digits-in-binary-floating-point-numbers/>
I think the above is exactly the sort of misleading website Keith was
complaining about.
Indeed, yes.  I stumbled across that same website several years
ago: I immediately dismissed it as the gigantic pile of bovine
manure which it so clearly represents.  Sadly, there are far too
many mathematically challenged programmers who are gullible enough
to accept such crap at face value; (and far too much similarly
ill-informed nonsense pervading the internet).

If you take the binary representation of DBL_MAX, replace the least
significant digit in the mantissa with a zero and calculate the
decimal equivalent, you will get another 309 digit number. But
compared to the decimal equivalent of DBL_MAX, I think you will find
that the bottom ~294 digits will have changed. If you got this number
from a calculation, what does it tell you about the reliability of
the bottom 294 digits if the smallest possible change to the binary
number produces such a massive shift?
It tells us, as I've said before, that those excess ~294 digits are
meaningless garbage, having no mathematical significance whatsoever.
Thinking about this further, I believe there is another level of subtlety here.

Again taking the DBL_MAX example from http://www.exploringbinary.com/number-of-decimal-digits-in-a-binary-fraction/, the logic goes:

DBL_MAX is 1.1111111111111111111111111111111111111111111111111111 x 2^1023

Add so:

1.1111111111111111111111111111111111111111111111111111 x 21023

=
(2 – 2-52) · 21023

=
17976931348623...58368 (309 digits worth)

The second step - the conversion from (2 – 2-52) · 21023  to this monster 309-digit decimal number, and therefore the validity of the trailing 294 digits - is unquestionably correct. All 294 digits are perfectly good! The logical flaw in the above argument is actually the very first step that equates the IEEE-format binary number to  (2 – 2-52) · 21023. This is not an equation, it is a logical equivalence (should be a '<=>' symbol)! (How do you assign a discrete data structure to a number?) It thus follows that any reasoning about the IEEE-format binary number using the decimal equivalent has to be tempered by some key conditions because there has been a fundamental change in the problem in the first step.

Given the subtlety of this, I am not surprised many people miss the point and erroneously assert things like "every  binary floating-point number has an exact decimal equivalent". The equivalent number written in terms of powers of 2 does indeed have an exact decimal equivalent; but this does not extend 'upstream' to the original binary number. The second stage of the reasoning is absolutely correct. But a correct piece of logic following a flawed logical step is still false. The net conclusion, of course, is that we still only have 15-16 significant digits and any further digits are false precision due to the finite width of the binary mantissa.

From some unpromising beginnings, I have actually found this a very useful thread. I now have a clearer formal understanding of why so many people get floating point numbers wrong. I have also been corrected about recent versions of gcc not reordering instructions during optimisation - that is really useful to know!

Peter


Put another way, if you do arithmetic at this end of the floating
point scale, the smallest change you can make is ~10^{294}. Thus only
~15 of the decimal digits are significant - the rest are completely
uncertain. Passing to the decimal equivalents ultimately clouds the
issue. Floating point arithmetic is done in binary, not using decimal
equivalents.

I suspect the OP's conceptual problem lies in viewing every float in
splendid isolation rather than as part of a computational system.
This is why printing to false precision has not attracted much uptake
here. There is a fundamental difference between digits and digits
that contain any useful information!
Exactly so.  Nicely elucidated, Peter.  Thank you.

Or another take: If you plot possible floating point representations
on a real number line, you will have gaps between the points. The OP
is trying print out numbers that fall in the gaps!
And, taken to its ultimate, there are an infinite number of possible
numbers falling in each gap: each of these is an equally speculative
possible misrepresentation of the reality conveyed by the actual bits encoding the number at one or other end of the gap.

- -- 
Regards,
Keith.




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
In reply to this post by Peter Rockett
Peter,

> At the risk of getting flamed by the command line warriors on this list,
> have you looked at the CodeBlocks or CodeLite IDEs for running MinGW
> directly?  Both have excellent graphical interfaces to gdb. (In fact, I
> have no idea how to run gdb from the command line! Never done it...)

Thanks for the info, I'll have a look.

> Seems working solely with mingw will save you a lot of the QC grief you
> seem to be having now.

No no, quite to the contrary: we WANT to have as many ways to do QC as
we may find!

We're mere mortals (so prone to err), and this algorithm is horrendously
complex (hundreds of thousands of lines, parallel execution, you name
it) - bugs are inevitable. But real people actually rely on our results
to carry out real production - millions are at stake. Ergo QC represents
a large part of our development effort... and checking for consistency
across toolchains is one more (and relatively easy) way to spot many
bugs that might get unnoticed with each build taken separately.

Best,

Emanuel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

KHMan
In reply to this post by Peter Rockett
On 1/21/2017 6:18 AM, [hidden email] wrote:

> Peter Rockett wrote:
>> Guys
>>
>> At the risk of wading into a thread that has become very heated,
>> could I drag this back to technical matters.
>
> With similar trepidation...
>
>> I think everybody (apart maybe from the OP) agrees how floating point
>> numbers behave. Keith makes a good point about rounding. Can I toss
>> in another feature that changing the compiler optimisation level
>> often reorders instructions meaning that rounding errors accumulate
>> in different ways. So changing the optimisation level often slightly
>> changes the numerical answers. :-\
>
> Another aspect which I think has been mentioned in passing a couple of
> times, but not expanded on, is that how the processor's floating-point
> registers are utilised can also have an effect. These registers are
> typically higher precision than the in-memory representation of floats,
> so can store intermediate results more accurately, but there are a
> limited number of them - so sometimes some intermediate results have to
> be written out to memory (at reduced precision) and read back later.
> [snip snip]

AFAIK it is only with 8087 registers -- just about the only
company who did this was Intel. Didn't really worked out, I think
everybody else quickly stuck with 64-bit registers. One thing is
that register spills would store values at 64-bit into memory, so
your application's results could change when it is compiled into
different register allocations. With 64-bit regularity, managing
errors and such shifts to the responsibility of apps.

Any modern 64-bit x86 app would be using SSE2, and have 64 bit FPU
registers in line with other CPUs and GPUs. That simplifies things
a lot. But GPUs might not be fully IEEE compliant as in they might
not handle denormals etc. I don't know what are the
reproducibility expectations for CPUs versus GPUs these days.

One newish thing that can change results versus regular FPU code
is FMA3. I would be interested in any views, Wikipedia is a bit
thin on its FMA3 page and I have been too lazy to read (and/or
reread) the Intel/AMD arch manuals...

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
Hi KHMan,

Excellent explanations, thx!

I agree it would be cool to see to what extent GPU computations are
actually consistent with FPU's (I'd bet not really: it's a relatively
new thing), but we so far didn't find a usage for GPUs for our purposes
anyway (one day, maybe?).

Best,

Emanuel

On 21-Jan-17 01:54, KHMan wrote:

> On 1/21/2017 6:18 AM, [hidden email] wrote:
>> Peter Rockett wrote:
>>> Guys
>>>
>>> At the risk of wading into a thread that has become very heated,
>>> could I drag this back to technical matters.
>> With similar trepidation...
>>
>>> I think everybody (apart maybe from the OP) agrees how floating point
>>> numbers behave. Keith makes a good point about rounding. Can I toss
>>> in another feature that changing the compiler optimisation level
>>> often reorders instructions meaning that rounding errors accumulate
>>> in different ways. So changing the optimisation level often slightly
>>> changes the numerical answers. :-\
>> Another aspect which I think has been mentioned in passing a couple of
>> times, but not expanded on, is that how the processor's floating-point
>> registers are utilised can also have an effect. These registers are
>> typically higher precision than the in-memory representation of floats,
>> so can store intermediate results more accurately, but there are a
>> limited number of them - so sometimes some intermediate results have to
>> be written out to memory (at reduced precision) and read back later.
>> [snip snip]
> AFAIK it is only with 8087 registers -- just about the only
> company who did this was Intel. Didn't really worked out, I think
> everybody else quickly stuck with 64-bit registers. One thing is
> that register spills would store values at 64-bit into memory, so
> your application's results could change when it is compiled into
> different register allocations. With 64-bit regularity, managing
> errors and such shifts to the responsibility of apps.
>
> Any modern 64-bit x86 app would be using SSE2, and have 64 bit FPU
> registers in line with other CPUs and GPUs. That simplifies things
> a lot. But GPUs might not be fully IEEE compliant as in they might
> not handle denormals etc. I don't know what are the
> reproducibility expectations for CPUs versus GPUs these days.
>
> One newish thing that can change results versus regular FPU code
> is FMA3. I would be interested in any views, Wikipedia is a bit
> thin on its FMA3 page and I have been too lazy to read (and/or
> reread) the Intel/AMD arch manuals...
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Emanuel Falkenauer
In reply to this post by Alan W. Irwin
Hi Alan,

> [...]
> But since you don't seem to be encountering such issues yourself with
> modern Linux gcc,

Oops, I'm blushing a bit... but anyway: I actually use a native
_Windows_ port of MinGW.  :-D

> I tried the experiment of generating PLplot results [...] using both -O0 and -O3, and indeed the
> results are identical (for 33 standard plplot examples representing
> several hundred pages of plots so it is an extensive test)!

Just excellent!  :-)

> So it
> appears the conclusion must be that floating-point calculations are
> done in much more consistent ways now with gcc for both your case and
> this recent PLplot test , and I intend to double-check that
> (unexpected for me) conclusion for the timeephem and FreeEOS projects
> where results are not so heavily rounded as those of PLplot.

Please do, and keep us posted.

> However, for whatever reason (e.g., everyone might now be using the
> same IEEE floating-point calculation algorithms) assuming this
> floating-point consistency is now true for most compilers (so long as
> no optimization is done other than the normal -O0 through -O3
> options), that is an important floating-point breakthrough that would
> make comparisons of results from one compiler to others _much_ simpler
> (as you have already discovered for your software project).
> Furthermore, if care was used to not gratuitously mess with the order
> of computations from one software release to the next, this
> breakthrough should also simplify consistency testing of results for
> various software versions.

Indeed, I find it extremely valuable.

> Finally, to drag this post back on topic for this list, I haven't paid
> much attention to earlier posts in this thread, but assuming from your
> subject line this floating-point consistency is not currently
> available for any unpatched official release of the MinGW version of
> gcc, this inconsistency is something I would strongly encourage the
> MinGW developers to address to aid such important cross-platform
> consistency testing.

Fully agree. Oops (once again!), I actually have no idea whether the
precise MinGW compiler we use is a "patched official release"... but
maybe it is:

C:\>gcc --version
gcc (i686-win32-sjlj-rev3, Built by MinGW-W64 project) 4.9.1
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


Best,

Emanuel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

Keith Marshall-3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 21/01/17 02:44, Emanuel Falkenauer wrote:
> Fully agree. Oops (once again!), I actually have no idea whether the
> precise MinGW compiler we use is a "patched official release"... but
> maybe it is:
>
> C:\>gcc --version
> gcc (i686-win32-sjlj-rev3, Built by MinGW-W64 project) 4.9.1

Couldn't possibly say ... you aren't even using an official MinGW
product; i.e your compiler isn't produced by MinGW.org, who are the
owners of the MinGW trademark.  In fact, technically, the compiler
you are using isn't supported here!

- --
Regards,
Keith.

Public key available from keys.gnupg.net
Key fingerprint: C19E C018 1547 DE50 E1D4 8F53 C0AD 36C6 347E 5A3F
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (GNU/Linux)

iQIcBAEBAgAGBQJYgzjIAAoJEMCtNsY0flo/AeYP/jKezxM0K26v5j33PH8YiK22
Zl/T2itzfLCQmKfIFzD7V0zkNeAtSGlJlKTsQbrFqSMfTgVsj6vATW5RHWiodGNS
MHwVu7WNg277/rZ5eVKvxgUG/SCWcJSgE/9HUzNkjhNzAVLMnntOamTvjNok2gnY
0Sqgt2gjBKsS/BGhnh4cMRF8A919n4wHihpyKDOxcYJaUObQtFe5VmtszwnTWYtl
QS7VZDeMwxQFyXHYBQTy21uYSrUK5LrXY3X/lt+NNH2XFV1zouOSDN0ilbwfZbeD
1vR8ObDqvKcy50hyKgfAvBdw0fdHnY0Z8maFNA4IzutUl6W89mYrh/H+eSCc09Af
CqnR5AT6dF/x3Ih/Htb/HBv0RVg+iccZOhF6IbtXKO9FkS5c5wZwypF+RCFIN3Lv
kKRUV1tHiv4NtLlaclveryUrxPDTBZlRAke2ttL3UTMRY7Cw0F4REzI/hGzxGvG5
txoYnTKzKkZgcirhTzUZ0Tk/+NTYw0Ifj46fsn1FnbwU73UUfWbXCKR0jAvnlM8x
FV/c2CFtKJPXkeV9msekrrehc8+kDSrSfmcyEqO5MvU1RIjO/tEjiXYbxWhNvHmo
2f4lk9D3r5qR5PTDAO81Ai4e2S2N/249/GnKWpeywpciq0lccnIrBNzVC9amy6jd
MrGT+Ou7D216T7rP2jwg
=Zxwu
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|

Re: msvcrt printf bug

K. Frank
In reply to this post by tei.andu
Hello List (and Alexandru)!

I see that I'm coming late to the party, but I would like to chime in.

On Sat, Jan 14, 2017 at 9:02 AM,  <[hidden email]> wrote:
>
> Hello,
> I encountered a loss of precision printing a float with C printf.
> ...

There are a couple of common misconceptions running through
this thread that I would like to correct.

Short story:  Floating-point numbers are well-defined, precise,
accurate, exact mathematical objects.  Floating-point arithmetic
(when held to the standard of the "laws of arithmetic") in not
always exact.

Unfortunately, when you conflate these two issues and conclude
that floating-point numbers are these mysterious, fuzzy, squishy
things that can never under any circumstances be exact, you create
the kind of floating-point FUD that runs through this thread, this
email list, and the internet in general.

(This kind of FUD shows up in a number of forms.  You'll see,
for example:  "Never test for equality between two floating-point
numbers."  "Never test a floating-point number for equality with
zero."  "Floating-point numbers are fuzzy and inexact, so equality
of floating-point numbers is meaningless."  When you see this on
the internet (and you will), don't believe it!  Now, it's true that
when you have two floating-point numbers that are the results
of two independent chains of floating-point calculations it
generally won't make sense to test for exact equality, but
there are many cases where it does.)

Worse yet (as is typical with FUD), when people call out this
FUD (on the internet, not just this list), they get attacked
with scorn and ad hominem arguments.  So, to Alexandru Tei (the
original poster) and Emanuel Falkenauer, stand your ground, you
are right!  (And don't let the personal attacks get you down.)

Let me start with an analogy:

Consider the number 157/50.  It is a well-defined, precise, accurate,
exact mathematical object.  It's a rational number, which makes it
also a real number (but it's not, for example, an integer).  There
is nothing fuzzy or imprecise or inaccurate about it.

It has as its decimal representation 3.14 (and is precisely equal to
3.14).  Now if we use it as an approximation to the real number pi,
we find that it is an inexact value for pi -- it is only an approximation.

But the fact that 3.14 is not exactly equal to pi doesn't make 3.14
somehow squishy or inaccurate.  3.14 is exactly equal to 314 divided
by 100 and is exactly equal to the average of 3.13 and 3.15.  It's
just not exactly equal to pi (among many other things).

Now it is true that the vast majority of floating-point numbers
running around in our computers are the results of performing
floating-point arithmetic, and the large majority of these numbers
are inexact approximations to the "correct" values (where by correct
I mean the real-number results that would be obtained by performing
real arithmetic on the floating-point operands).  And anybody
performing substantive numerical calculations on a computer needs
to understand this, and should be tutored in it if they don't.

(By the way, Alexandru asked nothing about floating-point calculations
in his original post, and everything he has said in this thread indicates
that he does understand how floating-point calculations work, so I have no
reason to think that he needs to be tutored in the fact that floating-point
arithmetic can be inexact.)

Alexandru asked about printing out floating-point numbers.  People
have called this the "decimal" or "ASCII" representation of a
floating-point numbers.  I will stick to calling it the "decimal
representation," by which I will mean a decimal fraction, of
potentially arbitrary precision, that approximates a given
floating-point number.

In keeping with my point that floating-point numbers are well-defined,
precise mathematical values, it is the case that every floating-point
number is exactly equal to a single, specific decimal fraction.

Alexandru complains that msvcrt doesn't use as its decimal representation
of a floating-point number the decimal fraction that is exactly equal to it.
This is a perfectly legitimate complaint.

Now an implementation can use as its decimal representation of floating-point
numbers whatever it wants -- it's a quality-of-implementation issue.
The implementation could always print out floating-point numbers with
two significant decimal digits.  Or it could use ten significant digits,
and add three to the last significant digit just for fun.  But there is
a preferred, canonically distinguished decimal representation for floating-point
numbers -- use the  unique decimal fraction (or ASCII string or whatever you
want to call it) that is exactly equal to the floating-point number.  "Exactly
equal" -- what could get more canonical than that?

In fairness, I don't consider this to be a particularly important
quality-of-implementation issue.  I do prefer that my implementations
use the canonically distinguished decimal representation, but I don't care
enough to have retrofitted mingw to do so (or to set the _XOPEN_SOURCE
flag).  But it isn't hard to get this right (i.e., to use the canonically
distinguished representation), and apparently both glibc and Embarcadero/Borland
have done so.

I would argue that the glibc/Borland implementation is clearly better in
this regard than that of msvcrt, and that there is no basis on which one
could argue that the msvcrt implementation is better.  (Again, in fairness,
microsoft probably felt that is was a better use of a crack engineer's
time to more smoothly animate a tool-tip fade-out than to implement the
better decimal representation, and from a business perspective, they
were probably right.  But it's not that hard, so they could have done
both.)

On Mon, Jan 16, 2017 at 3:17 PM, Keith Marshall <[hidden email]> wrote:
> On 16/01/17 16:51, Earnie wrote:
>> ...
> Regardless, it is a bug to emit more significant digits than the
> underlying data format is capable of representing ... a bug by
> which both glibc and our implementation are, sadly, afflicted;
> that the OP attempts to attribute any significance whatsoever to
> those superfluous digits is indicative of an all too common gap
> in knowledge ... garbage is garbage, whatever form it may take.

Here Keith claims that the glibc/Borland implementation is actually
a bug.  This is the kind of FUD we need to defend against.

I create a floating-point number however I choose, perhaps by
twiddling bits.  (And perhaps not by performing a floating-point
operation.)  The number that I have created is perfectly well-defined
and precise, and it is not a bug to be able to print out the decimal
representation to which it is exactly equal.  An implementation that
lets me print out the exactly correct decimal representation is better
than an implementation that does not.

The floating-point number that I created may well have a well-defined,
precise -- and even useful -- mathematical meaning, Keith's assertion
that "garbage is garbage," notwithstanding.

To reiterate this point ...

On Wed, Jan 18, 2017 at 4:39 PM, Keith Marshall <[hidden email]> wrote:
> ...
> On 18/01/17 10:00, [hidden email] wrote:
>> Emanuel, thank you very much for stepping in. I am extremely happy
>> that you found my code useful.
>
> Great that he finds it useful; depressing that neither of you cares
> in the slightest about accuracy; rather, you are both chasing the
> grail of "consistent inaccuracy".

Representing a floating-point number with the decimal fraction
(or ASCII string or whatever you want to call it) that is exactly,
mathematically equal to that floating-point number is, quite
simply, accuracy, rather than inaccuracy.

Granted, there are times when it may not be important or useful
to do so, but there are times when it is.

> ...
>> I will use cygwin when I need a more accurate printf.
> ...
> Yes, I deliberately said "consistently inaccurate"; see, cygwin's
> printf() is ABSOLUTELY NOT more accurate than MinGW's, (or even
> Microsoft's, probably, for that matter).  You keep stating these
> (sadly all too widely accepted) myths:

Alexandru is right here, and is stating truths.  The printf() that
emits the decimal fraction that is exactly, mathematically equal
to the floating-point number being printed is in a very legitimate
and substantive sense more accurate than the one that does not.

>> Every valid floating point representation that is not NaN or inf
>> corresponds to an exact, non recurring fraction representation in
>> decimal.
>
> In the general case, this is utter and absolute nonsense!

On the contrary, Alexandru is completely correct here

(Note:  "utter and absolute nonsense"  <--  FUD alert!)

Alexandru's statement is sensible, relevant, and mathematically
completely correct.

>> There is no reason why printf shouldn't print that exact
>> representation when needed, as the glibc printf does.

Absolutely correct, Alexandru.  If I or Alexandru or Emanuel wants
the exact representation it's a plus that the implementation provides
it for us.

>
> Pragmatically, there is every reason.  For a binary representation
> with N binary digits of precision, the equivalent REPRESENTABLE
> decimal precision is limited to a MAXIMUM of N * log10(2) decimal
> digits;

Again, you conflate the inaccuracy of some floating-point calculations
with individual floating point numbers themselves.  Individual
floating-point numbers have mathematically well-defined, precise,
accurate values (and some of us want to print those values out).

Let me repeat this point in the context of a comment of Peter's:

On Thu, Jan 19, 2017 at 6:19 AM, Peter Rockett <[hidden email]> wrote:
> On 19/01/17 08:21, [hidden email] wrote:
> ...
> I suspect the OP's conceptual problem lies in viewing every float in
> splendid isolation rather than as part of a computational system.

On the contrary, the conceptual problem underlying the FUD in this
thread is conflation of the properties of the overall computational
system with the individual floating-point numbers, and attributing
to the individual floating-point numbers, which are well-defined and
exact, the inexactness of some floating-point operations.

Floating-point numbers make perfect sense and are perfectly well-defined
"in splendid isolation" and to assume that all floating-point numbers of
legitimate interest are the results of inexact floating-point computations
is simply wrong.

> ...
> Or another take: If you plot possible floating point representations on a
> real number line, you will have gaps between the points. The OP is trying
> print out numbers that fall in the gaps!

When I read Alexandru's original post, it appears to me that he is trying
to print out individual, specific floating-point numbers.  That's his use
case.  I see nothing to suggest that he is trying to print out values in
the gaps.  (Alexandru clearly knows that floating-point numbers are discrete
points on the real number line with gaps between them.)

On Sun, Jan 15, 2017 at 10:08 PM, KHMan <[hidden email]> wrote:
> On 1/16/2017 8:56 AM, John Brown wrote:
>> ...
> I do not think there are canonical conversion algorithms that must
> always be upheld, so I did not have an expectation that glibc must
> be canonical.

There is a canonically distinguished conversion algorithm -- it's the
one that produces the decimal representation that is mathematically
equal to the floating-point number.  To repeat myself: Mathematical
equality, what's more canonical than that?

But, of course, this algorithm does not need to be upheld.  I am quite
sure that it is not required by either the c or c++ standard, and I
am pretty sure that IEEE 754 is silent on this matter.  (I also don't
think that this issue is that important.  But it is legitimate, and
an implementation that does uphold this conversion algorithm is a
better implementation.)

> The glibc result is one data point, msvcrt is also one data point.
> He claims to have his own float to string, but knowing digits of
> precision limitations and the platform difference, why is he so
> strident in knocking msvcrt? Curious. I won't score that, so we
> are left with two data points running what are probably
> non-identical algorithms.

But it's more than just data points.  We have a canonical representation.
From this thread, we have three data points -- msvcrt, glibc, and
Borland (plus Alexandru's roll-your-own) -- and (apparently, as I
haven't tested them myself) glibc and Borland (and Alexandru's)
produce the canonical representation, while msvcrt doesn't, so
msvcrt is not canonical and is also the odd man out.

> ...
> For that expectation we pretty much need everyone to be using the same
> conversion algorithm.

Yes, and we probably won't have everyone using the same conversion
algorithm (for example, msvcrt).  Well that's what standards are for,
and some things don't get standardized.  But if everyone were to use
the same algorithm (for example, the canonical decimal representation),
then this whole thread would be much simpler, and life would be easier
for Emanuel.

There are some side comments I would like to make:

Two distinct, but related issues have been discussed.  The first is
whether printf() should print out the exact decimal representation
of a floating-point number (Alexandru), and the second is whether
different implementations should print out the same representation
(Emanuel).  Both are desirable goals, and if you get the first (for
all implementations), you get the second.

My preference, of course, would be to have all implementations print
out the exact representation (when asked to).  But you could, say,
have printf() print out the (specific-floating-point-number-dependent)
minimum number of digits for which you get "round-trip consistency"
(i.e., floating-point number --> printf() --> ASCII --> scanf() -->
back to the same floating-point number).  That would be reasonable,
and would solve Emanuel's problem.  (Or you could print out the minimum
number of "consistency" digits, and swap the last two digits just for fun.
That would be less reasonable, but would also solve Emanuel's problem.)

My point is that a canonically distinguished representation exists,
so, even if you only care about the second issue, it's easier to
get various implementations to hew to that canonical representation,
rather than to some well-defined, but semi-arbitrary representation
that I (or someone else) might make up.

In retort to Keith's claim that such a standard across implementations,
such as my "swap the last two digits" standard, would be chasing "consistent
inaccuracy" (Of, course, the canonical representation would be "consistent
accuracy."), Emanuel has presented a perfectly logical and valid use case
for this.  Sure, there are other ways Emanuel could achieve his goal of
cross-checking the output of different builds, but this is a good one.

More importantly, all of us (including Emanuel) agree that he has no
right to expect the floating-point results of his different builds to
be the exactly the same.  (Nothing in the c or c++ standard, nor in
IEEE 754 requires this.)  However, he enjoys the happy accident that
the floating-point results do agree, so it's unfortunate that printf()
from mingw/msvcrt and Borland print out different values, and that he
therefor has to go through additional work to cross-check his results.

That he can use Emanuel's code or set the _XOPEN_SOURCE flag to
resolve this issue is a good thing, but the fact that he has to take this
extra step is a minor negative.

Last, and quite tangential, Kein-Hong took a dig at Intel and the 8087:

On Fri, Jan 20, 2017 at 7:54 PM, KHMan <[hidden email]> wrote:
> On 1/21/2017 6:18 AM, [hidden email] wrote:
> ...
> AFAIK it is only with 8087 registers -- just about the only
> company who did this was Intel. Didn't really worked out,

Actually, it worked out quite well for some important use cases,
and kudos to Intel for doing this.

Often in numerical analysis you perform linear algebra on large
systems.  Often with large systems round-off error accumulates
excessively, and when the systems are ill-conditioned, the round-off
error is further amplified.

Often, as part of the linear-algebra algorithm, you compute a sum
of products, that is, the inner product of two vectors.  It turns
out (and, if you're a numerical analyst, you can prove, given conditions
on the linear systems involved), that you do not need to perform the
entire calculation with higher precision to get dramatically more
accurate results -- you can get most of the benefit just using higher
precision to compute the inner products.  The inner-product computation
(a chain of multiply-accumulates) fits trivially in the 8087 floating-point
registers (without register spill), and the use of the 8087's 80-bit
extended-precision just on these inner-product computations yields
dramatically more accurate results in many cases.

There are various engineering considerations for not using extended-precision
registers (some legitimate, some less so), but Intel, for whatever reason,
decided that they wanted to do floating-point right, they hired Kahan to
help them, and we're all the better for it.

Look, I understand the frustration felt by many commenters here, particularly
that expressed by Keith and Kein-Hong.  Stack Overflow and forums and bulletin
boards and chat rooms (and even classrooms, where they use things like, you
know, "blackboards" and "chalk") are filled with naive confusion about
floating-point arithmetic, with questions of the sort "I did x = y / z, and
w = x * z, and x and w don't test equal.  How can this be?  Mercy!  It must
be a compiler bug!"  And now you have to tutor another generation of novice
programmers in how floating-point arithmetic works.  It gets old.

But it's counter-productive to tutor them with misinformation.  Floating-point
numbers are what they are and are mathematically perfectly well defined.
Floating-point arithmetic is inexact when understood as an approximation
to real-number arithmetic.  Let's tutor the next generation in what actually
happens: Two perfectly well-defined floating-point operands go into a
floating-point operation, and out comes a perfectly well-defined (if you're
using a well-defined standard such as IEEE 754) floating-point result that
in general, is not equal to the real-number result you would have obtained
if you had used real-number arithmetic on the floating-point operands.

But, to repeat Emanuel's comment, "it's not as if a FPU had a Schrödinger cat
embedded!"  The floating-point operation doesn't sometimes give you one
(inherently fuzzy) floating-point result and sometimes another (inherently
fuzzy) result.  It gives you a single, consistent, perfectly well-defined
(if you're using a well-defined standard) result that makes perfectly good
mathematical sense.  Furthermore, it's also not as if the data bus connecting
memory to the FPU has an embedded Schrödinger cat, and that these (squishy,
fuzzy, inexact) floating-point numbers get fuzzed up somehow by cat hair as
they travel around inside our computers.

The way floating-point numbers -- and floating-point arithmetic -- really work
is completely precise and mathematically well defined, even if it's subtle
and complex -- and different from real-number arithmetic.  And how it really
works -- not FUD -- is what we need to help the next generation of people
doing substantive numerical calculations learn.


Happy Floating-Point Hacking!


K. Frank

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
1234