putwchar not working

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

putwchar not working

Chan Oak
Does anyone know why putwchar is not working with mingw?
I only able to use wprintf for wide character printing

------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Yongwei Wu
On 7 September 2016 at 13:56, Chan Oak <[hidden email]> wrote:
>
> Does anyone know why putwchar is not working with mingw?
> I only able to use wprintf for wide character printing

Problem of the MSVCRT.DLL on Windows Vista and later. Reported a few
years ago but no one is working on it. It is not just putwchar.
Outputting non-ASCII characters has problems with putchar/putwchar, in
general.

I had a blog on this one:

https://yongweiwu.wordpress.com/2016/05/27/msvcrt-dll-console-io-bug/

Best regards,

Yongwei

--
Yongwei Wu
URL: http://wyw.dcweb.cn/

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Eli Zaretskii
> From: Yongwei Wu <[hidden email]>
> Date: Wed, 7 Sep 2016 17:07:54 +0800
>
> On 7 September 2016 at 13:56, Chan Oak <[hidden email]> wrote:
> >
> > Does anyone know why putwchar is not working with mingw?
> > I only able to use wprintf for wide character printing
>
> Problem of the MSVCRT.DLL on Windows Vista and later. Reported a few
> years ago but no one is working on it. It is not just putwchar.
> Outputting non-ASCII characters has problems with putchar/putwchar, in
> general.
>
> I had a blog on this one:
>
> https://yongweiwu.wordpress.com/2016/05/27/msvcrt-dll-console-io-bug/

AFAIK, the only reliable way of writing non-ASCII text to the Windows
console is by using WriteConsoleW.  Did you try that?

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
I tried both Suggestions [the one from Wu and this one suggested by Eli]. Nothing worked at least for Arabic, including piping through perl. By redirecting the Arabic output to a file I can open it and see it correct using any editor [I used emacs], but when piping through perl I get garbage or using puts or putchar or whatever. I have Windows 10 64-bit and gcc 5.3.0 (the latest)
By the way if anyone is curious, the Arabic string is my first name:

#include <stdio.h>

int main() {
   puts(u8"هشام\n");
  return 0;
}

 
hisham...


On Wednesday, September 7, 2016 4:47 PM, Eli Zaretskii <[hidden email]> wrote:


> From: Yongwei Wu <[hidden email]>

> Date: Wed, 7 Sep 2016 17:07:54 +0800
>
> On 7 September 2016 at 13:56, Chan Oak <[hidden email]> wrote:
> >
> > Does anyone know why putwchar is not working with mingw?
> > I only able to use wprintf for wide character printing
>
> Problem of the MSVCRT.DLL on Windows Vista and later. Reported a few
> years ago but no one is working on it. It is not just putwchar.
> Outputting non-ASCII characters has problems with putchar/putwchar, in
> general.
>
> I had a blog on this one:
>
> https://yongweiwu.wordpress.com/2016/05/27/msvcrt-dll-console-io-bug/

AFAIK, the only reliable way of writing non-ASCII text to the Windows
console is by using WriteConsoleW.  Did you try that?


------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe



------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Eli Zaretskii
--text follows this line--

> Date: Wed, 7 Sep 2016 17:07:27 +0000 (UTC)
> From: Hisham Sueyllam <[hidden email]>
>
> I tried both Suggestions [the one from Wu and this one suggested by Eli]. Nothing worked at least for Arabic,
> including piping through perl. By redirecting the Arabic output to a file I can open it and see it correct using any
> editor [I used emacs], but when piping through perl I get garbage or using puts or putchar or whatever. I have
> Windows 10 64-bit and gcc 5.3.0 (the latest)
> By the way if anyone is curious, the Arabic string is my first name:
>
> #include <stdio.h>
>
> int main() {
> puts(u8"هشام\n");
> return 0;
> }

Your email is encoded in UTF-8, and you use Emacs, so I strongly
suspect the Arabic string in your source file was also encoded in
UTF-8.  That will never work; you need to encode it in your system's
ANSI codepage, I think.  Windows includes only a very sporadic support
for UTF-8, and in particular cannot use it in multibyte strings.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
Same thing when I encode the string using ANSI. As in:

#include <stdio.h>

int main() {
  puts("\uFEEB\uFEB8\uFE8E\uFEE1\n");
  return 0;
}

However by inspecting the properties  of the command window I found that its code page is 720 which is the code page for Arabic used by DOS. Windows however uses a different code page for Arabic which is (I believe) is 1256 or something like that which you can check on wikipedia by searching for "code page 720". I am not sure but this might be the reason...
hisham...


On Wednesday, September 7, 2016 7:43 PM, Eli Zaretskii <[hidden email]> wrote:


--text follows this line--

> Date: Wed, 7 Sep 2016 17:07:27 +0000 (UTC)
> From: Hisham Sueyllam <[hidden email]>
>
> I tried both Suggestions [the one from Wu and this one suggested by Eli]. Nothing worked at least for Arabic,
> including piping through perl. By redirecting the Arabic output to a file I can open it and see it correct using any
> editor [I used emacs], but when piping through perl I get garbage or using puts or putchar or whatever. I have
> Windows 10 64-bit and gcc 5.3.0 (the latest)
> By the way if anyone is curious, the Arabic string is my first name:
>
> #include <stdio.h>
>
> int main() {
> puts(u8"هشام\n");
> return 0;

> }

Your email is encoded in UTF-8, and you use Emacs, so I strongly
suspect the Arabic string in your source file was also encoded in
UTF-8.  That will never work; you need to encode it in your system's
ANSI codepage, I think.  Windows includes only a very sporadic support
for UTF-8, and in particular cannot use it in multibyte strings.




------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Yongwei Wu
In reply to this post by Eli Zaretskii
On 7 September 2016 at 22:42, Eli Zaretskii <[hidden email]> wrote:

>> From: Yongwei Wu <[hidden email]>
>> Date: Wed, 7 Sep 2016 17:07:54 +0800
>>
>> On 7 September 2016 at 13:56, Chan Oak <[hidden email]> wrote:
>> >
>> > Does anyone know why putwchar is not working with mingw?
>> > I only able to use wprintf for wide character printing
>>
>> Problem of the MSVCRT.DLL on Windows Vista and later. Reported a few
>> years ago but no one is working on it. It is not just putwchar.
>> Outputting non-ASCII characters has problems with putchar/putwchar, in
>> general.
>>
>> I had a blog on this one:
>>
>> https://yongweiwu.wordpress.com/2016/05/27/msvcrt-dll-console-io-bug/
>
> AFAIK, the only reliable way of writing non-ASCII text to the Windows
> console is by using WriteConsoleW.  Did you try that?

I have many ways to work around the issue for myself. However, I do
not have time to fix all the applications that are compilable by
MinGW. The only practical solutions (though still difficult), IMHO,
are:

* Make Microsoft fix the issue in MSVCRT
* Make MinGW not use MSVCRT for I/O functions

--
Yongwei Wu
URL: http://wyw.dcweb.cn/

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Yongwei Wu
In reply to this post by Eli Zaretskii
On 8 September 2016 at 01:42, Eli Zaretskii <[hidden email]> wrote:

> --text follows this line--
>> Date: Wed, 7 Sep 2016 17:07:27 +0000 (UTC)
>> From: Hisham Sueyllam <[hidden email]>
>>
>> I tried both Suggestions [the one from Wu and this one suggested by Eli]. Nothing worked at least for Arabic,
>> including piping through perl. By redirecting the Arabic output to a file I can open it and see it correct using any
>> editor [I used emacs], but when piping through perl I get garbage or using puts or putchar or whatever. I have
>> Windows 10 64-bit and gcc 5.3.0 (the latest)
>> By the way if anyone is curious, the Arabic string is my first name:
>>
>> #include <stdio.h>
>>
>> int main() {
>> puts(u8"هشام\n");
>> return 0;
>> }
>
> Your email is encoded in UTF-8, and you use Emacs, so I strongly
> suspect the Arabic string in your source file was also encoded in
> UTF-8.  That will never work; you need to encode it in your system's
> ANSI codepage, I think.  Windows includes only a very sporadic support
> for UTF-8, and in particular cannot use it in multibyte strings.

The problem is that GCC treats the file source as UTF-8, but the puts
function can accept only an ‘ANSI’ string. The conclusion is the same
as yours: removing ‘u8’ and saving the file as the ‘ANSI’ code page
should work.

As discussed earlier, putws does not work. Otherwise, putws(L"هشام\n")
is the logical solution on Windows.

--
Yongwei Wu
URL: http://wyw.dcweb.cn/

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
I totally agree and the following simple experiment proves it:

#include <stdio.h>

int main() {
  char s[] = "\xEC\xAC\x9F\xEA\n";
  char * t = s;
  while (*t != '\0') {
 putchar (*t);
 t++;
  }
  return 0;
}

This program has in s my first name in Arabic encoded using code page 720. According to the command window properties this is the current code page on my computer. However I still get garbage.
When I redirect the output to a file then display this file using notepad++ I get garbage, but when I change the encoding to "OEM 720", it displays correctly...
 
hisham...


On Thursday, September 8, 2016 5:02 AM, Yongwei Wu <[hidden email]> wrote:


On 8 September 2016 at 01:42, Eli Zaretskii <[hidden email]> wrote:

> --text follows this line--
>> Date: Wed, 7 Sep 2016 17:07:27 +0000 (UTC)
>> From: Hisham Sueyllam <[hidden email]>
>>
>> I tried both Suggestions [the one from Wu and this one suggested by Eli]. Nothing worked at least for Arabic,
>> including piping through perl. By redirecting the Arabic output to a file I can open it and see it correct using any
>> editor [I used emacs], but when piping through perl I get garbage or using puts or putchar or whatever. I have
>> Windows 10 64-bit and gcc 5.3.0 (the latest)
>> By the way if anyone is curious, the Arabic string is my first name:
>>
>> #include <stdio.h>
>>
>> int main() {
>> puts(u8"هشام\n");
>> return 0;
>> }
>
> Your email is encoded in UTF-8, and you use Emacs, so I strongly
> suspect the Arabic string in your source file was also encoded in
> UTF-8.  That will never work; you need to encode it in your system's
> ANSI codepage, I think.  Windows includes only a very sporadic support
> for UTF-8, and in particular cannot use it in multibyte strings.

The problem is that GCC treats the file source as UTF-8, but the puts
function can accept only an ‘ANSI’ string. The conclusion is the same
as yours: removing ‘u8’ and saving the file as the ‘ANSI’ code page
should work.

As discussed earlier, putws does not work. Otherwise, putws(L"هشام\n")
is the logical solution on Windows.

--
Yongwei Wu
URL: http://wyw.dcweb.cn/


------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Eli Zaretskii
In reply to this post by Yongwei Wu
> From: Yongwei Wu <[hidden email]>
> Date: Thu, 8 Sep 2016 10:52:46 +0800
>
> > AFAIK, the only reliable way of writing non-ASCII text to the Windows
> > console is by using WriteConsoleW.  Did you try that?
>
> I have many ways to work around the issue for myself. However, I do
> not have time to fix all the applications that are compilable by
> MinGW. The only practical solutions (though still difficult), IMHO,
> are:
>
> * Make Microsoft fix the issue in MSVCRT
> * Make MinGW not use MSVCRT for I/O functions

My point is whatever you do, it won't solve the problem completely.
Even if putwchar will eventually do what the MS VS static libraries
do, some basic problem will remain.  For example, you cannot display
anything with putwchar that is beyond the BMP, because a single
wchar_t argument can only express characters in the BMP.  And that is
only one of the fundamental problems with non-ASCII support on the
Windows console.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Eli Zaretskii
In reply to this post by Yongwei Wu
> From: Yongwei Wu <[hidden email]>
> Date: Thu, 8 Sep 2016 11:02:50 +0800
> Cc: Hisham Sueyllam <[hidden email]>
>
> putws(L"هشام\n") is the logical solution on Windows.

You assume that the console font supports Arabic.  That is not
necessarily true.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Yongwei Wu
In reply to this post by Eli Zaretskii
On 9 September 2016 at 00:17, Eli Zaretskii <[hidden email]> wrote:

>> From: Yongwei Wu <[hidden email]>
>> Date: Thu, 8 Sep 2016 10:52:46 +0800
>>
>> > AFAIK, the only reliable way of writing non-ASCII text to the Windows
>> > console is by using WriteConsoleW.  Did you try that?
>>
>> I have many ways to work around the issue for myself. However, I do
>> not have time to fix all the applications that are compilable by
>> MinGW. The only practical solutions (though still difficult), IMHO,
>> are:
>>
>> * Make Microsoft fix the issue in MSVCRT
>> * Make MinGW not use MSVCRT for I/O functions
>
> My point is whatever you do, it won't solve the problem completely.
> Even if putwchar will eventually do what the MS VS static libraries
> do, some basic problem will remain.  For example, you cannot display
> anything with putwchar that is beyond the BMP, because a single
> wchar_t argument can only express characters in the BMP.  And that is
> only one of the fundamental problems with non-ASCII support on the
> Windows console.

Not supporting characters beyond the BMP does not look a big problem
to me. To me, the big problem is anything compiled by MinGW is broken
in non-ASCII support, and, in comparison, MSVC looks much better....

--
Yongwei Wu
URL: http://wyw.dcweb.cn/

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Sergio NNX
Hi.

We are in the process of updating (or upgrading) the runtime and w32api to the latest Windows Kit and we welcome all patches.

> Not supporting characters beyond the BMP does not look a big problem
> to me. To me, the big problem is anything compiled by MinGW is broken
> in non-ASCII support, and, in comparison, MSVC looks much better....

Can you email us your patch(es)? We are more than happy to apply them and run a few tests.

Cheers.

------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Eli Zaretskii
In reply to this post by Yongwei Wu
> From: Yongwei Wu <[hidden email]>
> Date: Fri, 9 Sep 2016 09:35:02 +0800
> Cc: MinGW Users List <[hidden email]>
>
> > My point is whatever you do, it won't solve the problem completely.
> > Even if putwchar will eventually do what the MS VS static libraries
> > do, some basic problem will remain.  For example, you cannot display
> > anything with putwchar that is beyond the BMP, because a single
> > wchar_t argument can only express characters in the BMP.  And that is
> > only one of the fundamental problems with non-ASCII support on the
> > Windows console.
>
> Not supporting characters beyond the BMP does not look a big problem
> to me.

Maybe for you personally it isn't.  But being able to support just the
BMP in the year 2016 is a huge disadvantage IME.  We have already a
lot of character sets entirely in Unicode blocks beyond the BMP, and
each new version of the Unicode standard adds more.  Moreover, Windows
itself adds fonts to support these new character sets with each new
Windows version.  Being unable to support that is far from a Good
Thing.

> To me, the big problem is anything compiled by MinGW is broken
> in non-ASCII support, and, in comparison, MSVC looks much better....

Not everything compiled by MinGW is broken in this regard.  I think
Texinfo's info.exe isn't, for example (it uses WriteConsoleW).  You
just need to avoid the CRT functions like putwchar and all the rest,
because (a) they only support the BMP, and (b) they only "work" with
non-ASCII characters supported by the current codepage.  The latter
part means that (1) you need to call setlocale each time you want a
character outside of the current codepage, and (2) worse, you cannot
support Unicode at all, because UTF-8 and other Unicode encodings can
never be a codeset specified in a setlocale call.

The upshot of all this is that any program which needs to support
non-ASCII characters in a sound way has no choice but to avoid the CRT
functions entirely, and instead use the "wide" Win32 APIs (such as
WriteConsoleW) directly and convert characters from and to UTF-16 by
hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
that relies on CRT functions for that is fundamentally broken on
Windows, and the main reason is NOT the MinGW libraries.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Yongwei Wu
On Friday, 9 September 2016, Eli Zaretskii <[hidden email]> wrote:

>
> > From: Yongwei Wu <[hidden email]>
> > Date: Fri, 9 Sep 2016 09:35:02 +0800
> > Cc: MinGW Users List <[hidden email]>
> >
> > > My point is whatever you do, it won't solve the problem completely.
> > > Even if putwchar will eventually do what the MS VS static libraries
> > > do, some basic problem will remain.  For example, you cannot display
> > > anything with putwchar that is beyond the BMP, because a single
> > > wchar_t argument can only express characters in the BMP.  And that is
> > > only one of the fundamental problems with non-ASCII support on the
> > > Windows console.
> >
> > Not supporting characters beyond the BMP does not look a big problem
> > to me.
>
> Maybe for you personally it isn't.  But being able to support just the
> BMP in the year 2016 is a huge disadvantage IME.  We have already a
> lot of character sets entirely in Unicode blocks beyond the BMP, and
> each new version of the Unicode standard adds more.  Moreover, Windows
> itself adds fonts to support these new character sets with each new
> Windows version.  Being unable to support that is far from a Good
> Thing.

It is a luxury to talk about this, when we have problems dealing with the BMP.

> > To me, the big problem is anything compiled by MinGW is broken
> > in non-ASCII support, and, in comparison, MSVC looks much better....
>
> Not everything compiled by MinGW is broken in this regard.  I think
> Texinfo's info.exe isn't, for example (it uses WriteConsoleW).  You
> just need to avoid the CRT functions like putwchar and all the rest,
> because (a) they only support the BMP, and (b) they only "work" with
> non-ASCII characters supported by the current codepage.  The latter
> part means that (1) you need to call setlocale each time you want a
> character outside of the current codepage, and (2) worse, you cannot
> support Unicode at all, because UTF-8 and other Unicode encodings can
> never be a codeset specified in a setlocale call.
>
> The upshot of all this is that any program which needs to support
> non-ASCII characters in a sound way has no choice but to avoid the CRT
> functions entirely, and instead use the "wide" Win32 APIs (such as
> WriteConsoleW) directly and convert characters from and to UTF-16 by
> hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
> that relies on CRT functions for that is fundamentally broken on
> Windows, and the main reason is NOT the MinGW libraries.

I see your point, but I need to point out that it means completely
different porting efforts, if the original application uses putchar
etc. Just take the example of GCC: Oh, it outputs garbage information
if it detects the environment is Chinese! I have to force the language
to English to make it work. (I am not sure whether the current version
of MinGW GCC ships with Chinese localization, since I am mainly on Mac
now. It used to be the case....)

Do you want to tell the GCC guys that they need to replace all
occurrences of putchar etc.? (Maybe the easier approach is rid MinGW
of all localization.)

Best regards,

Yongwei

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Eli Zaretskii
> From: Yongwei Wu <[hidden email]>
> Date: Fri, 9 Sep 2016 22:41:53 +0800
> Cc: "[hidden email]" <[hidden email]>
>
> > Maybe for you personally it isn't.  But being able to support just the
> > BMP in the year 2016 is a huge disadvantage IME.  We have already a
> > lot of character sets entirely in Unicode blocks beyond the BMP, and
> > each new version of the Unicode standard adds more.  Moreover, Windows
> > itself adds fonts to support these new character sets with each new
> > Windows version.  Being unable to support that is far from a Good
> > Thing.
>
> It is a luxury to talk about this, when we have problems dealing with the BMP.

When using the wide APIs, there are no such problems.

> > The upshot of all this is that any program which needs to support
> > non-ASCII characters in a sound way has no choice but to avoid the CRT
> > functions entirely, and instead use the "wide" Win32 APIs (such as
> > WriteConsoleW) directly and convert characters from and to UTF-16 by
> > hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
> > that relies on CRT functions for that is fundamentally broken on
> > Windows, and the main reason is NOT the MinGW libraries.
>
> I see your point, but I need to point out that it means completely
> different porting efforts, if the original application uses putchar
> etc. Just take the example of GCC: Oh, it outputs garbage information
> if it detects the environment is Chinese! I have to force the language
> to English to make it work. (I am not sure whether the current version
> of MinGW GCC ships with Chinese localization, since I am mainly on Mac
> now. It used to be the case....)
>
> Do you want to tell the GCC guys that they need to replace all
> occurrences of putchar etc.? (Maybe the easier approach is rid MinGW
> of all localization.)

Localization the gettext way has a very limited utility on Windows
anyway, because most Posix programs nowadays assume UTF-8, which is
compatible with "char *" strings, something that on Windows is
possible only with single-byte codepages and a small number of
codepages that support DBCS.  If we want localization on Windows that
really works, the only practical way is to provide replacements for
setlocale and all the multibyte and wide-character functions in the
CRT to support UTF-8 and work with 32-bit wchar_t data type.  Anything
less than that is no more than an illusion of non-ASCII support, and
breaks as soon as you try it in a locale outside Western Europe.

So if we want to fix MSVCRT in this area, we need to come up with a
much larger library than just a replacement for putwchar.  If someone
volunteers to do that job, I think it would allow the MinGW project to
become a much better environment for porting Posix programs that deal
with non-ASCII than it is today, including localization in the likes
of GCC.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
Well, as an experiment I tried the mingw64, and it works fine and prints unicode both Arabic and English in the command window, as shown in the attached png image.
Here is the listing of the program that worked, it prints an 'a' next to my first name in Arabic, to show that it prints both latin and other non-latin (Actually my guess any valid utf8 character, or may be just the BMP).
Note however that from the options menu I changed the font from Lucida console (the default and which did not work) to courier new.

So I am not sure how they did it in the mingw64,project  but clearly this takes the blame off Microsoft...

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
    char s1[] = "a\uFEEB\uFEB8\uFE8E\uFEE1"; // or "a\u\U"
    char s2[] = u8"a\uFEEB\uFEB8\uFE8E\uFEE1";
    //setlocale(LC_ALL, "en_US.utf8");
    printf("%s\n", s1);
    printf("%s\n", s2);
return 0;
}

 
hisham...


On Friday, September 9, 2016 6:08 PM, Eli Zaretskii <[hidden email]> wrote:


> From: Yongwei Wu <[hidden email]>

> Date: Fri, 9 Sep 2016 22:41:53 +0800
> Cc: "[hidden email]" <[hidden email]>
>
> > Maybe for you personally it isn't.  But being able to support just the
> > BMP in the year 2016 is a huge disadvantage IME.  We have already a
> > lot of character sets entirely in Unicode blocks beyond the BMP, and
> > each new version of the Unicode standard adds more.  Moreover, Windows
> > itself adds fonts to support these new character sets with each new
> > Windows version.  Being unable to support that is far from a Good
> > Thing.
>
> It is a luxury to talk about this, when we have problems dealing with the BMP.

When using the wide APIs, there are no such problems.

> > The upshot of all this is that any program which needs to support
> > non-ASCII characters in a sound way has no choice but to avoid the CRT
> > functions entirely, and instead use the "wide" Win32 APIs (such as
> > WriteConsoleW) directly and convert characters from and to UTF-16 by
> > hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
> > that relies on CRT functions for that is fundamentally broken on
> > Windows, and the main reason is NOT the MinGW libraries.
>
> I see your point, but I need to point out that it means completely
> different porting efforts, if the original application uses putchar
> etc. Just take the example of GCC: Oh, it outputs garbage information
> if it detects the environment is Chinese! I have to force the language
> to English to make it work. (I am not sure whether the current version
> of MinGW GCC ships with Chinese localization, since I am mainly on Mac
> now. It used to be the case....)
>
> Do you want to tell the GCC guys that they need to replace all
> occurrences of putchar etc.? (Maybe the easier approach is rid MinGW
> of all localization.)

Localization the gettext way has a very limited utility on Windows
anyway, because most Posix programs nowadays assume UTF-8, which is
compatible with "char *" strings, something that on Windows is
possible only with single-byte codepages and a small number of
codepages that support DBCS.  If we want localization on Windows that
really works, the only practical way is to provide replacements for
setlocale and all the multibyte and wide-character functions in the
CRT to support UTF-8 and work with 32-bit wchar_t data type.  Anything
less than that is no more than an illusion of non-ASCII support, and
breaks as soon as you try it in a locale outside Western Europe.

So if we want to fix MSVCRT in this area, we need to come up with a
much larger library than just a replacement for putwchar.  If someone
volunteers to do that job, I think it would allow the MinGW project to
become a much better environment for porting Posix programs that deal
with non-ASCII than it is today, including localization in the likes

of GCC.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe



------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe

ExampleWchar_Mingw64.png (26K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
Well I hate to flood the mail but I thought you may find it interesting, since changing the font did the trick for mingw64, I thought do the same for the command prompt I use to run mingw32 programs. So from the system menu of the command prompt I also changed the font to courier new. Unfortunately, the program in the previous message still displayed garbage, but remembering that the code page is 720, I tried the following, with my arabic first name coded using code page 720 instead of utf8 as in the previous program:

//#include <windows.h>
#include <stdio.h>
//#include <wchar.h>
//#include <conio.h>
//#include <locale.h>

int main() {
  char s[] = "\xEC\xAC\x9F\xEA\n";
  printf("%s", s);
  return 0;
}

and it almost worked. I said almost because it printed the 4 letters which constitute my first name correctly but with 2 missing items:
1- They are from left to right (Arabic is written from right to left for those who do not know)
2- The letters have the wrong shape (Also for those unfamiliar with Arabic, the shape of the letter depends on its position in the word: beginning, end, or in the middle)
See the attached image and compare the outpout with the one in the previous message attachment...
Well, I am not boasting mingw64 but both those 2 buggy apperances were not present, my name was printed as it should from right to left...
 
hisham...


On Sunday, September 11, 2016 7:06 AM, Hisham Sueyllam <[hidden email]> wrote:


Well, as an experiment I tried the mingw64, and it works fine and prints unicode both Arabic and English in the command window, as shown in the attached png image.
Here is the listing of the program that worked, it prints an 'a' next to my first name in Arabic, to show that it prints both latin and other non-latin (Actually my guess any valid utf8 character, or may be just the BMP).
Note however that from the options menu I changed the font from Lucida console (the default and which did not work) to courier new.

So I am not sure how they did it in the mingw64,project  but clearly this takes the blame off Microsoft...

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
    char s1[] = "a\uFEEB\uFEB8\uFE8E\uFEE1"; // or "a\u\U"
    char s2[] = u8"a\uFEEB\uFEB8\uFE8E\uFEE1";
    //setlocale(LC_ALL, "en_US.utf8");
    printf("%s\n", s1);
    printf("%s\n", s2);
return 0;
}

 
hisham...


On Friday, September 9, 2016 6:08 PM, Eli Zaretskii <[hidden email]> wrote:


> From: Yongwei Wu <[hidden email]>

> Date: Fri, 9 Sep 2016 22:41:53 +0800
> Cc: "[hidden email]" <[hidden email]>
>
> > Maybe for you personally it isn't.  But being able to support just the
> > BMP in the year 2016 is a huge disadvantage IME.  We have already a
> > lot of character sets entirely in Unicode blocks beyond the BMP, and
> > each new version of the Unicode standard adds more.  Moreover, Windows
> > itself adds fonts to support these new character sets with each new
> > Windows version.  Being unable to support that is far from a Good
> > Thing.
>
> It is a luxury to talk about this, when we have problems dealing with the BMP.

When using the wide APIs, there are no such problems.

> > The upshot of all this is that any program which needs to support
> > non-ASCII characters in a sound way has no choice but to avoid the CRT
> > functions entirely, and instead use the "wide" Win32 APIs (such as
> > WriteConsoleW) directly and convert characters from and to UTF-16 by
> > hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
> > that relies on CRT functions for that is fundamentally broken on
> > Windows, and the main reason is NOT the MinGW libraries.
>
> I see your point, but I need to point out that it means completely
> different porting efforts, if the original application uses putchar
> etc. Just take the example of GCC: Oh, it outputs garbage information
> if it detects the environment is Chinese! I have to force the language
> to English to make it work. (I am not sure whether the current version
> of MinGW GCC ships with Chinese localization, since I am mainly on Mac
> now. It used to be the case....)
>
> Do you want to tell the GCC guys that they need to replace all
> occurrences of putchar etc.? (Maybe the easier approach is rid MinGW
> of all localization.)

Localization the gettext way has a very limited utility on Windows
anyway, because most Posix programs nowadays assume UTF-8, which is
compatible with "char *" strings, something that on Windows is
possible only with single-byte codepages and a small number of
codepages that support DBCS.  If we want localization on Windows that
really works, the only practical way is to provide replacements for
setlocale and all the multibyte and wide-character functions in the
CRT to support UTF-8 and work with 32-bit wchar_t data type.  Anything
less than that is no more than an illusion of non-ASCII support, and
breaks as soon as you try it in a locale outside Western Europe.

So if we want to fix MSVCRT in this area, we need to come up with a
much larger library than just a replacement for putwchar.  If someone
volunteers to do that job, I think it would allow the MinGW project to
become a much better environment for porting Posix programs that deal
with non-ASCII than it is today, including localization in the likes

of GCC.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe





------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe

widePrint2_Ansi.png (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
Just one correction mingw64 gives the same as mingw32, the correct display from right to left and correct letter shapes happened in msys2 terminal so I guess the same terminal with mingw32 would also work...
 
hisham...


On Sunday, September 11, 2016 7:44 AM, Hisham Sueyllam <[hidden email]> wrote:


Well I hate to flood the mail but I thought you may find it interesting, since changing the font did the trick for mingw64, I thought do the same for the command prompt I use to run mingw32 programs. So from the system menu of the command prompt I also changed the font to courier new. Unfortunately, the program in the previous message still displayed garbage, but remembering that the code page is 720, I tried the following, with my arabic first name coded using code page 720 instead of utf8 as in the previous program:

//#include <windows.h>
#include <stdio.h>
//#include <wchar.h>
//#include <conio.h>
//#include <locale.h>

int main() {
  char s[] = "\xEC\xAC\x9F\xEA\n";
  printf("%s", s);
  return 0;
}

and it almost worked. I said almost because it printed the 4 letters which constitute my first name correctly but with 2 missing items:
1- They are from left to right (Arabic is written from right to left for those who do not know)
2- The letters have the wrong shape (Also for those unfamiliar with Arabic, the shape of the letter depends on its position in the word: beginning, end, or in the middle)
See the attached image and compare the outpout with the one in the previous message attachment...
Well, I am not boasting mingw64 but both those 2 buggy apperances were not present, my name was printed as it should from right to left...
 
hisham...


On Sunday, September 11, 2016 7:06 AM, Hisham Sueyllam <[hidden email]> wrote:


Well, as an experiment I tried the mingw64, and it works fine and prints unicode both Arabic and English in the command window, as shown in the attached png image.
Here is the listing of the program that worked, it prints an 'a' next to my first name in Arabic, to show that it prints both latin and other non-latin (Actually my guess any valid utf8 character, or may be just the BMP).
Note however that from the options menu I changed the font from Lucida console (the default and which did not work) to courier new.

So I am not sure how they did it in the mingw64,project  but clearly this takes the blame off Microsoft...

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
    char s1[] = "a\uFEEB\uFEB8\uFE8E\uFEE1"; // or "a\u\U"
    char s2[] = u8"a\uFEEB\uFEB8\uFE8E\uFEE1";
    //setlocale(LC_ALL, "en_US.utf8");
    printf("%s\n", s1);
    printf("%s\n", s2);
return 0;
}

 
hisham...


On Friday, September 9, 2016 6:08 PM, Eli Zaretskii <[hidden email]> wrote:


> From: Yongwei Wu <[hidden email]>

> Date: Fri, 9 Sep 2016 22:41:53 +0800
> Cc: "[hidden email]" <[hidden email]>
>
> > Maybe for you personally it isn't.  But being able to support just the
> > BMP in the year 2016 is a huge disadvantage IME.  We have already a
> > lot of character sets entirely in Unicode blocks beyond the BMP, and
> > each new version of the Unicode standard adds more.  Moreover, Windows
> > itself adds fonts to support these new character sets with each new
> > Windows version.  Being unable to support that is far from a Good
> > Thing.
>
> It is a luxury to talk about this, when we have problems dealing with the BMP.

When using the wide APIs, there are no such problems.

> > The upshot of all this is that any program which needs to support
> > non-ASCII characters in a sound way has no choice but to avoid the CRT
> > functions entirely, and instead use the "wide" Win32 APIs (such as
> > WriteConsoleW) directly and convert characters from and to UTF-16 by
> > hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
> > that relies on CRT functions for that is fundamentally broken on
> > Windows, and the main reason is NOT the MinGW libraries.
>
> I see your point, but I need to point out that it means completely
> different porting efforts, if the original application uses putchar
> etc. Just take the example of GCC: Oh, it outputs garbage information
> if it detects the environment is Chinese! I have to force the language
> to English to make it work. (I am not sure whether the current version
> of MinGW GCC ships with Chinese localization, since I am mainly on Mac
> now. It used to be the case....)
>
> Do you want to tell the GCC guys that they need to replace all
> occurrences of putchar etc.? (Maybe the easier approach is rid MinGW
> of all localization.)

Localization the gettext way has a very limited utility on Windows
anyway, because most Posix programs nowadays assume UTF-8, which is
compatible with "char *" strings, something that on Windows is
possible only with single-byte codepages and a small number of
codepages that support DBCS.  If we want localization on Windows that
really works, the only practical way is to provide replacements for
setlocale and all the multibyte and wide-character functions in the
CRT to support UTF-8 and work with 32-bit wchar_t data type.  Anything
less than that is no more than an illusion of non-ASCII support, and
breaks as soon as you try it in a locale outside Western Europe.

So if we want to fix MSVCRT in this area, we need to come up with a
much larger library than just a replacement for putwchar.  If someone
volunteers to do that job, I think it would allow the MinGW project to
become a much better environment for porting Posix programs that deal
with non-ASCII than it is today, including localization in the likes

of GCC.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe







------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: putwchar not working

Hisham Sueyllam
It might also be worth mentioning that by changing the font of the cygwin terminal from the options item on the system menu to courier new the utf8 representation of my name in Arabic gets printed correctly [from right to left and right letter shapes]...
 
hisham...


On Sunday, September 11, 2016 8:31 AM, Hisham Sueyllam <[hidden email]> wrote:


Just one correction mingw64 gives the same as mingw32, the correct display from right to left and correct letter shapes happened in msys2 terminal so I guess the same terminal with mingw32 would also work...
 
hisham...


On Sunday, September 11, 2016 7:44 AM, Hisham Sueyllam <[hidden email]> wrote:


Well I hate to flood the mail but I thought you may find it interesting, since changing the font did the trick for mingw64, I thought do the same for the command prompt I use to run mingw32 programs. So from the system menu of the command prompt I also changed the font to courier new. Unfortunately, the program in the previous message still displayed garbage, but remembering that the code page is 720, I tried the following, with my arabic first name coded using code page 720 instead of utf8 as in the previous program:

//#include <windows.h>
#include <stdio.h>
//#include <wchar.h>
//#include <conio.h>
//#include <locale.h>

int main() {
  char s[] = "\xEC\xAC\x9F\xEA\n";
  printf("%s", s);
  return 0;
}

and it almost worked. I said almost because it printed the 4 letters which constitute my first name correctly but with 2 missing items:
1- They are from left to right (Arabic is written from right to left for those who do not know)
2- The letters have the wrong shape (Also for those unfamiliar with Arabic, the shape of the letter depends on its position in the word: beginning, end, or in the middle)
See the attached image and compare the outpout with the one in the previous message attachment...
Well, I am not boasting mingw64 but both those 2 buggy apperances were not present, my name was printed as it should from right to left...
 
hisham...


On Sunday, September 11, 2016 7:06 AM, Hisham Sueyllam <[hidden email]> wrote:


Well, as an experiment I tried the mingw64, and it works fine and prints unicode both Arabic and English in the command window, as shown in the attached png image.
Here is the listing of the program that worked, it prints an 'a' next to my first name in Arabic, to show that it prints both latin and other non-latin (Actually my guess any valid utf8 character, or may be just the BMP).
Note however that from the options menu I changed the font from Lucida console (the default and which did not work) to courier new.

So I am not sure how they did it in the mingw64,project  but clearly this takes the blame off Microsoft...

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
    char s1[] = "a\uFEEB\uFEB8\uFE8E\uFEE1"; // or "a\u\U"
    char s2[] = u8"a\uFEEB\uFEB8\uFE8E\uFEE1";
    //setlocale(LC_ALL, "en_US.utf8");
    printf("%s\n", s1);
    printf("%s\n", s2);
return 0;
}

 
hisham...


On Friday, September 9, 2016 6:08 PM, Eli Zaretskii <[hidden email]> wrote:


> From: Yongwei Wu <[hidden email]>

> Date: Fri, 9 Sep 2016 22:41:53 +0800
> Cc: "[hidden email]" <[hidden email]>
>
> > Maybe for you personally it isn't.  But being able to support just the
> > BMP in the year 2016 is a huge disadvantage IME.  We have already a
> > lot of character sets entirely in Unicode blocks beyond the BMP, and
> > each new version of the Unicode standard adds more.  Moreover, Windows
> > itself adds fonts to support these new character sets with each new
> > Windows version.  Being unable to support that is far from a Good
> > Thing.
>
> It is a luxury to talk about this, when we have problems dealing with the BMP.

When using the wide APIs, there are no such problems.

> > The upshot of all this is that any program which needs to support
> > non-ASCII characters in a sound way has no choice but to avoid the CRT
> > functions entirely, and instead use the "wide" Win32 APIs (such as
> > WriteConsoleW) directly and convert characters from and to UTF-16 by
> > hand, using MultiByteToWideChar and WideCharToMultiByte.  Any program
> > that relies on CRT functions for that is fundamentally broken on
> > Windows, and the main reason is NOT the MinGW libraries.
>
> I see your point, but I need to point out that it means completely
> different porting efforts, if the original application uses putchar
> etc. Just take the example of GCC: Oh, it outputs garbage information
> if it detects the environment is Chinese! I have to force the language
> to English to make it work. (I am not sure whether the current version
> of MinGW GCC ships with Chinese localization, since I am mainly on Mac
> now. It used to be the case....)
>
> Do you want to tell the GCC guys that they need to replace all
> occurrences of putchar etc.? (Maybe the easier approach is rid MinGW
> of all localization.)

Localization the gettext way has a very limited utility on Windows
anyway, because most Posix programs nowadays assume UTF-8, which is
compatible with "char *" strings, something that on Windows is
possible only with single-byte codepages and a small number of
codepages that support DBCS.  If we want localization on Windows that
really works, the only practical way is to provide replacements for
setlocale and all the multibyte and wide-character functions in the
CRT to support UTF-8 and work with 32-bit wchar_t data type.  Anything
less than that is no more than an illusion of non-ASCII support, and
breaks as soon as you try it in a locale outside Western Europe.

So if we want to fix MSVCRT in this area, we need to come up with a
much larger library than just a replacement for putwchar.  If someone
volunteers to do that job, I think it would allow the MinGW project to
become a much better environment for porting Posix programs that deal
with non-ASCII than it is today, including localization in the likes

of GCC.

------------------------------------------------------------------------------
_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe









------------------------------------------------------------------------------

_______________________________________________
MinGW-users mailing list
[hidden email]

This list observes the Etiquette found at
http://www.mingw.org/Mailing_Lists.
We ask that you be polite and do the same.  Disregard for the list etiquette may cause your account to be moderated.

_______________________________________________
You may change your MinGW Account Options or unsubscribe at:
https://lists.sourceforge.net/lists/listinfo/mingw-users
Also: mailto:[hidden email]?subject=unsubscribe
12
Loading...