Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
107 user(s) are online (65 user(s) are browsing Forums)

Members: 0
Guests: 107

more...

Headlines

 
  Register To Post  

(1) 2 »
What is OS4's default character set?
Home away from home
Home away from home


See User information
Assuming I am using a standard (British or USA) English installation of AmigaOS4, does it USUALLY use the ISO-8859-1 (Latin 1) character set?

Wikipedia states that AmigaDOS (1.x) uses ISO-8859-1 (Latin 1), so I'm guessing that's what future versions of AmigaOS stuck to for English users?

BTW, I see a Libs:Charsets folder - is there a standard Amiga library which allows converting between charsets? The SDKBrowser wasn't much help in answering this question.

Author of the PortablE programming language.
Go to top
Re: What is OS4's default character set?
Just popping in
Just popping in


See User information
That depends on your locale settings. For plain english the system charset should be ISO-8859-1, but i.e. for czech or other slavic languages it will be ISO-8859-2.

These two lines will give you the currently used charset:

LONG default_charset = GetDiskFontCtrl(DFCTRL_CHARSET);
char *charset = (char *)ObtainCharsetInfo(DFCS_NUMBER, default_charset, DFCS_MIMENAME);

Both are functions of diskfont.library.

Conversion is best done by codesets.library.

Why stop it now, just when I am hating it?

Thore Böckelmann
Go to top
Re: What is OS4's default character set?
Just can't stay away
Just can't stay away


See User information
@ChrisH

You can use the iconv functions from newlib.library to convert between charsets.

Go to top
Re: What is OS4's default character set?
Amigans Defender
Amigans Defender


See User information
@ChrisH

Don't assume a charset. I use ISO-8859-15, otherwise the Euro symbol is missing (I know we don't use the Euro here, but that doesn't mean I don't want or need to type it...!).

There are various ways of finding the current charset. I use the same as tboeckel's code above, although somebody did mention that I shouldn't be using that, but the alternative looked convoluted, not that I can remember what it was.

Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
struct Locale locale = ILocale->OpenLocale(NULL);

charset = locale->loc_CodeSet;

where

uint32 loc_CodeSet
Specifies the code set required by this locale. Before V50, this
value was always 0. Since V50, this is the IANA charset number
(see L:CharSets/character-sets). For compatibility, 0 should be
handled as equal to 4, both meaning ISO-8859-1 Latin1.


Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information

or


Starting with V50, locale.library maintains a global environment
variable called "Charset" which contains the MIME name of the
current default charset as used in the system. This is the name
of the charset associated with the Locale structure returned by
OpenLocale(NULL).

Go to top
Re: What is OS4's default character set?
Just popping in
Just popping in


See User information
I want to use my AmigaOS4.1 in English but set the fonts to Turkish. I still haven't figured this out.

Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
PS: It took me two minutes from not knowing, to RTFM, to finding out, why autodoc authors even bother?


Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@broadblues Quote:
charset = locale->loc_CodeSet;

On it's own a "MIBenum" number doesn't look terribly useful. I'll have to see if there is a way to get a meaningful name from it... (Maybe GetDiskFontCtrl() will do the job.)

Quote:
PS: It took me two minutes from not knowing, to RTFM, to finding out, why autodoc authors even bother?

You didn't even answer my main question (i.e. is ISO-8859-1 the default for English), so no need to be grumpy. Wikipedia was literally the ONLY website with any information.

Author of the PortablE programming language.
Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@ChrisH

Quote:

You didn't even answer my main question (i.e. is ISO-8859-1 the default for English),


There is no "default" always use Locale to determine the charset. For example once ancilmon creates his own custom locale for english in turkish character set then, just the fact it's using english would mess you up.

And as the other Chris said many english users use iso-8859-15 these days.


Quote:

so no need to be grumpy.


You beat me to the edit, where I was about to say I'm only saying this because all the contributors to the thread are established and experience developers, who should at least know how to read the autodocs, I ofcourse wouldn't say it to a newbie dev, and if it came over as overly grumpy then sorry

Quote:

Wikipedia was literally the ONLY website with any information.


I would trust wiki.amigaos.net over Wikipedia in such matters any day.


Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@tboeckel Quote:
For plain english the system charset should be ISO-8859-1

Thanks! It's amazing how this seems to be assumed as common knowledge, but doesn't actually seem to be stated anywhere (apart from Wikipedia, the unreliable font of all knowledge).

Quote:
LONG default_charset = GetDiskFontCtrl(DFCTRL_CHARSET);
char *charset = (char *)ObtainCharsetInfo(DFCS_NUMBER, default_charset, DFCS_MIMENAME);

Both are functions of diskfont.library.

Thanks. I may use that eventually - at the moment I just want to get something working with the common case (aka ISO-8859-1).

Quote:
Conversion is best done by codesets.library.

Which mean I probably won't

Author of the PortablE programming language.
Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@ChrisH

Quote:

I'll have to see if there is a way to get a meaningful name from it... (Maybe GetDiskFontCtrl() will do the job.)


No, put the number from struct Locale into ObtainCharsetInfo()


Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@tboeckel Quote:
LONG default_charset = GetDiskFontCtrl(DFCTRL_CHARSET);
char *charset = (char *)ObtainCharsetInfo(DFCS_NUMBER, default_charset, DFCS_MIMENAME);

Both of those appear to be V50, so looks like I'll still have to assume ISO-8859-1 for AmigaOS 3.x (and probably MOS+AROS until I can be bothered to find out how they do it).

Author of the PortablE programming language.
Go to top
Re: What is OS4's default character set?
Just popping in
Just popping in


See User information
Take a look at the source of codesets.library, function getSystemCodeset():

http://sourceforge.net/p/codesetslib/ ... EAD/tree/trunk/src/init.c

There you will find how codesets.lib supports all systems to obtain the currently active charset. Eventually it falls back to ISO-8859-1 if all other attempts fail.

Why stop it now, just when I am hating it?

Thore Böckelmann
Go to top
Re: What is OS4's default character set?
Amigans Defender
Amigans Defender


See User information
@ChrisH

Quote:
Thanks. I may use that eventually - at the moment I just want to get something working with the common case (aka ISO-8859-1).


No. DO NOT ASSUME A CERTAIN CHARSET IS IN USE.

Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@ChrisH

Way are you interested in character set for?

if you wont consistency, you should store any text string as UTF8 and convert it to character set used by the user.

At least if it is a language file.

In 8 BIT ASCII you have 0 to 127 the typical English (7BIT ASCII), from 128 to 255 you have language specific chars, the symbols for this are not the same between languages, this are controlled by code set that the user has selected.

If it's English you wont, it make little difference what charset you use, besides the "€" symbol.

The codeset defines what symbol that OS should show depending on the language. They also are the same as values as in UTF32 table, used by the fonts.

There for there is no “default character set”, character sets are irrelevant when it comes to 7bit ASCII.


Edited by LiveForIt on 2014/11/10 15:57:30
(NutsAboutAmiga)

Basilisk II for AmigaOS4
AmigaInputAnywhere
Excalibur
and other tools and apps.
Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@ancalimon

Quote:

I want to use my AmigaOS4.1 in English but set the fonts to Turkish. I still haven't figured this out.


You will need to set your keyboard map to turkish to get your charset, then your prefered language to english to get you language. They may or may not produce the effect you requIre?Iim testIng that concept as I type and these arenit typos! There agIn I donit know If the odd characters from they board wIll get trasmItted by AWeb????

BeIng unable to read turkIsh Iim unable to verIfy If that results In turkIsh language beIng dIsplayed correctly?




Go to top
Re: What is OS4's default character set?
Just popping in
Just popping in


See User information
@All,

The only "common" information is... RKRM 3rd Edition based,

ISO-Latin-1 (this is ISO-8859-1 through ISO-8859-15 collectively)

You can only trust the Character Codes up o code 127(DEL)

Anything above character 127 is subject to change at the users whims.

Additionally ... I am working with UTF-8 as the codeset of choice for my own projects.

Use Locale.Library to get the MIBenum value and then query the on-disk reference file mapping them to names if you looked into S: and L:

DiskFont.Library will only tell you about what is currently displayed (and I am having fun and games with *multiple* Keymaps along with chording whole typed words for presenting small menus of options... 3000+ "daily Kanji" with readings anywhere from 1 through to 8 syllables for common and upto 16 syllables for uncommon readings, each "syllable" is equal to 2 or 3 English Letters...and that is only for the Japanese).

I wonder how anyone will cope when the "system default" is set for Unicode and there is no "upper limit" for Character codes (when a 32bit CodePoint IS reasonable).

Assumptions == Screwups of the worst kind... good to ask and definitely double-check before cutting code out of the frypan :P

Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
Thanks again for everyone's suggestions on how to determine the current character set... even though it wasn't originally my intention to ask for that! Having finally got through a large list of things which were more necessary for my new program to function, I've now looked through those suggestions again, and implement a hopefully good way of getting the current character set.

@Chris
Quote:
No. DO NOT ASSUME A CERTAIN CHARSET IS IN USE.

I don't see what's wrong with writing a "stub" function (which always returns ISO-8859-1), until my program becomes functional enough that it's worth finding out how to do it properly. Us solo programmers need to pick our fights carefully, and avoid extra work which isn't strictly necessary:
http://www.lispcast.com/how-to-write-software
(I agree with virtually everything he writes, apart from the part where he says to spend ages ensuring you write something 100% perfect the first time around.)

@broadblues
Thanks for your suggestion of "locale->loc_CodeSet". At the moment I'm using that first, and only if it fails for some reason do I fall-back to using "GetDiskFontCtrl(DFCTRL_CHARSET)".

Quote:
I would trust wiki.amigaos.net over Wikipedia in such matters any day.

Of course. But Google didn't find the info I was after on wiki.amigaos.net .

@tboeckel
Thanks for both of your suggestions. getSystemCodeset() was helpful in seeing how to do it on MorphOS & AROS.

Quote:
Conversion is best done by codesets.library.

I ended-up writing my own code to convert to/from other charsets, and read/write UTF-8 (the latter being somewhat time consuming since I wasn't familiar with how UTF-8 worked before). One benefit of doing it with my own code is that it will work on Windows/etc without any extra effort. Another benefit is that I can convert to/from UTF-8 while simultaneously converting encoded XML characters (rather than doing it less efficiently in two separate passes).

Author of the PortablE programming language.
Go to top
Re: What is OS4's default character set?
Home away from home
Home away from home


See User information
@LiveForIt Quote:
Way are you interested in character set for?

Text downloaded from the internet comes in all sorts of encodings, and displaying them correctly is tricky.

Quote:
if you wont consistency, you should store any text string as UTF8 and convert it to character set used by the user.

That is in fact what I settled on doing, otherwise things get too complicated. Luckily XML tends to be UTF8 in the first place.

@Belxjander
Quote:
DiskFont.Library will only tell you about what is currently displayed

I'm afraid I don't understand how that might be a problem. Would "locale->loc_CodeSet" (after "locale=OpenLocale(NULL);") be better than "GetDiskFontCtrl(DFCTRL_CHARSET)"?

Quote:
I wonder how anyone will cope when the "system default" is set for Unicode and there is no "upper limit" for Character codes (when a 32bit CodePoint IS reasonable).

I don't see how AmigaOS can support Unicode as the system default. Even using UTF-8 would cause problems for many programs, which assume 1 byte is 1 character.

About the only solution I *can* see for AmigaOS, would be to have new functions which were explicitly UTF-8 (possibly also allowing UTF-16), and then have all legacy OS functions automatically convert UTF-8 to/from a "legacy character set". Anything which can't be converted would get replaced by a question mark or whatever (which apparently isn't advised for security reasons, but I can't see a better solution).

Author of the PortablE programming language.
Go to top

  Register To Post
(1) 2 »

 




Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2023 The XOOPS Project