I think it?s better to use UTF-7 because UTF-8 conflicts whit ESC code sequences in AmigaDOS.
UTF-7 can maybe be added as patch to AmigaOS, to display text, but you most have some kind way to type the text as well, that?s is the biggest problem.
Edited by LiveForIt on 2007/7/13 12:28:44
(NutsAboutAmiga)
Basilisk II for AmigaOS4 AmigaInputAnywhere Excalibur and other tools and apps.
funny, I was thinking about asking the same this morning too.. basically just because I was jumping from one wikipedia link to another while I was searching some information regarding your "port for dummies" guide!
Or else a deeper integration most be done where the ESC code sequences becomes embedded in side UTF-8, but then the shell system most be completely updated.
(NutsAboutAmiga)
Basilisk II for AmigaOS4 AmigaInputAnywhere Excalibur and other tools and apps.
Or else a deeper integration most be done where the ESC code sequences becomes embedded in side UTF-8, but then the shell system most be completely updated.
You want <CrsrUp> to be encoded in UTF-8? Thats not possible, the con-handler only supports <CSI> but not <ESC>[. <CSI> is different in UTF-8, <ESC>[ is not.
Should not be nursery, when you have files form different countries they should use there correct symbol.
If I have downloaded a file from Sweden or Finland or some other country, it should use original symbols in its file name, not the symbols from CHARSET I?m currently using.
I was saying the con-handler, need?s UTF-8 support ?<ESC>[? should be the same as CSI
If you convert ?????? to UTF-8 you and insert this current shell, the ASCII values be interpret ESC code sequences this happens, the problem whit con-handler is that it expects ASCII code, and therefore assumes the ?????? encoded symbols are ESC codes.
In order to clean this up, ESC codes most be in UTF8 format.
When it comes to input.handler and data coming from there most be converted using a CHARSET for example ISO-8859-7 the result should be UTF-8 before it written to terminal.
symbol="ESC [" should be the format used for new applications, CSI should be rendered obsolete or only used when older programs expects that format.
(NutsAboutAmiga)
Basilisk II for AmigaOS4 AmigaInputAnywhere Excalibur and other tools and apps.
I think the funny part is tetisoft asking why would you need utf8 at all ;) Same as memory protection right? who needs it anyway ?!
hum yea i forgot.. amiga is retro nostalgia thingy.. and having all thoses features would make it a reasonably modern environment.. god forbid ;)
but wait, you got lots of shiny icons don't you all? who cares about the rest when one get plenty of nice glowy icons ? ;)
maybe you could paint utf8 glyphs on png icons and use them to display some sentence.. i know it could take a while to lineup properly all the icons to be able to read the sentence, you who also need to use another OS to take screenshots of all the glyphs before copy and paste them into icons .... but it would be so retro fun..
now that's definately a good project idea for glowicon-revolution 14
If I have downloaded a file from Sweden or Finland or some other country, it should use original symbols in its file name, not the symbols from CHARSET I'm currently using.
I have no problems with swedish or finnish file names, because swedish, finnish and german can all be displayed with ISO-8859-1 or -15. When you want e.g. chinese filenames, you would need a new filesystem first.
AmigaOS filenames are documented to be in ISO-8859-1 and case insensitive ("H?KKINEN" overwrites "H?kkinen"). When you downloaded a chinese file and want to delete it, how do you wanna do that when you dont know how to type the chinese filename in the shell? Cheeting with Tab-completetion or using WB is not allowed ;)
Quote:
I was saying the con-handler, needs UTF-8 support ESC[ should be the same as CSI
Then file a Bugzilla enhancement request. But dont forget that using ESC[ escape codes instead of CSI esacpe codes in AmigaDOS would break compatibility to all existing Unix implementations which expect CSI codes from an Amiga which is attached to the serial port or logged in via net.
I think the funny part is tetisoft asking why would you need utf8 at all ;)
I think the funny part is that you miss that it was me who added UTF-8 support to e.g. locale.library, TypeManager, keymap.library, CharsetConvert. But maybe you didnt understand what I wanted. To rephrase it, where exactly and for what exactly do you need UTF-8? I'm asking because its likely that I'll answer "You need UTF-8 in AmigaOS component A? Then we would need UTF-8 support in component B first"...
Quote:
Same as memory protection right? who needs it anyway ?!
I like sarcastic comments. Especially those which can be answered with "It was me who implemented write protection of .rodata sections in OS4".
To sum up the discussion, I would not even think about using UTF-8 file names before the user is able to type UTF-8 file names, I would not even think about allowing the user to type UTF-8 before AmigaOS can display UTF-8, I would not even think about allowing AmigaOS to display UTF-8 until most parts of AmigaOS are charset aware (the caller of the text display function is responsible to know if the to-be-displayed text is cyrillic, greek or UTF-8), ...
but hey, all paths needs to starts somewhere right ? well, what can i say?.. i guess all efforts are much appreciated.. but that still doesn't makes os4 more usable yet. i'll still have to wait probably a lot, until it reach a state when i can use it on a daily basis..
i guess utf8 support and memory protection and such aren't to be considered luxurious or elite features nowdays ..
but , otoh, you os4 dev ppl are lucky, not much of the ppl left in the community really understand thoses concept of memory protection, so they don't care much about it but .. there's a whole world outside the community you know ? ;)
well at least you got me watching ;) keep going, maybe within the next 5years .. or maybe not, you tell me ;)
As for when i need UTF8, most notably when chatting via IRC. It's irritating that it gets more and more common to get garbage chars when friend type swedish characters.
I guess it would help to display some webpages better too.
It should be possible to add UTF-8 support in WoookieChat using codesets.library. Maybe you should ask jahc for it.
Something like codesets.library is only required for software which has to work on AmigaOS 3.x, on OS4 it's not required. OS4 includes even 2 charset conversions, the iconv functions in newlib and diskfont.library ObtainCharsetInfo() DFCS_MAPTABLE. The main problem is displaying the unicode texts, although it's supported since AmigaOS 2.0 or 2.1 already (bullet.library API, on OS4 it's easier to use through the diskfont.library Open/CloseOutlineFont() and E*() functions) hardly anything uses it. In ReAction and MUI programs you have the additional problem that you can't render yourself into the window and the standard classes are limited to 8 bit charsets, you'd have to (re)implement all gadgets, etc.
To sum up the discussion, I would not even think about using UTF-8 file names before the user is able to type UTF-8 file names,
Even if you could type them you couldn't use them, the only filesystem which allows UTF-8 names is only available to the OS4 beta testers ... Since FFS2, SFS, etc. reject eveything which included chars between 128 and 159 you can't create files with UTF-8 names.
Yes, but there are WookieChat ports for OS3, MorphOS and Aros. It will save jahc much work if he use codesets.library. Where is the problem to display swedish characters using the swedish charset, when the UFT-8 String has been converted to the normal swedish amiga-charset?
As for when i need UTF8, most notably when chatting via IRC. It's irritating that it gets more and more common to get garbage chars when friend type swedish characters.
I've heard that the IRC protocol doesnt include any MIME specification of the used charset. The user is responsible to know which charset is used by the other user and to send the text he typed in the charset which is expected by the other user.
Or in other words, you have to tell the other users that they shall send ISO-8859-1 or -15. Or you use an IRC client which is able to decode UTF-8 and to convert it to the current OS4 system default charset before displaying the text. No, you dont need an IRC client which can display full Unicode, you dont even need an IRC client which can display any 8bit charset, a simple conversion from UTF-8 to the current system default charset is enough to talk swedish with swedish friends.
Quote:
I guess it would help to display some webpages better too.
You dont need UTF-8 support in OS4 to be able to display most UTF-8 encoded webpages, an UTF-8 decoder plus conversion to the current system default 8bit charset would be enough for the majority of cases IMHO (swedish users prefer swedish websites which can normally be displayed without problem with their swedish system default fonts in their swedish system default charset).
Quote:
If wookie and ibrowse would support UTF8.
Quoting the IBrowse history.txt file from the OS4Final CD: Quote:
-or- Added preliminary support for pages using UTF-8 encoding, mapping back to windows-1252 for now
Please ask the WookieChat author about an option to decode UTF-8 to the current OS4 system default charset before displaying the received text.