Login
Username:

Password:

Remember me



Lost Password?

Register now!

Sections

Who's Online
100 user(s) are online (90 user(s) are browsing Forums)

Members: 0
Guests: 100

more...

Support us!

Headlines

 
  Register To Post  

PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
Just released.

PTDQ is a video system for AGA Amigas that provides a chunky-to-planar method which is faster than the traditional ones. It is the higher-quality brother of PTDS (formerly PED81C), another system based on the same core principle.

SIMPLIFIED COMPARISON CHART
----------------+------------+----------------+--------------+---------+------
                | 
horizontal maximum number color choice visual  |
         
system resolution |      of colors freedom      quality speed
----------------+------------+----------------+--------------+---------+------
           
PTDQ full       |            256 | **           | **      | **
           
PTDS half       |             81 | *            | *       | ***
traditional C2P full       |            256 | ***          | ***     | *

          
C2P PERFORMANCE COMPARISON SUMMARY
---------+---------+------------------+---------------
         |         | 
C2P conversions  C2P conversion
 machine 
routine per second       time (frames)
 --------+---------+------------------+---------------
 
A1200   PTDQ    12.670  (+2.493) | 3.946 (-0.966)
         | 
K030    10.177           4.912
 
--------+---------+------------------+---------------
 
A1200B  PTDQ    50.098 (+13.425) | 0.998 (-0.365)
         | 
K030    36.673           1.363
 
--------+---------+------------------+---------------
 
A1200PS PTDQ    86.340 (+12.494) | 0.579 (-0.098)
         | 
K040    73.846           0.677
 
--------+---------+------------------+---------------
 
A1200TF PTDQ    67.689  (+4.699) | 0.738 (-0.055)
         | 
K040    62.990           0.793
 
--------+---------+------------------+---------------
 
A4000CS PTDQ    72.690  (+7.614) | 0.687 (-0.081)
         | 
K040    65.076           0.768

A1200
:   Amiga 1200
A1200B
:  Amiga 1200Blizzard 1230 IV68030 50 MHz60 ns RAM
A1200PS
Amiga 1200PiStorm32-liteRaspberry Pi CM4firmware v1.04
A1200TF
Amiga 1200TerribleFire TF126068060 50 MHzfirmware 68090
A4000CS
Amiga 4000NTSC PAL-jumperedCyberStorm MK III68060 50 MHz

PTDQ 
PTDQ_DoC2P()
K030/K040 Kalms c2p1x1_8_c5_030_2() / c2p1x1_8_c5_040()


Video: https://www.youtube.com/watch?v=witR2EE9No8

[The video quality of the real machine output is heavily affected by the fact that the scandoubler did not support SHRES (so a real-time software trick was used to somehow produce the colors, although it is only a visual illusion and causes a sort of rasterline effect), the monitor did not support progressive PAL and the video was captured with an ancient phone at just 24.917 Hz. YouTube's compression degraded the video quality.]

Full details are provided in documentation included in the archive that can be downloaded from https://retream.itch.io/ptdq.

RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top
Re: PTDQ - faster C2P
Not too shy to talk
Not too shy to talk


See User information
hypothetically could this be used in an operating system app, especially, a web browser that has an off screen buffer where the html/css has been rendered into a chunky bitmap, and then the actual browser needs to blit sections of that into the real screen buffer to show the rendering in a window

Go to top
Re: PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
@NinjaCyborg

Technically yes, but the LORES resolution and the lack of clarity wouldn't be suitable for web pages.

RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top
Re: PTDQ - faster C2P
Not too shy to talk
Not too shy to talk


See User information
it doesn't work at hires 256 colours on AGA?

Go to top
Re: PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
@NinjaCyborg

No, because it uses SHRES to simulate LORES dots.

RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top
Re: PTDQ - faster C2P
Just popping in
Just popping in


See User information
It looks really great and fast :) but what games profit from it ? and how it works ? can i just load this, install it and start like Wolf3D to get more fps ?

im just not so much into this :)

Go to top
Re: PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
@manga303

It's for developers, so only future games/demos will benefit from it - if any will use it at all, that is.

RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top
Re: PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
This video shows the Amiga AGA chipset combining 3 full screen 8-bit layers (or playfields, if you prefer) using various 8-bit alpha values.





LAYERS

Background:
* PTDQ system
* 320x200 dots
* max 256 colors

Middleground:
* PTDS system
* 160x200 logical dots, 319x200 physical dots
* max 16 non-transparent colors
* each base color can have an arbitrary 8-bit alpha (actually used: 0 for complete transparency, 128 for dark colors, 255 for bright colors)
* "native" chunky dots (i.e. each byte in the layer buffer corresponds to a dot)
* triple buffer

Foreground:
* PTDQ system
* 320x200 dots
* max 81 non-transparent colors
* each base color can have an arbitrary 8-bit alpha (actually used: 0 for complete transparency, 192 for see-through graphics, 255 for solid graphics)


NOTES

* The color model is RGBW for all the layers, but each layer could use a color model of its own without making any difference performance-wise.
* If the middleground had used PTDQ, its size would have been 320x200 dots and its maximum number of non-transparent colors would have been 81. However, that would have required the PTDQ C2P conversion.
* If the middleground did not use 100% transparent dots, its maximum number of colors would have been 81.
* If the foreground did not use 100% transparent dots, its maximum number of colors would have been 256.
* The display size is actually 319x200 dots to hide the leftmost column of dots, as PTDS requires a 1-dot shift to the right for the even bitplanes.
* The ball is rendered by scaling and flipping in real time a 128x128 chunky bitmap.
* The ball is wiped by means of both CPU and Blitter. The logic that handles the geometry still needs to be refined in order to provide a massive speedup.
* For convenience, the video has been recorded with WinUAE.
* On a stock Amiga 1200, the demo runs at 50 fps except when the ball covers most of the screen (in that case, the frame rate drops proportionally to the size of the ball); the slowdowns will be greatly reduced once the wiping is optimized.
* On an accelerated Amiga, the demo runs at steady 50 fps.
* YouTube's encoding reduced the saturation of colors.


CREDITS

The graphics have been obtained by processing the following pictures:
* boing ball: from OBLIGEMENT / http://obligement.free.fr/gfx4/boingb ... tree_amigacorporation.png
* Earth: from https://cronianverse.fandom.com/wiki/Earth
* plate: from Freepik / https://www.freepik.com/premium-ai-ima ... -background_277147660.htm
* porthole: by winwood1, from Wallpapers.com / https://wallpapers.com/png/round-porth ... 024-zmsra0nm6gp8hvuv.html


Edited by saimo on 2025/9/16 7:46:03
Edited by saimo on 2025/9/16 7:50:26
Edited by saimo on 2025/9/16 7:55:05
RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top
Re: PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
Time for a new demo. Also this one shows the AGA chipset combining 3 full screen 8-bit layers (or playfields, if you prefer) using various 8-bit alpha values.
It isn't available for download yet as I have to finish to write some more technical details.





LAYERS

Common:
* 320x256 dots
* PTDQ system
* RGBWa color model

Background:
* maximum 256 colors
* horizontal scrolling

Middleground:
* maximum 256 colors
* colors use 8-bit alpha values for the fade-in and cross-fading effects
* vertical scrolling

Foreground:
* maximum 81 non-transparent colors
* triple buffer

NOTES

* All the layers reside in CHIP RAM.
* If the foreground did not use 100% transparent dots, its maximum number of colors would have been 256.
* The horizontal scrolling requires the background layer to be at least twice as wide as the screen plus 32 dots (i.e. 16 bytes, which in all cause a waste of 16*256 = 4096 bytes of CHIP RAM). Making the layer additional D dots wider (where D is dividable by 64) would waste further D/4 bytes per line of CHIP RAM.
* The background layer can be made to scroll also vertically without problems.
* The Thalion logo is scaled in real time on a chunky raster which gets written to the foreground layer by means of the PTDQ C2P conversion routine PTDQ_DoC2P_R().
* The 24-bit palettes for the fade-in and cross-fading effects are pre-calculated at startup; the effects are obtained by writing each frame a whole palette to the COLORxx registers with the CPU during the vertical blanking.
* The music is a tracker module played by means of P6112.
* On a stock Amiga 1200, the demo runs at 50 fps.
* YouTube's encoding degraded the quality.


Edited by saimo on 2025/9/21 21:13:54
Edited by saimo on 2025/9/21 21:40:09
RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top
Re: PTDQ - faster C2P
Quite a regular
Quite a regular


See User information
I have uploaded the demo now. You can get it from https://retream.itch.io/ptdq.
The delay was due to the fact that I wanted to add the following section to the documentation.

PERFORMANCE

The following calculations evaluate the performance on Amiga without FAST RAM
.

Given that the screen is 1280 SHRES pixels wide256 lines tall and 8 bitplanes
deep
the amount of bytes fetched each frame by the bitplanes DMA is
1280
/8*256*327680. Howeverthanks to the AGA 64-bit data fetchthe number
of reads per frame are 327680
/40960.

The pulse effect applied to the Thalion logo redraws every frame an area of
64x46 dots
which amount to 64*46 2944 bytes (as each dots corresponds to a
byte
).
Given that the scaling reads and writes the dots by byte (from many testsit
turned out to be the fastest solution on a stock Amiga 1200
), it needs 2944*=
5888 memory accessesSuch accesses are performed one after anotherwithout
other instructions causing delays
Howeverthe CPU is never granted access to
the CHIP bus twice in a row
so the accesses actually take 5888*11776 color
clocks
.
After the scalingthe data needs to be readC2P-converted and then written to
the foreground bitplanes
, for a total of 2944*5888 bytesTo convert 32
bytes
PTDQ_DoC2P_R() performs 9 memory accesses in parallel with other
operations 
and 7 accesses right after another accessthus needing 9+7*23
color clocks
Thereforeignoring for simplicity a marginal overheadit needs
(5888/32)*23 4232 color clocks.
In allthe pulse effects needs 11776+4232 16008 color clocks.

When a fade-in or cross-fading effect takes placethe CPU also writes 24-bit
values to all the COLORxx registers
So it needs to
 
read 256*1024 bytes from CHIP RAM,
 * 
write 256*1024 bytes to COLORxx and
 * 
perform 8*16 writes to BPLCON3.
Both reads and writes are made by longword (thus setting 2 COLORxx registers at
a time
), however the writes internally execute as two word writes, for a total
of 3 accesses per 2 COLORxx registers
Given that all these accesses are
consecutive
they take 3*6 color clocksThe fact that the high and the low
low order bits of the colors have to be written separately means that 
for each
color 2 accesses are needed
thus cancelling the avantage of processing the
colors by longword
. As a result 256*1536 color clocks are needed.
The writes to BPLCON3 are spaced a little bit by means of a few instructions,
so, for simplicityit can be assumed that each of them needs 1 color clock.
In alla fade-in or cross-fading effect needs 1536+16 1552 color clocks.

When the staggered lines option is onthe Copper alters BPLCON1 once per line
by means of a WAIT 
and a MOVE instructionwhich need 4 memory accesses, for a
total of 256
*1024 additional accessesMoreoverdue to the horizontal
scrolling
the CPU needs to update the values of the MOVE instructions in the
Copperlist
performing 256 consecutive writes which require 256*512 color
clocks
.
In allthe staggered lines need 1024+512 1536 color clocks.
(
Noteusing a Copper loop to have just two MOVE instructions that can be
quickly modified by the CPU does not provide any speed benefit 
as each jump,
executed by a MOVE to COPJMP2would require 2 more readsthus cancelling the
gain on the CPU side
.)

The demo also plays musicwhich requires some color clocks for the audio data
DMA
Due to how music playback worksthere is no simple way to calculate its
load on the CHIP bus
Howeverit is possible to calculate the load in a very
heavy 
case (and certainly worse than the actual oneby assuming that all the 4
channels play continuously at almost the maximum frequency allowed
i.e., for
simplicity28500 HzThat means that 28500*114000 bytes need to be read per
second
i.eabout 114000/50 2280 bytes per frameGiven that the data is
fetched by words
those amount to 2280/1140 memory accesses per frame.
(
Note: if the quality were half of the one considered which is closer to
reality 
the accesses would be 1140/570but such value would not make much
difference anyway
.)

Putting all the figures togetherthe total number of color clocks needed is
40960
+16008+1552+1140 59660 if the staggered lines are off and 59660+1536 =
61196 if the staggered lines are onConsidering that a PAL Amiga has 313*227 =
71051 color clocks per (longframethose figures represent respectively
100
*59660/71051 83.8% and 100*61196/71051 86.1of the color clocks
available in a frame
.

It might seem that there is some room to perform other operationsbut the above
calculations 
do not take into account the reads performed by the CPU to fetch
the instructions 
(which do occur often as the 68020 has only a tiny instruction
cache of 256 bytes
); moreoverthe CPU is not just reading and writing databut
also performing other operations 
(which only partially overlap with writes).
Thereforeunfortunatelythere is not really much more that can be done in a
frame time
.

RETREAM - retro dreams for Amiga, Commodore 64 and PC
Go to top

  Register To Post

 




Currently Active Users Viewing This Thread: 1 ( 0 members and 1 Anonymous Users )




Powered by XOOPS 2.0 © 2001-2024 The XOOPS Project