I spent the last few days on working on the sprite loading / display.
My code is 90% complete but of course, completly untested for now.
It will probably take sometime before I post new pictures including those.
The sprite code amount for 20% of the whole design, and 70% of the generated logic in the FPGA.
Once I have finished that, I have a lot of small details and things to finish.
The TODO list is maintained inside the SVN anyway.
2010/05/04
2010/05/01
Changes...
Today, I managed to fix the following bugs :
- Color blending overflow detection had a problem with "half" color, this is now fixed and color looks good.
- Tested mode7 screen from game : now coordinate system is fixed.
Some timing adjustment may be necessary but everything seems to be fine otherwise.
I also decided to write a prototype of the sprite system.
To decide the priority of the sprites when drawing ONE pixel, I must have access simultaneously to the 34 tiles sprite cache for the current pixel and do a priority check on valid pixel and take the pixel with the highest priority.
This result in having the following information per sprite :
- Palette
- Priority
- Start X
- 16x2 bit of sprite info. (8 pixel @ 4 BPP for 1 line)
As the sprite unit are also loaded while the display is performed, I do bank switching and have to store TWICE that amount of memory.
One possible trick is to do the LOADING of the sprite into a real MEMORY and not flip/flop
and then do the transfer internally during the H-Blank.
(I would need 34 cycles, that seems possible)
Anyway for now the sprite system is just HUGE.
It is not complete yet (the state machine for the tile loading is not done yet but it will not be that huge) but I already reach 2835 slices of my Virtex4. It is like 18% of my SX35 Virtex just for the sprite system...
Even if I manage later to create a different architecture, I will only reduce that less than 50%, most likely 40% even if I remove 50% of the internal registers.
Basically, once the PPU is finished, consider the sprite system to consume around 40% of the PPU design.
In the best case, I could reduce the sprite unit to X / BPP0123.
Use RAM do to the transfer during HBlank, use another ram to store visible pal/prio of sprites.
Because RAM need one cycle for the access compare to my preloaded registers,
I would need to read ahead of one pixel the sprite sooner than the BGs.
Then I may be lucky to reduce this design @40% its original size.
It is something I may do later on, when everything is working fine...
But for now, I prefer an easy design, no timing issue.
Pixel / Priority => Ouput.
- Color blending overflow detection had a problem with "half" color, this is now fixed and color looks good.
- Tested mode7 screen from game : now coordinate system is fixed.
Some timing adjustment may be necessary but everything seems to be fine otherwise.
I also decided to write a prototype of the sprite system.
To decide the priority of the sprites when drawing ONE pixel, I must have access simultaneously to the 34 tiles sprite cache for the current pixel and do a priority check on valid pixel and take the pixel with the highest priority.
This result in having the following information per sprite :
- Palette
- Priority
- Start X
- 16x2 bit of sprite info. (8 pixel @ 4 BPP for 1 line)
As the sprite unit are also loaded while the display is performed, I do bank switching and have to store TWICE that amount of memory.
One possible trick is to do the LOADING of the sprite into a real MEMORY and not flip/flop
and then do the transfer internally during the H-Blank.
(I would need 34 cycles, that seems possible)
Anyway for now the sprite system is just HUGE.
It is not complete yet (the state machine for the tile loading is not done yet but it will not be that huge) but I already reach 2835 slices of my Virtex4. It is like 18% of my SX35 Virtex just for the sprite system...
Even if I manage later to create a different architecture, I will only reduce that less than 50%, most likely 40% even if I remove 50% of the internal registers.
Basically, once the PPU is finished, consider the sprite system to consume around 40% of the PPU design.
In the best case, I could reduce the sprite unit to X / BPP0123.
Use RAM do to the transfer during HBlank, use another ram to store visible pal/prio of sprites.
Because RAM need one cycle for the access compare to my preloaded registers,
I would need to read ahead of one pixel the sprite sooner than the BGs.
Then I may be lucky to reduce this design @40% its original size.
It is something I may do later on, when everything is working fine...
But for now, I prefer an easy design, no timing issue.
Pixel / Priority => Ouput.
Subscribe to:
Posts (Atom)