Загрузка видео...

Не удалось загрузить видео

На главную

** Sega MD 3D engine update 7 ** Speed up rendering by another 15-20 % ! Massive unrolling of the line drawing hotpath has seen a good pickup in the rendering although further improvements are needed , particularly for small triangles as most of the overhead is not in...

17,409 просмотров • 1 месяц назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

** Sega Genesis 3D Engine Day 3 ** Blast Processing kicks In !! Converted the Line Drawing and Triangle drawing to 68k Assembly and gained 25% in the line drawing and 33% speedups in the triangle drawing vs C ( so far ) . The scheduler calculating the vertexs is still in C , just the drawing routines are now are in assembly . Possibly 5-10 % gains will come from the scheduler being in ASM also. The Line and Triangle drawing C code took some beating , it was surprising - particularly the line drawing to get a speed up over the C code in ASM. I guess its not too surprising as the C code was basically handling pointers and integers so I had to get exotic to find the speedups. Initially my ASM Line draw was 100% slower than the C code - the first pass conversion from C to ASM is always rough as its more about getting it working then refine later , but yeah 100% slower on first pass was brutal, after 4 hrs tweaking it was 25% faster than the C code so thats a good start - might get more yet. So added more squares and more triangles for load testing at the same frame rate. Its still not anything cohesive but right now focussing on performance in rendering to see how much can be pushed. As rendering will be the biggest cpu soak the faster that goes the better. The smaller triangles take about 1/3 the CPU time of the larger triangles at present so theres definately a fill rate limit vs the setup time to consider. The same would go for the squares been line framed - smaller ones would allow al lot more . #SGDK #SegaGenesis #SegaMegadrive

Shannon Birt

19,781 просмотров • 2 месяцев назад

*** Mega Parodius Sega Megadrive Update *** Lately I've been having some fun seeing how far we can push the Parodius game engine . This is not arcade accurate - just for testing, decreased bullet timers and upped the bullet count to see what would happen re cpu usage when things are made a lot busier ! Maybe something like this could be insane mode in the options etc. Since last update we are a lot more optimised under the hood . The C based sprite engine, particularly visibility checking was speedup , overall 25 % faster. Then re-wrote the sprite engine completely in 68k assembly , 30% faster again. It took about 3 days and 1500 lines of assembly , thankfully the gains were worth it. All these gains will be back ported to S.O.T.A also as its using the same sprite engine. Usage is around 19% of cpu per frame with a full sprite load - 33 % cpu left in the busiest frame currently . Parodius is using a write the sprite list every frame type engine , so priority and meta-sprite objects can be handled with ease albiet its still a bit slower than static allocation sprite engines such as the one in lufthoheit but this is a bit easier to code for in the long run. All collision checks are been done , we are using a spatial grid system to get the collision checks done faster than a brute force approach . To get to maximum sprite count destruction is disabled in the video. We hit 80 sprites onscreen in this sequence , the sprite counter is one the Left side. We use about 16 sprites in the top hud and water line , about 21 for player attacks / missiles / shots / options . Up to 35 bullets + enemies . No lazer usage here as the Lazers use zero sprites thanks to the raster tricks , this video is all about the sprites ! Improved the water line when the Catboss is active , as he is Sprites + Forground we are doing some tile rotation tricks eg bit scrolling to give the impression of parrallax at that point . Vector Orbitex has updated the stage 1 music again and has made things even higher quality , hes mixed pcm and fm channels together to create higher quality orchestral hits for example . Pyron has started converting assets from stage 2/3 , some amazing work there . Hes well ahead of me at present which is a good thing. Still lots of incomplete animations missing logic etc , its slow going with RL getting in the way haha and 2 other projects !! Still its fun !! #SGDK #SegaMegadrive #SegaGenesis #Parodius

Shannon Birt

16,084 просмотров • 6 месяцев назад

** MEGA Parodius Scaling Effects Part 1 ** One of the big challenges with the Parodius Megadrive port is Stage 8's boss - The puffer-fish *Pooyan* with his full screen scaling effect. The goal is to be very close to the arcade (with extras on top ) so I thought lets tackle it head on to see how close we can get. I was also keen to jump into another scaling code rabit hole haha. Pyron pulled out all the stops and got me the source frames and reworked the BG tiles for this test - a big thankyou to him , Vector Orbitex is busy working on Stage 2 tracks so the team is working hard all round on this port. The MD has no sprite / background GFX scaling hardware , however the VDPs Vertical scroll can be updated per scanline to help vertical scaling on backgrounds, but there is a cpu cost to manage all the interupts so thats not free either. With the Horizontal scaling there is no help at all , apart from a semi-friendly packed pixel format for the cpu to work with, its not quite chunky format but better than planar format still for scaling. So its falls back to the 68k CPU to do all of the horizontal expansion which is the largest cpu cost. Basically drawing strips of either 1x, 2x, 3x or 4x wide columns at speed. So we are one week into this Boss's routine and you can see from the below video the horizontal scaling is implented ( vertical will be in the next update ) . We are scaling from 1x to 4x in the video below in 74 steps for testing . The column distributions are always a bit painfull to do - thankfully they are all worked out now. This is the third scaler I have built and the goal was with this one to make it really flexible for use in other projects also, sometimes when you optimise something to the last degree all the flexibility gets taken out of it. Currenty scaling at 12-25 FPS update here, I had some rules against some optimisations which I would use and some I wouldn't , thankfully we are a bit ahead of the Arcades animation frame rate here still and I may yet find optimisations that fit within the scope. We have vertical scaling and sprite spikes to add yet so Im hoping i can find a few more optimisations to offset things when they are implemented also. In a scale frame update we are processing close to 42000 pixels in ram before using DMA to send to VRAM . Using a 41x16 (656 tile scale buffer) - single buffered for now due to its size in VRAM. So thats nearly 21k in tiles ! I had to re-organise ram a bit to support a buffer of that size for the stage. The scaling function is written in 68k assembly , with a little C code handling the Vertical interupt code ( so the game logic can actually run & DMA updates etc ) . The DMA routines are in assembly also and customised for large chunk size ( big blocks of tiles ) which suits the scaler. I had some race conditions to sort out where the cpu was faster than DMA (sending tiles from RAM to VRAM ) and in some cases where it wasn't so it had to be balanced. We may be able to add more detail into the top and bottom of the background yet but its low priority for now until all the other bits are in !! #SGDK #SegaMegadrive #Genesis #Parodius

Shannon Birt

25,495 просмотров • 5 месяцев назад

** MEGA Parodius Scaling Effects Part 2 ** Well - this is the BIG one - literally !! Huge thanks to my team Pyron & Vector Orbitex for their efforts. Pyron has provided all the source frames for Puyon and his spikes / explosion and a lot of analysis video on how the spikes behave / move which was really helpful. Pyron also used his CRT setup for this video as we felt an emulator video would not do it justice - running on 100% real MD hardware FTW. Vector has provided the catchy boss music and its sounding great - as always ! The coding on this has been a bit insane - things done since part 1 post previous: Implemented dual buffering - Last video was single buffered - so VRAM is very tight now , we only have about 40/2048 tiles free. For the longest time I didn't think it would fit - I found a vram jigsaw puzzle that made it work in the end. Double buffering has cleaned up the stability of the animation and matches the arcade scaling effect now albiet costing 2x more VRAM . Vertical Scaling Implemented - the vertical scaler was taken from Lufthoheit ( my other shooter ) and its heavily optimised to reduce cpu usage. During the scaling the vertical scaler partitions the 68k processor registers into 2 sets, 4 registers are allocated to feeding the fast horizontal interrupt (h-int) that drives the vertical scaling , remaining 12 registers are for the horizontal scaler running in the background. This setup is much faster than normal backup / restore register methods, as we do not need to backup / restore registers in the h-int which would double CPU costs. The catch is the momment any background routine tries to use the H-ints registers it would break things so it has to be carefully timed. Added the spike projectiles - this was very tricky to get close to the arcade, they are semi heat seeking missiles basically and hence needed code that worked out angle differences to player at speed theres no time for arc-tan or similar so it uses faster lookup tables to work out the angles . They speed up over time and get larger and whats more we can't keep all the scales in VRAM - we have room for 2 spike buffers only. Also what was a real pain was working out scaled coordinates for the circular launch of the spikes . Added Fish Damage - Shock frame and Explosion frames from Arcade . Very proud of the fact we have the full arcade quality explosion is which is fully scaled also. Added Temporal Masking - which is fancy wording for don't draw nothing to buffer if nothing is there already there for the scaling . So empty Corners and edges can be optimised out to lower cpu costs and rom costs. I had to make some scripting for this and work out what areas did not need drawing at all in the frame, which should be force cleared by cpu and which areas should just be copied from rom. This reduced rom size by 30 kb and with a bit more work we could extend that to 60 kb & get a bit more speedup even doing so. Added a frame limiter . In the last video update we let the 68k burn hot and just pump out frames as fast as it could - here we match the arcade animation rate which does leave the cpu idling at times , particularly in the smaller frames - even at large though we could be running the animation 25 % faster , issue is though that would speed up the game logic and make it less arcade accurate. We had some real bullet / Spike hell simulations going without the limiter but yeah we had to tone it down a bit. Maybe in a hardcore mode we could let it run wild though ! Code is 95% 68k assembly with about 5% C code (v-int as its cold path ) driving things . This sort of thing needs all the speed it can get !! Well now after all that I can return to finish off Level 1 haha - just a wee sidetrack there . No doubt we will polish stage 8 boss some more in time too !! #SegaMegadrive #SegaGenesis #Parodius #SGDK

Shannon Birt

26,344 просмотров • 4 месяцев назад

*** Parodius Demo Update *** After seeing my good friend Pyron amazing Parodius artwork , I could not let that go to waste. Its a crying shame Konami's shooters did not make it to MD at the time , time to rectify that (in time). The video shows some of the progress so far. Stage 1 scrolling using the streaming scroller from lufthoheit to automatically manage the tile management for the scrolling rather than using different sets etc. Its great when I can grab bits of code and apply them to new projects, it speeds things up !! Lufthoheits new stage is coming along well also and has its first enemies in , hopefully show something of that soon. The VRAM view is to the right , we will have plenty of vram left for objects without resorting to much manual management it seems. ~ 1/2 vram left. It looks like the biggest challenge for this level is the Cat Ship mid boss - the end boss seems easier in comparison for a few reasons. I've analysed the way it was done on PCE and SNES and will come up with the best option for MD. More to come on that soon. At present we will target mainly assembly , so performance won't be an issue . I'm not sure we need a sprite multiplexor for > 80 sprites , but we will use one anyway just to not have regrets/changes later on and for the Hint color effects the multiplexor can do also. We have an A level musician volunteering also for those concerned about the audio side of things - it wont dissapoint I'm sure. As this is now my 3rd / 4th project it will be done as time permits so no gaurantees on timelines sorry - we are all hoping to do well by the community however. #SGDK #Konami #Parodius #MegaDrive #SegaGenesis

Shannon Birt

12,526 просмотров • 9 месяцев назад

*** Parodius Update - LAZERS !! *** How many sprites do we have to use on the Sega Genesis to make 4 x 160 pixel wide lazers ?? 10 ? 20 ? ZERO !! Which is really great as when the options overlap all with Lazers using sprites would have lead to sprite severe overload / dropout otherwise. So yes we have some raster trickery going on . I think this might take the cake for the most exotic effect Ive added just for a power up type haha. Breaking down the effect. 1.) Lazer tile rows ( the tile rows the lazers are currently on ) are copied to an offscreen area of the plane . I used the DMA VRAM to VRAM copy mode for this. You can see these offscreen rows just above the Game Videos window in the Plane A tilemap view. 2.) Then a solid block of Lazer colored tiles is drawn at the position of the Lazer horizontally in the offscreen buffer we just copied too - these are 8x8 tiles. 3.) Then onscreen when as the display is being drawn we change to the Lazers row at the correct pixel line offset for exactly 1 scanline then change back to the normal screens next row. I'm using the Horizontal interupt system developed in SOTA for handling the interupts for this. The technique used has been done before on the NES ( Salamander tech demo ) , as usual some of the best techniques come from the 8 bit systems , still it presented challenges getting it working on the MD. These irregular spaced interupts on the Genesis are a little bit of a problem too . All the height sorting has to be done on the frame before. Whats very interesting about the VRAM to VRAM copy DMA (Copy DMA) is it runs asynchronous to the CPU. Normally ROM/RAM to VRAM write DMA (Write DMA) halts the CPU during the send operation but with a Copy DMA I can pipeline one Copy DMA setup while the previous one is copying still . So even though the Copy DMA is 1/2 the speed of a Write DMA with the pipelining and no stalling its probably on par or better in some cases , where a copy makes sense to use. Theres still need some fine tuning , might make smaller lazers with more breaks etc. The Lazers can be full screen width , or narrow as we want so bit of fine tuning here to come. Pyron Vector Orbitex #Parodius #SegaMegadrive #SegaGenesis #SGDK

Shannon Birt

16,162 просмотров • 9 месяцев назад