Diferencias
Muestra las diferencias entre dos versiones de la página.
Ambos lados, revisión anterior Revisión previa Próxima revisión | Revisión previa | ||
cursos:ensamblador:gfx2_direccionamiento [21-01-2024 17:07] – [Cálculo de posiciones de pixeles mediante composición] sromero | cursos:ensamblador:gfx2_direccionamiento [21-01-2024 17:22] (actual) – [Optimizaciones para Get_Pixel_Offset_HR] sromero | ||
---|---|---|---|
Línea 970: | Línea 970: | ||
;--- Fin nuevo código --- | ;--- Fin nuevo código --- | ||
- | ret | ||
- | </ | ||
- | |||
- | En ocasiones se puede reescribir una rutina de otra forma para ser ligeramente más eficiente: \\ //Dean Belfield//, en su página //L Break Into Program// nos proporciona la siguiente rutina optimizada que requiere 117 t-estados, a costa de no devolvernos la posición relativa del pixel: | ||
- | |||
- | \\ | ||
- | <code z80> | ||
- | ; Get screen address | ||
- | ; B = Y pixel position | ||
- | ; C = X pixel position | ||
- | ; Returns address in HL | ||
- | ; | ||
- | Get_Pixel_Address: | ||
- | ld a, b ; Calculate Y2,Y1,Y0 | ||
- | and %00000111 | ||
- | or %01000000 | ||
- | ld h, a ; Store in H | ||
- | ld a, b ; Calculate Y7,Y6 | ||
- | rra ; Shift to position | ||
- | rra | ||
- | rra | ||
- | and %00011000 | ||
- | or h ; OR with Y2,Y1,Y0 | ||
- | ld h, a ; Store in H | ||
- | ld a, b ; Calculate Y5,Y4,Y3 | ||
- | rla ; Shift to position | ||
- | rla | ||
- | and %11100000 | ||
- | ld l, a ; Store in L | ||
- | ld a, c ; Calculate X4, | ||
- | rra ; Shift into position | ||
- | rra | ||
- | rra | ||
- | and %00011111 | ||
- | or l ; OR with Y5,Y4,Y3 | ||
- | ld l, a ; Store in L | ||
- | ret | ||
- | </ | ||
- | |||
- | Finalmente, //David Black// en su web //Overtaken by events// nos ofrece la siguiente rutina de 105 t-estados y 26 bytes: | ||
- | |||
- | \\ | ||
- | <code z80> | ||
- | ; Get screen address | ||
- | ; B = Y pixel position | ||
- | ; C = X pixel position | ||
- | ; Returns address in HL | ||
- | Get_Screen_Address: | ||
- | ld a,b ; Work on the upper byte of the address | ||
- | and %00000111 | ||
- | or %01000000 | ||
- | ld h,a ; store in h | ||
- | ld a,b ; get bits Y7, Y6 | ||
- | rra ; move them into place | ||
- | rra | ||
- | rra | ||
- | and %00011000 | ||
- | or h ; a = 0 1 0 Y7 Y6 Y2 Y1 Y0 | ||
- | ld h,a ; calculation of h is now complete | ||
- | ld a,b ; get y | ||
- | rla | ||
- | rla | ||
- | and %11100000 | ||
- | ld l,a ; store in l | ||
- | ld a,c | ||
- | and %00011111 | ||
- | or l ; a = Y5 Y4 Y3 X4 X3 X2 X1 | ||
- | ld l,a ; calculation of l is complete | ||
ret | ret | ||
</ | </ | ||
Línea 2002: | Línea 1934: | ||
| | ||
+ | |||
+ | \\ | ||
+ | ===== Optimizaciones para Get_Pixel_Offset_HR ===== | ||
+ | |||
+ | En ocasiones se puede reescribir una rutina de otra forma para ser ligeramente más eficiente, y las rutinas relacionadas con los gráficos (tanto " | ||
+ | |||
+ | \\ //Dean Belfield//, en su página //L Break Into Program// nos proporciona la siguiente rutina optimizada para obtener la dirección de memoria de un pixel dadas su coordenadas (x,y) que requiere 117 t-estados, a costa de no devolvernos la posición relativa del pixel: | ||
+ | |||
+ | \\ | ||
+ | <code z80> | ||
+ | ; Get screen address - by Dean Belfield | ||
+ | ; | ||
+ | ; B = Y pixel position | ||
+ | ; C = X pixel position | ||
+ | ; Returns address in HL | ||
+ | Get_Pixel_Address: | ||
+ | ld a, b ; Calculate Y2,Y1,Y0 | ||
+ | and %00000111 | ||
+ | or %01000000 | ||
+ | ld h, a ; Store in H | ||
+ | ld a, b ; Calculate Y7,Y6 | ||
+ | rra ; Shift to position | ||
+ | rra | ||
+ | rra | ||
+ | and %00011000 | ||
+ | or h ; OR with Y2,Y1,Y0 | ||
+ | ld h, a ; Store in H | ||
+ | ld a, b ; Calculate Y5,Y4,Y3 | ||
+ | rla ; Shift to position | ||
+ | rla | ||
+ | and %11100000 | ||
+ | ld l, a ; Store in L | ||
+ | ld a, c ; Calculate X4, | ||
+ | rra ; Shift into position | ||
+ | rra | ||
+ | rra | ||
+ | and %00011111 | ||
+ | or l ; OR with Y5,Y4,Y3 | ||
+ | ld l, a ; Store in L | ||
+ | ret | ||
+ | </ | ||
+ | |||
+ | Finalmente, //David Black// en su web //Overtaken by events// nos ofrece la siguiente rutina de 105 t-estados y 26 bytes: | ||
+ | |||
+ | \\ | ||
+ | <code z80> | ||
+ | ; Get screen address - by David Black | ||
+ | ; B = Y pixel position | ||
+ | ; C = X pixel position | ||
+ | ; Returns address in HL | ||
+ | Get_Screen_Address: | ||
+ | ld a,b ; Work on the upper byte of the address | ||
+ | and %00000111 | ||
+ | or %01000000 | ||
+ | ld h,a ; store in h | ||
+ | ld a,b ; get bits Y7, Y6 | ||
+ | rra ; move them into place | ||
+ | rra | ||
+ | rra | ||
+ | and %00011000 | ||
+ | or h ; a = 0 1 0 Y7 Y6 Y2 Y1 Y0 | ||
+ | ld h,a ; calculation of h is now complete | ||
+ | ld a,b ; get y | ||
+ | rla | ||
+ | rla | ||
+ | and %11100000 | ||
+ | ld l,a ; store in l | ||
+ | ld a,c | ||
+ | and %00011111 | ||
+ | or l ; a = Y5 Y4 Y3 X4 X3 X2 X1 | ||
+ | ld l,a ; calculation of l is complete | ||
+ | ret | ||
+ | </ | ||
+ | |||
+ | Utilizando tablas, en esta misma web podemos ver las siguientes 2 aproximaciones de //Patrick Prendergast// | ||
+ | |||
+ | <code z80> | ||
+ | ; Store the LUT table in the format "y5 y4 y3 y7 y6 y2 y1 y0" | ||
+ | ; Lower 5 bits where you need them for y and upper 3 bits to mask | ||
+ | ; out to OR with X (which are replaced with 010 anyway). | ||
+ | ; This way you'd only need 192 bytes for the table, which could be | ||
+ | ; page-aligned for speed. You'd be looking at 69 cycles por request | ||
+ | ; and 16 + 192 for the code + table. | ||
+ | ; | ||
+ | ; By Patrick Prendergast. | ||
+ | |||
+ | ; b = y, c = x | ||
+ | getScreenAddress: | ||
+ | ld h,tbl >> 8 | ||
+ | ld l,b | ||
+ | ld h,(hl) | ||
+ | ld a,%11100000 | ||
+ | and h | ||
+ | or c | ||
+ | ld l,a | ||
+ | ld a,%00011111 | ||
+ | and h | ||
+ | or %01000000 | ||
+ | ld h,a | ||
+ | ret | ||
+ | |||
+ | tbl: ; y5 y4 y3 y7 y6 y2 y1 y0 | ||
+ | .db 0, | ||
+ | | ||
+ | |||
+ | ; Option 2: if you are willing to [potentially] sacrifice | ||
+ | ; some space for speed, you can divide the table so that | ||
+ | ; you have the low and high bytes of your address list in | ||
+ | ; 2 independent tables and have them both page aligned - | ||
+ | ; with the low byte first in memory. | ||
+ | ; This would completely remove to need to calc y*2 to get | ||
+ | ; to your table offset. | ||
+ | ; This would require 64 bytes of padding after the 1st table | ||
+ | ; (due to both tables being page aligned) meaning you would | ||
+ | ; need 448 bytes all up. That being said the 64 bytes of | ||
+ | ; padding space is not needed so you can include any other | ||
+ | ; data you might need there so it's not wasted. | ||
+ | ; Then you would only need 47 cycles to lookup your address! | ||
+ | ; | ||
+ | ; By Patrick Prendergast. | ||
+ | |||
+ | ; b = y, c = x | ||
+ | getScreenAddress: | ||
+ | ld h,tblLow >> 8 | ||
+ | ld l,b | ||
+ | ld a,(hl) | ||
+ | inc h | ||
+ | ld h,(hl) | ||
+ | or c | ||
+ | ld l,a | ||
+ | ret | ||
+ | |||
+ | ALIGN 256 | ||
+ | tblLow: ; (ADDR & 0xFF) | ||
+ | .db 0, | ||
+ | |||
+ | ALIGN 256 | ||
+ | tblHigh: ; (ADDR >> 8) | ||
+ | .db 64, | ||
+ | </ | ||
+ | |||
+ | Estas rutinas son realmente rápidas, teniendo la segunda un coste de sólo 47 t-estados por cálculo de dirección, a costa de ocupar más espacio por separar la parte alta y la parte baja de la tabla precalculada, | ||
\\ | \\ | ||
Línea 2020: | Línea 2094: | ||
* [[http:// | * [[http:// | ||
* [[http:// | * [[http:// | ||
+ | * [[https:// | ||
\\ | \\ | ||
**[ [[.: | **[ [[.: | ||