ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ[ Rage Technologies, Inc. ]ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ ³ ³ - Members - ³ ³ ] Myth: Ideas / Coder [ ³ ³ ] Night Stalker: Coder [ ³ ³ ] SKoRPiON: Musician [ ³ ³ ³ ³ - Support Board - ³ ³ ] Shadow Lands: (407) 851-2313, run by Night Stalker [ ³ ³ ³ ÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄ How to use Fixed Point (16.16) Math (Part 1 of 2) - by Night Stalker ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ[ Revised version 1.1, dated March 12th, 1995 ]ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ [1] Introduction Allright, a simple question: what exactly *IS* fixed point math? Fixed point math is a very simple way to speed up any program that uses floating point. The 16.16 describes how many bits are before and after the decimal point. (In this case, 65535.999984741211 is the largest possible number. If you want to know how to get this value, set the decimal portion to all 1's (in binary), and you get 65535 with 16 bits. Take 65535 and divide it by 65536 -- that should answer your question. Thanks goes to Jonathan K. Butlerfor pointing this out.) There are other variants such as 24.8, and 8.24. It depends on what your application needs in regards to precision. Even with a math coprocessor (as it is becoming a standard nowadays), floating point speeds are slow. I will go into much detail about how fixed point can be used in your programs, as well as some trigonometric functions. I'll also warn you ahead of time. This document contains Intel 80386+ assembly code for performance. It was designed to compile with Watcom C/C++, and I'm sure it could be ported to other 32-bit compilers. If you're going to try to port it to a 16-bit compiler, I wish you best of luck. :) But first, I must give credit to where credit is due. The code I am referencing was created by David Boeren. If you wish to reach him, E-mail me, and I will give you his address. (I'd rather not post it publicly, although I don't think he'd mind...) [2] Fixed point basics ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ First, we'll start out on how we actually set our program up to handle fixed point. You really only need one definition to use fixed point math: ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ typedef long Fixed32; // 16.16 FixedPoint ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ Then whenever you want a fixed point number, just use 'Fixed32' instead of 'float'. Allright, you've got fixed point now. How do you actually get a number into this format? This is a little trickier. If you're using an integer or a char, you simply shift left over the decimal point. Longs are truncated because only 2 bytes fit in the upper portion of a Fixed32. The following #defines will help you out from converting to and from any format to Fixed32 and back: ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ #define INT_TO_FIXED(x) ((x) << 16) #define DOUBLE_TO_FIXED(x) ((long)(x * 65536.0 + 0.5)) #define FIXED_TO_INT(x) ((x) >> 16) #define FIXED_TO_DOUBLE(x) (((double)x) / 65536.0) #define ROUND_FIXED_TO_INT(x) (((x) + 0x8000) >> 16) ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ I also use the following #defines as well to access some well-known numbers: ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ #define ONE INT_TO_FIXED(1) #define FIXED_PI 205887L #define FIXED_2PI 411775L #define FIXED_E 178144L #define FIXED_ROOT2 74804L #define FIXED_ROOT3 113512L #define FIXED_GOLDEN 106039L ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ Notice that FIXED_2PI is not equal to 2 * PI. Why? Well, 2 * FIXED_PI is technically close enough, but FIXED_2PI is closer. :) [3] Fixed point math ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ Okay, you understand fixed point and now you want to do the basics of math. How do we accomplish adding, subtracting, multipling, and dividing? Adding and subtracting are simple. You simply add and subtract like you would normal floats. (ie: c = a + b; d -= c; etc, etc.) Multiplication and division require a little more. Here is the code to multiply and divide fixed point numbers: ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ Fixed32 FixedMul(Fixed32 num1, Fixed32 num2); Fixed32 FixedDiv(Fixed32 numer, Fixed32 denom); #pragma aux FixedMul = \ "imul edx" \ "add eax, 8000h" \ "adc edx, 0" \ "shrd eax, edx, 16" \ parm caller [eax] [edx] \ value [eax] \ modify [eax edx]; #pragma aux FixedDiv = \ "xor eax, eax" \ "shrd eax, edx, 16" \ "sar edx, 16" \ "idiv ebx" \ parm caller [edx] [ebx] \ value [eax] \ modify [eax ebx edx]; ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ If you're up on your assembly code, you'll notice that FixedDiv() doesn't do any rounding. Just take that into note. If some of you have never done assembly in Watcom, I'll try to help. Watcom's way of adding assembly into your programs starts with a #pragma. The standard format is: ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ #pragma aux = \ " " \ " " \ "<...>" \ parm caller [ ] ... \ value [ ] \ modify [ ...]; ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ Watcom uses 'parm caller' to know where to put the variables that are passed to an inline assembly function. For example in FixedDiv(), the numerator is passed into the EDX register, and the denominator is passed into the EBX register. The 'value' command is used to determine what register contains the value to return to the program when the inline code has completed. The 'modify' command tells Watcom what registers to push onto the stack (since you will be modifying them). Watcom allows omitting of 'parm caller' and 'value' if there is no need for them. If you have a function which doesn't return a value, then omit 'value'. The same goes for 'parm caller'. Unfortunately, I've only had success at putting inline code into my .H (or .HPP) files. If anyone knows the answer to this mystery, *PLEASE* tell me. Thanks. :) [4] More fixed point math ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ Moving right along... now that you know the basic math, we can start on how to do some of the harder math functions: squaring, one over (1/x), and square root. ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ Fixed32 FixedSquare(Fixed32 n); Fixed32 OneOver(Fixed32 n); Fixed32 FixedSqrtLP(Fixed32 n); // Low Precision (8.8) Fixed32 FixedSqrtHP(Fixed32 n); // High Precision (8.16) // This is faster than using FixedMul for squares. #pragma aux FixedSquare = \ "imul eax" \ "add eax, 8000h" \ "adc edx, 0" \ "shrd eax, edx, 16" \ parm caller [eax] \ value [eax] \ modify [eax edx]; // This is faster than using FixedDiv. #pragma aux OneOver = \ "xor eax, eax" \ "mov edx, 1" \ "idiv ebx" \ parm caller [ebx] \ value [eax] \ modify [eax ebx edx]; #pragma aux FixedSqrtLP = \ " xor eax, eax" \ " mov ebx, 40000000h" \ "sqrtLP1: mov edx, ecx" \ " sub edx, ebx" \ " jl sqrtLP2" \ " sub edx, eax" \ " jl sqrtLP2" \ " mov ecx,edx" \ " shr eax, 1" \ " or eax, ebx" \ " shr ebx, 2" \ " jnz sqrtLP1" \ " shl eax, 8" \ " jmp sqrtLP3" \ "sqrtLP2: shr eax, 1" \ " shr ebx, 2" \ " jnz sqrtLP1" \ " shl eax, 8" \ "sqrtLP3: nop" \ parm caller [ecx] \ value [eax] \ modify [eax ebx ecx edx]; #pragma aux FixedSqrtHP = \ " xor eax, eax" \ " mov ebx, 40000000h" \ "sqrtHP1: mov edx, ecx" \ " sub edx, ebx" \ " jb sqrtHP2" \ " sub edx, eax" \ " jb sqrtHP2" \ " mov ecx,edx" \ " shr eax, 1" \ " or eax, ebx" \ " shr ebx, 2" \ " jnz sqrtHP1" \ " jz sqrtHP5" \ "sqrtHP2: shr eax, 1" \ " shr ebx, 2" \ " jnz sqrtHP1" \ "sqrtHP5: mov ebx, 00004000h" \ " shl eax, 16" \ " shl ecx, 16" \ "sqrtHP3: mov edx, ecx" \ " sub edx, ebx" \ " jb sqrtHP4" \ " sub edx, eax" \ " jb sqrtHP4" \ " mov ecx, edx" \ " shr eax, 1" \ " or eax, ebx" \ " shr ebx, 2" \ " jnz sqrtHP3" \ " jmp sqrtHP6" \ "sqrtHP4: shr eax, 1" \ " shr ebx, 2" \ " jnz sqrtHP3" \ "sqrtHP6: nop" \ parm caller [ecx] \ value [eax] \ modify [eax ebx ecx edx]; ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ I hope you don't gawk at all that assembly code. Not to worry, there isn't any more. :) Note that the high precision square root takes more time to perform than it does for the lower precision. One note here regarding the inline assembly. As you can see, jumps are possible in the code. However, when you have a jump to the last command (see 'sqrtHP6:' above..), you need an instruction or Watcom will yell at you. NOP was the only logical choice I found, since it does just what it says, "No Operation". I also have a formula to do logarithm (log), but you need to be able to calculate a natural logarithm (ln) before you can do it. The basic formula is: ln(n) log n = ----- x ln(x) However, I have no clue on how to calculate a natural logarithm. If anyone can offer any ideas, I'll be glad to write one. [5] Fixed point trigonometric functions ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ Allright, this is the last section: How to do trigonometric functions in fixed point. Well, I found it easier to use lookup tables since the need for fixed point in the first place was for speed. Why bother bogging down the processor with floating point cosines and sines and converting them to fixed point? First off, let's define what an 'Iangle' is. It technically stands for 'Integer angle'. Since it's faster to reference an integer than it is to reference a floating point, we'll use 'Iangles'. ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ typedef unsigned short Iangle; // Integer angle (0..1023) ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ This means if you want to travel through a complete sine (or cosine) loop, you must pass through 1024 'Iangles'. Don't get this confused with degrees or radians. Neither one of them comes close to this definition. If you're still a little confused, you might want to look at the table generation code below to see what an 'Iangle' actually is. Below is the code to do sine, cosine, and tangent formulas: ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ #define MAX_TRIG 1024 void CosSin(Iangle theta, Fixed32 *Cos, Fixed32 *Sin) { theta &= (MAX_TRIG - 1); *Sin = SinTab[theta]; *Cos = CosTab[theta]; } Fixed32 Tan(Iangle theta) { // This shifting stuff is for better accuracy. theta &= (MAX_TRIG - 1); return (FixedDiv(SinTab[theta] << 16, CosTab[theta]) >> 16); } ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ The code to generate the lookup tables is below. It creates two .HPP files, COSTAB.HPP and SINTAB.HPP which can simply be #included into your source code. The 'MAGIC' value I used was derived from being able to fit an entire 360ø sine and cosine inside an array of 1024 fixed point numbers. These tables eat up 8K of RAM, so I figured 1024 was plenty. Remember, fixed point numbers are 4 bytes a piece: (1024 entries * 4 bytes per entry) = 4096 bytes * 2 tables = 8192 bytes. ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ #include #include #define MAGIC 162.974 #define MAX_TRIG 1024 #define DOUBLE_TO_FIXED(x) ((long)(x * 65536.0 + 0.5)) typedef long Fixed32; void main() { FILE *f; int i, pos = 0; Fixed32 z; f = fopen("sintab.hpp", "wt"); fputs("Fixed32 SinTab[MAX_TRIG] = {\n", f); for (i = 0; i < 1023; i++) { if (pos == 0) fputs(" ", f); z = DOUBLE_TO_FIXED(sin((double)i / MAGIC)); fprintf(f, "%10lu, ", z); pos += 12; if (pos > 70) { fputs("\n", f); pos = 0; } } z = DOUBLE_TO_FIXED(sin(1023.0 / MAGIC)); fprintf(f, "%10lu };\n ", z); fclose(f); pos = 0; f = fopen("costab.hpp", "wt"); fputs("Fixed32 CosTab[MAX_TRIG] = {\n", f); for (i = 0; i < 1023; i++) { if (pos == 0) fputs(" ", f); z = DOUBLE_TO_FIXED(cos((double)i / MAGIC)); fprintf(f, "%10lu, ", z); pos += 12; if (pos > 70) { fputs("\n", f); pos = 0; } } z = DOUBLE_TO_FIXED(cos(1023.0 / MAGIC)); fprintf(f, "%10lu };\n ", z); fclose(f); } ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ [6] Conclusion ÄÄÄÄÄÄÄÄÄÄÄÄÄÄ This is the first part of two files on fixed point math. The next file covers doing Vector and Matrix calculations. Enjoy the speed improvements in your program, and as always, if you ever need any help, feel free to ask us. We may be working on a kick-butt demo, but we can always spare a couple of minutes to help someone out. :) - Night Stalker ------------------------------------------------------------------------------- Look for other Rage Technologies, Inc. stuff coming soon: - Our first major demo, "Transvectoring". The theme is to show off our new 3-D engine with lightsourcing and texture mapping... REALLY fast. Also to show what objects beyond 3D really look like. For example, a 4D or a 5D cube. Maybe more. Expected release date: Mid '95 (?) - Night Hawk 0.2à BBS. The first BBS software to show that ANSI is dead, and RIP is a thing of the past. Features include: True multitasking, full video and audio routines, and more. Expected release date: Early/Mid '96. ------------------------------------------------------------------------------- Other news: - Shadow Lands is still not up. Blame Night Stalker. He's too lazy to sell his old 486/33 to run the board on a DX4-100. We'll let you know when he gets off his duff and has Shadow Lands online. - Rage Technologies, Inc. has a mailing list. If you'd like to get ahold of any one of us, send E-mail to: ragetech@trappen.vsl.ist.ucf.edu - Rage Technologies, Inc. also has an experimental FTP server running. If you would like to get any Rage products, simply anonymous FTP to: trappen.vsl.ist.ucf.edu. All Rage files are located in /pub/ragetech.

Back to the3D Game Programming Resource page.