Assembly language program for square root




















An object file is a file containing general machine code which are generally called opcodes. These opcodes are relocatable and usually not directly executable.

There are various formats for object files, and the same object code can be packaged in different object files. We can use hexdump to see actual opcodes. In order to think things through, I wrote up a basic and verbose function in c to find the square root of 64, or of any number that is a perfect square with integer results.

With this as a guide, I then did the laborious process of translating this into x86 64 assembly for the mac architecture:. Every implementation of every architecture is very specific. Here knowledge of C translates over nicely. In C, main is just a plain old function, and it has a couple parameters of its own:. If you follow the docs, you see that argc will end up in rdi , and argv a pointer will end up in rsi.

For clarification and additional information on MIPS instructions, click here. Community Bot 1 1 1 silver badge. Add a comment. The idea being that the number of times you loop through is n, which is your square root. Hope this helps. Patrik Patrik 2, 1 1 gold badge 18 18 silver badges 34 34 bronze badges. Alexander Alexander 52k 10 10 gold badges 84 84 silver badges bronze badges.

You guys all wrong. You can use sqrt. Jaehyeon Park Jaehyeon Park 23 4 4 bronze badges. That's the floating-point square root instruction. This would be a valid answer if you include instructions to convert from integer to double and back, and say something about truncating vs. How slow is sqrt. On modern x86, convert to FP and back for hardware sqrt is better than an integer loop for any but the simplest cases.

Talha Ahmed Talha Ahmed 1. Sign up or log in Sign up using Google. Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. Email Required, but never shown. The Overflow Blog. Stack Gives Back Safety in numbers: crowdsourcing data on nefarious IP addresses.

Asked 7 years, 8 months ago. Active 7 years, 6 months ago. Viewed 13k times. Improve this question. Tom Zych 4 4 silver badges 14 14 bronze badges. Jintonation Jintonation 91 1 1 gold badge 1 1 silver badge 3 3 bronze badges. Add a comment.

Active Oldest Votes. Improve this answer. David Ongaro David Ongaro 3 3 silver badges 14 14 bronze badges.

Looking back, it does seem extremely odd for me to initially shift EAX right by 3. Looking at that number you've given, that could be very bad. Another possibility could be an AND with 0x, shifting that right, and adding it to the original value. I only did one iteration because, in my trials, the estimate was usually off by only a few bits in the last place. In retrospect, I should probably check better.

I noticed however that my solution also has problems, because by short cutting the loop the inital cx value might be too high Maybe you have to use the fixed rep after all, but I'm thinking about a better solution.

You can use that to adjust the final estimation. I edited my answer accordingly. Note that I had to replace ecx with edx in the loop, because shift operations only allow cl as a register parameter.

I have a difficult time imagining an application for which this level of in accuracy would be acceptable, but if it's even close to acceptable, I'd use an even much simpler approximation: approx proc num:dword mov eax, num bsr ecx, eax mov eax, 1 shr ecx, 1 shl eax, cl ret approx endp This produces a result very quickly, and it's at least sort of in the right general range, anyway: fsqrt: approx: That's obviously still not very accurate, but at least it's sort of close.

If we want to improve that a little, we can add one more iteration of the same basic idea: remove the MSB that's set in the input, approximate the square root of what's left, and add that to our result: approx proc num:dword mov eax, num bsr ecx, eax mov ebx, 1 push ecx shr ecx, 1 shl ebx, cl mov esi, 1 pop ecx shl esi, cl sub eax, esi bsr ecx, eax shr ecx, 1 mov edi, 1 shl edi, cl or ebx, edi mov eax, ebx ret approx endp For the test run, the result now looks like this: fsqrt: approx: This obviously isn't very accurate yet, but at least I can imagine cases where it might easily be accurate enough, and it is pretty fast.

If you want a routine that's accurate and nearly competitive with fsqrt, you might consider the binary reducing algorithm instead: isqrt proc num:dword mov eax, num xor ebx, ebx bsr ecx, eax and cl, 0feh mov edx, 1 shl edx, cl refine: mov esi, ebx add esi, edx cmp esi, eax ja f sub eax, esi shr ebx, 1 add ebx, edx jmp next : shr ebx, 1 next : shr edx, 2 jnz refine mov eax, ebx ret isqrt endp Depending upon circumstances, calling conventions, etc.

Jerry Coffin Jerry Coffin I had a few ideas, but none of the simple ones improves the run time. Zoyd Zoyd 3 3 silver badges 9 9 bronze badges. It does the same thing, but with an XMM register. I'll see if I can use that instead. Sign up or log in Sign up using Google. Sign up using Facebook.



0コメント

  • 1000 / 1000