safe_strlen,safe_strcpy

ecube · February 26, 2008, 07:23:55 AM

the two procedures below are my attempts at a safe strlen, and strcpy, and they do this by

*forcing you to specify the max supported size for the destination(need to add this for the source aswell possibly)
*doesn't crash if source is NULL
*doesn't crash if the specified size is a negative number

unlike lstrcpyn it doesn't copy the number of bytes you specify for the destination buffer, that's just a max range it'll look for a null terminator in

anyway none of this is really fascinating or remotely new, but what i'm having trouble against and would like to prevent is the procedures not crashing if someone where to enter an invalid entry like a number/dword instead of a address ptr. the code below for instance crashes.

Code Select


.586
.model flat, stdcall
option casemap:none

include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\kernel32.lib
include \masm32\include\user32.inc
includelib \masm32\lib\user32.lib
include \masm32\include\masm32.inc
includelib \masm32\lib\masm32.lib

     CTEXT MACRO text:VARARG
            local TxtName
              .data
               TxtName BYTE text,0
              .code
            EXITM <ADDR TxtName>
     ENDM
strlen_safe proto C :DWORD,:DWORD
strcpy_safe proto C :DWORD,:DWORD,:DWORD
.data?
mybuf byte 255 dup(?)
trythis byte 255 dup(?)
.code
start:
invoke strcpy_safe,addr mybuf,CTEXT("hello"),255
invoke MessageBox,0,addr mybuf,NULL,MB_OK
invoke strcpy_safe,333,13,255
invoke ExitProcess,0

strlen_safe proc C iInPut:DWORD,iMaxSize:DWORD
xor eax,eax
xor edx,edx
mov edi,iInPut
cmp edi,NULL
je @failed
mov ecx,iMaxSize
@check:
mov dh,byte ptr[edi]
inc edi
cmp dh,0
je @goodstring
inc eax
cmp ecx,0
jle @failed
dec ecx
jmp @check
@goodstring:
ret
@failed:
mov eax,-1
ret
strlen_safe endp

strcpy_safe proc C iDestin:DWORD,iSource:DWORD,iDestMaxSize:DWORD
xor eax,eax
xor edx,edx
mov edi,iDestin
mov esi,iSource
cmp iSource,0
je @failed
mov ecx,iDestMaxSize
@check:
cmp ecx,0
jle @failed
dec ecx
mov bh,byte ptr[esi]
inc esi
mov [edi],bh
inc edi
cmp bh,0
je @goodstring
inc eax
jmp @check
@goodstring:
ret
@failed:
mov eax,-1
ret
strcpy_safe endp
end start

asmfan · February 26, 2008, 09:20:40 AM

What kind of safety of strlen are you talking about? :lol
cmp reg, 0 -> test reg, reg
Don't do signed comparison (jle) on unsigned vars - it fails on strings >= 2GiB. Anyway bad technique.
Don't use partial registers (bh, ah, etc.) or at least use lower byte to be compatible to x86-64.

Verdict: useless (strlen_s) slow (both) code. I bet C/C++ compiler could do it better for you.

ecube · February 26, 2008, 10:10:38 PM

Who uses 2 gig strings? and thanks for your opinion on how "slow" my procedures are as that has absolutely nothing to do with this thread. Infact the only question I asked you managed to avoid entirely, so again THANKS for your input but "Verdict: useless" .

u · February 26, 2008, 10:38:59 PM

IsBadReadPtr(). Or try/catch blocks. Assuming it's safety-first, and ok-coders are kicked out of the door. No-one ever would provide an integer as an argument to these procs.

ecube · February 26, 2008, 10:47:27 PM

thanks alot ultrano, I read somewhere the IsBadReadPtr etc.. procedures aren't good to use, I think I can find the article if you want to take a look, if not thanks for response anyway, i'll try out the try/catch with macros, since speed isn't a concern just yeah safety.

u · February 26, 2008, 11:12:30 PM

Yeah, http://blogs.msdn.com/oldnewthing/
but I've never ever managed to reproduce the problems he describes. Been on win2k SP5 forever, I guess.
Then his statement that due to this, try{ compute_stuff(); }catch{ ouch_it_was_gonna_crash_due_to_unpaged_yet_memory(); }
is understandable but understandably should be a bug, not a design of the OS! So, I must safely assume that MS did their job and fixed that bug. Otherwise it'd be crashing hell, and MS would be stating "heh, it's in our specs that your app will crash, bear with it" ... impossible :).
It's one of the articles that kinda baffled me - a famous reliable source teaching bullshit. I take everything with a spoonful of salt since then, I guess :).

jj2007 · February 26, 2008, 11:14:51 PM

Quote from: E^cube on February 26, 2008, 07:23:55 AM
*doesn't crash if source is NULL
*doesn't crash if the specified size is a negative number

Frankly speaking I think it's good practice to let code crash into your face until your app's logic is OK. When it crashes, there must be a design error - so let me know about it. Let it crash.

u · February 27, 2008, 12:04:29 AM

Quote from: Ultrano on February 26, 2008, 11:12:30 PM
So, I must safely assume that MS did their job and fixed that bug.

Ha-ha, checked the latest MSDN, and MS couldn't get their lazy asses to implement something that would work at least with normal parameters (anything that isn't 1.99 GB in size). Instead, they scrap it all, marking it as "unreliable" and adding more confusion as "another thread could free this memory". Heck, if another thread just frees memory like that and my current thread has access to this pointer, then there's something seriously wrong with my own code (some EnterCriticalSection/ cmpxchg locking must be used).

Quote(from a comment on that article)
Wednesday, September 27, 2006 12:32 PM by A
"When the stack grows into the guard page, a guard page exception is raised,"

Technically, calling IsBad*Ptr() on the stack is safe because the guard page exceptions are handled entirely within the kernel. User-mode exception handlers cannot see or intercept them (contrary to what Larry Osterman's article claims).

So, if that is true (and I experimentally verified it here), then Raymond's article is bollocks and only could be correct about PAGE_NOACCESS. But PAGE_NOACCESS causes GPF, so that DLL must be doing SEH tricks, potentially messing with your own SEH chain. If you use someone's code that does this without a big screaming banner in the docs, then you've been essentially shitted on right from the start, and avoiding IsBad*Ptr won't help you - all it takes is one SEH addition of your own at an imperfect place, and your code can go berserk anytime.

Back to the topic, jj2007's commend is spot-on. When you're developing software, you must not let such errors slip-by. Something caused them, and hiding the reason when the code isn't dozens of thousands of asm lines yet is generating hard-to-find problems for the future. This is the reason there are ASSERT() macros in C++, and you can easily make such macros in masm :).

hutch-- · February 27, 2008, 01:19:35 AM

Cube,

i agree with the guys here, code that goes BANG in your face always tells you if something is wrong, stuff that is wrapped in multiple error handlers, SEH and the like can be very hard to debug. You can of course put some testing in the algo if its not too big or too slow to check for invalid input data, zero length string, defective addresses passed and so on but you face an old problem, you can never make code idiot proof as there will always be a beter idiot.

Make it fast, easy to use and documented properly and there is little better you can do. The assumption that the user is an idiot and has to be patronised leads to often terrible, slow and unreliable code.

ChrisLeslie · February 27, 2008, 10:24:46 AM

Quotei agree with the guys here, code that goes BANG in your face always tells you if something is wrong, stuff that is wrapped in multiple error handlers, SEH and the like can be very hard to debug.

Hutch, I think that to dismiss code that goes bang so frivolously is relegating the package that you and Iczilion spent so much time developing into the status of a toy. I would never trust an application to be used in a serious situation without a reasonable level of error checking. Maybe old retired programmers programing for fun can afford to take risks with their assembler playthings, but a larger application that mysteriously goes bang IS the situation that is hard to debug. For me, going "bang" could mean serious loss of data, set backs in tight schedules or even research projects lost. Cubie has an attitude that I respect in attempting a level of security if that matters to him as it does for me. His code may be slow but at least it is straight forward honest and readable and that is my style too. Further though may bring up better ways of error checking. Optimising for speed can come later, and only if it is needed which usually is not.

Chris

hutch-- · February 27, 2008, 11:22:28 AM

Chris,

i am not sire you got what I intended, code that goes BANG in the development stage forced the developer to write code that does not go BANG. Get a piece of code correct and it does not need extra junk appended to it. The next stage is the beta testing stage where it gets thrashed to death and if it passes that it becomes release software.

There is a difference between safe code and code written for idiots, safe code HAS passed all the tests, the other is a form of patronising from one end of the development area to another (iIE: They are too dumb to do it properly so we will do in an idiot proof wrapper.)

ecube · February 27, 2008, 11:27:43 AM

Yeah my goals are for safe procedures in terms of security and stability. Programming is already critical enough as it is, one little bug/issue can cause your program to simply not-work, which is serious. And while some of you guru's may argue "well if you were a better programmer..." kind of thing, doesn't hold weight if you look at what you're dealing with. While it helps to test as best as you can at home, when you release it publically your program is still entering into the jungle where they're a lot of unknowns and other dangers that could result from different environments(os') and different population makeup(other software) which could break yours. Not only that but the way I see it with the countless "exploits" and such for big name software released daily it's obvious things need to change. So yes while my code is slow atm i'm just focused on writing safer coder that is as dummie proof as possible for myself and others. PHP is a great example of what i'm after as you don't need to specify the data type, just give it a number or a string or use none strings with string procedures, etc.. it doesn't matter it handles it all safely and without panic. If I can accomplish any of this in ASM that'd be great :D

Tedd · February 27, 2008, 12:54:29 PM

Just to poke my nose in.. :bg

Firstly, coding for safety is a good thing! I don't agree with the view that if it's wrong it should go bang - that only works when it would be wrong every time (and so can be caught during development), otherwise it would appear to work fine most of the time and then go bang at some critical point. As for such things being patronising, sometimes maybe, but the truth is that coders are inherently lazy and if they can get away with not checking for errors or other such nonsense because "it should be safe" then they will. And hence the origin of so many exploits and bugs.

That said, there can be such a thing as being over cautious. Checking the max length of the string is pretty sensible, as is failing gracefully for invalid parameters (i.e. null pointers) - however, null pointers are one of those things where it's acceptable to go 'bang' (they generally indicate purposeful stupidity or deeper problems elsewhere, so covering them up isn't necessarily a good thing.) I'm not sure about checking for a 'negative' size though - as you know, negatives are just big positives, so essentially you're placing a (very generous!) limit on the maximum size; in practice though, who would knowingly pass a negative size, and if they did, would they not get an access violation long before they get close to that limit anyway? You'd get the same result from passing in 7f000000h as the size, so the extra check doesn't really protect as much as it suggests.

As for entering a number instead of a pointer - again, there's no difference as far as the processor is concerned. What you really mean to do is check that the pointer they give you references memory they have access to, otherwise it's not a valid pointer. So you have three options: IsBadReadPtr can work, though it has its problems (namely the possibility of messing up guard pages, which can cause system instability); allow the access violation, but set up a handler to catch the exception and deal with it sensibly; or allow the access violation, and the OS vomits up a nice exception to the user. I'd be inclined to treat it the same as a null pointer - allow the exception to bubble up to the OS, from where it can be debugged and fixed (rather than hidden.)

The problem with error checking is that you can expand your code to the extent that the majority of it is purely for catching those errors. The truth is though, the extra errors occur so rarely, if at all. There's a trade-off - definitely check those errors that occur frequently, and critical ones, and after that any others you can do cheaply (i.e. once only.)

The level of 'critical' depends on the context your program is running in. Most user applications can die horribly without any side-effects, so you can avoid too much of the error checking when the majority of it is effectively useless. Of course, a system service should be treated with more care and therefore more extensive error checking is necessary. Which suggests two implementations - 'safe_' and 'verysafe_' - or "ifdef PARANOID" could include the extra checking? :bdg

It is impossible to guard against the arbitrary - there will always be a better idiot, and if they purposely want to mess up then they will, you can't stop them and you shouldn't waste effort trying.

dacid · December 24, 2008, 09:40:00 AM

strsafe

Security (General) Technical Articles
Strsafe.h: Safer String Handling in C

http://msdn.microsoft.com/en-us/library/ms995353.aspx

Using the Strsafe.h Functions
Poor buffer handling is implicated in many security issues that involve buffer overruns. The functions defined in Strsafe.h provide additional processing for proper buffer handling in your code....

http://msdn.microsoft.com/en-us/library/ms647466.aspx

News:

safe_strlen,safe_strcpy

ecube

ecube

ecube

ChrisLeslie

ecube

dacid