General-Purpose string functions

Started by Mark Jones, July 25, 2006, 08:38:57 PM

Previous topic - Next topic

Mark Jones

Hi, here are three general-purpose ASCII string functions: compare, copy, and length. They are easy to understand and use, and are much faster than the windows API's.


align 16
szComp proc szDest:DWORD,szSource:DWORD
    mov ecx,szSource
    mov edx,szDest
@@:
    mov al,byte ptr[ecx]            ; fetch bytes
    inc ecx                         ; prevent stall (add ecx,1 faster on Pentium)
    mov ah,byte ptr[edx]
    inc edx
    test al,ah                      ; null?
    jz @good
    cmp al,ah
    jnz @bad                        ; match?
    jmp @B
@good:
    mov eax,0
    ret
@bad:
    mov eax,1
    ret
szComp ENDP



align 16
szCopy PROC uses esi edi szDest:DWORD,szSource:DWORD
    mov esi,szSource
    mov edi,szDest
    cld                             ; copy forwards
@@:
    cmp byte ptr [esi+03],00        ; better branching
    jz CopyFour
    cmp byte ptr [esi+02],00
    jz CopyThree
    cmp byte ptr [esi+01],00
    jz CopyTwo
    cmp byte ptr [esi+00],00
    jz CopyOne
    movsd                           ; else move DW ESI --> EDI & increment
    jmp @B

CopyFour:
    movsd
    ret
CopyTwo:
    movsw
    ret
CopyThree:
    movsw
CopyOne:
    movsb
    ret
szCopy ENDP



align 16
szLen PROC uses edi ecx src:DWORD
   mov ecx,-1
   mov edi,src
   mov al,0
   repnz scasb
   mov eax,ecx
   not eax
   dec eax
   ret
szLen ENDP
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

Mark Jones

For larger strings, buffers, or blocks of memory, a routine like this can efficiently copy data without any special requirements.


align 16
CopyMem PROC uses ESI EDI dst:DWORD, src:DWORD, leng:DWORD
    mov esi,src
    mov edi,dst
    mov ecx,leng
    shr ecx,2               ; divide length into dwords
    cld                     ; clear direction to copy forwards
    rep movsd               ; copy DWORD & increment pointers until ecx=0
@@:                         ; then copy any remaining bytes
    test ecx,ecx
    je @F
    sub ecx,1
    movsb
    jmp @B
@@:
    ret
CopyMem ENDP
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

PBrennick

Mark,
Thank you for these functions.  They will fill a need for sure! Do you happen to have a function similar to lstrcat?  That would make it a complete set.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

jdoe

#3
There is a full set of string functions. ANSI and Unicode.

EDIT :  My contribution to this project has been moved to http://www.masm32.com/board/index.php?topic=5620.0

Mark Jones

Nice job JDoe, here's one like lstrcat. :)

Microsoft actually did a good job on lstrcat, it is actually faster. But this is smaller, and it includes the source code. :bg


    align 16    ; by Mark Jones 2006
    OPTION PROLOGUE:NONE
    OPTION EPILOGUE:NONE
szCat PROC uses esi edi szDest:DWORD,szSource:DWORD
    mov esi,[esp+8]                 ; esi = szSource
    mov edi,[esp+4]                 ; target = szDest
    cld                             ; scan/copy forwards
    mov ecx,-1                      ; number of byte to search (max)
    mov al,00                       ; byte to match (null)
    repnz scasb                     ; repeat until null found in edi
    dec edi                         ; backup one char (overwrite null)
@@:
    cmp byte ptr [esi+03],00        ; look for nulls
    jz CopyFour
    cmp byte ptr [esi+02],00        ; when found, break out
    jz CopyThree
    cmp byte ptr [esi+01],00
    jz CopyTwo
    cmp byte ptr [esi+00],00
    jz CopyOne
    movsd                           ; else move DW EDI <-- ESI & increment
    jmp @B

CopyFour:
    movsw                           ; copies four bytes total
CopyTwo:
    movsw                           ; copies two
    ret 8
CopyThree:
    movsw                           ; two
CopyOne:
    movsb                           ; one
    ret 8
szCat ENDP
    OPTION PROLOGUE:PrologDef
    OPTION EPILOGUE:EpilogDef


Quote from: AMD XP 2500+
Proctimers.inc: (adding a 64-byte string to a 64-byte string)

lstrcat: 3039 clock cycles
szCat:  3047 clock cycles

lstrcat: 72 bytes of code
szCat:  55 bytes of code
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

Mark Jones

Hello JDoe, thanks for sharing your string routines, they have been added to the GeneSys library. The library contains some very nice functions now. See the full list of functions in the repository:

http://www.masm32.com/board/index.php?topic=5297.0

:U
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

jdoe

#6
Quote from: Mark Jones on July 28, 2006, 06:31:18 PM
Hello JDoe, thanks for sharing your string routines, they have been added to the GeneSys library.

It's my pleasure Mark.