News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Protected mode problem.

Started by r_miele, December 20, 2004, 07:43:35 PM

Previous topic - Next topic

r_miele

I made this small Protected mode program and it is giving me a problem.  When I try to run the program it will restart the computer.  It seems to happen on the far jump directly after the PM switch.  The code is listed below, any suggestions?  Thanks


.MODEL SMALL
.386p

.STACK

GDT_DESCR STRUC
gdt_size              WORD 0
gdt_location          DWORD 0
GDT_DESCR ENDS

GDT_ENTRY STRUC
segment_size15_0 WORD 0
base_addr15_0 WORD 0
base_addr23_16 BYTE 0
p_dpl_s_type BYTE 0
g_db_0_avl_seg19_16 BYTE 0
base_addr31_24 BYTE 0
GDT_ENTRY ENDS

.DATA

gdt_descriptor GDT_DESCR  <127>

gdt GDT_ENTRY <>, \
<0FFFFh, , , 09Ah, 0CFh>, \
<0FFFFh, , , 092h, 08Fh>

pm_jmp DWORD 0

.CODE

MAIN PROC
ORG 0

;Get linear address of GDT
MOV AX, DS
MOVZX EAX, AX
SHL EAX, 4
ADD EAX, OFFSET gdt

MOV gdt_descriptor.gdt_location, EAX

LGDT gdt_descriptor

; Initialize far pointer for mode change
MOV     WORD PTR pm_jmp, OFFSET START
MOV     WORD PTR pm_jmp[2], 8

; Go to PM
MOV     EAX, CR0
OR      AL, 01h
MOV     CR0, EAX

; Do intersegment jump to set cs and flush instruction queue
JMP     DWORD PTR pm_jmp
START:
CLI
HLT


MAIN ENDP
END     

japheth


- you should set base of GDT descriptor 8, which is used for CS
- descriptor for CS should be a 16-bit descriptor, D-bit should be cleared
- interrupts should be disabled before changing CR0
- instead of using "jmp dword ptr []" you should use 0EAh opcode (jmp ssss:oooo)

r_miele

- you should set base of GDT descriptor 8, which is used for CS

The base address is set to zero using the default declarations.

- descriptor for CS should be a 16-bit descriptor, D-bit should be cleared

Why wouldn't it be 32-bit?

- interrupts should be disabled before changing CR0

Forgot about that one, sorry.

- instead of using "jmp dword ptr []" you should use 0EAh opcode (jmp ssss:oooo)

I'll try it and post back if it works,  Thanks. :thumbu

r_miele

Nope, using the straight-up Op-code didn't fix it.  The program is still rebooting the system. 

Here is the updated code:

.MODEL SMALL
.386p

.STACK

GDT_DESCR STRUC
gdt_size              WORD 0
gdt_location          DWORD 0
GDT_DESCR ENDS

GDT_ENTRY STRUC
segment_size15_0 WORD 0
base_addr15_0 WORD 0
base_addr23_16 BYTE 0
p_dpl_s_type BYTE 0
g_db_0_avl_seg19_16 BYTE 0
base_addr31_24 BYTE 0
GDT_ENTRY ENDS

PM_JUMP  MACRO SEGMENT,OFFSET
BYTE 0EAh
WORD OFFSET
WORD SEGMENT
ENDM

.DATA

gdt_descriptor GDT_DESCR  <127>

gdt GDT_ENTRY <>, \
<0FFFFh, , , 09Ah, 0CFh>, \
<0FFFFh, , , 092h, 08Fh>

.CODE

MAIN PROC
ORG 0

;Get linear address of GDT
MOV AX, DS
MOVZX EAX, AX
SHL EAX, 4
ADD EAX, OFFSET gdt

MOV gdt_descriptor.gdt_location, EAX

LGDT gdt_descriptor

; Go to PM
CLI
MOV     EAX, CR0
OR      AL, 01h
MOV     CR0, EAX

; Do intersegment jump to set cs and flush instruction queue
PM_JUMP 08h, OFFSET START

START:
CLI
HLT


MAIN ENDP
END     




Could it be that the "OFFSET" directive is getting the "START:" segment's real-mode offset and when the processor switches to protected mode the offset value it obtained in real-mode is now in-valid?


MichaelW

As the linker warning indicates your code does not specify a starting address. Even though this program will work OK with the default starting address, not all program layouts will so you should generally specify a starting address.

If the program is to actually use data, you must initialize DS so it points to the data segment (and not to the PSP as the loader initializes it). You can do this with something like:

mov  ax,@data
mov  ds,ax

Or you can just use the .STARTUP directive which will initialize DS and set the program starting address to the address of the directive.

The ORG 0 directive serves no useful purpose.

Beyond that, I can see two problems with your PM setup code. In your jump instruction you are encoding a 16-bit immediate offset address that is calculated relative to your program's code segment. For this to work the segment base address in the code segment descriptor must match the absolute address of your program's code segment, and the D bit must be clear. If your program is to use data from PM then the segment base address in the data segment descriptor will need to be set to match the absolute address of your data segment, and DS will need to be loaded with the data selector after the switch to PM. If your program is to use the stack from PM then the SS register will need to be set appropriately. If you use the .STARTUP directive then DS and SS will be initialized to the same value (the segment address of DGROUP) and SP will be adjusted accordingly, so you can load the data selector into SS as well.
eschew obfuscation

japheth


I did some extensions to your little app and now it works.
While in PM it does some screen magic and by pressing ESC it should return to RM, avoiding having to reboot


.MODEL SMALL
.386p

.STACK

GDT_DESCR STRUC
gdt_size              WORD 0
gdt_location          DWORD 0
GDT_DESCR ENDS

GDT_ENTRY STRUC
segment_size15_0 WORD 0
base_addr15_0 WORD 0
base_addr23_16 BYTE 0
p_dpl_s_type BYTE 0
g_db_0_avl_seg19_16 BYTE 0
base_addr31_24 BYTE 0
GDT_ENTRY ENDS

PM_JUMP  MACRO _SEGMENT,_OFFSET
BYTE 0EAh
WORD _OFFSET
WORD _SEGMENT
ENDM

.DATA

gdt_descriptor GDT_DESCR  <127>

gdt GDT_ENTRY <>, \
<0FFFFh, , , 09Ah, 08Fh>, \
<0FFFFh, , , 092h, 08Fh>, \
<0FFFFh, , , 092h, 000h> ;a valid 64 kB data descriptor

.CODE

MAIN PROC

mov ax,DGROUP
    mov ds,ax
;Get linear address of GDT
MOV AX, DS
MOVZX EAX, AX
SHL EAX, 4
ADD EAX, OFFSET gdt
MOV gdt_descriptor.gdt_location, EAX

;set decriptor 8 to base of CS
MOV AX, CS
MOVZX EAX, AX
SHL EAX, 4
mov [gdt+1*sizeof GDT_ENTRY].base_addr15_0,ax
    shr eax,16
mov [gdt+1*sizeof GDT_ENTRY].base_addr23_16,al
mov [gdt+1*sizeof GDT_ENTRY].base_addr31_24,ah

LGDT gdt_descriptor

; Go to PM
CLI
MOV     EAX, CR0
OR      AL, 01h
MOV     CR0, EAX

; Do intersegment jump to set cs and flush instruction queue
PM_JUMP 08h, OFFSET START

START:
mov ax,10h
    mov ds,ax
    mov bx,0700h
nextloop:   
mov ax,bx
mov cx,80*24
mov edi,0B8000h
    .while (cx)
    mov [edi],ax
        inc edi
        inc edi
        inc al
        dec cx
    .endw
    inc bl
    in al,64h
    and al,1
jz nextloop
    in al,60h
    cmp al,1 ;ESC pressed?
    jnz nextloop
   
    mov ax,18h
    mov ds,ax
   
    mov eax,cr0 ;back to real mode
    and al,0FEh
    mov cr0,eax
    db 0eah
    dw offset in_rm_again
    dw seg _TEXT
in_rm_again:   
    sti
    mov ax,4c00h
    int 21h

CLI
HLT


MAIN ENDP

END MAIN   


r_miele

Thanks for the help guys! :U

This line:

[gdt+1*sizeof GDT_ENTRY]

I'm not understanding it, could you walk through it please.

Also, I was originally trying to set up the GDT so that there would be a code segment that would consist of the first megabyte of memory and a data segment that encompassed the entire 32-bit memory range. 
Is this possible?

Thanks again.

japheth


r_miele,

this [gdt + 1 * sizeof GDT_ENTRY] is to address the second item in the gdt descriptor table.
MASM knows byte offsets only, no array indices, thats why "sizeof GDTENTRY" has to be added.

> Also, I was originally trying to set up the GDT so that there would be a code segment that would consist of the first
> Is this possible?

Is is surely possible, but a bit advanced. To use it you would have to switch to a 32bit code segment (that is, CS D-bit is set, thus EIP instead of IP is used) immediatetely after switching to protected mode. MASM will force you to place such code in a "use32" code segment.

japheth


This version uses a 32 bit flat code segment:



.MODEL SMALL

.STACK 2048

.386p

GDT_DESCR STRUC
gdt_size              WORD 0
gdt_location          DWORD 0
GDT_DESCR ENDS

GDT_ENTRY STRUC
segment_size15_0 WORD 0
base_addr15_0 WORD 0
base_addr23_16 BYTE 0
p_dpl_s_type BYTE 0
g_db_0_avl_seg19_16 BYTE 0
base_addr31_24 BYTE 0
GDT_ENTRY ENDS

PM_JUMP  MACRO _SEGMENT,_OFFSET
BYTE 0EAh
WORD _OFFSET
WORD _SEGMENT
ENDM

.data

gdt_descriptor GDT_DESCR  <127>

gdt GDT_ENTRY <>, \
<0FFFFh, , , 09Ah, 08Fh>, \ ;08
<0FFFFh, , , 092h, 08Fh>, \     ;10
<0FFFFh, , , 092h, 000h>, \ ;18 a valid 64 kB data descriptor
<0FFFFh, , , 09Ah, 0CFh> ;20 a flat 32 bit code segment

.code

MAIN PROC

mov ax,DGROUP
    mov ds,ax
;Get linear address of GDT
MOV AX, DS
MOVZX EAX, AX
SHL EAX, 4
ADD EAX, OFFSET gdt
MOV gdt_descriptor.gdt_location, EAX

;set decriptor 8 to base of CS
MOV AX, CS
MOVZX EAX, AX
SHL EAX, 4
mov [gdt+1*sizeof GDT_ENTRY].base_addr15_0,ax
    shr eax,16
mov [gdt+1*sizeof GDT_ENTRY].base_addr23_16,al
mov [gdt+1*sizeof GDT_ENTRY].base_addr31_24,ah

;set call to flat 32 bit code
mov ax, _TEXT32
    movzx eax,ax
    shl eax,4
    mov dx, offset start
    movzx edx,dx
    add eax, edx
    mov cs:[xxx], eax

LGDT gdt_descriptor

; Go to PM
CLI
MOV     EAX, CR0
OR      AL, 01h
MOV     CR0, EAX

; Do intersegment jump to set cs and flush instruction queue
    db 66h, 0eah ;jmp fword ptr 20h:start
xxx dd 0
    dw 20h
   
back_in_16_bit::
    mov ax,18h
    mov ds,ax
    mov eax,cr0 ;back to real mode
    and al,0FEh
    mov cr0,eax
    db 0eah
    dw offset in_rm_again
    dw seg _TEXT
in_rm_again:   
    sti
    mov ax,4c00h
    int 21h

MAIN ENDP

_TEXT32 segment use32 dword private 'CODE'

start:
mov ax,10h
    mov ds,ax
    mov bx,0700h
nextloop:   
mov ax,bx
mov cx,80*24
mov edi,0B8000h
    .while (cx)
    mov [edi],ax
        inc edi
        inc edi
        inc al
        dec cx
    .endw
    inc bl
    in al,64h
    and al,1
jz nextloop
    in al,60h
    cmp al,1 ;ESC pressed?
    jnz nextloop
   
    db 0eah
    dw offset back_in_16_bit ;jmp fword ptr 8:back_in_16_bit
    dw 0 ;HIWORD(offset)
    dw 8
   
_TEXT32 ends


END MAIN   



Please note: you may get linker errors with MS link or Borland's tlink.
For 32bit OMF code like this one I would suggest to use digital mars OMF linker (it is free),
which is by far the best OMF linker I know. Furthermore it has no problems with DGROUP size > 64 kB.



r_miele

Thanks for the quick reply japheth! :thumbu

I have a couple of other questions if you do not mind.


[gdt+1*sizeof GDT_ENTRY]

Correct me if I am wrong, with four entries in the GDT the "SIZEOF" directive will return 256.  gdt will provide the offset of the GDT and the plus one will add one byte to that.  I can figure out how that is getting to the second element in the array.  I don't know why this is confusing me so badly!

When a small memory model is used do the segments default to 16-bit?

This following code:

;set call to flat 32 bit code
    mov ax, _TEXT32
    movzx eax,ax
    shl eax,4
    mov dx, offset start
    movzx edx,dx
    add eax, edx
    mov cs:[xxx], eax


This is going to calculate the linear address of the 32-bit code segment but I've never seen this syntax before:

    mov cs:[xxx], eax

could you please explain it to me.




This jump:

; Do intersegment jump to set cs and flush instruction queue
    db 66h, 0eah ;jmp fword ptr 20h:start
xxx dd 0
    dw 20h

I'm confused with this jump, the Intel Developers Manual Vol 3 states "The instruction prefix 66H can be used to select an operand size other than the default, and the prefix 67H can be used select an address size other than the default." 
Is that were the 66h is coming into effect?


Thanks for the help and sorry for so many questions, I want to make sure I understand all that is going on. :U

japheth

#10
> Correct me if I am wrong, with four entries in the GDT the "SIZEOF" directive will return 256.  gdt will provide the
> offset of the GDT and the plus one will add one byte to that

No (I dont understand how you can figure out a size of 256 for anything).

[gdt+1*sizeof GDT_ENTRY]

(can also be written: [gdt + (1*sizeof GDT_ENTRY)])

"sizeof GDT_ENTRY" is 8

so in total it is "offset gdt + 8"


> When a small memory model is used do the segments default to 16-bit?

it depends:

with:
        .286
        .model small

you get defaults of 16 bit, with

        .386
        .model small

you get 32bit defaults. So the .model directive depends on the previous processor directive.


> This is going to calculate the linear address of the 32-bit code segment but I've never seen this syntax before:

This calculation is needed because you wanted a flat, zero-based CS segment (_TEXT32). In this case you have
to calculate the linear address of label start, that is "(segment * 16) + offset"

>  mov cs:[xxx], eax

the linear address of label "start" has to be calculated at run time, the dos MZ loader is unable to do that. Its done here and so the intersegment jump to 32bit may work.

> Is that were the 66h is coming into effect?

yes. with the 66h the cpu expects a dword offset in the far jump (which is required here), without it the offset is expected to be a word only.

[EDIT]
I just tried with MASM 6.15 and this more simple coding works as well:
   jmp fword ptr [xxx]
xxx   dd 0
        dw 20h
So there is no need to use the db 66h, db 0eah form
[/EDIT]


BTW: I find it good that someone is interested in this basic stuff. My personal opinion is that an ASM programmer should have to know it, but that isn't true nowadays.











MichaelW

QuoteCorrect me if I am wrong, with four entries in the GDT the "SIZEOF" directive will return 256. gdt will provide the offset of the GDT and the plus one will add one byte to that. I can figure out how that is getting to the second element in the array. I don't know why this is confusing me so badly!

For a structure SIZEOF returns the number of bytes in the initializers. For the GDT_ENTRY structure SIZEOF would return 8. For the gdt variable in japheth' s most recent code, SIZEOF would return 40 (5 entries * 8 bytes per entry).

From the MASM 6.0 Programmer's Guide:

The assembler evaluates expressions that contain more than one
operator according to the following rules:

Operations in parentheses are always performed before any
adjacent operations.

Binary operations of highest precedence are performed first.

Operations of equal precedence are performed from left to right.

Unary operations of equal precedence are performed right to left.

The order of precedence for all operators is listen in Table 1.3.
Operators on the same line have equal precedence.

Table 1.3 Operator Precedence
Precedence    Operators
    1          (),[]
    2          LENGTH,  SIZE, WIDTH, MASK
    3          .(structure-field-name operator)
    4          :(segment-override operator), PTR
    5          LROFFSET, OFFSET, SEG, THIS, TYPE
    6          HIGH, HIGHWORD, LOW, LOWWORD
    7          +, - (unary)
    8          *, /, MOD, SHL, SHR
    9          +, - (binary)
    10         EQ, NE, LT, LE, GT, GE
    11         NOT
    12         AND
    13         OR, XOR
    14         OPATTR, SHORT, .TYPE


Multiplication has a higher precedence than addition (higher meaning higher up in the list), so the multiplication is performed first. Some of these operators are unique to MASM, but the parentheses, arithmetic, relational and logical operators have the same relative order of precedence for most expression evaluators.

http://www.jimloy.com/algebra/some.htm

QuoteWhen a small memory model is used do the segments default to 16-bit?

See Defining Segments with the SEGMENT Directive, and Setting Segment Word Sizes (80386/486 only):

http://webster.cs.ucr.edu/Page_TechDocs/MASMDoc/ProgrammersGuide/Chap_02.htm
eschew obfuscation

r_miele

Thanks for the help guys. :U

I read that chapter MichaelW and it answered a couple of questions for me but brought up a few problems.

I think the root of my major problem with assembly is my in-ability to define the difference between physical segments and logical segments.

For example, I've always thought that when you use the small memory model you have your data and code stored in two separate physical segments.  After reading that chapter I think I was wrong in that they are stored in two different logical segments.  I also noticed that when I would run a small memory model program in codeview, it would state that my data and code segments were in the same physical segment.

Am I correct or do I mis-understand this.  Any insight would be appreciated.  :wink

MichaelW

I think the chapter's reference to logical segments just creates unnecessary confusion. If you were coding in hex, or creating an assembler, or certain types of memory managers, you might need to consider the distinction between physical and logical segments. But for most purposes I think you should just visualize segments as "regions" of physical memory, where the segment address specifies the region and the offset address specifies the location within the region.
QuoteLogical segments contain the three components of a program: code, data, and stack. MASM organizes the three parts for you so they occupy physical segments of memory. The segment registers CS, DS, and SS contain the addresses of the physical memory segments where the logical segments reside.
If you examine a small model program in CodeView you will see that CS, DS, and ES are all set to the same segment address when the program is loaded, but after the normal startup code executes (the code that the .STARTUP directive would generate) DS and SS are set to a different segment address (the segment address of DGROUP), and CS still contains its original value. According to the above quote, the logical code segment is in one physical segment, and the logical data and stack segments are in another physical segment. This arrangement is common to all of the conventional memory models other than TINY (the memory model for a COM file), for which all of the program's logical segments share a single physical segment.
eschew obfuscation

r_miele

Alright, I finally think all of my Assembly knowledge is starting to come together. :bg

I just have to verify a couple of things with you guys.

In the above code supplied by japheth, this section of code:

;set call to flat 32 bit code
MOV AX, _TEXT32
MOVZX EAX, AX
SHL EAX, 4
MOV DX, OFFSET start
MOVZX EDX, DX
ADD EAX, EDX
MOV CS:[xxx], EAX

;Initializing the GDTR.
LGDT gdt_descriptor

; Go to PM
CLI
MOV     EAX, CR0
OR       AL, 01h
MOV     CR0, EAX

; Do intersegment jump to set cs and flush instruction queue
JMP FWORD PTR [xxx] ;jmp fword ptr 20h:start
xxx DWORD 0
WORD 20h


Correct me if I am wrong.  The reason you are declaring "xxx" here and not in a data segment is because when the processor is switched into protected mode it won't be able to access the data segment because its decriptor hasn't been loaded into the DS register.  Normally you can't declare data in a code segment because the processor will fault when it tries to execute a data declaration but it is possible here because the processor doesn't actually process the declaration, it just uses it for the jump operation.

This line:

MOV CS:[xxx], EAX

is telling the assembler using a segment override that the "xxx" variable is located in the CS segment and not in the DS segment, correct?  Also, why are you dereferencing "xxx" using brackets, wouldn't it have been the same if you didn't use the brackets?

MOV CS:xxx, EAX


The only straight up question I have on this section of code is I thought the assembler would give an error if you tried to access a variable that has not been declared yet (ie. "XXX").

Thanks again. :thumbu