Δευτέρα 14 Νοεμβρίου 2022

Tiny Assembler in M2000 (part 1) updated


This is the example which we programming machine code in M2000 Environment.  For the part 1 we have data in BinaryData and we place absolute address when we have to move values from and to BinaryData. Code in buffer alfa, use relative addressing for Calls and Jmps. So the code can be moveable but the data can't move. Part 2 isn't ready yet.

The assembler use subroutines calls. A subroutine have same scope as the module, can make local variables and can use the stack of values. so when we use Number we pop a number from the stack of values or get an error if not a number exist at top of stack. So when we call OpJMP() passing an address this placed in stack of values, then in sub OpJmp() we call OpJXX() passing two parameters, so the stack has the two parameters and as third value the one we pass to OpJmp().

Sub Op() get values until the stack is empty so sometimes we use Stack New {} which park the current stack and set a new one, then at the exit of block the old stack became the current stack (and the other in then block destroyed, and any value on it also deleted).

We have a two pass assembler. The first pass ensure that all labels are known. The second pass use these labels. So we can use a label at a forward relative address. To pass the address we can use the @Label() function. Functions like @Label() called simple functions, because the have the same scope as the module (like subs).

The addressOf ModuleName return a handler by Interpreter. We can use it if we call at address  which return the function without parameters module(), passing it as  parameter (pushing in stack - the return stack in machine code). Sub CallModule, push the handler to stack and call the module() entry.

For demonstration we have the Call_DispDword function, which is a StdCall type function.  


When we set BaseAddress to 0, the program export bytecode (like the one after the assembly list) to paste to this link and get the listing with the line numbers and without color.

check it here: https://defuse.ca/online-x86-assembler.htm#disassembly2


The following machine code (adapted to address 0x0000), so L001 is 0xF and  FuncA is 0x70.

This program count from 100 to 1 and increment the value at 0xffffec78. In each iteration we call back M2000 to display a value (first value 100). When eax register turn to zero the jne l001 exit the loop. The nop instructions added because the "assembler" leave space for the second pass to decide if it is a byte or dword (22 bit) relative jump. 


About Buffers in M2000. We can define a buffer using a type and a multiplier. Here is byte. So a buffer is an array of a type. We can define structures and make buffers of arrays of structures. The are two types of bytes. The normal and the code type. The normal type is always read/write memory, and can be any size (from 1 byte).  The code type is read only at execution and read/write after execution. We can write bytes, integers, long (all of three of them as unsigned values), or using Uint() we can pass negative values (same bits as the returned unsigned value). Also we can write strings, UTF16LE without the trailing zero (which all BSTR type strings have), or using ANSI by using this Str$("CAR") which return 3 bytes (Len(Str$("CAR"))=1.5 because Len return number of words, so 3 bytes are 1.5 words. So if we have a BDATA named buffer with 1024 bytes, a Return BDATA, 0:=STR$("CAR") would poke the ANSI code (based on LOCALE value), of characters. Because these characters have value lower than 128, they exist in all LOCALE values, sto at offset 0 we have 67, 

Buffer clear Bdata as byte*1024
Return Bdata, 0:=str$("CAR"), 3:=0xAA22FF22 as long
Print Eval(Bdata, 0)=67, Eval(Bdata, 1)=65, Eval(Bdata, 2)=82
Print Eval(Bdata, 3 as long)
Return Bdata, 4:=1.2422323e-12 as double, 12:=123123.2 as single
Print Eval(Bdata, 4 as double)
Print Eval(Bdata, 12 as single)
Utf8$=string$("Alfa-ΑΛΦA" as UTF8Enc)
Return Bdata, 16:="UTF16LE", 16+14:=UTF8$
Print Eval$(Bdata,16,14), String$(Eval$(Bdata, 30, Len(UTF8$)*2) as Utf8Dec)
a$=Eval$(Bdata)
Print len(a$)*2=1024
Buffer Bdata1 as byte*1024
// copy Bdata to Bdata1
Return Bdata1, 0:=a$
// same as
Return Bdata1, 0:=Eval$(Bdata)


The assebly list


mov    eax,0xffffec78  ; -5000
mov    ds:0x11a295e4,eax  ; *ExportValue <- eax
mov    eax,0x64 ; 100
L001:
push   eax
mov    ds:0x11a295e0,eax  ; *Display_Value<-eax
push   0x12 ;; specific module DisplayDWord
call   0x502e9060 ;; module entry in M2000
inc    DWORD PTR ds:0x11a295e4 ; ExportValue++
pop    eax
dec    eax
jne    L001
nop
nop
nop
nop
mov    ds:0x11a295e0,eax ; *Display_Value<-eax
push   0x13 ;; specific module DisplayCR
call   0x502e9060  ;; module entry in M2000
push   0x11 ;; specific module DisplayDWord_and_wait
call   0x502e9060  ;; module entry in M2000
push   DWORD PTR ds:0x11a295e4 ;; push *ExportValue
call   FuncA
push   0x8
push   0x11a295e0  ; source at Display_Value
push   0x11a295e8  ; destination at DestMem
call   0x774681b0  ; kernel32.RtlMoveMemory
xor    eax,eax
ret
...
FuncA:
push   ebp
mov    ebp,esp
mov    eax,DWORD PTR [ebp+0x8]
mov    ds:0x11a295e0,eax
push   0x11
 ;; specific module DisplayDWord_and_wait
call   0x502e9060
  ;; module entry in M2000
pop    eax
xor    eax,eax
mov    esp,ebp
pop    ebp
ret    0x4

This is the bytecode: B878ECFFFFA3FCAB9A11B86400000050A3F8AB9A116812000000E841902E50FF05FCAB9A11584875E690909090A3F8AB9A116813000000E824902E506811000000E81A902E50FF35FCAB9A11E81F000000680800000068F8AB9A116800AC9A11E84B81467731C0C300000000000000005589E58B4508A3F8AB9A116811000000E8DB8F2E505831C089EC5DC20400



The program


rem  https://defuse.ca/online-x86-assembler.htm#disassembly2
// opcodes:
rem https://c9x.me/x86/
flush
Buffer Clear BinaryData as byte*1024
Buffer code alfa as byte*1024
return alfa, 0:=str$(string$(chr$(0x90), 1024))


rem {
General registers
EAX EBX ECX EDX
Index and pointers
ESI EDI EBP EIP ESP
}


Enum opcodes {
inc_eax=0x40, inc_ecx, inc_edx, inc_ebx, // OF, SF, ZF, AF, and PF
inc_esp, inc_ebp, inc_esi, inc_edi,  //... no CF -> use SUB for CF
dec_eax=0x48, dec_ecx, dec_edx, dec_ebx, // OF, SF, ZF, AF, and PF
dec_esp, dec_ebp, dec_esi, dec_edi,  //... no CF -> use SUB for CF
push_eax=0x50, push_ecx, push_edx, push_ebx, push_esp
push_ebp, push_esi, push_edi, pop_eax,pop_ecx, pop_edx,
pop_ebx, pop_esp, pop_ebp,pop_esi, pop_edi
mov_eax_mem=0xA1,  //  eax<-memory
mov_mem_eax=0xA3, // memory<-eax
mov_eax=0xb8, // άμεση τιμή στον eax
mov_ecx, mov_edx,mov_ebx, mov_esp, mov_ebp, mov_esi, mov_edi
mov_ecx=0xb9, mov_ebp=0xbd // move literal to ecx ebp (eax a line before)
//
mov_mem_reg=0x89, mov_reg_mem=0x8b
ecx=0x0d, esp=0x25, esi=0x31, edi=0x3d, ebp=0x2d, edx=0x15
push_sp=0X68,
call_func=0xE8
nop=0x90, test_eax=0xA9, cmp_eax=0x3d
}
// move the pc (program counter for assembler) to aligned position based on n span (2,4,8, 16...)


function align(&i as long, n as long=16){
=i:if n<1 then exit
i=binary.and(binary.shift(0xffffffff, sqrt(n)), i+n-1)
=i
}
// call from machine code back to module
// scope is the same as the main module
module DisplayDWord_and_wait {
if not valid(Display_Value) then exit
? "Display--->"
Hex eval(BinaryData, Display_Value as long)
Print "", sint(eval(BinaryData, Display_Value as long))
Print "<--------- press a key"
Refresh
push key$ : Drop
Print "Ok": Refresh
}
// print only a number from Display_Value at DisplayData buffer (read/write memory)
module DisplayDWord {
if not valid(Display_Value) then exit
Print eval(BinaryData, Display_Value as long),
}
//  just a new line
module DisplayCR {
Print
}


declare copyMem lib "kernel32.RtlMoveMemory" {long pDest, Long pSource, Long Bytes}


Enum Labels{
Display_Value=0,
ExportValue=4,
DestMem=8,
SourceMem=0
}
// byte, integer, long for buffers are Unsigned.
// to pass a signed use UINT()
// to read unsigned use Eval(BinaryData, 0  as long)
// to read signed use sint(Eval(BinaryData, 0  as long), 4)  // 1 for byte, 2 for integer, 4 for long
Return BinaryData, Display_Value:=0 as long, ExportValue:=0 as long, DestMem:=0 as double
Print Exist(string$(leftpart$(eval$(BinaryData, 100), 0) as utf8dec))
//* example of __stdcall */ push argN : push arg2 : push arg1 and then Call function
//  {filename$,  long Options, Long &tree}
For pass=1 to 2
Print "pass:", pass


//  ************************************************
//
BaseAddress=alfa(0)
//BaseAddress=0  // use this to copy from clipboard to:
// https://defuse.ca/online-x86-assembler.htm#disassembly2
// *************************************************
Long pc=0
Call_Test=align(&pc)
OpLong(mov_eax, UINT(-5000))
OpLong(mov_mem_eax, BinaryData(ExportValue))
OpLong(mov_eax, 100)
L001_label=pc // no align
Op(push_eax)
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayDWord)
OpIncMemory(BinaryData(ExportValue))
Op(pop_eax, dec_eax)
OpJNZ(L001_label)  // for back jump we place the number as is
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayCR)
CallModule(AddressOf DisplayDWord_and_wait)
hex @Label("L002_Label")
// OpJMP(@Label("L002_Label")) //
// OpByte(0x31, 0xC0)  // eax xor eax  // 0
// ret()
L002_Label=pc
OpPushFromMemory(BinaryData(Display_Value))  // p
CallFunctionAtLabel("Call_DispDword")  // call forward label


OpLong(push_sp, 8) ' 8 // third parameter fist, 8 bytes
OpLong(push_sp,BinaryData(SourceMem))
OpLong(push_sp, BinaryData(DestMem))


OpLong(mov_eax, UINT(1234))
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayCR)
CallModule(AddressOf DisplayDWord_and_wait)
OpCall(copyMem)
// CallFunctionAtLabel("Call_Dummy") // for test only
OpLong(mov_eax, UINT(2345))
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayCR)
CallModule(AddressOf DisplayDWord_and_wait)
OpByte(0x31, 0xC0)  // eax xor eax  // 0
Ret(0)
Call_Dummy=align(&pc)
Op(0x55, 0x89, 0xe5)
GetParam2eax(8)
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayDWord)
CallModule(AddressOf DisplayCR)
GetParam2eax(12)
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayDWord)
CallModule(AddressOf DisplayCR)
GetParam2eax(16)
OpLong(mov_mem_eax, BinaryData(Display_Value))
CallModule(AddressOf DisplayDWord)
CallModule(AddressOf DisplayCR)
Op(0x89, 0xEC, 0x5D)
ret(12)
Call_DispDword=align(&pc) // function stdCall
Op(0x55, 0x89, 0xe5)
op(push_eax)
GetParam2eax(4) //
OpLong(mov_mem_eax, BinaryData(0))
CallModule(AddressOf DisplayDWord_and_wait)
Op(pop_eax) // mov esp, ebp // pop ebp
Op(0x89, 0xEC, 0x5D)
Ret(4)
Next Pass


HEX UINT(COPYMEM)
Print "end of assembly"
if BaseAddress=0 then
document ex$
for i=0 to pc-1
ex$=hex$(eval(alfa,i),1)
next i
clipboard ex$
Print "Set BaseAddress ": break
end if
refresh
// Execute code from alfa (read only at execute), from offset Call_Test
Try ok {
execute code alfa, Call_Test
}
Print "Return Value in eax: ";Error
Print eval(BinaryData, SourceMem as long)
Print eval(BinaryData, DestMem as long)
end
Sub Op()
      while not empty
            Return alfa, pc:=number
            pc++
      end while
End Sub
Sub Ret(n=0)
if n<>0 then
Return alfa, pc:=0xC2, pc+1:=n as integer
pc+=3
else
Return alfa, pc:=0xC3
pc++
end if
End Sub
Sub OpByte()
Return alfa, pc:=number, pc+1:=number
pc+=2
End Sub
Sub OpLong()
Return alfa, pc:=number, pc+1:=number as long
pc+=5
End Sub
Sub AlignPcBytes(bytes)
local i=pc
call void align(&i, bytes)
if pc<i then
stack new {for i=i to pc+1: Op(nop): next}
end if
End Sub
Function Label() // "thislabel"
if not islet then Error {give label as string}
read new label$
if valid(eval(label$)) then = eval(label$)+BaseAddress else = 0
end Function
Sub CallFunctionAtLabel() // "thislabel"
if not islet then Error {give label as string}
read new label$
if valid(eval(label$)) then push eval(label$)+BaseAddress else push 0
OpCall()
end sub
Sub OpCall()
Return alfa, pc:=call_func, pc+1:=uint(number)-pc-5-BaseAddress as long
pc+=5
End Sub
Sub OpJMP()
OpJXX(0xeb, 0xe9)
End Sub
Sub OpJZ()
OpJXX(0x74, 0x84)
End Sub
Sub OpJNZ()
OpJXX(0x75, 0x85)
End Sub
Sub OpJXX(op1,op2,addr) // addr is an offset same as pc.
local no=addr-pc-2
if no<128 and no>-129 then '' 8bit
// hex "op1=";op1, UINT(no), pc, addr
Return alfa, pc:=op1, pc+1:=UINT(no)
pc+=2
' stack new {Op(nop, nop,nop)} // so now we have equal
else  // 32 bit  (we have -2 in previous calculation)
Return alfa, pc:=op2, pc+1:=UINT(no-3) as long
pc+=5
end if
End Sub
Sub OpReg2Mem()
Return alfa, pc:=mov_mem_reg, pc+1:=number, pc+2:=UINT(number) as long
pc+=6
End Sub
Sub OpMem2Reg()
Return alfa, pc:=mov_reg_mem, pc+1:=number, pc+2:=UINT(number) as long
pc+=6
End Sub
Sub OpPushFromMemory()
Return alfa, pc:=0xFF, pc+1:=0x35, pc+2:=UINT(number) as long
pc+=6
End sub
Sub OpIncMemory()
Return alfa, pc:=0xFF, pc+1:=0x05, pc+2:=UINT(number) as long
pc+=6
End Sub
Sub CallModule()
      Return alfa, pc:=0x68, pc+1:=UINT(number) as long
      pc+=5
      Return alfa, pc:=0xE8, pc+1:=UINT(module()-pc-5-BaseAddress) as long
      pc+=5
End Sub
sub GetParam2eax(no as long)
if no<128 and no>-129 then '' 8bit
stack new {Op(0x8b, 0x45, UINT(no))}
else  '' 32 bit
stack new {Op(0x8b, 0x85) : Return alfa, pc:=UINT(no) as long: pc+=4}
end if
end sub


Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου

You can feel free to write any suggestion, or idea on the subject.