Lots of pain ...
Many bugs, much re-factoring, unit testing and regression testing. Changes to lexer and parser.
I've decided the smart thing to do would be to group instructions together according to shared parser rules. My thinking is that there are a number of instructions which can be handled by exactly the same rule set.
IE: All instructions that take NO parameters... A group for instructions that take one parameter being a memory address and so on.
I realized my idea with multiple passes was still flawed.. you DO need as many passes as it takes to solve the problem, but my solution to solve the jumps will still work. I've subsequently implemented FULL multi-pass support so that after each pass it knows
the state of forward references and symbol definition completeness.
This was necessary to solve things like this:
A equ B
B equ C
C equ D
D equ 2
While doing this I noticed that ML/ML64 handle a lot of things FAR better than JWASM. Like the above which works in ML but not in JWASM without funny errors if these values are all fwd. references and defined after use.
This also breaks jwasm:
A equ B
B equ A
Whereas ML and mine handle this as expected by hitting maximum pass warning.
My solution to only adding symbols to the symbol table on reference works nicely. I've fully implemented org, EQU, Expression evaluation and a bunch of built-in pre-defined symbols for things like $, $$, true, false, null. I've tested some ORGs, forward references,
offset operator and more.
I've added a dump of the symbol table to a .sym file when you build in debug mode with binary output.
The debug mode execution will also demonstrate the parser deciding where to evaluate through recurssion or linear-state matching.
I'm starting to look at building up the line number info necessary for debug mode output. As yet I'm not sure what COFF etc requires for this, I'm assuming it needs a line number+address reference for every instruction? As well as line number for symbol definitions (which is already stored in the symbol table). Any thoughts?
The one thing I do want to fix here over ML is that the line number in source of the actual MACRO must be stored (as this annoys me currently) when debugging you can't really step into a macro and there's no reason why not.. it should be much like a proc.
I've also used all the cpu manuals and MASM manual to finish capturing every single instruction and directive into the lookups... that was painful.
Attached is the next update including the usual release/debug version with added info coming from the SYMBOL TABLE sub-system and EXPRESSION system. I've included a test file which has just about every possible expression i could come up with to test it.
Once I can solve the line number debug info, complete a few more opcode group rules I should be able to start on doing the first simple OBJ generation with COFF that will actually link, run and debug properly.
(I will need some assistance or advice around what info needs to be captured to .xdata / pdata etc).
It's going slower than I would've liked.. but at least its going :) 8000 lines of code and counting...