From: "Rick A. Hohensee" <rickh@Capaccess.org> To: cola@stump.algebra.com Subject: osimplay, formerly shasm, is now beta Date: Sat, 05 Jan 2002 04:25:06 -0500 Cc: lwn@lwn.net osimplay, formerly shasm, is an x86 macro-assembler, "mid-level-language", or "compembler". It is implemented entirely in GNU Bash 2 without dependance on any external utils. Coverage is roughly 386, real and pmode, no FPU, with Linux syscalls. osimplay has simple analogues of a nice set of C and Forth features, and some unique features such as the "xray" jump-table construct, without creating any syntactic seam between high-level and low-level. There is no asm("") or CODE/ENDCODE. osimplay can now build working examples of small Linux ELF executables, and a bootsector, and the sources are included. osimplay is thus at beta development level. It's reasonably useable, and the bugs that arise may now be small enough to not always require the author to fix, although I would love to know about them. This version of osimplay is public domain. Programmers and would-be programmers that enjoy having thier assumptions challenged should find osimplay amusing. included) to a mode-changing bootsector (also included, working.). In ftp://ftp.gwdg.de/pub/cLIeNUX/interim/osimplay.tgz rickh@capaccess.org Rick Hohensee, sole author long blurb...................................... asmacs begat shasm begat osimpa begat osimplay, and I'm saying osimplay is now beta. asmacs was just a bunch of m4 macros for Gas that simply transliterated Intel opcode and register names to names I consider massively clearer and/or more convenient. Intel MOVx is = in osimplay, and LMSW is loadmachinestatusword. = is about 25% of most code, and I believe there's one occurance of LMSW in Linux, and I think that's there out of nostalgia. Main register names in osimplay are A, B, C, D, SP, BP, SI and DI. I found asmacs very helpful, and this simple renaming remains the big win in osimplay. High-level languages have frozen the evolution of assemblers, and some catch-up is about 35 years overdue. shasm got rid of most of the need for sized register names like A - AX - AL with "byte" and "cell" keywords. The cell concept also hides some fundamental machine information elegantly, and thus is seen previous to shasm (by 1970 or so) in Forth and BCPL, and is very helpful with the fact that a 386 is two different size machines, 16 bit rmode and 32 bit pmode. The concept may be "forward-compatible" to IA64 also, but I don't know that architecture. shasm also allows source/dest or dest/source (AT&T or Intel) syntaciis by expanding the usual "," arguments-delimiter to "to", "from" or "with". shasm got Slashdotted before it could really produce much working 386 code, but it did produce some shortly thereafter. shasm and it's existing subsequent versions are 100% GNU Bash 2 shell scripts. That's right, just a recent sh. No dd, sed, etc. "Installing", running, and reading some operator-specific osimplay help on Linux/Bash is... tar xzvf osimplay.tgz cd osimplay_ . osimplay = h osimpa was shasm+enthusiasm. osimpa added various rustic imitations of C and Forth constructs to shasm, and a couple features I suspect are unique, without losing seamless access to assembly. A seam is typified by the asm("") seam between C and Gas in the GNU toolchain. osimpa features include; "allot", data "clump"s, "print", "text", "Linux" (syscalls), "entrance" procedures, "heap" (like .bss), "ELF" (executables only) and so on. In the course of adding all that featurism, shasm real mode support was broken, but writing small Linux utilities became almost convenient. Deliberately avoided to remain an assembler; data types, structured flow control abstractions like DO/WHILE/FOR/ELSE, and of course there are no Obstacle-Oriented Programming techniqueMethodMechanism()s. Although I don't do IF/ELSE/ENDIF and so on, osimpa "when" conditional branches are pretty nice for what they are, and osimpa has real execution arrays (jump tables, not heavily tested). osimplay means writing operating systems is simply childsplay. That is hype, and is thus deliberately outrageous, but there's a sliver of truth to it. It should make playing with OS design easier. osimplay can build anything from a Linux console text editor (a fair wad of the beginnings of one are included) to a mode-changing bootsector (also included, working.). In other words, real mode is fixed, pmode is almost convenient, and thus osimplay probably does merit the term "beta". Result. Even high-level languages as low-level as C or Forth work from some abstraction back to the machine. osimplay is pure bottom-up, being an attempt at a Forth for one-stack machines. There are two areas where I believe this has been worth the effort. Systems programming suffers at the machine/abstraction seam, and there is no such seam in osimplay. That seam is normally considered the cost of portability, but I believe that cost can be greatly reduced in an assembler-like model closer to the machine than C, and besides, there's plentys of 386s out there. I also suspect that osimplay is relatively easy to learn, particularly to self-teach. No pointers (C), no stack-dancing (Forth), no REPxx (x86), fairly interactive ... An area where it hasn't been worth the effort is in runtime performance. C is impressive, even on x86, which isn't a PDP-11. Even if I can beat Gcc, it's not usually by much, but certain areas (switch/case, recursion, very finely factored code...) still bear a closer look. Conversely, it's not so hard to get close to C in assembly in most cases either. Optimized Gcc is good, but unoptimized Gcc can be pretty, uh, amusing. Beyond, osimplay visually looks pretty CPU-independant, and I believe, could be completely portable (across commodity desktop CPUs) with a few more tricks. The great genius of C is good portability with excellent performance. Everything else about C is minor, including some mistakes. The same is achievable much more simply, even via a shell script. One lesson of Forth is that simplicity is robust. I can't find the quote on Google, but I believe Rob Pike once told me in 9fans that UNIX naming tradition is horrid. Whether Mr. Plan 9 said so or not, it is. Linux people are repulsed and enraged by my fits of neologistic frenzy. Forth people obsess over names. There is excellent reason for the latter. Bad names don't matter to machines, but frequently cause humans to write dysfunctional, often totally self-extraneous code, and this effect is self-compounding, and I believe people don't appreciate how bad the situation is. To put it positively, I believe renaming is currently a huge opportunity in computing, starting with assembly, which is the point at which names start to matter. So go get osimplay before I decide the name is wrong again :o) It's a script, so feel free to decide the names are all wrong :o) beyond beyond, C claims portability by only modeling the execution engine of the CPU in the core of the language. Forth also. It would be nice if more operating system mechanism was part of a standard portable language. I personally don't know of such a language with systems-grade performance, and if it exists I doubt it's very general. A compembler can help investigate that, even one written in a unix sh. osimplay is now a distinct language independant of implementation. Not too distinct though; most of it shouldn't be too alien to good programmers, other than the basic fact that in the current implementation your assembler source is a shell script. ftp://ftp.gwdg.de/pub/cLIeNUX/interim/osimplay.tgz and browse the cLIeNUX dirs above that :o) That version of osimplay is public domain. Rick Hohensee rickh@capaccess.org http://linux01.gwdg.de/~rhohen