2022-06-28

HAC as an embedded compiler

Here is an example to illustrate how the HAC compiler is embedded in an Ada application.

The user of the application has typically only an executable (built with a "full Ada" system like GNAT) and an "Ada-pplet" (can be a few lines) that is run from that executable.

Normally, applications have different languages for building the application and for the embedded script system: Excel is written in C/C++ but embeds VBA (Visual Basic for Applications). 3D Studio Max has MaxScript. GIMP uses Script-fu, a variant of Scheme. Others use Lua or Python. Of course nobody would think about shipping a software with a scripting language where you have to fight with pointers and conditional compilation (C/C++).

But... how about an application that is written in a compiled language, but is using the same language for its scripting purposes? Since a few days, it is possible with HAC (the HAC Ada Compiler).

If you build the demo exchange_hac_side.adb (e.g. from GNAT Studio, or via the command "gprbuild hac"), you get an executable, exchange_hac_side on Linux or exchange_hac_side.exe on Windows. If you run it, you get:

Exchange_Native_Side is started.

Native: building a HAC program: exchange_hac_side.adb

   Exchange_HAC_Side is started.

   HAC: I call a parameterless callback.
      Native: Parameterless_Callback is speaking!
   HAC: done calling.

   HAC: I send some stuff through callbacks.
      Native: HAC is saying: [I'm HAC and I say hello!]
      Native: HAC has sent me the numbers: 123 456 789
      Native: HAC has sent me the numbers: 1.23000000000000E+02 4.56000000000000E+02 7.89000000000000E+02

   HAC: message before call: [I'm HAC]
      Native: HAC is saying: [I'm HAC]
   HAC: message after call: [No, I'm Native, you loser!]
   HAC: integer before call: [12]
      Native: HAC has sent me the number: 12; I will send it back squared.
   HAC: integer after call: [144]
   HAC: float before call: [11.0]
      Native: HAC has sent me the number: 1.10000000000000E+01; I will send it back squared.
   HAC: float after call: [121.0]

   HAC: integer before call: [13]
        message before call: [I'm a HAC record field]
           enum before call: [BAT]
      Native: Enum = BAT
      Native: Matrix m:
           1.10 1.20
           1.30 1.40
      Native: Matrix n:
          -2.10 2.20
          -2.30 2.40
      Native: Matrix o = m * n:
          -5.07 5.30
          -5.95 6.22
      Native: Enum = CAT
   HAC: integer after call: [169]
        message after call: [I'm a Native message now (niarg niarg niarg)!]
           enum after call: [CAT]
        matrix product:
          -5.07 5.30
          -5.95 6.22

Messages as "Native" are from the main application.

Messages as "HAC" are from the small HAC program that is run from the application.

A little disadvantage of having the same language on both sides... is that it is the same language! It could be confusing at first glance. Plus, it is not in the tradition. That may be disturbing for a number of people.

A big advantage of having the same language is that you can share some pieces of code. For instance the following package is used for building the demo application, and compiled by HAC when the application is running!

------------------------------------------- -- HAC <-> Native data exchange demo -- ------------------------------------------- -- Package used by both HAC and -- -- Native sides -- ------------------------------------------- package Exchange_Common is type Animal is (ant, bat, cat, dog); subtype Beast is Animal; subtype Insect is Animal range ant .. ant; subtype Mammal is Animal range bat .. dog; end Exchange_Common;


HAC (HAC Ada Compiler) is a quick, small, open-source Ada compiler, covering a subset of the Ada language. HAC is itself fully programmed in Ada.

Web site: http://hacadacompiler.sf.net/ 

Source repositories:

 #1 svn: https://sf.net/p/hacadacompiler/code/HEAD/tree/trunk/

 #2 git: https://github.com/zertovitch/hac

2022-06-25

Ann: HAC v.0.2

HAC (HAC Ada Compiler) is a quick, small, open-source Ada
compiler, covering a subset of the Ada language.
HAC is itself fully programmed in Ada.

Web site: http://hacadacompiler.sf.net/
From there, links to sources, and an executable for Windows.

Source repositories:
  #1 svn: https://sf.net/p/hacadacompiler/code/HEAD/tree/trunk/
  #2 git: https://github.com/zertovitch/hac

* Main improvements since v.0.1:

  - a program run by HAC can exchange data with the
      programm running HAC, through dynamically registered call-backs
      - see package HAC_Sys.Interfacing and demos:
        src/apps/exchange_native_side.adb
        src/apps/exchange_hac_side.adb

  - the compiler checks that all choices in a CASE statement
      are covered

  - the compiler performs more compile-time range checks and
      optimizes away useless run-time checks when it's safe to do so.

Enjoy!

Gautier

2022-06-01

Zip-Ada: Zip64 extensions

On the very top of the Zip-Ada to-do list, for a while, was the support for archive containing large (more than 4 GiB data) or numerous (more than 65535) files – the so-called "Zip64" extensions.

It's now gone from the to-do list and done in the project.

The Zip archive format was defined with 32-bit file sizes and 16-bit number of files. At the time (1989) it was a perfectly sound choice, especially given that the software, PKZIP, was written for 16-bit PC's running on MS-DOS. At some point, these limits began to be problematic for archiving or backup tasks. Although the original Zip format is still the most widely used (including many file types without the .zip name extension, such as .docx, .xlsx, .pptx, .jar), the limits become, slowly, more and more frequently an issue over time. Consequently, PKWARE (the firm behind PKZIP) has introduced, around year 2000, a set of format extensions to overcome those limitations. Given the omnipresence of Zip archives and the difficulty to innovate in software, PKWARE did not opt for a fresh, new, simple, but incompatible format. Instead, they designed a set of extensions, allowed by the flexibility of the Zip format. This seems clumsy and over-engineered, but the design is actually quite clever: a program can create a Zip archive, without size limitations, and without checking in advance whether the Zip64 extensions are needed or not. When the archive is finished, either the Zip64 extensions were not needed and the archive is conform to the year 1989 original format, or they were – then, an unpacker for that archive file has to know about Zip64. Thus, the Zip64 design allowed for a progressive adoption of the new format, since the actual need went progressively from very rare to a bit less rare.

How to implement Zip64 in a new software? There are a few options, which you can combine:

1.    Follow the documentation, appnote.txt
2.    Take inspiration from some open-source software like Info-Zip or 7-Zip
3.    Reverse-engineer data (hex dumps)

All three are very time-consuming (especially, the documentation, aimed at maintaining the openness of the Zip format, is "legally" correct but lacks some indications and practical details).
Even widespread, commercial software (with paying customers and many paid programmers behind it) does not support Zip64 (e.g. Microsoft Word for .docx), or took almost two decades to do so (Microsoft Windows for instance). It reveals something about the time and costs related to the implementation…

Fortunately, a very helpful person called Yaakov not only did an implementation from scratch, but also was kind enough to share his experience in an explanatory and colorful way, with a simple example as an illustration. The article is available here: ZIP64 - Go Big Or Go Home.
I show here (with permission) the key diagram of his article:

Click to enlarge


The Zip64 extensions are in dark background and white letters.
The part with the archived (eventually compressed) files is in pink and brown. In that example, there is only one file in the archive.

As you see, an implementation of Zip64 may be quite tricky, but it is feasible, if we have "the big picture" before our eyes.

Alas, things are a little bit more complicated than it appears when reading the article too fast...

We want, with Zip-Ada, to decode and unpack correctly archives not only made by Zip-Ada, but also by Info-Zip's Zip, 7-Zip, WinZip, and so on (the converse is also true: we want the other guys to be able to unpack our Zip-Ada archives).
For that reason, we need to anticipate all possible ways the data headers have been written, not only our choices. In that respect, we have had some headache with the per-entry "size" Zip64 extensions (the parts in brown and dark green in the chart).
The documentation (4.5.3) describes, for that, a record with a variable number of values:

8 bytes    [A] Original uncompressed file size
8 bytes    [B] Size of compressed data
8 bytes    [C] Offset of local header record
4 bytes    [D] Number of the disk on which this file starts


and defines in plain English some rules about the order and in which circumstances the values are present or not.

The documentation requires both [A] and [B] for the Local Header (part in brown in the chart). Let us figure out the explanation to that rule. For a program that creates the archive, it would not be practical to put only the uncompressed size before compressing the data, because the program would have to shift all the compressed content by 8 bytes in case the compressed size (which is known only after the compression) happened to also exceed 2**32 - 2. Hence, exactly and always two values if the uncompressed size (which is usually known in advance) exceeds the limit.

When it is time for the archiving program to write the Central Directory (the part that concludes the archive, in green, light and dark, in the chart), all information about sizes is known. An archiving program may choose to put only the necessary values in 64 bits. However, traditionally, you would expect that the variable record's contents are self-describing and would contain [A], or: [A] and [B], or: [A] and [B] and [C], or: [A] and [B] and [C] and [D], given the record's size.
With Zip64, it is not the case: each field is optional. This micro-optimization is ridiculous given the fact that the size of archive is breaking the 4 GiB limit. To make things worse, the documentation is not clear about that full optionality.
An archiver (7-Zip does so for instance) may put just [A], the uncompressed size of one first large file that is well compressible, or just [C], the offset of one small file following a file with a large size in the archive. Or it could put [A], [B], [C] in both cases (as we do for Zip-Ada's archive creation). So, we've adapted the archive reader to take all possible cases into consideration.

Thanks to Nicolas Brunot for pointing that (hopefully) ultimate difficulty with Zip64.

Latest sources:
 
Link 1: https://github.com/zertovitch/zip-ada/ 
  With git: git clone https://github.com/zertovitch/zip-ada.git
  As Zip-ball: green button "Code", choice: "Download ZIP"
 
Link 2: https://sourceforge.net/p/unzip-ada/code/HEAD/tree/
  With subversion: svn checkout https://svn.code.sf.net/p/unzip-ada/code/ za
  As Zip-ball: button: "Download Snapshot"

2022-05-14

Ann: HAC v.0.1

HAC (HAC Ada Compiler) is a quick, small, open-source Ada
compiler, covering a subset of the Ada language.
HAC is itself fully programmed in Ada.

Web site: http://hacadacompiler.sf.net/
From there, links to sources and an executable for Windows.

Source repositories:
  #1 svn: https://sf.net/p/hacadacompiler/code/HEAD/tree/trunk/
  #2 git: https://github.com/zertovitch/hac

* Main improvements since v.0.0996:

  - packages and subpackages are now supported
  - modularity: packages and subprograms can be standalone
      library units, stored in individual files with
      GNAT's naming convention, and accessed from other units
      via the WITH clause
  - validity checks were added for a better detection of
      uninitialized variables.

Package examples and modularity tests have been added.
Particularly, a new PDF producer package with a few demos
is located in the ./exm/pdf directory.


hakoch

Enjoy!

Gautier
__
PS: for Windows, there is an integrated editor that embeds HAC:
LEA: http://l-e-a.sf.net
PPS: HAC will be shown at the Ada-Europe conference (presentation + tutorial)
http://www.ada-europe.org/conference2022/

2022-05-10

HAC, packages and PDF

In order to illustrate a feature, what is better than a visual example?
In our case, the feature is the modularity, recently implemented in HAC (the HAC Ada Compiler).
So, a nice exercise was to try with a "real" visual package, PDF_Out (project Ada PDF Writer, link here).
Of course, HAC is still a small compiler project which doesn't cover all the sexy features of Ada - even those of the first version of Ada, Ada 1983, which already had extensive generics, aggregates (on-the-fly composite objects as expressions), default values in records or in subprogram parameters, exceptions or user-defined operators.
The easy way of down-stripping PDF_Out for HAC was to open the source code in the LEA editor (link here), and do frenetically and iteratively:

loop
  punch the compile key (F4)
  exit when the package compiles!
  make the code HAC-compatible (sometimes, it hurts!)
end loop


A few things that needed a downgrade for the HAC 0.1 language subset were:

  • Object-oriented type extension (tagged type for PDF stream). In the simplified version, PDF files are the only supported PDF streams.
  • Object.method notation
  • Default values in records, replaced by explicit initialization.
  • Inclusion of raster images (NB: the removal of that feature was unnecessary).
  • User-defined exceptions (PDF_stream_not_created, ...).
  • User-defined operators ("+", ...).
  • Indefinite page table and offset table.

The result can be seen here:


Click to enlarge

 

You can experiment it yourself by checking out or cloning the latest version of HAC:

#1 svn: https://sf.net/p/hacadacompiler/code/HEAD/tree/trunk/
#2 git: https://github.com/zertovitch/hac

The new PDF examples are located in the subdirectory ./exm/pdf




2022-04-12

HAC: packages - now, the whole package!

Drum rolls: HAC (the HAC Ada Compiler) now supports packages.

OK, there is still a slight limitation: library-level packages cannot contain variables and constants, nor an initialization part (begin .. end). Removing this limitation would be complex to implement and is postponed so far for HAC.

A new modularity demo can be found in the "exm" subdirectory.
The main file is "pkg_demo.adb".
It is a simple test for simulating the build of an application with numerous dependencies.
The dependencies take form of packages depending on two or more other packages, down to
a certain recursion depth.
Each package contains a procedure with the same name: "Do_it", which just writes a string related to the package name. For instance,  X_Pkg_Demo_B3.Do_it will write "[B3]", then call X_Pkg_Demo_B31.Do_it, X_Pkg_Demo_B32.Do_it, and X_Pkg_Demo_B33.Do_it.
The package dependencies can be seen the following chart:



NB: the depth and number of children can be set as you wish.

How to run the demo?
First, we generate the package tree.
For that, assume you have the "hac[.exe]" executable ready in the main directory.
Note: in what follows, replace '/' by '\' if you are on Windows.
In the "exm" directory, run from the command line: "../hac pkg_demo_gen.adb".
Alternatively, from LEA, load "pkg_demo_gen.adb" and run it (F9).
The packages ("x_*" files) will be generated.

Now, from the command line, run "../hac pkg_demo.adb" (or from LEA, load it and punch F9).

The result is:

     Specs:  [S][S1][S11][S12][S13][S2][S21][S22][S23][S3][S31][S32][S33]    
     Mixed:  [M][M1][M11][M12][M13][M2][M21][M22][M23][M3][M31][M32][M33]    
     Bodies: [B][B1][B11][B12][B13][B2][B21][B22][B23][B3][B31][B32][B33]    

    
If you are curious about how HAC does the whole build the 40 units (main procedure and 39 packages),
add the "-v2" option to the build command: "../hac -v2 pkg_demo.adb".
On an i7 machine (with HAC itself built in "Fast" mode), the time elapsed
for the entire build is around 0.08 seconds. Only one core is used.
Since GNAT builds also the library "on the fly" by automatically
finding dependencies, you can see it in action by doing:
"gnatmake -I../src -j0 pkg_demo"
Interestingly, GNAT compiles the packages for each depth successively.
Then you can see the result by doing:
"./pkg_demo"
Of course, it's identical to HAC's. 

---

HAC (the HAC Ada Compiler) is a small, quick, open-source Ada compiler,
covering a subset of the Ada language.
HAC is itself fully programmed in Ada.

Web site: http://hacadacompiler.sf.net/

Source repositories:
#1 svn: https://sf.net/p/hacadacompiler/code/HEAD/tree/trunk/
#2 git: https://github.com/zertovitch/hac
 

Enjoy!

Gautier


2022-03-27

HAC: packages (work in progress)

Since a few commits, HAC parses packages! But so far there are limitations : it's "work in progress".

Nevertheless you can already play a bit with the things already programmed in HAC.

File prc.ads:

with Pkg_1, Pkg_2;

procedure Prc is
  x : Pkg_1.A;
 
  use Pkg_1, Pkg_2;

  xg : G;
  y : B;
  z : Pkg_1.E;

begin
  x := 0;
  y := 0;
  z := 2;
end;

File pkg_1.ads:

package Pkg_1 is

  subtype A is Integer;
  subtype B is Pkg_1.A;
  subtype C is B range 0 .. 1000;
 
  package Sub_Pkg is    
    subtype D is C range 1 .. 100;    
  end Sub_Pkg;  
 
  subtype E is Sub_Pkg.D range 2 .. 10;
 
end Pkg_1;

File pkg_2.ads:

with Pkg_1;

package Pkg_2 is 

  subtype F is Pkg_1.E;
  use Pkg_1.Sub_Pkg;
  subtype G is D;
 
end Pkg_2;

Now, from the command-line (assume you are in the main hac directory), do:

hac -v2 prc.adb

*******[ HAC ]*******   Compiler version: 0.0997 dated 19-Mar-2022.
*******[ HAC ]*******   Caution: HAC is not a complete Ada compiler. Type "hac" for license.
. . . .[ HAC ]. . . .   Compiling main: prc.adb
. . . .[ HAC ]. . . .   .Compiling: pkg_1.ads (specification)
. . . .[ HAC ]. . . .   .Compilation of pkg_1.ads completed
. . . .[ HAC ]. . . .   .Compiling: pkg_2.ads (specification)
. . . .[ HAC ]. . . .   .Compilation of pkg_2.ads completed
. . . .[ HAC ]. . . .   Compilation of prc.adb (main) completed
. . . .[ HAC ]. . . .   --  Compilation of eventual with'ed unit's bodies  --
. . . .[ HAC ]. . . .   Build finished in 0.001255600 seconds.
. . . .[ HAC ]. . . .   Object code size: 12 of 100000 Virtual Machine instructions.
. . . .[ HAC ]. . . .   Starting p-code VM interpreter...
-------[ HAC ]-------   VM interpreter done after 0.045569300 seconds.
Execution of prc.adb completed.
Maximum stack usage: 10 of 4000000 units, around 0%.

If you want a more complex example of how HAC can create the library "on the fly" by compiling recursively everything needed to build the main procedure, it is already in the examples (subdirectory exm):

*******[ HAC ]*******   Compiler version: 0.0997 dated 19-Mar-2022.
*******[ HAC ]*******   Caution: HAC is not a complete Ada compiler. Type "hac" for license.
. . . .[ HAC ]. . . .   Compiling main: unit_a.adb
. . . .[ HAC ]. . . .   .Compiling: unit_b.ads (specification)
. . . .[ HAC ]. . . .   .Compilation of unit_b.ads completed
. . . .[ HAC ]. . . .   .Compiling: unit_c.adb (body)
. . . .[ HAC ]. . . .   .Compilation of unit_c.adb completed
. . . .[ HAC ]. . . .   Compilation of unit_a.adb (main) completed
. . . .[ HAC ]. . . .   --  Compilation of eventual with'ed unit's bodies  --
. . . .[ HAC ]. . . .   .Compiling: unit_b.adb (body)
. . . .[ HAC ]. . . .   ..Compiling: unit_e.adb (body)
. . . .[ HAC ]. . . .   ..Compilation of unit_e.adb completed
. . . .[ HAC ]. . . .   ..Compiling: unit_f.ads (specification)
. . . .[ HAC ]. . . .   ...Compiling: unit_g.ads (specification)
. . . .[ HAC ]. . . .   ...Compilation of unit_g.ads completed
. . . .[ HAC ]. . . .   ..Compilation of unit_f.ads completed
. . . .[ HAC ]. . . .   .Compilation of unit_b.adb completed
. . . .[ HAC ]. . . .   --  Compiled bodies: 1
. . . .[ HAC ]. . . .   .Compiling: unit_f.adb (body)
. . . .[ HAC ]. . . .   .Compilation of unit_f.adb completed
. . . .[ HAC ]. . . .   .Compiling: unit_g.adb (body)
. . . .[ HAC ]. . . .   .Compilation of unit_g.adb completed
. . . .[ HAC ]. . . .   --  Compiled bodies: 2
. . . .[ HAC ]. . . .   Build finished in 0.006483700 seconds.
. . . .[ HAC ]. . . .   Object code size: 350 of 100000 Virtual Machine instructions.
. . . .[ HAC ]. . . .   Starting p-code VM interpreter...
Unit_A: demo of modularity features for subprograms.

------

HAC (HAC Ada Compiler) is a small, quick, open-source Ada compiler,
covering a subset of the Ada language.
HAC is itself fully programmed in Ada.

Web site: http://hacadacompiler.sf.net/

Source repositories:
#1 svn: https://sf.net/p/hacadacompiler/code/HEAD/tree/trunk/
#2 git: https://github.com/zertovitch/hac
 

Enjoy!


Gautier

HAC as an embedded compiler

Here is an example to illustrate how the HAC compiler is embedded in an Ada application. The user of the application has typically only an e...