Amc is an extensible generator of source code from ssimfiles.
Amc reads ssim tables, creates C++ in-memory database code as determined by these tables,
and outputs .h and .cpp files.
By default, ssim tables are loaded from directory "data", and output
is written to directories cpp/gen
and include/gen
.
Amc generates hash tables, arrays, linked lists, dequeues, binary heaps; trees; hierarchical, region-based memory allocators, including single and powers-of-two freelists, fifo, linear array (vector), indirect array (with permanent pointers); inline arrays and fixed-size arrays; Functions to convert any struct to/from a string or a bash command line; Enum support, both for integer and string values; presence masks; big-endian fields; sort functions on custom fields; incremental group-by indexes; tracking of pointers with automatic cascade delete; protection against linear scanning when deleting elements; scheduling constructs (for real-time modules); cycle accounting ('traces'). C++ symbols from ssimfile columns; Statically loaded tables; Subprocess invocation. Asynchronous I/O. Bitsets on top of any array type. Char sets; Fixed string types (Pascal strings, Left-padded strings, Right-padded strings, Fixed-length strings with numeric conversion); Scaled decimal types; Dispatches (any group of ctypes), sharing a common header with a type field, or not. Printing, reading, calling dispatches given both binary and text input. Uniform cursor (iterator) interfaces for bheap, hash, tree, array, lines, files in directory, and more. I'm sure I'm forgetting something.
For each program, these things are generated in-place and
from scratch, and can be iteratively customized.
The resulting code forms a database of source code,
containing a superset of functions that will be used by the final
application. The generated code is verbose, user-readable, properly commented,
is intended to be readable by a human, corresponds directly to the final assembly
code, and uses only a small, conservative subset of C++.
Amc
does not modify or reverse-engineer user code, so it's not a framework
where you have to "plug in" anything. It is a tool for constructing software
based on your specifications.
Amc
loads about 100 ssim tables. The full list can be obtained with
acr_in amc
. The exact actual amc
input can be printed with acr_in amc -data
.
About 20% of these tables are responsible for 80% of the generated code, the rest deal with finer details.
Amc
was initially coded by hand, but once its capabilities became powerful enough, it was used to
generate data structures for its next version. As a result, all of Amc
's internal data structures,
both input, and computational, are defined as ssim tuples and can be queried with acr ns:amc -xref
.
Good algorithms and data structures for most problems are known. The problem is attaching them to an application. Usually the costs associated with using algorithms are:
- Performance cost and complexity cost when using libraries.
- Complexity cost due to symbol renamings (what happens with the C++ template sublanguage or with macro preprocessors).
- Debugging and reliability cost when hand-coding algorithms.
- Maintenance cost due to having too many lines of code (technical debt).
- Unexpected changes in upstream generic libraries.
The motivation for writing generators is that writing code for reusability doesn't work. The reason it doesn't work is that the definition of correctness doesn't lie with the piece of reusable code -- it lies with the final application. And so the reusable code always offers to the application too much and at the same time not enough. You need a singly linked list, but the library uses a doubly linked list. You need an extra index, but the library author didn't anticipate it. You have your own strategy for memory management, but the library insists on its own. And you can't customize the library, since for every feature, there is already some user out there who needs it to stay the same. And when you update to the next version of the library, you get tons of features you didn't ask for. Code written for reusability rarely reaches its intended potential in terms of either performance or utility.
Leaving aside reusability for a moment, as can be seen from real life examples, all high-performance systems are hand-coded due to highly specific requirements, and because it allows writing only what's needed, debugging that, and leaving it to run and do its job indefinitely. Yet hand-coding is difficult and requires a lot of debugging and chasing of memory errors and dangling pointers.
All of this may not matter when the problems are small and requirements negligible, but it really starts to matter when you are pushing against the limits of a single machine. The difference between code that runs in 1 millisecond and 10 milliseconds is eventually the difference between 10 servers and 100 servers.
Thus we have a mini-max problem, which is the first sign of a problem worth solving. On one hand, we want maximally customizable code that does only what we want. On the other hand, we want to write and debug the minimal amount of code.
Software complexity models such as COCOMO estimate the defect rate to grow as an exponential function of number of source lines. This exponent should not be underestimated. It is about 1.2 for a national stock exchange with real-time and embedded constraints, meaning that to write 1,000 lines of code, it costs you 3,981 units of effort. And if you write 100,000 lines of code, you pay a full 1,000,000 units. This exponential nature of the cost of lines of code is closely related to the cost of borrowing money. You basically have to pay back with interest, making every line of code in a very real sense "technical debt".
Massive code bases can slow development to a crawl and destroy projects. I don't want to present a macabre listing of all the software projects that failed because they had too much code; I will just assume that we all agree on the validity of the following equation:
(Project Goodness) = W*(Features) - A*(Amount of Code Written)^B
In other words, we like the features of our system, with some weight W
,
and we pay with interest B
and weight A
for the amount
of code we wrote to get these features.
With that, let's see how far we are able to go in solving this problem, and what kinds of cool tricks we can make our generator do.
To run amc
, just type amc
. All of its inputs come from ssimfiles, so no arguments
are necessary. The default invocation prints something like this:
$ amc
report.amc n_cppfile:123 n_cppline:258301 n_ctype:970 n_func:22524 n_xref:460 n_filemod:0
Amc
processes generates about 1M LOC per second.
Of course, this performance is not reflected in
the final executable, which means that adding new checks to amc is effectively free.
It is important that amc outputs are versioned in git, so we can trace the origin of any change (with git annotate), and continue to make changes to amc without breaking existing code.
When amc is given an argument, it runs in query mode. Instead of modifying source files it simply prints to stdout all internal code sections whose key matches the specified regex (typically it's a ctype name or a function name).
This is the fastest way to check how a certain function is implemented.
amc amc.%
This would generate all functions which are used to compile amc itself. The apparent circularity
exists because at some point, functions implementing amc were written by hand,
and then amc was modified to generate them and save them to cpp/gen/amc_gen.cpp
.
Please see The Algorithm For Generating Any Code
To limit amc's output to prototypes only, specify -proto
:
amc amc.% -proto
It generates 15-25 lines of code per 1 line of ssimfile specification. It is easy enough to check this claim:
$ acr ns:amc -t | wc -l
2010
$ amc amc.% | wc -l
47431
The specification can be edited manually and adjusted frequently with Unix tools such as
sed and perl, or by issuing acr_ed
commands. This makes the cost of
ssim specifications lower than the cost of regular code.
The sandbox mode is provided to help add features to amc without creating a dead-end (where generated code is broken, but cannot be re-generated because amc needs to be rebuilt)
abt -testgen
compiles a debug version of amc, creates the sandbox directory .testgen, runs amc in the .testgen
directory, shows a diff between current and sandboxed changes, builds all unit tests inside the sandbox
and runs the tests. The command succeeds if all of the steps succeed.
Amc source code is located under cpp/amc. The list of all the source files and headers
can also be examined with acr targsrc:amc/%
amc inpt tables are in data/amcdb
and data/dmmeta
; The full list can be obtained
with acr_in amc
.
The Relational Model is a universal way to represent knowledge, first described by Edgar Codd. It is related to Zermelo–Fraenkel set theory.
It is a good foundation. In a relational data model, individual records are represented by tuples (what we call a ctype). Each attribute (what we call a field) is either a basic type, or a reference to some other tuple.
The reference can take several forms -- it could be a direct pointer (Upptr), or a primary key (Pkey). In either case, the whole reason we write compiled language programs instead of solving all problems using SQL and MariaDB, is that reversing the direction of reference lookup -- finding all records that point to a given record -- is expensive. Cheaper cross-references is really the reason why most programs exist in the first place. In database terms, a cross-reference is a group-by; in amc, cross-reference is always incremental -- automatically updated as new records are added or removed from tables.
A cross-reference is established by use of an xref record. Here is a random example from amc, which needs to keep track of targets and their dependencies.
dmmeta.field field:amc.FTarget.c_targdep arg:amc.FTargdep reftype:Ptrary dflt:" comment:"
dmmeta.ptrary field:amc.FTarget.c_targdep unique:Y
dmmeta.xref field:amc.FTarget.c_targdep inscond:true via:amc.FDb.ind_target/dev.Targdep.target
This says: Dear amc, whenever a targdep
record is inserted in my program, find an instance
of target
by using global index ind_target
with key dev.Targdep.target
as the key, and
insert targdep
into an array of pointers rooted in target
.
Whenever a targdep
record is deleted, automatically remove it from the list.
(Removing from a Ptrary is expensive but the last part won't be needed).
The main xref types supported by amc are Ptr, Ptrary, Llist, Thash, Bheap, Atree and Count.
Xrefs can be easily added and removed either by hand (with acr -e
or by editing ssimfiles
directly), or using acr_ed
. In the example above, acr_ed -create -field amc.FTarget.c_targdep -write
would be enough to establish the xref.
There can be any number of xrefs between any two tables. So, if you want a record to be
a member of ten linked lists and eleven binary heaps -- you're free to do so.
Xrefs are exactly analogous to RDBMS indexes; except xrefs can be established between any two
tables, so they are also partitioned indexes
and incremental group by
s at the same time.
I would re-state here the fact that these xrefs, or indexes, are secondary to the data. The information needed to establish the xrefs must be contained in the data itself. This typically means extra information in the leaves.
It doesn't make sense to say "hash of X" or "tree of Y". Elements of X and Y are primary, and exist even if you remove all meaningful access paths to them. Instead, we speak of "access to X". The indexes are roads, and there can be many roads to the same place.
To visualize xrefs, it may be useful to use amc_vis
.
The main tables Amc uses as input are ns, ctype and field. Ns maps to a C++ namespace. Ctype corresponds to a struct. Field corresponds to a struct member or some derived value.
The main attributes of a field are its name, arg, and reftype. Arg is the type, and reftype is a modifier, or 'type constructor'.
dmmeta.field field:amc.FDb.ind_ns arg:amc.FNs reftype:Thash dflt:"" comment:""
dmmeta.field field:dmmeta.Field.field arg:algo.Smallstr100 reftype:Val dflt:" comment:"
Throughout the code base, you will see several string types in use. They are fairly straightforward, and they're all described below.
-
algo::strptr
: Length-delimited string pointer. There are two fields in this struct:char *elems, int n_elems
; The string is just n chars starting at elems; No null-termination is assumed or allowed. It's safe to use strptr in function prototypes when the argument isn't modified by the function -- all other string types can be converted to such temporary strptr and passed to a function. amc's hash find uses strptr as argument type whenever the key type is some kind of string. -
algo::cstring
: Tary of char. Contents can be cast to strptr. Fields are:char *ch, int ch_n, int ch_max
. operator = is defined for cstring, so these can be assigned like values -
algo::tempstr
: cstring, to be used only as the return type of a function, in order to avoid copying data. Has the property that when assigned to cstring, the contents are moved instead ("move constructor semantics").
- DO NOT assign tempstr, cstring, or a temporary variable to a strptr, since strptr is just a pointer.
- DO NOT pass cstring& to a function when strptr is sufficient.
- DO NOT return cstring from functions, it will result in extra alloc/copy/delete. Return tempstr instead.
In addition to these string types, which are fully sufficient for all practical purposes, there are many possible fixed-length string types that are generated by amc, mostly for protocol-handling and database interoperability purposes.
A lot of these are are already defined, but new ones can be built as needed. use "acr smallstr" for the full list. Brief explanation below.
-
algo::SmallstrN
: pascal string with N characters. Both data and length are part of the struct. Don't send this over the wire, because unused portions of the string may contain garbage from the stack. -
algo::RspaceStr{1..255}
: Fixed-length string field padded on the right with spaces. Can be sent over the wire (included in protocol types) -
algo::RnullStr{1..255}
-
algo::LspaceStr{1..255}
: similar to the above -
algo::LnumStr{1..N}_U{32,64}
: Fixed-length string field, padded on the left with '0' (ascii character '0', not the NUL character). Includes functions to convert to/from a u32/u64. Number cannot be negative, because left-padding with 0 prevents that. -
LnumStr{1..N}_U{32,64}_Base{B}
: Same as above, but different base.
The functions amc generates are all global functions. I.e. they are not member
functions of any class. The function name is usually built from the name of the related
field, plus a name. For example, a big-endian u32 called value
will cause a function
named value_Get
to be generated. A linked list field called zd_target
will cause
amc to create functions zd_target_Insert
, zd_target_Prev
, etc. These function names
are readable because the field name contains a hint about its type.
The table dmmeta.fprefix
defines a constraint between field prefix and a field reftype,
as interpreted by amc. If the field's name begins with the prefix (such as zd), followed
by an underscore, then the field must have the specified reftype.
The defined prefixes are:
bh
-> Bheapc
-> Ptr or Ptrarycd, cdl, cs, csl, zd, zdl, zs, zsl
-> Llistcnt
-> Countind
-> Thashp
-> Upptrtr
-> Atree
Additional prefixes may be defined by the user.
The main algorithm for generating any code (not just with amc) is simple:
- Manually write the code that you want to generate and make sure it works.
- Put a print statement around each line.
- Move the resulting code to the appropriate place in your generator.
- Run your generator. Now you have 2 copies of your code: one you started with, and the other that was printed by the generator. If you did everygthing right, you should get a link error now.
- Delete the manually written code.
- Parameterize the generator so that it can generate a whole family of implementations that look similar.
It is usually not a good idea to start programming new features in amc itself. It is very tiresome to debug such code. The code should already have been written by hand, possibly a couple of times, to the point where the duplication occurs, but the different implementations cannot be unified because of either unacceptable performance costs, or too many dependencies. Such code is to be lifted into a generator.
TBD
A scaled decimal type is an integer with an implied decimal point. For instance, a number
with two implied decimal places can be constructed with an fdec
record, specifying nplace:2
.
Attribute fixedfmt
controls whether trailing zeros are left in place when printing.
With fixedfmt:Y
, 0 is printed as 0.00
, otherwise it is just printed as 0.
Example:
dmmeta.field field:algo.U64Dec2.value arg:u64 reftype:Val dflt:"" comment:""
dmmeta.fdec field:algo.U64Dec2.value nplace:2 fixedfmt:Y comment:"0 prints as 0.00"
The scaled decimal type generates functions to convert between the stored (scaled) value
and a numeric value: value_qSetDouble
, value_SetDoubleMaybe
, value_GetDouble
, value_GetInt
.
// Set value of field value.
// The value is rounded to the nearest integer.
// This ensures that truncation of a near-integer value does not occur.
// Example: 1.3 cannot be represented exactly as a double, the actual
// stored value will be 1.29999999. when we apply C truncation,
// we want to end up with 1.3, not 1.2.
void value_qSetDouble(algo::U64Dec2& parent, double val) __attribute__((nothrow));
double value_GetDouble(algo::U64Dec2& parent) __attribute__((nothrow));
// Return integer portion (divide number by 100)
u64 value_GetInt(algo::U64Dec2& parent) __attribute__((nothrow));
// Return constant 100
u64 U64Dec2_GetScale() __attribute__((nothrow));
// Set value of field value, using rounding.
// If value is out of range for the target type, return false.
bool value_SetDoubleMaybe(algo::U64Dec2& parent, double val) __attribute__((nothrow));
Functions to print and read the value are also generated.
You can query the generated source for value_Print
with amc algo.U64Dec2.value.Print
bool value_ReadStrptrMaybe(algo::U64Dec2& parent, algo::strptr in) __attribute__((nothrow));
void value_Print(algo::U64Dec2& parent, cstring &outstr) __attribute__((nothrow));
Adding an fbigend
record to a field indicates that the in-memory byte order is big-endian.
dmmeta.field field:atf_amc.TypeBE16.value arg:u16 reftype:Val dflt:"" comment:""
dmmeta.fbigend field:atf_amc.TypeBE16.value comment:""
Amc generates values value_Get
and value_Set
to access the value.
u16 value_Get(const atf_amc::TypeBE16& parent) __attribute__((__warn_unused_result__, nothrow));
void value_Set(atf_amc::TypeBE16& parent, u16 rhs) __attribute__((nothrow));
When generating the C++ struct, amc adds _be
to the field name so that the programmer
becomes aware, should they choose to access it manually, that the field may be encoded.
$ amc atf_amc.TypeBE16%
// --- atf_amc.TypeBE16
struct TypeBE16 { // atf_amc.TypeBE16
u16 value_be; // 0
TypeBE16();
};
...
// --- atf_amc.TypeBE16.value.Get
inline u16 atf_amc::value_Get(const atf_amc::TypeBE16& parent) {
return be16toh(parent.value_be); // read big-endian value from memory
}
// --- atf_amc.TypeBE16.value.Set
inline void atf_amc::value_Set(atf_amc::TypeBE16& parent, u16 rhs) {
parent.value_be = htobe16(rhs); // write big-endian value to memory
}
...
Amc supports bitfields. In the example below, the field bits5
is defined
as bits 5..10 of the value of the field called value
. It is important that
the source field is explicitly specified. This eliminates any ambiguity about where
in memory the bitfield may be. The source field can be any integer type.
Bitfields also work identically between big- and little-endian source fields. The reason is that we define the bitfield in terms of a shift- and mask- operation on the source field. It is purely an arithmetic operation that produces the value of the bitfield.
If you've looked at Linux system headers for network protocols, you've seen ungainly #ifdef
blocks.
They are needed in C because C bitfields are defined against memory, not against the value of
a source field.
dmmeta.field field:atf_amc.BitfldType1.value arg:u64 reftype:Val dflt:"" comment:""
dmmeta.field field:atf_amc.BitfldType1.bits5 arg:u64 reftype:Bitfld dflt:"" comment:""
dmmeta.bitfld field:atf_amc.BitfldType1.bits5 offset:5 width:10 srcfield:atf_amc.BitfldType1.value comment:""
A bitfield can have any type - bool
, u64
, or a custom type that wraps an integer.
$ amc atf_amc.BitfldType1.bits5.% -proto
// Retrieve bitfield from value of field value
// 10 bits starting at bit 5.
u64 bits5_Get(const atf_amc::BitfldType1& parent) __attribute__((__warn_unused_result__, nothrow));
// Set bitfield in value of field 'value'
// 10 bits starting at bit 5.
void bits5_Set(atf_amc::BitfldType1& parent, u64 rhs) __attribute__((nothrow));
The implementation is straightforward:
$ amc atf_amc.BitfldType1.bits5.Set
// --- atf_amc.BitfldType1.bits5.Set
// Set bitfield in value of field 'value'
// 10 bits starting at bit 5.
inline void atf_amc::bits5_Set(atf_amc::BitfldType1& parent, u64 rhs) {
u64 t1 = u64(0x3ff) << 5;
u64 t2 = (u64(rhs) & 0x3ff) << 5;
parent.value = u64((parent.value & ~t1) | t2);
}
bitfields can be read from string tuples just like other fields. When printing a ctype containing bitfields to an ssim tuple, amc does not print the source field; instead, all the bitfields are printed instead.
When printing a ctype containing bitfields using the Raw
format, only the source field is printed,
and bitfields are skipped.
It is an error to specify a combination of offset and width that is out of bounds for the source type.
Amc will flag this. It is also an error to have two bitfields overlap. All of the offset+width
ranges must be disjoint.
Here is an example of the code amc generates when the source field is big-endian. (Endianness tends to be confusing, because languages try hard to erase the boundary between memory and values, so the user gets an extra warning).
$ amc atf_amc.TypeBE64sf.bit63.Get
// --- atf_amc.TypeBE64sf.bit63.Get
// Retrieve bitfield from value of field value
// 1 bits starting at bit 63.
// NOTE: bits correspond to the the integer value of the field.
// The value is obtained by reading bytes from memory and swapping them.
inline u64 atf_amc::bit63_Get(const atf_amc::TypeBE64sf& parent) {
return u64((value_Get(parent) >> 63) & 0x01);
}
UNDER CONSTRUCTION.
Steps are a scheduling construct to associate actions to fields.
A step can be thought of as a cooperative thread -- a function that's assigned to some top-level variable (pointer, bool, list or heap) that is invoked whenever that variable is non-empty, and advances the process towards its goal.
The amc model for a server process is a top-level while loop, which calls a top-level Step function for each namespace linked into the process. The namespace Step function then performs some action for each fstep defined in that namespace.
The main loop is defined in terms of real time: it executes until the value of CPU clock (rdtsc) reaches some limit (typically 2^64-1). The scheduling cycle begins by setting next_loop to the limit, then executing all the steps. the steps adjust next_loop down depending on their scheduling needs. At the end of the scheduling cycle, unused remaining time is given up to the OS via epoll or nanosleep (if there are no file descriptors open). This way, both hot-spinning and blocking processes are covered with minimal overhead.
The following step types are defined: Inline, InlineRecur, TimeHookRecur, Callback To define a step that is performed periodically on a timer, use fdelay.
Inline step is the simplest: on every pass, the empty condition is checked on the underlying field, and a user-provided Step function is called if the field is non-empty.
InlineRecur step requires an fdelay record specifying the initial delay between steps. The logic is the same as Inline, with a time-based delay between steps.
UNDER CONSTRUCTION. Ftrace record can be used to enable counting of alloc/delete calls for each pool.
UNDER CONSTRUCTION.
The Base reftype copies all fields from one arg to field's parent ctype. There are two main use cases.
-
Protocol messages, where Base is used to declare a common message header for all message types. Amc generates functions to go back and forth between header and specific type using Castdown / Castbase functions.
-
In-memory tables based on ssimfiles, which inherit all of the fields defined in the ssimfile. These do not allow casting back and forth since memory layouts differ, there are extra pointers in the in-memory version, some attributes are being used for cross-references (joins), etc.
UNDER CONSTRUCTION.
The binary heap is implemented as an flat array of pointers (e.g. a Ptrary).
UNDER CONSTRUCTION.
Bitsets can be created on top of any integer field (e.g. u8 to u128) or array field (Inlary, Tary). Amc generates functions to provide indexed access to bits of the underlying field.
UNDER CONSTRUCTION.
Count is a xref type that simply keeps track of the number of child elements referring to a given parent. The elements themselves are not accessible via this field.
UNDER CONSTRUCTION. This reftype is not specified explicitly. It is applied when fconst record appears. Each fconst record names a symbol and a C++ expression (value). The symbol is the string representation of the vaule. Amc creates an enum type with values, and creates to-string and from-string functions that translate between values and symbols.
A related record is Gconst, which can be used in place of fconst to create an enum type out of an entire table.
When printing, if there is no symbol associated with the numeric value of the field being printed, the value is printed as a number. When reading, if input symbol doesn't map to any known value, it is parsed as an integer instead.
UNDER CONSTRUCTION.
Inlary uses memory reserved directly inside the parent struct. A dmmeta.inlary record is required, specifying min,max elements. If min=max, it is considered a fixed array. Fixed array has neither Alloc nor Delete functions, and there is no count of elements. If min < max, then the first min elements are created when the parent is constructed, and the rest can be dynamically allocated. The Inlary supports random access because it is an array.
One particularly cool use of Inlary is with gstatic. Whenever gstatic is specified, the contents of an ssim table are ``statically'' included into the generated source file. When gstatic is combined with Inlary, amc creates a C++ compiler symbol with a name derived from the primary key of the source table. The symbol is a reference whose value is a compile-time constant since the offset to the variable is known. The symbol can be used at runtime to access the record, which is guaranteed to exist and be properly cross-referenced.
UNDER CONSTRUCTION.
UNDER CONSTRUCTION.
Amc generates 32 flavors of linked lists, almost as many as Heinz for ketchup. Linked lists can be:
Singly or doubly linked (as indicated by letter s or d in the prefix) Zero-terminated or circular (as indicated by letter z or c in the prefix) Fifo or lifo (as indicated by presence of letter l in the prefix, l for lifo) With or without count With or without tail pointer
Circular linked lists are often used with steps, because it is convenient to call RotateFirst to both grab an element off the head of the list, and move this element to the back.
UNDER CONSTRUCTION.
Opt corresponds to 0 or 1 values at the end of a struct. This type is frequently used by protocols to specify optional payload. Amc provides functions that can validate and access the payload.
UNDER CONSTRUCTION.
Amc provides convenient presence mask support. If any field in a ctype is marked Pmask, then amc allocates 1 bit in the pmask for each field in the struct. The bits are initially zero. Amc then generates a Set, PresentQ, and SetPresent functions for each field, unless one already exists. Whenever the Set function is called on a field, the corresponding pmask bit is set to 1. When reading from a string, the pmask bits are populated for each scanned attribute. When printing, fields with zero pmask bit are not printed.
Regardless of pmask status, all fields of a struct are initialized to default values upon construction, so there is no speed to be gained from using pmask. Access to fields is not protected in any way -- they can be accessed as if the pmask didn't exist. The generated PresentQ function can be used to check if a field was previously Set.
UNDER CONSTRUCTION.
Ptr is a cross-reference type used when there can be 0 or 1 child records for each parent record.
UNDER CONSTRUCTION.
Tary of Ptrs.
UNDER CONSTRUCTION.
When using this field type, amc ignores the field arg and inserts an algo_lib.Regx into the parent structure. The expression intended to match the primary key of the target type. This reftype is very useful in command line arguments.
UNDER CONSTRUCTION.
Smallstr is a fixed-length character field. Memory is reserved inline in the parent struct. Strings can be length-suffixed (Rpascal), left-padded or right-padded. For padded strings, the string value is calculated by discarding the pad characters from the edge. Any smallstr is castable to strptr.
UNDER CONSTRUCTION.
Hash table, implemented as a Tary of pointers. Collisions are implemented as a singly linked list. Hash tables grow automatically. Whenever the number of entries in the hash table is greater than the number of buckets (pointers), the hash table size is doubled. The size of the array of pointers is always a power of 2. This means that a hash function has to be good. Amc can generate hash functions for any ctype, typically CRC32 is used.
Hash tables can be unique of non-unique. A unique hash table rejects insertions of duplicate keys. A non-unique hash table allows them.
For records that have only one hash access path defined for them, amc generates a GetOrCreate function which is a convenient way to force creation of an element when you know its key.
UNDER CONSTRUCTION.
Upptr is a pointer to a pre-existing record, usually non-null. Upptr is a reference. By contrast, Ptr is a cross-reference. I.e. a Ptr becomes non-null when another record starts pointing at the parent of the Ptr. Upptr becomes non-null when a lookup in an index is performed. amc_vis will complain if there are circular dependencies implied by Upptr (i.e. A has Upptr to B, B has Upptr to A). Circular dependencies between Ptr fields are OK.
UNDER CONSTRUCTION.
Varlen corresponds to a (possibly empty) array of fixed-size records appended to the end of a struct. the parent ctype must have a length field defined. varlen fields cane be read from a string or iterated over. This type is frequently used by wire protocols to specify a repeated section of a message.
amc_gc
is a tool for removing unused records from the dmmeta database.
amc_gc
takes a target regex and an acr query as an argument. It finds all records
matching the query, then it tries to remove them one by one, and rebuilds the targets
(using abt) in a sandbox dir. If build succeeds with one of the records deleted
it means it wasn't needed in the first place.
Let's illustrate amc_gc
by creating a new program and inputting a table.
$ acr_ed -create -target sample -write
$ acr_ed -create -finput -target sample -ssimfile dmmeta.ns -write
Since the ns
table is unused, sample
will compile even if we remove it. This is the
case that amc_gc
detects, and can remove the table:
$ amc_gc -target:sample -key:ctype:sample.%
amc_gc.begin tot_rec:2 n_cppline:259802 watch_cmd:"watch head -50 temp/amc_gc.build"
amc_gc.analyze query:dmmeta.ctype:sample.FDb eliminate:N rec_no:1 tot_rec:2 n_del:0 n_cppline:259802 n_cppline_del:0
amc_gc.analyze query:dmmeta.ctype:sample.FNs eliminate:Y rec_no:2 tot_rec:2 n_del:1 n_cppline:259341 n_cppline_del:461
report.amc_gc key:ctype:sample.% n_match:2 n_del:1 n_cppline:259341 n_cppline_del:461
And indeed, amc_gc
successfully garbage collects the table.
Let's finish by deleting the unused target
$ acr_ed -del -target sample -write
Follow the steps below to create a new sample program.
The program will print the names of all of its own structures, and their fields, cross-referenced twice: first, by membership and then by computing all back-references.
This seems like an appropriately self-referential way to say hello using the tools at our disposal. Having a program print its own data structure is also mind-boggling if you think about it for too long.
Use this as a starting point, or to get motivated to read one of the tutorials.
acr_ed -create -target hi -write
cat > cpp/samp/hi.cpp << EOF
#include "include/hi.h"
void hi::Main() {
prlog("Hello Meta World!");
ind_beg(hi::_db_ctype_curs,ctype,hi::_db) {
if (ns_Get(ctype) == dmmeta_Ns_ns_hi) {
prlog("ctype "<<ctype.ctype);
ind_beg(hi::ctype_zd_field_curs,field,ctype) {
prlog(" has field "<<field.field<<" of type "<<field.arg<<" reftype:"<<field.reftype);
}ind_end;
ind_beg(hi::ctype_zd_arg_curs,arg,ctype) {
prlog(" is referred to by field "<<arg.field<<" using "<<arg.reftype);
}ind_end;
}
}ind_end;
}
EOF
acr_ed -create -finput -target hi -ssimfile:dmmeta.ctype -indexed -write
acr_ed -create -finput -target hi -ssimfile:dmmeta.field -write
acr_ed -create -field:hi.FField.p_ctype -arg:hi.FCtype -xref -via:hi.FDb.ind_ctype/dmmeta.Field.ctype -write
acr_ed -create -field:hi.FCtype.zd_field -arg:hi.FField -xref -via:hi.FDb.ind_ctype/dmmeta.Field.ctype -write
acr_ed -create -field:hi.FCtype.zd_arg -arg:hi.FField -xref -via:hi.FDb.ind_ctype/dmmeta.Field.arg -write
abt -install hi
hi
For each ctype
, instances of which can be dynamically allocated
(i.e. not on the stack), amc generates two functions: Alloc
and Delete
.
This section will explain all of the available pool types, how to chain them so that one pool allocates its own memory from another, and provide examples.
In amc
, memory pools are fields with specific reftypes. They are given names and
they can be referred to.
A memory pool can be thought of as any mechanism by which new instances of records can be allocated. Amc provides the following base set of memory pools, listed here in alphabetical order.
- Blkpool - A mostly-fifo allocator based on refcounted blocks.
- Delptr - Indirect Val; A single value, always freed in destructor.
- Inlary - A piece of memory for min..max elements inside the parent struct.
When
min == max
, there is noAlloc
function, and it behaves like NVal
s. Whenmin < max
, new elements can be allocated. Only the last element can be freed. - Lary - Effectively an Inlary of 32 Tarys, each of size 2^k. Has permanent pointers.
- Lpool - 32 Tpools, each for elements of size up to 2^k.
- Malloc - Simply calls malloc() / free().
- Sbrk - Calls sbrk() to get memory.
- Tary - Pointer to a resizable array (typically growing by powers of 2). Similar to std::vector.
- Tpool - A singly linked list of free elements; Gets new memory from base pool, frees elements into the list.
- Val - A single value, automatically allocated in constructor, and freed in destructor.
A pool is declared like a field.
dmmeta.field field:acr.FDb.file arg:acr.FFile reftype:Lary dflt:"" comment:"List of all loaded files"
This provides a hook for amc
to generate the functions initializing
and maintaining the state of the pool.
Amc keeps track of all the pools which claim to be able to provide memory for a given ctype.
In the example above, amc
would generate functions file_Alloc
and file_Delete
.
The names of these functions are not derived from the ctype; They are derived
from the pool name. It is the pool's state that gets altered when one of these functions
is called.
For any pool, you can specify where to get memory from. This is called a basepool,
and is specified with a dmmeta.basepool
record.
In the example below, the Lpool
algo_lib.FDb.lpool
calls sbrk_AllocMem
whenever it needs more memory.
dmmeta.field field:algo_lib.FDb.sbrk arg:u8 reftype:Sbrk dflt:"" comment:"Base allocator for everything"
dmmeta.field field:algo_lib.FDb.lpool arg:u8 reftype:Lpool dflt:"" comment:"private memory pool"
dmmeta.basepool field:algo_lib.FDb.lpool base:algo_lib.FDb.sbrk
The basepool is a really great feature. Imagine a realtime program with strict memory requirements; Everything must be pre-allocated and system calls are not allowed during runtime, since they would cause latency spikes 1000x of what's allowed. The configuration of memory pools can be done entirely outside of the source code base. One can write straightforward bash scripts that try different memory pools and measure their relative performance.
To avoid having to specify a base pool for every single pool in a given
namespace, you can (actually, you must) specify a default namespace pool in the dmmeta.nsx
record.
dmmeta.nsx ns:algo_lib genthrow:N correct_getorcreate:Y pool:algo_lib.FDb.lpool sortxref:N pack:N fldoffset_asserts:N comment:""
There are two ways to run out of memory: voluntarily (because you decide a table got too big)
and involuntarily (the OS refuses to give you more memory). The first method is on the user --
amc
does not support pool limits. When the OS is out of memory, the function either exits
the calling process (it's a fatal error) or returns NULL. A function
with an unambiguous name such as Alloc
will kill the process when an out-of-memory condition occurs.
A function ending in Maybe
, such as AllocMaybe
, will return NULL.
One of the use cases of amc
is to generate deterministic processes. Think of two programs, running
in parallel on two different physical hosts, processing the same sequence of messages. The output
of both of these processes is sent to the same destination and de-duplicated based on sequence numbers.
This is a hot-hot redundancy scenario with great latency characteristics (the faster of the two messages
becomes the output, shaving off some latency spikes). In any case, the output of the two proceses
must be identical and depend only on the input. This means that a process is not allowed to strategize
around low-memory conditions. That's why exiting on out-of-memory is a valid, in fact the only possible
strategy.
The block pool is a free-list of large-ish blocks. Memory is allocated from the current block, with others serving as reserve. Allocated elements contain a back-pointer to the beginning of the block, and increment a refcount on the block. Elements are allocated back-to-back starting at the beginning of the block until the block is exhausted, respecting the necessary alignment.
When freeing memory, the refcount is decremented but memory cannot be reused (yet). Only when the block's refcount goes to zero, the entire block goes back to the free list. This allocator is suitable for messages that are being processed using a fifo strategy. It is not good for random access patterns because one unfreed message can hold an entire block in memory, eventually exhausting memory.
Delptr is an indirect Val. The parent struct gets a pointer which is initially NULL
.
When you call _Access
, and the point is still NULL
, a new value is allocated.
When the the struct is deleted, the value is freed.
Here is an example of a Delptr
field:
dmmeta.ctype ctype:atf_amc.DelType1 comment:"Delptr test 1"
dmmeta.field field:atf_amc.DelType1.u32val arg:u32 reftype:Delptr dflt:34 comment:""
Here is the generated struct:
// --- atf_amc.DelType1
struct DelType1 { // atf_amc.DelType1: Delptr test 1
u32* u32val; // Private pointer to value
DelType1();
~DelType1();
private:
// reftype of atf_amc.DelType1.u32val prohibits copy
DelType1(const DelType1&){ /*disallow copy constructor */}
void operator =(const DelType1&){ /*disallow direct assignment */}
};
And here is the implementation of _Access
.
The underlying memory pool used is the one associated with its namespace
via the nsx
record.
// --- atf_amc.DelType1.u32val.Access
// Get or Create
// Access value, creating it if necessary. Process dies if not successful.
u32& atf_amc::u32val_Access(atf_amc::DelType1& parent) {
u32 *ret=parent.u32val;
if (!ret) {
ret = (u32*)algo_lib::malloc_AllocMem(sizeof(u32));
if (!ret) {
FatalErrorExit("out of memory allocating u32 (in atf_amc::DelType1.u32val)");
}
new(ret) u32(34);
parent.u32val = ret;
}
return *ret;
}
Tbd
Lary is implemented as 32 pointers in the parent struct. Level k holds as pointer to a block of elements of length 2^k. Indexed lookup involves just 1 memory access, because amc uses BitScanReverse to find which level the element lives on. When a level is exhausted, another level, 2x the size, is allocated. Since none of the previous levels need to be reallocated, the pointers returned by Lary are stable and so elements can be freely cross-referenced. Lary is the most common pool.
The permanent pointer promise is the main reason for Lary's existence. Lary is the default pool for most records. Here is an example of one:
dmmeta.field field:ssim2mysql.FDb.ctype arg:ssim2mysql.FCtype reftype:Lary dflt:"" comment:""
And here are the generated functions.
// Allocate memory for new default row.
// If out of memory, process is killed.
ssim2mysql::FCtype& ctype_Alloc() __attribute__((__warn_unused_result__, nothrow));
// Allocate memory for new element. If out of memory, return NULL.
ssim2mysql::FCtype* ctype_AllocMaybe() __attribute__((__warn_unused_result__, nothrow));
// Create new row from struct.
// Return pointer to new element, or NULL if insertion failed (due to out-of-memory, duplicate key, etc)
ssim2mysql::FCtype* ctype_InsertMaybe(const dmmeta::Ctype &value) __attribute__((nothrow));
// Allocate space for one element. If no memory available, return NULL.
void* ctype_AllocMem() __attribute__((__warn_unused_result__, nothrow));
// Return true if index is empty
bool ctype_EmptyQ() __attribute__((nothrow));
// Look up row by row id. Return NULL if out of range
ssim2mysql::FCtype* ctype_Find(u64 t) __attribute__((__warn_unused_result__, nothrow));
// Return pointer to last element of array, or NULL if array is empty
ssim2mysql::FCtype* ctype_Last() __attribute__((nothrow, pure));
// Return number of items in the pool
i32 ctype_N() __attribute__((__warn_unused_result__, nothrow, pure));
// Delete last element of array. Do nothing if array is empty.
void ctype_RemoveLast() __attribute__((nothrow));
// 'quick' Access row by row id. No bounds checking.
ssim2mysql::FCtype& ctype_qFind(u64 t) __attribute__((nothrow));
// Insert row into all appropriate indices. If error occurs, store error
// in algo_lib::_db.errtext and return false. Caller must Delete or Unref such row.
bool ctype_XrefMaybe(ssim2mysql::FCtype &row);
Lpool is 32 Tpools, one for each allocation size. When allocating memory, the request is bumped up to the nearest power of 2 and from there Tpool logic is followed.
Here is an example of an Lpool declaration:
dmmeta.field field:algo_lib.FDb.lpool arg:u8 reftype:Lpool dflt:"" comment:"private memory pool"
dmmeta.basepool field:algo_lib.FDb.lpool base:algo_lib.FDb.sbrk
And here are the generated functions. As always, the actual code
of these functions can be queried with amc algo_lib.FDb.lpool.%
// Free block of memory previously returned by Lpool.
void lpool_FreeMem(void *mem, u64 size) __attribute__((nothrow));
// Allocate new piece of memory at least SIZE bytes long.
// If not successful, return NULL
// The allocated block is 16-byte aligned
void* lpool_AllocMem(u64 size) __attribute__((__warn_unused_result__, nothrow));
// Add N buffers of some size to the free store
bool lpool_ReserveBuffers(int nbuf, u64 bufsize) __attribute__((nothrow));
// Allocate new block, copy old to new, delete old.
// New memory is always allocated (i.e. size reduction is not a no-op)
// If no memory, return NULL: old memory untouched
void* lpool_ReallocMem(void *oldmem, u64 old_size, u64 new_size) __attribute__((nothrow));
Pass-through to libc's malloc / free.
Tbd
Tary is a dynamically allocated resizable array of values. A single block of memory
is used for all elements.
Taking the address of a Tary element is not allowed, although it cannot be prevented
Records allocated with Tary cannot be cross-referenced, this is enforced by amc
.
algo.ByteAry
is defined as Tary of u8. algo.cstring
is defined as Tary of char.
When growing a full Tary (such as from Reserve or Alloc functions),
the size is always at least doubled.
Here is an example of a Tary field:
dmmeta.field field:algo.LineBuf.buf arg:char reftype:Tary dflt:"" comment:""
dmmeta.tary field:algo.LineBuf.buf aliased:Y comment:""
The aliased
attribute of the tary
record specifies whether functions involving
aryptr
will be generated.
When they are, it is possible that the aryptr
being passed to Addary
or Setary
is
a subrange of the array itself. A check is inserted for this condition, and it's a fatal
program error if the check fails. Even though amc
could adjust the incoming pointer
before and after calling Reserve
, the caller still has a bad aryptr
on their hands,
which means there is a program error.
The following functions are generated:
// Reserve space (this may move memory). Insert N element at the end.
// Return aryptr to newly inserted block.
// If the RHS argument aliases the array (refers to the same memory), exit program with fatal error.
algo::aryptr<char> buf_Addary(algo::LineBuf& parent, algo::aryptr<char> rhs) __attribute__((nothrow));
// Reserve space. Insert element at the end
// The new element is initialized to a default value
char& buf_Alloc(algo::LineBuf& parent) __attribute__((__warn_unused_result__, nothrow));
// Reserve space for new element, reallocating the array if necessary
// Insert new element at specified index. Index must be in range or a fatal error occurs.
char& buf_AllocAt(algo::LineBuf& parent, int at) __attribute__((__warn_unused_result__, nothrow));
// Reserve space. Insert N elements at the end of the array, return pointer to array
algo::aryptr<char> buf_AllocN(algo::LineBuf& parent, int n_elems) __attribute__((__warn_unused_result__, nothrow));
// Return true if index is empty
bool buf_EmptyQ(algo::LineBuf& parent) __attribute__((nothrow));
// Look up row by row id. Return NULL if out of range
char* buf_Find(algo::LineBuf& parent, u64 t) __attribute__((__warn_unused_result__, nothrow));
// Return array pointer by value
algo::aryptr<char> buf_Getary(algo::LineBuf& parent) __attribute__((nothrow));
// Return pointer to last element of array, or NULL if array is empty
char* buf_Last(algo::LineBuf& parent) __attribute__((nothrow, pure));
// Return max. number of items in the array
i32 buf_Max(algo::LineBuf& parent) __attribute__((nothrow));
// Return number of items in the array
i32 buf_N(const algo::LineBuf& parent) __attribute__((__warn_unused_result__, nothrow, pure));
// Remove item by index. If index outside of range, do nothing.
void buf_Remove(algo::LineBuf& parent, u32 i) __attribute__((nothrow));
void buf_RemoveAll(algo::LineBuf& parent) __attribute__((nothrow));
// Delete last element of array. Do nothing if array is empty.
void buf_RemoveLast(algo::LineBuf& parent) __attribute__((nothrow));
// Make sure N *more* elements will fit in array. Process dies if out of memory
void buf_Reserve(algo::LineBuf& parent, int n) __attribute__((nothrow));
// Make sure N elements fit in array. Process dies if out of memory
void buf_AbsReserve(algo::LineBuf& parent, int n) __attribute__((nothrow));
// Copy contents of RHS to PARENT.
void buf_Setary(algo::LineBuf& parent, algo::LineBuf &rhs) __attribute__((nothrow));
// Copy specified array into buf, discarding previous contents.
// If the RHS argument aliases the array (refers to the same memory), throw exception.
void buf_Setary(algo::LineBuf& parent, const algo::aryptr<char> &rhs) __attribute__((nothrow));
// 'quick' Access row by row id. No bounds checking.
char& buf_qFind(algo::LineBuf& parent, u64 t) __attribute__((nothrow));
// Return reference to last element of array. No bounds checking
char& buf_qLast(algo::LineBuf& parent) __attribute__((nothrow));
This pool type only supports fixed-size allocation. Free elements area stored in a singly linked list. if the list is empty, tpool uses the base allocator (or the namespace default allocator) to fulfill the request. The free list can be refilled with ReserveMem. The memory obtained by Tpool from the base allocator is never returned. Tpools can only be global (otherwise, memory leaks would occur). This is the fastest allocator, because it only takes a couple of instructions to peel a free element off of the free list.
Here is an example of a Tpool
:
dmmeta.field field:ssim2mysql.FDb.cmd arg:ssim2mysql.FCmd reftype:Tpool dflt:"" comment:""
And here is the generated code:
// Allocate memory for new default row.
// If out of memory, process is killed.
ssim2mysql::FCmd& cmd_Alloc() __attribute__((__warn_unused_result__, nothrow));
// Allocate memory for new element. If out of memory, return NULL.
ssim2mysql::FCmd* cmd_AllocMaybe() __attribute__((__warn_unused_result__, nothrow));
// Remove row from all global and cross indices, then deallocate row
void cmd_Delete(ssim2mysql::FCmd &row) __attribute__((nothrow));
// Allocate space for one element
// If no memory available, return NULL.
void* cmd_AllocMem() __attribute__((__warn_unused_result__, nothrow));
// Remove mem from all global and cross indices, then deallocate mem
void cmd_FreeMem(ssim2mysql::FCmd &row) __attribute__((nothrow));
// Preallocate memory for N more elements
// Return number of elements actually reserved.
u64 cmd_Reserve(u64 n_elems) __attribute__((nothrow));
// Allocate block of given size, break up into small elements and append to free list.
// Return number of elements reserved.
u64 cmd_ReserveMem(u64 size) __attribute__((nothrow));
// Insert row into all appropriate indices. If error occurs, store error
// in algo_lib::_db.errtext and return false. Call Unref or Delete to cleanup partially inserted row.
bool cmd_XrefMaybe(ssim2mysql::FCmd &row);
Val is the simplest reftype. A field of type val becomes a regular struct member
on output.
Here is an example of a Val
field:
dmmeta.ctype ctype:algo.UnTime comment:"unix time * 1e9 + nanoseconds"
dmmeta.field field:algo.UnTime.value arg:i64 reftype:Val dflt:"" comment:""
And here is what amc
produces on output:
// --- algo.UnTime
struct UnTime { // algo.UnTime: unix time * 1e9 + nanoseconds
i64 value; // 0
UnTime();
};
amc
bakes some information about a process into the process itself.
For each each namespace linked into a process, possessing an in-memory database, an algo_lib.FImdb
entry is defined, and added to the algo_lib.FDb.imdb
table. This table is indexed
with algo_lib.FDb.ind_imdb
.
Without loading any ssimfiles, a process can access the imdb
records.
Each record has the following fields:
imdb
: primary key. (e.g.algo_lib
orsample
)InsertStrptrMaybe
: pointer to a function that takes a strptr and inserts it into this namespace's in-memory database.Step
: pointer to a function that executes a single scheduler cycle.MainLoop
: pointer to the main loop function, if definedGetTrace
: pointer to a function that reads all trace variables from the process
In addition, the algo_lib.FDb.imtable
table contains the list of all in-memory tables.
Imtable
is also indexed, with ind_imtable
, and has the followin fields (amc -e algo.Imtable
):
imtable
: primary keyelem_type
: A string holding the ctype pkey of the records in this tablec_RowidFind
: Optional pointer to a function to find an element by rowidXrefX
: Optional function to x-reference the recordNItems
: Optional function that returns the number of records in the tablePrint
: Optional function to print one element as a stringsize
: With of each recordssimfile
: Pkey of ssimfile, if one is associated with the record.
Reflection is generally considered a very powerful mechanism, but in the OpenACR world it's not used that frequently. While the ability to insert records dynamically (i.e. outside of a pre-declared code-path) and knowing the list of namespace linked into a given program is useful, it's not as useful as simply loading the ssimtables that describe all processes in the given universe, and doing something with them. An analogy would be surgery on yourself, especially brain surgery. Powerful? Yes. Best practice? Hardly.
The single parameter to amc_vis
is a ctype regex,
and whatever ctypes are matched by the regex will be shown
via ASCII art.
Here is an example:
$ amc_vis amc.FCtype\|amc.FField
/ amc.FCtype
|
| / amc.FField
|Ptrary c_field---->|
|Ptr c_pkeyfield--->|
|Ptrary c_datafld-->|
|Llist zd_inst----->|
|<------------------|Upptr p_arg
|Ptr c_optfld------>|
|<------------------|Upptr p_ctype
|Llist zd_access--->|
|Ptr c_varlenfld--->|
- |
|
-
amc_vis can also output an dot file, which can then be viewed in a browser:
$ amc_vis -xref:N amc.FCtype\|amc.FField -dot xyz.dot
amc_vis.dot out_dot:xyz.dot out_svg:xyz.svg
$ firefox xyz.svg