Automating Object Data Compaction to Expanded Numeric Types
Introduction
A common memory saving technique in programming is to represent the expanded attributes (or fields) of an object as a single numeric type using bit-masks and bit-shifting. Compacted fields enable faster sorting as well as facilitate saving data to a binary file.
This method is error prone and tedious to code. This article presents the idea of using reflection mechanisms to automate the bit-twiddling code necessary to pack and unpack the attribute into a single numeric type.
Manual Compaction
The manual method of compaction is best illustrated by the DATE class from $ISE_LIBRARY/time/time.ecf
which most Eiffel coders should be familiar with. As you can see in the listing, this method necessitates the introduction of 2 extra constants per attribute to specify the bit-mask and bit-shift. Even using a programming calculator, calculating the correct masks and shifts is error prone. You then have to write a lot of bit-twiddling code to convert to and from the compacted object. On top of that the fields may not necessarily be of the same type as the compacted number, requiring additional type conversion operations.
DATE inherits DATE_VALUE
class DATE_VALUE
feature -- Access
day: INTEGER
-- Day of the current object
do
Result := ordered_compact_date & day_mask
end
month: INTEGER
-- Month of the current object
do
Result := (ordered_compact_date & month_mask) |>> month_shift
end
year: INTEGER
-- Year of the current object
do
Result := ((ordered_compact_date & year_mask) |>> year_shift) & 0x0000FFFF
end
ordered_compact_date: INTEGER
-- Year, month, day coded for fast comparison between dates.
feature -- Element change
set_date (y, m, d: INTEGER)
-- Set `year' with `y', `month' with `m' and `day' with `d'.
local
l_date: like ordered_compact_date
do
-- Same as
l_date := l_date & day_mask.bit_not
l_date := l_date | d
l_date := l_date & month_mask.bit_not
l_date := l_date | (m |<< month_shift)
l_date := l_date & year_mask.bit_not
l_date := l_date | (y |<< year_shift)
ordered_compact_date := l_date
end
feature {NONE} -- Implementation
day_mask: INTEGER = 0x000000FF
month_mask: INTEGER = 0x0000FF00
year_mask: INTEGER = 0xFFFF0000
-- Mask used to extract/set `year', `month' and `day'.
year_shift: INTEGER = 16
month_shift: INTEGER = 8
day_shift: INTEGER = 0
-- Shift needed to extract/set `year', `month' and `day'.
end -- class DATE_VALUE
Automated Compaction
To save development time, what is needed is a way of mapping any type of expanded field by name to bit-masks specified as a 1 based range. The range 1 .. 8 for example, can be mapped to either a NATURAL_8 type or CHARACTER_8. The range 9 .. 9 can be mapped to type BOOLEAN. After the map is specified we should not have to write any bit-twiddling code at all.
The test class illustrates an automated technique using reflection mechanisms and design-by-contract to spot potential errors.
In this example, the Eiffel-Loop implementation of EL_REFLECTED_FIELD_BIT_MASKS is used in conjunction with COMPACTABLE_DATE to compact 3 INTEGER fields into one NATURAL_64. However the types can be any the 14 expanded types found in the ELKS library.
As you can see in the example the code for compact_value and make_by_compact comes for free.
class
COMPACTABLE_DATE
inherit
EL_COMPACTABLE_REFLECTIVE
rename
compact_integer_32 as compact_date,
make_from_integer_32 as make_from_compact_date,
set_from_integer_32 as set_from_compact_date
end
create
make, make_from_compact_date
feature {NONE} -- Initialisation
make (a_year, a_month, a_day: INTEGER)
do
year := a_year; month := a_month; day := a_day
end
feature -- Access
day: INTEGER
-- Day of the current object
month: INTEGER
-- Month of the current object
year: INTEGER
-- Year of the current object
feature {NONE} -- Constants
Field_masks: EL_REFLECTED_FIELD_BIT_MASKS
once
create Result.make (Current, "[
day := 1 .. 8
month := 9 .. 16
year := 17 .. 32
]")
end
end -- class COMPACTABLE_DATE
Supporting Contracts
The classes EL_REFLECTED_FIELD_BIT_MASKS and COMPACTABLE_DATE contain the following design contracts to ensure that the bit-compaction and conversion is error-free.
- Require that bit-formatting text is correctly indented and the field names are valid
- Require that all expanded fields have been included in the formatting text.
- Require that ranges are properly formatted as either a .. b or a, where a and b are natural numbers.
- Require that any compacted NATURAL_64 value being converted to another NUMERIC type is less than or equal to 2^n - 1 where n is the upper-most mask-bit position (upper_bit).
- Require that any compact value used to initialise a compact-able object and conforming to NUMERIC must be less than or equal to 2^n - 1 where n is the upper-most mask-bit position (upper_bit).
- Ensure the generated bit-masks do not not overlap
- Ensure bit-ranges fit into a NATURAL_64 type
- Check that each bit-mask is wide enough to accommodate the field value currently being assigned.
A Real World Example
The class EL_HACKER_INTERCEPT_SERVLET is used to intercept hacking attempts into the websites myching.software and matryoshka.software. The servlet maintains a hash table of IP addresses mapped to a compact NATURAL_64 representation of class EL_FIREWALL_STATUS. This class has 3 attributes of types: INTEGER, NATURAL_16 and BOOLEAN. Any time you want to work with a table item, you must first initialise the shared constant of type EL_FIREWALL_STATUS. If you make any changes to the status you must then update the table with the new compacted value. This saves on garbage collection.
If the request url matches a configured pattern, the firewall is instructed to block packets from that IP address for a number of days.
(For the sake of brevity, many class features have been omitted in this source listing)
class
EL_HACKER_INTERCEPT_SERVLET
inherit
FCGI_HTTP_SERVLET
feature -- Access
firewall_status_table: HASH_TABLE [NATURAL_64, NATURAL]
-- map IP number to compact form of `EL_FIREWALL_STATUS' data
feature {NONE} -- Constants
Firewall_status: EL_FIREWALL_STATUS
once
create Result
end
end -- class EL_HACKER_INTERCEPT_SERVLET
class
EL_FIREWALL_STATUS
inherit
EL_COMPACTABLE_REFLECTIVE
rename
compact_natural_64 as compact_status,
make_from_natural_64 as make_from_compact,
set_from_natural_64 as set_from_compact
end
EL_SHARED_SERVICE_PORT
create
default_create, make_from_compact
feature -- Access
blocked_ports: EL_ARRAYED_LIST [NATURAL_16]
-- list of blocked ports
do
create Result.make (3)
if http_blocked then
Result.extend (Service_port.HTTP)
end
if smtp_blocked then
Result.extend (Service_port.SMTP)
end
if ssh_blocked then
Result.extend (Service_port.SSH)
end
end
feature -- Measurement
compact_date: INTEGER
-- date of intercept
feature -- Status query
http_blocked: BOOLEAN
smtp_blocked: BOOLEAN
ssh_blocked: BOOLEAN
feature {NONE} -- Constants
Field_masks: EL_REFLECTED_FIELD_BIT_MASKS
once
create Result.make (Current, "[
compact_date := 1 .. 32
http_blocked := 33
smtp_blocked := 34
ssh_blocked := 35
]")
end
end -- class EL_FIREWALL_STATUS
Benchmark Comparison
However the convenience of automated compaction does come at the cost of performance, with manual compaction substantially outperforming automated compaction. The benchmark test uses a round-trip of initialising by component, then initialising by compacted value, and finally comparing the attributes to original inputs.
COMPACTABLE_DATE versus DATE_VALUE
Passes over 1000 millisecs (in descending order)
Manual compaction : 6187.0 times (100%)
Automated compaction : 1535.0 times (-75.2%)
Round-trip Test Code
feature {NONE} -- Implementation
automated_date_compaction
local
date: COMPACTABLE_DATE; i, y, m, d: INTEGER
compact_date: NATURAL_64
do
y := 2023; m := 10; d := 25
from i := 1 until i > 1000 loop
create date.make (y, m, d)
compact_date := date.compact_date
create date.make_from_compact_date (compact_date)
if date.day /= d or date.month /= m or date.year /= y then
lio.put_line ("Conversion failed")
i := 10_000
else
i := i + 1
end
end
end