Three Memory Conservation Techniques

by Finnian Reilly (modified: 2022 Jun 23)


In modern times memory is very cheap and programmers don't have to give much concern to conserving memory. Although in the 2020's with disruptions to supply chains, RAM is becoming more expensive. But there are still some circumstances where conserving memory is important. A good example is in an Eiffel object to represent the JSON data from the query:<IP-address>/json

{ "ip": "", "version": "IPv4", "city": "Kensington", "region": "England", "region_code": "ENG", "country": "GB", "country_name": "United Kingdom", "country_code": "GB", "country_code_iso3": "GBR", "country_capital": "London", "country_tld": ".uk", "continent_code": "EU", "in_eu": false, "postal": "SW7", "latitude": 51.4957, "longitude": -0.1772, "timezone": "Europe/London", "utc_offset": "+0100", "country_calling_code": "+44", "currency": "GBP", "currency_name": "Pound", "languages": "en-GB,cy-GB,gd", "country_area": 244820.0, "country_population": 66488991, "asn": "AS8560", "org": "IONOS SE" } An Eiffel object to represent this data can easily have a memory-foot print that is many times larger than the JSON string. It is easy to imagine an application where you might want to store this information for 10's if not 100's of thousands of IP addresses. In that case it would be good idea to implement some memory conserving techniques.

In this article I will present three techniques with reference to an example from Eiffel-Loop, class EL_IP_ADDRESS_GEOGRAPHIC_INFO which inherits EL_IP_ADDRESS_GEOLOCATION. These classes use reflection classes from Eiffel-Loop to automate initialisation from JSON data.

NATURAL-X Compressed Codes

This technique can be applied when you know in advance that certain fields will always be a fixed length and of a size that can be stored in one of the types:

Eiffel-Loop has two classes which facilitate this:

  1. EL_CODE_STRING which automatically converts any NATURAL-X type to a string
  2. EL_CODE_REPRESENTATION (and it's 3 implementations) which convert a STRING_8 code to a NATURAL-X type.

A 6 character code can be represented in a compressed form by a NATURAL_64. The converted EL_CODE_STRING requires exactly 13 times more memory than the compressed form, which is a memory saving by an order of magnitude. (And there is still room for an extra 2 bytes in the code) The code string itself is 96 bytes, but you still need an 8 byte pointer to reference it, so the total is 104 bytes. Whereas a NATURAL_64 field is expanded, so only adds 8 bytes to the size of an object.

Representation Map

A code excerpt from class EL_IP_ADDRESS_GEOGRAPHIC_INFO showing a representation map and illustrating the use of class EL_CODE_STRING.

feature -- 8 byte code-strings asn_: EL_CODE_STRING -- autonomous system number -- Characters sequence 'AS' followed by a number that can be up-to 32 bit long (i.e. AS4294967295) do Result := asn end postal_: EL_CODE_STRING -- postal code / zip code do Result := postal end utc_offset_: EL_CODE_STRING -- UTC offset as +HHMM or -HHMM (HH is hours, MM is minutes) do Result := utc_offset end feature -- API 8 byte compressed codes asn: NATURAL_64 -- autonomous system number -- Characters sequence 'AS' followed by a number that can be up-to 32 bit long (i.e. AS4294967295) postal: NATURAL_64 -- postal code / zip code utc_offset: NATURAL_64 -- UTC offset as +HHMM or -HHMM (HH is hours, MM is minutes) feature {NONE} -- Implementation new_representations: like Default_representations do Result := Precursor + ["ip", Ip_address_representation] + ["asn", Code_64_representation] + ["utc_offset", Code_64_representation] + ["postal", Code_64_representation] + ["continent_code", Code_16_representation] + ["country", Code_16_representation] + ["country_code", Code_16_representation] + ["country_code_iso3", Code_32_representation] + ["country_tld", Code_32_representation] + ["currency", Code_32_representation] + ["region_code", Code_32_representation] + ["version", Code_32_representation] end

Mapping Fields to Hash-sets

Some fields like for example country_name which will obviously be repeated across many instances of EL_IP_ADDRESS_GEOLOCATION, can be mapped to a hash-set, which eliminates duplication of the same string.

feature {NONE} -- Implementation new_representations: like Default_representations do create Result.make (<< ["country_name", Country_representation] >>) end feature {NONE} -- Constants Country_representation: EL_HASH_SET_REPRESENTATION [ZSTRING] once create Result.make_default end

Compressed Object Tables

The 3rd and final technique is to write the object to a memory buffer which can be stored in a hash table as a MANAGED_POINTER and every time the client calls has_key the object can be reconstructed to a `found_item' attribute. This way the serialization/deserialization is hidden from the client.

Eiffel-Loop has a class EL_COMPRESSION_TABLE which implements this idea. It is defined as follows

class EL_COMPRESSION_TABLE [G -> EL_STORABLE create make_default end, K -> HASHABLE] inherit HASH_TABLE [MANAGED_POINTER, K] rename at as at_key, found_item as found_buffer, item as item_buffer, force as force_buffer, put as put_buffer export {NONE} all {ANY} has, current_keys redefine has_key, make end

Compression Table Example

In this example the compression ratio achieved is 24%.

test_compressed_table local geo_info_table: EL_COMPRESSION_TABLE [EL_IP_ADDRESS_GEOGRAPHIC_INFO, NATURAL] geo_info: EL_IP_ADDRESS_GEOGRAPHIC_INFO; compression_ratio: DOUBLE do create geo_info.make_from_json (JSON_eiffel_loop_ip) create geo_info_table.make (11) geo_info_table.put (geo_info, geo_info.ip) assert ("same object", geo_info = geo_info_table.found_item) compression_ratio := geo_info_table.size_compressed_item / geo_info.deep_physical_size lio.put_integer_field ("Compression ratio", (compression_ratio * 100).rounded); lio.put_string ("%%") lio.put_new_line geo_info_table.put (geo_info, geo_info.ip) assert ("same value", geo_info /= geo_info_table.found_item and geo_info ~ geo_info_table.found_item) assert ("same value", geo_info ~ geo_info_table.found_item) if geo_info_table.has_key (geo_info.ip) then assert ("same value", geo_info ~ geo_info_table.found_item) else assert ("table has geo_info", False) end geo_info_table.put (geo_info, geo_info.ip) assert ("same value", geo_info ~ geo_info_table.found_item) end

Economic Savings

Many hosted cloud services have graduated packages with RAM allocation being a factor in the price. Obviously if you can make your application use less memory, you can select a cheaper package.