## Overview¶

MicroEJ Linker is a standard linker that is compliant with the Executable and Linkable File format (ELF).

MicroEJ Linker takes one or several relocatable binary files and generates an image representation using a description file. The process of extracting binary code, positioning blocks and resolving symbols is called linking.

Relocatable object files are generated by SOAR and third-party compilers. An archive file is a container of Relocatable object files.

The description file is called a Linker Specific Configuration file (lsc). It describes what shall be embedded, and how those things shall be organized in the program image. The linker outputs :

• An ELF executable file that contains the image and potential debug sections. This file can be directly used by debuggers or programming tools. It may also be converted into a another format (Intel* hex, Motorola* s19, rawBinary, etc.) using external tools, such as standard GNU binutils toolchain (objcopy, objdump, etc.).
• A map file, in XML format, which can be viewed as a database of what has been embedded and resolved by the linker. It can be easily processed to get a sort of all sizes, call graphs, statistics, etc.
• The linker is composed with one or more library loaders, according to the platform’s configuration.

## ELF Overview¶

An ELF relocatable file is split into several sections:

• allocation sections representing a part of the program
• control sections describing the binary sections (relocation sections, symbol tables, debug sections, etc.)

An allocation section can hold some image binary bytes (assembler instructions and raw data) or can refer to an interval of memory which makes sense only at runtime (statics, main stack, heap, etc.). An allocation section is an atomic block and cannot be split. A section has a name that by convention, represents the kind of data it holds. For example, .text sections hold binary instructions, .bss sections hold read-write static data, .rodata hold read-only data, and .data holds read-write data (initialized static data). The name is used in the .lsc file to organize sections.

A symbol is an entity made of a name and a value. A symbol may be absolute (link-time constant) or relative to a section: Its value is unknown until MicroEJ Linker has assigned a definitive position to the target section. A symbol can be local to the relocatable file or global to the system. All global symbol names should be unique in the system (the name is the key that connects an unresolved symbol reference to a symbol definition). A section may need the value of symbols to be fully resolved: the address of a function called, address of a static variable, etc.

The linking process can be divided into three main steps:

1. Symbols and sections resolution. Starting from root symbols and root sections, the linker embeds all sections targeted by symbols and all symbols referred by sections. This process is transitive while new symbols and/or sections are found. At the end of this step, the linker may stop and output errors (unresolved symbols, duplicate symbols, unknown or bad input libraries, etc.)
2. Memory positioning. Sections are laid out in memory ranges according to memory layout constraints described by the lsc file. Relocations are performed (in other words, symbol values are resolved and section contents are modified). At the end of this step, the linker may stop and output errors (it could not resolve constraints, such as not enough memory, etc.)
3. An output ELF executable file and map file are generated.

A partial map file may be generated at the end of step 2. It provides useful information to understand why the link phase failed. Symbol resolution is the process of connecting a global symbol name to its definition, found in one of the linker input units. The order the units are passed to the linker may have an impact on symbol resolution. The rules are :

• Relocatable object files are loaded without order. Two global symbols defined with the same name result in an unrecoverable linker error.
• Archive files are loaded on demand. When a global symbol must be resolved, the linker inspects each archive unit in the order it was passed to the linker. When an archive contains a relocatable object file that declares the symbol, the object file is extracted and loaded. Then the first rule is applied. It is recommended that you group object files in archives as much as possible, in order to improve load performances. Moreover, archive files are the only way to tie with relocatable object files that share the same symbols definitions.
• A symbol name is resolved to a weak symbol if - and only if - no global symbol is found with the same name.

## Linker Specific Configuration File Specification¶

### Description¶

A Linker Specific Configuration (Lsc) file contains directives to link input library units. An lsc file is written in an XML dialect, and its contents can be divided into two principal categories:

• Symbols and sections definitions.
• Memory layout definitions.
Example of Relocation of Runtime Data from FLASH to RAM
<?xml version="1.0" encoding="UTF-8"?>
<!--
An example of linker specific configuration file
-->
<lsc name="MyAppInFlash">
<include name="subfile.lscf"/>
<!--
Define symbols with arithmetical and logical expressions
-->
<defSymbol name="FlashStart" value="0"/>
<defSymbol name="FlashSize"  value="0x10000"/>
<defSymbol name="FlashEnd"   value="FlashStart+FlashSize-1"/>
<!--
Define FLASH memory interval
-->
<defSection name="FLASH" start="FlashStart" size="FlashSize"/>

<!--
Some memory layout directives
-->
<memoryLayout ranges ="FLASH">
<sectionRef name ="*.text"/>
<sectionRef name ="*.data"/>
</memoryLayout>
</lsc>


### File Fragments¶

An lsc file can be physically divided into multiple lsc files, which are called lsc fragments. Lsc fragments may be loaded directly from the linker path option, or indirectly using the include tag in an lsc file.

Lsc fragments start with the root tag lscFragment. By convention the lsc fragments file extension is .lscf. From here to the end of the document, the expression “the lsc file” denotes the result of the union of all loaded (directly and indirectly loaded) lsc fragments files.

### Symbols and Sections¶

A new symbol is defined using defSymbol tag. A symbol has a name and an expression value. All symbols defined in the lsc file are global symbols.

A new section is defined using the defSection tag. A section may be used to define a memory interval, or define a chunk of the final image with the description of the contents of the section.

### Memory Layout¶

A memory layout contains an ordered set of statements describing what shall be embedded. Memory positioning can be viewed as moving a cursor into intervals, appending referenced sections in the order they appear. A symbol can be defined as a “floating” item: Its value is the value of the cursor when the symbol definition is encountered. In the example below, the memory layout sets the FLASH section. First, all sections named .text are embedded. The matching sections are appended in a undefined order. To reference a specific section, the section shall have a unique name (for example a reset vector is commonly called .reset or .vector, etc.). Then, the floating symbol dataStart is set to the absolute address of the virtual cursor right after embedded .text sections. Finally all sections named .data are embedded.

A memory layout can be relocated to a memory interval. The positioning works in parallel with the layout ranges, as if there were two cursors. The address of the section (used to resolve symbols) is the address in the relocated interval. Floating symbols can refer either to the layout cursor (by default), or to the relocated cursor, using the relocation attribute. A relocation layout is typically used to embed data in a program image that will be used at runtime in a read-write memory. Assuming the program image is programmed in a read only memory, one of the first jobs at runtime, before starting the main program, is to copy the data from read-only memory to RAM, because the symbols targeting the data have been resolved with the address of the sections in the relocated space. To perform the copy, the program needs both the start address in FLASH where the data has been put, and the start address in RAM where the data shall be copied.

Example of Relocation of Runtime Data from FLASH to RAM
<memoryLayout ranges="FLASH" relocation="RAM" image="true">
<defSymbol name="DataFlashStart" value="."/>
<defSymbol name="DataRamStart" value=" ." relocation="true"/>
<sectionRef name=".data"/>
<defSymbol name="DataFlashLimit" value="."/>
</memoryLayout>


Note

the symbol DataRamStart is defined to the start address where .data sections will be inserted in RAM memory.

### Tags Specification¶

Here is the complete syntactical and semantical description of all available tags of the .lsc file.

Tags Attributes Description
defSection   Defines a new section. A floating section only holds a declared size attribute. A fixed section declares at least one of the start / end attributes. When this tag is empty, the section is a runtime section, and must define at least one of the start, end or size attributes. When this tag is not empty (when it holds a binary description), the section is an image section.
name Name of the section. The section name may not be unique. However, it is recommended that you define a unique name if the section must be referred separately for memory positioning.
start Optional. Expression defining the absolute start address of the section. Must be resolved to a constant after the full load of the lsc file.
end Optional. Expression defining the absolute end address of the section. Must be resolved to a constant after the full load of the lsc file.
size Optional. Expression defining the size in bytes of the section. Invariant: (end-start)+1=size. Must be resolved to a constant after the full load of the lsc file.
align Optional. Expression defining the alignment in bytes of the section.
rootSection Optional. Boolean value. Sets this section as a root section to be embedded even if it is not targeted by any embedded symbol. See also rootSection tag.
symbolPrefix Optional. Used in collaboration with symbolTags. Prefix of symbols embedded in the auto-generated section. See Auto-generated Sections.
symbolTags Optional. Used in collaboration with symbolPrefix. Comma separated list of tags of symbols embedded in the auto-generated section. See Auto-generated Sections.
defSymbol   Defines a new global symbol. Symbol name must be unique in the linker context
name Name of the symbol.
type Optional. Type of symbol usage. This may be necessary to set the type of a symbol when using third party ELF tools. There are three types: - none: default. No special type of use. - function: symbol describes a function. - data: symbol describes some data.
value The value "." defines a floating symbol that holds the current cursor position in a memory layout. (This is the only form of this tag that can be used as a memoryLayout directive) Otherwise value is an expression. A symbol expression must be resolved to a constant after memory positioning.
relocation Optional. The only allowed value is true. Indicates that the value of the symbol takes the address of the current cursor in the memory layout relocation space. Only allowed on floating symbols.
rootSymbol Optional. Boolean value. Sets this symbol as a root symbol that must be resolved. See also rootSymbol tag.
weak Optional. Boolean value. Sets this symbol as a weak symbol.
group   memoryLayout directive. Defines a named group of sections. Group name may be used in expression macros START, END, SIZE. All memoryLayout directives are allowed within this tag (recursively).
name The name of the group.
include   Includes an lsc fragment file, semantically the same as if the fragment contents were defined in place of the include tag.
name Name of the file to include. When the name is relative, the file separator is /, and the file is relative to the directory where the current lsc file or fragment is loaded. When absolute, the name describes a platform-dependent filename.
lsc   Root tag for an .lsc file.
name Name of the lsc file. The ELF executable output will be {name}.out, and the map file will be {name}.map
lscFragment   Root tag for an lsc file fragment. Lsc fragments are loaded from the linker path option, or included from a master file using the include tag.
memoryLayout   Describes the organization of a set of memory intervals. The memory layouts are processed in the order in which they are declared in the file. The same interval may be organized in several layouts. Each layout starts at the value of the cursor the previous layout ended. The following tags are allowed within a memoryLayout directive: defSymbol (under certain conditions), group, memoryLayoutRef, padding, and sectionRef.
ranges Exclusive with default. Comma-separated ordered list of fixed sections to which the layout is applied. Sections represent memory segments.
image Optional. Boolean value. false if not set. If true, the layout describes a part of the binary image: Only image sections can be embedded. If false, only runtime sections can be embedded.
relocation Optional. Name of the section to which this layout is relocated.
name Exclusive with ranges. Defines a named memoryLayout directive instead of specifying a concrete memory location. May be included in a parent memoryLayout using memoryLayoutRef.
memoryLayoutRef   memoryLayout directive. Provides an extension-point mechanism to include memoryLayout directives defined outside the current one.
name All directives of memoryLayout defined with the same name are included in an undefined order.
padding   memoryLayout directive. Append padding bytes to the current cursor. Either size or align attributes should be provided.
size Optional. Expression must be resolved to a constant after the full load of the lsc file. Increment the cursor position with the given size.
align Optional. Expression must be resolved to a constant after the full load of the lsc file. Move the current cursor position to the next address that matches the given alignment. Warning: when used with relocation, the relocation cursor is also aligned. Keep in mind this may increase the cursor position with a different amount of bytes.
address Optional. Expression must be resolved to a constant after the full load of the lsc file. Move the current cursor position to the given absolute address.
fill Optional. Expression must be resolved to a constant after the full load of the lsc file. Fill padding with the given value (32 bits).
rootSection   References a section name that must be embedded. This tag is not a definition. It forces the linker to embed all loaded sections matching the given name.
name Name of the section to be embedded.
rootSymbol   References a symbol that must be resolved. This tag is not a definition. It forces the linker to resolve the value of the symbol.
name Name of the symbol to be resolved.
sectionRef   Memory layout statement. Embeds all sections matching the given name starting at the current cursor address.
file Select only sections defined in a linker unit matching the given file name. The file name is the simple name without any file separator, e.g. bsp.o or mylink.lsc. Link units may be object files within archive units.
name Name of the sections to embed. When the name ends with *, all sections starting with the given name are embedded (name completion), except sections that are embedded in another sectionRef using the exact name (without completion).
symbol Optional. Only embeds the section targeted by the given symbol. This is the only way at link level to embed a specific section whose name is not unique.
force Optional. Deprecated. Replaced by the rootSection tag. The only allowed value is true. By default, for compaction, the linker embeds only what is needed. Setting this attribute will force the linker to embed all sections that appear in all loaded relocatable files, even sections that are not targeted by a symbol.
sort Optional. Specifies that the sections must be sorted in memory. The value can be: - order: the sections will be in the same order as the input files - name: the sections are sorted by their file names - unit: the sections declared in an object file are grouped and sorted in the order they are declared in the object file
u4   Binary section statement. Describes the four next raw bytes of the section. Bytes are organized in the endianness of the target ELF executable.
value Expression must be resolved to a constant after the full load of the lsc file (32 bits value).
fill   Binary section statement. Fills the section with the given expression. Bytes are organized in the endianness of the target ELF executable.
size Expression defining the number of bytes to be filled.
value Expression must be resolved to a constant after the full load of the lsc file (32 bits value).

### Expressions¶

An attribute expression is a value resulting from the computation of an arithmetical and logical expression. Supported operators are the same operators supported in the Java language, and follow Java semantics:

• Unary operators: + , - , ~ , !
• Binary operators: + , - , * , / , % , << , >>> , >> , < , > , <= , >= , == , != , &, | , ^ , && , ||
• Ternary operator: cond ? ifTrue : ifFalse
• Built-in macros:
• START(name): Get the start address of a section or a group of sections
• END(name): Get the end address of a section or a group of sections
• SIZE(name): Get the size of a section or a group of sections. Equivalent to END(name)-START(name)
• TSTAMPH(), TSTAMPL(): Get 32 bits linker time stamp (high/low part of system time in milliseconds)
• SUM(name,tag): Get the sum of an auto-generated section (Auto-generated Sections) column. The column is specified by its tag name.

An operand is either a sub expression, a constant, or a symbol name. Constants may be written in decimal (127) or hexadecimal form (0x7F). There are no boolean constants. Constant value 0 means false, and other constants’ values mean true. Examples of use:

value="symbol+3"
value="((symbol1*4)-(symbol2*3)"


Note: Ternary expressions can be used to define selective linking because they are the only expressions that may remain partially unresolved without generating an error. Example:

<defSymbol name="myFunction" value="condition ? symb1 : symb2"/>


No error will be thrown if the condition is true and symb1 is defined, or the condition is false and symb2 is defined, even if the other symbol is undefined.

## Auto-generated Sections¶

The MicroEJ Linker allows you to define sections that are automatically generated with symbol values. This is commonly used to generate tables whose contents depends on the linked symbols. Symbols eligible to be embedded in an auto-generated section are of the form: prefix_tag_suffix. An auto-generated section is viewed as a table composed of lines and columns that organize symbols sharing the same prefix. On the same column appear symbols that share the same tag. On the same line appear symbols that share the same suffix. Lines are sorted in the lexical order of the symbol name. The next line defines a section which will embed symbols starting with zeroinit. The first column refers to symbols starting with zeroinit_start_; the second column refers to symbols starting with zeroinit_end_.

<defSection
name=".zeroinit"
symbolPrefix="zeroInit"
symbolTags="start,end"
/>


Consider there are four defined symbols named zeroinit_start_xxx, zeroinit_end_xxx, zeroinit_start_yyy and zeroinit_end_yyy. The generated section is of the form:

0x00: zeroinit_start_xxx
0x04: zeroinit_end_xxx
0x08: zeroinit_start_yyy
0x0C: zeroinit_end_yyy


If there are missing symbols to fill a line of an auto-generated section, an error is thrown.

## Execution¶

MicroEJ Linker can be invoked through an ANT task. The task is installed by inserting the following code in an ANT script

<taskdef
/>


[LINKER_CLASSPATH] is a list of path-separated jar files, including the linker and all architecture-specific library loaders.

The following code shows a linker ANT task invocation and available options.

<linker
endianness="[little|big|none]"
generateMapFile="[true|false]"
ignoreWrongPositioningForEmptySection="[true|false]"
lsc="[filename]"
mergeSegmentSections="[true|false]"
noWarning="[true|false]"
outputArchitecture="[tag]"
outputName="[name]"
stripDebug="[true|false]"
toDir="[outputDir]"
verboseLevel="[0...9]"
>
<!-- ELF object & archives files using ANT paths / filesets -->
<fileset dir="xxx" includes="*.o">
<fileset file="xxx.a">
<fileset file="xxx.a">

<!-- Properties that will be reported into .map file -->
<property name="myProp" value="myValue"/>

Option Description
doNotLoadAlreadyDefinedSymbol Silently skip the load of a global symbol if it has already been loaded before. (false by default. Only the first loaded symbol is taken into account (in the order input files are declared). This option only affects the load semantic for global symbols, and does not modify the semantic for loading weak symbols and local symbols.
endianness Explicitly declare linker endianness [little,  big] or [none] for auto-detection. All input files must declare the same endianness or an error is thrown.
generateMapFile Generate the .map file (true by default).
ignoreWrongPositioningForEmptySection Silently ignore wrong section positioning for zero size sections. (false by default).
lsc Provide a master lsc file. This option is mandatory unless the linkPath option is set.
linkPath Provide a set of directories into which to load link file fragments. Directories are separated with a platform-path separator. This option is mandatory unless the lsc option is set.
noWarning Silently skip the output of warning messages.
mergeSegmentSections (experimental). Generate a single section per segment. This may speed up the load of the output executable file into debuggers or flasher tools. (false by default).
outputArchitecture Set the architecture tag for the output ELF file (ELF machine id).
outputName Specify the output name of the generated files. By default, take the name provided in the lsc tag. The output ELF executable filename will be name.out. The map filename will be name.map.
stripDebug Remove all debug information from the output ELF file. A stripped output ELF executable holds only the binary image (no remaining symbols, debug sections, etc.).
toDir Specify the output directory in which to store generated files. Output filenames are in the form: od + separator + value of the lsc name attribute + suffix. By default, without this option, files are generated in the directory from which the linker was launched.
verboseLevel Print additional messages on the standard output about linking process.

## Error Messages¶

This section lists MicroEJ Linker error messages.

 Message ID Description 0 The linker has encountered an unexpected internal error. Please contact the support hotline. 1 A library cannot be loaded with this linker. Try verbose to check installed loaders. 2 No lsc file provided to the linker. 3 A file could not be loaded. Check the existence of the file and file access rights. 4 Conflicting input libraries. A global symbol definition with the same name has already been loaded from a previous object file. 5 Completion (*) could not be used in association with the force attribute. Must be an exact name. 6 A required section refers to an unknown global symbol. Maybe input libraries are missing. 7 A library loader has encountered an unexpected internal error. Check input library file integrity. 8 Floating symbols can only be declared inside memoryLayout tags. 9 Invalid value format. For example, the attribute relocation in defSymbol must be a boolean value. 10 Missing one of the following attributes: address, size, align. 11 Too many attributes that cannot be used in association. 13 Negative padding. Memory layout cursor cannot decrease. 15 Not enough space in the memory layout intervals to append all sections that need to be embedded. Check the output map file to get more information about what is required as memory space. 16 A block is referenced but has already been embedded. Most likely a block has been especially embedded using the force attribute and the symbol attribute. 17 A block that must be embedded has no matching sectionRef statement. 19 An IO error occurred when trying to dump one of the output files. Check the output directory option and file access rights. 20 size attribute expected. 21 The computed size does not match the declared size. 22 Sections defined in the lsc file must be unique. 23 One of the memory layout intervals refers to an unknown lsc section. 24 Relocation must be done in one and only one contiguous interval. 25 force and symbol attributes are not allowed together. 26 XML char data not allowed at this position in the lsc file. 27 A section which is a part of the program image must be embedded in an image memory layout. 28 A section which is not a part of the program image must be embedded in a non-image memory layout. 29 Expression could not be resolved to a link-time constant. Some symbols are unresolved. 30 Sections used in memory layout ranges must be sections defined in the lsc file. 31 Invalid character encountered when scanning the lsc expression. 32 A recursive include cycle was detected. 33 An alignment inconsistency was detected in a relocation memory layout. Most likely one of the start addresses of the memory layout is not aligned on the current alignment. 34 An error occurs in a relocation resolution. In general, the relocation has a value that is out of range. 35 symbol and sort attributes are not allowed together. 36 Invalid sort attribute value is not one of order, name, or no. 37 Attribute start or end in defSection tag is not allowed when defining a floating section. 38 Autogenerated section can build tables according to symbol names (see Auto-generated Sections). A symbol is needed to build this section but has not been loaded. 39 Deprecated feature warning. Remains for backward compatibility. It is recommended that you use the new indicated feature, because this feature may be removed in future linker releases. 40 Unknown output architecture. Either the architecture ID is invalid, or the library loader has not been loaded by the linker. Check loaded library loaders using verbose option. 41…43 Reserved. 44 Duplicate group definition. A group name is unique and cannot be defined twice. 45 Invalid endianness. The endianness mnemonic is not one of the expected mnemonics (little,big,none). 46 Multiple endiannesses detected within loaded input libraries. 47 Reserved. 48 Invalid type mnemonic passed to a defSymbol tag. Must be one of none, function, or data. 49 Warning. A directory of link path is invalid (skipped). 50 No linker-specific description file could be loaded from the link path. Check that the link path directories are valid, and that they contain .lsc or .lscf files. 51 Exclusive options (these options cannot be used simultaneously). For example, -linkFilename and -linkPath are exclusive; either select a master lsc file or a path from which to load .lscf files. 52 Name given to a memoryLayoutRef or a memoryLayout is invalid. It must not be empty. 53 A memoryLayoutRef with the same name has already been processed. 54 A memoryLayout must define ranges or the name attribute. 55 No memory layout found matching the name of the current memoryLayoutRef. 56 A named memoryLayout is declared with a relocation directive, but the relocation interval is incompatible with the relocation interval of the memoryLayout that referenced it. 57 A named memoryLayout has not been referenced. Every declared memoryLayout must be processed. A named memoryLayout must be referenced by a memoryLayoutRef statement. 58 SUM operator expects an auto-generated section. 59 SUM operator tag is unknown for the targetted auto-generated section. 60 SUM operator auto-generated section name is unknown. 61 An option is set for an unknown extension. Most likely the extension has not been set to the linker classpath. 62 Reserved. 63 ELF unit flags are inconsistent with flags set using the -forceFlags option. 64 Reserved. 65 Reserved. 66 Found an executable object file as input (expected a relocatable object file). 67 Reserved. 68 Reserved. 69 Reserved. 70 Not enough memory to achieve the linking process. Try to increase JVM heap that is running the linker (e.g. by adding option -Xmx1024M to the JRE command line).

## Map File Interpretor¶

The map file interpretor is a tool that allows you to read, classify and display memory information dumped by the linker map file. The map file interpretor is a graph-oriented tool. It supports graphs of symbols and allows standard operations on them (union, intersection, subtract, etc.). It can also dump graphs, compute graph total sizes, list graph paths, etc.

The map file interpretor uses the standard Java regular expression syntax.

It is used internally by the graphical Memory Map Analyzer tool.

Commands:

• createGraph graphName symbolRegExp ... section=regexp

createGraph all section=.*


Recursively create a graph of symbols from root symbols and sections described as regular expressions. For example, to extract the complete graph of the application:

• createGraphNoRec symbolRegExp ... section=regexp


The above line is similar to the previous statement, but embeds only declared symbols and sections (without recursive connections).

• removeGraph graphName


Removes the graph for memory.

• listGraphs


Lists all the created graphs in memory.

• listSymbols graphName


Lists all graph symbols.

• listPadding


Lists the padding of the application.

• listSections graphName


Lists all sections targeted by all symbols of the graph.

• inter graphResult g1 ... gn


Creates a graph which is the intersection of g1/\ ... /\gn.

• union graphResult g1 ... gn


Creates a graph which is the union of g1\/ ...\/ gn.

• substract graphResult g1 ... gn


Creates a graph which is the substract of g1\ ... \ gn.

• reportConnections graphName


Prints the graph connections.

• totalImageSize graphName


Prints the image size of the graph.

• totalDynamicSize graphName


Prints the dynamic size of the graph.

• accessPath symbolName


The above line prints one of the paths from a root symbol to this symbol. This is very useful in helping you understand why a symbol is embedded.

• echo arguments


Prints raw text.

• exec commandFile


Execute the given commandFile. The path may be absolute or relative from the current command file.