MicroEJ Linker
Overview
MicroEJ Linker is a standard linker that is compliant with the Executable and Linkable File format (ELF).
MicroEJ Linker takes one or several relocatable binary files and generates an image representation using a description file. The process of extracting binary code, positioning blocks and resolving symbols is called linking.
Relocatable object files are generated by SOAR and third-party compilers. An archive file is a container of Relocatable object files.
The description file is called a Linker Specific Configuration file (lsc). It describes what shall be embedded, and how those things shall be organized in the program image. The linker outputs :
An ELF executable file that contains the image and potential debug sections. This file can be directly used by debuggers or programming tools. It may also be converted into a another format (Intel* hex, Motorola* s19, rawBinary, etc.) using external tools, such as standard GNU binutils toolchain (objcopy, objdump, etc.).
A map file, in XML format, which can be viewed as a database of what has been embedded and resolved by the linker. It can be easily processed to get a sort of all sizes, call graphs, statistics, etc.
The linker is composed with one or more library loaders, according to the platform’s configuration.
ELF Overview
An ELF relocatable file is split into several sections:
allocation sections representing a part of the program
control sections describing the binary sections (relocation sections, symbol tables, debug sections, etc.)
An allocation section can hold some image binary bytes (assembler
instructions and raw data) or can refer to an interval of memory which
makes sense only at runtime (statics, main stack, heap, etc.). An
allocation section is an atomic block and cannot be split. A section has
a name that by convention, represents the kind of data it holds. For
example, .text
sections hold binary instructions, .bss
sections
hold read-write static data, .rodata
hold read-only data, and
.data
holds read-write data (initialized static data). The name is
used in the .lsc file to organize sections.
A symbol is an entity made of a name and a value. A symbol may be absolute (link-time constant) or relative to a section: Its value is unknown until MicroEJ Linker has assigned a definitive position to the target section. A symbol can be local to the relocatable file or global to the system. All global symbol names should be unique in the system (the name is the key that connects an unresolved symbol reference to a symbol definition). A section may need the value of symbols to be fully resolved: the address of a function called, address of a static variable, etc.
Linking Process
The linking process can be divided into three main steps:
Symbols and sections resolution. Starting from root symbols and root sections, the linker embeds all sections targeted by symbols and all symbols referred by sections. This process is transitive while new symbols and/or sections are found. At the end of this step, the linker may stop and output errors (unresolved symbols, duplicate symbols, unknown or bad input libraries, etc.)
Memory positioning. Sections are laid out in memory ranges according to memory layout constraints described by the lsc file. Relocations are performed (in other words, symbol values are resolved and section contents are modified). At the end of this step, the linker may stop and output errors (it could not resolve constraints, such as not enough memory, etc.)
An output ELF executable file and map file are generated.
A partial map file may be generated at the end of step 2. It provides useful information to understand why the link phase failed. Symbol resolution is the process of connecting a global symbol name to its definition, found in one of the linker input units. The order the units are passed to the linker may have an impact on symbol resolution. The rules are :
Relocatable object files are loaded without order. Two global symbols defined with the same name result in an unrecoverable linker error.
Archive files are loaded on demand. When a global symbol must be resolved, the linker inspects each archive unit in the order it was passed to the linker. When an archive contains a relocatable object file that declares the symbol, the object file is extracted and loaded. Then the first rule is applied. It is recommended that you group object files in archives as much as possible, in order to improve load performances. Moreover, archive files are the only way to tie with relocatable object files that share the same symbols definitions.
A symbol name is resolved to a weak symbol if - and only if - no global symbol is found with the same name.
Linker Specific Configuration File Specification
Description
A Linker Specific Configuration (Lsc) file contains directives to link input library units. An lsc file is written in an XML dialect, and its contents can be divided into two principal categories:
Symbols and sections definitions.
Memory layout definitions.
<?xml version="1.0" encoding="UTF-8"?>
<!--
An example of linker specific configuration file
-->
<lsc name="MyAppInFlash">
<include name="subfile.lscf"/>
<!--
Define symbols with arithmetical and logical expressions
-->
<defSymbol name="FlashStart" value="0"/>
<defSymbol name="FlashSize" value="0x10000"/>
<defSymbol name="FlashEnd" value="FlashStart+FlashSize-1"/>
<!--
Define FLASH memory interval
-->
<defSection name="FLASH" start="FlashStart" size="FlashSize"/>
<!--
Some memory layout directives
-->
<memoryLayout ranges ="FLASH">
<sectionRef name ="*.text"/>
<sectionRef name ="*.data"/>
</memoryLayout>
</lsc>
File Fragments
An lsc file can be physically divided into multiple lsc files, which are called lsc fragments. Lsc fragments may be loaded directly from the linker path option, or indirectly using the include tag in an lsc file.
Lsc fragments start with the root tag lscFragment
. By convention the
lsc fragments file extension is .lscf
. From here to the end of the
document, the expression “the lsc file” denotes the result of the union
of all loaded (directly and indirectly loaded) lsc fragments files.
Symbols and Sections
A new symbol is defined using defSymbol
tag. A symbol has a name and
an expression value. All symbols defined in the lsc file are global
symbols.
A new section is defined using the defSection
tag. A section may be
used to define a memory interval, or define a chunk of the final image
with the description of the contents of the section.
Memory Layout
A memory layout contains an ordered set of statements describing what
shall be embedded. Memory positioning can be viewed as moving a cursor
into intervals, appending referenced sections in the order they appear.
A symbol can be defined as a “floating” item: Its value is the value of
the cursor when the symbol definition is encountered. In
the example below, the memory layout sets
the FLASH
section. First, all sections named .text
are embedded.
The matching sections are appended in a undefined order. To reference a
specific section, the section shall have a unique name (for example a
reset vector is commonly called .reset
or .vector
, etc.). Then,
the floating symbol dataStart
is set to the absolute address of the
virtual cursor right after embedded .text
sections. Finally all
sections named .data
are embedded.
A memory layout can be relocated to a memory interval. The positioning
works in parallel with the layout ranges, as if there were two cursors.
The address of the section (used to resolve symbols) is the address in
the relocated interval. Floating symbols can refer either to the layout
cursor (by default), or to the relocated cursor, using the
relocation
attribute. A relocation layout is typically used to embed
data in a program image that will be used at runtime in a read-write
memory. Assuming the program image is programmed in a read only memory,
one of the first jobs at runtime, before starting the main program, is
to copy the data from read-only memory to RAM
, because the symbols
targeting the data have been resolved with the address of the sections
in the relocated space. To perform the copy, the program needs both the
start address in FLASH
where the data has been put, and the start
address in RAM
where the data shall be copied.
<memoryLayout ranges="FLASH" relocation="RAM" image="true">
<defSymbol name="DataFlashStart" value="."/>
<defSymbol name="DataRamStart" value=" ." relocation="true"/>
<sectionRef name=".data"/>
<defSymbol name="DataFlashLimit" value="."/>
</memoryLayout>
Note
the symbol DataRamStart
is defined to the start address where
.data
sections will be inserted in RAM
memory.
Expressions
An attribute expression is a value resulting from the computation of an arithmetical and logical expression. Supported operators are the same operators supported in the Java language, and follow Java semantics:
Unary operators:
+ , - , ~ , !
Binary operators:
+ , - , * , / , % , << , >>> , >> , < , > , <= , >= , == , != , &, | , ^ , && , ||
Ternary operator:
cond ? ifTrue : ifFalse
Built-in macros:
START(name)
: Get the start address of a section or a group of sectionsEND(name)
: Get the end address of a section or a group of sectionsSIZE(name)
: Get the size of a section or a group of sections. Equivalent toEND(name)-START(name)
TSTAMPH()
,TSTAMPL()
: Get 32 bits linker time stamp (high/low part of system time in milliseconds)SUM(name,tag)
: Get the sum of an auto-generated section (Auto-generated Sections) column. The column is specified by its tag name.
An operand is either a sub expression, a constant, or a symbol name.
Constants may be written in decimal (127
) or hexadecimal form
(0x7F
). There are no boolean constants. Constant value 0
means
false
, and other constants’ values mean true
. Examples of use:
value="symbol+3"
value="((symbol1*4)-(symbol2*3)"
Note: Ternary expressions can be used to define selective linking because they are the only expressions that may remain partially unresolved without generating an error. Example:
<defSymbol name="myFunction" value="condition ? symb1 : symb2"/>
No error will be thrown if the condition is true
and symb1
is
defined, or the condition is false
and symb2
is defined, even if
the other symbol is undefined.
Auto-generated Sections
The MicroEJ Linker allows you to define sections that are automatically
generated with symbol values. This is commonly used to generate tables
whose contents depends on the linked symbols. Symbols eligible to be
embedded in an auto-generated section are of the form:
prefix_tag_suffix
. An auto-generated section is viewed as a table
composed of lines and columns that organize symbols sharing the same
prefix. On the same column appear symbols that share the same tag. On
the same line appear symbols that share the same suffix. Lines are
sorted in the lexical order of the symbol name. The next line defines a
section which will embed symbols starting with zeroinit
. The first
column refers to symbols starting with zeroinit_start_
; the second
column refers to symbols starting with zeroinit_end_
.
<defSection
name=".zeroinit"
symbolPrefix="zeroInit"
symbolTags="start,end"
/>
Consider there are four defined symbols named zeroinit_start_xxx
,
zeroinit_end_xxx
, zeroinit_start_yyy
and zeroinit_end_yyy
.
The generated section is of the form:
0x00: zeroinit_start_xxx
0x04: zeroinit_end_xxx
0x08: zeroinit_start_yyy
0x0C: zeroinit_end_yyy
If there are missing symbols to fill a line of an auto-generated section, an error is thrown.
Execution
MicroEJ Linker can be invoked through an ANT task. The task is installed by inserting the following code in an ANT script
<taskdef
name="linker"
classname="com.is2t.linker.GenericLinkerTask"
classpath="[LINKER_CLASSPATH]"
/>
[LINKER_CLASSPATH]
is a list of path-separated jar files, including
the linker and all architecture-specific library loaders.
The following code shows a linker ANT task invocation and available options.
<linker
doNotLoadAlreadyDefinedSymbol="[true|false]"
endianness="[little|big|none]"
generateMapFile="[true|false]"
ignoreWrongPositioningForEmptySection="[true|false]"
lsc="[filename]"
linkPath="[path1:...pathN]"
mergeSegmentSections="[true|false]"
noWarning="[true|false]"
outputArchitecture="[tag]"
outputName="[name]"
stripDebug="[true|false]"
toDir="[outputDir]"
verboseLevel="[0...9]"
>
<!-- ELF object & archives files using ANT paths / filesets -->
<fileset dir="xxx" includes="*.o">
<fileset file="xxx.a">
<fileset file="xxx.a">
<!-- Properties that will be reported into .map file -->
<property name="myProp" value="myValue"/>
</linker>
Option |
Description |
---|---|
|
Silently skip the load of a global symbol if
it has already been loaded before.
( |
|
Explicitly declare linker endianness
|
|
Generate the |
|
Silently ignore wrong section positioning
for zero size sections. ( |
|
Provide a master lsc file. This option is
mandatory unless the |
|
Provide a set of directories into which to
load link file fragments. Directories are
separated with a platform-path separator.
This option is mandatory unless the |
|
Silently skip the output of warning messages. |
|
(experimental). Generate a single section
per segment. This may speed up the load of
the output executable file into debuggers or
flasher tools. ( |
|
Set the architecture tag for the output ELF file (ELF machine id). |
|
Specify the output name of the generated files. By default, take the name provided in the lsc tag. The output ELF executable filename will be name.out. The map filename will be name.map. |
|
Remove all debug information from the output ELF file. A stripped output ELF executable holds only the binary image (no remaining symbols, debug sections, etc.). |
|
Specify the output directory in which to
store generated files. Output filenames are
in the form: |
|
Print additional messages on the standard output about linking process. |
Error Messages
This section lists MicroEJ Linker error messages.
Message ID |
Description |
0 |
The linker has encountered an unexpected internal error. Please contact the support hotline. |
1 |
A library cannot be loaded with this linker. Try verbose to check installed loaders. |
2 |
No lsc file provided to the linker. |
3 |
A file could not be loaded. Check the existence of the file and file access rights. |
4 |
Conflicting input libraries. A global symbol definition with the same name has already been loaded from a previous object file. |
5 |
Completion (*) could not be used in association with the force attribute. Must be an exact name. |
6 |
A required section refers to an unknown global symbol. Maybe input libraries are missing. |
7 |
A library loader has encountered an unexpected internal error. Check input library file integrity. |
8 |
Floating symbols can only be declared inside
|
9 |
Invalid value format. For example, the attribute
relocation in |
10 |
Missing one of the following attributes: |
11 |
Too many attributes that cannot be used in association. |
13 |
Negative padding. Memory layout cursor cannot decrease. |
15 |
Not enough space in the memory layout intervals to append all sections that need to be embedded. Check the output map file to get more information about what is required as memory space. |
16 |
A block is referenced but has already been embedded. Most likely a block has been especially embedded using the force attribute and the symbol attribute. |
17 |
A block that must be embedded has no matching
|
19 |
An IO error occurred when trying to dump one of the output files. Check the output directory option and file access rights. |
20 |
|
21 |
The computed size does not match the declared size. |
22 |
Sections defined in the lsc file must be unique. |
23 |
One of the memory layout intervals refers to an unknown lsc section. |
24 |
Relocation must be done in one and only one contiguous interval. |
25 |
|
26 |
XML char data not allowed at this position in the lsc file. |
27 |
A section which is a part of the program image must be embedded in an image memory layout. |
28 |
A section which is not a part of the program image must be embedded in a non-image memory layout. |
29 |
Expression could not be resolved to a link-time constant. Some symbols are unresolved. |
30 |
Sections used in memory layout ranges must be sections defined in the lsc file. |
31 |
Invalid character encountered when scanning the lsc expression. |
32 |
A recursive include cycle was detected. |
33 |
An alignment inconsistency was detected in a relocation memory layout. Most likely one of the start addresses of the memory layout is not aligned on the current alignment. |
34 |
An error occurs in a relocation resolution. In general, the relocation has a value that is out of range. |
35 |
|
36 |
Invalid sort attribute value is not one of |
37 |
Attribute |
38 |
Autogenerated section can build tables according to symbol names (see Auto-generated Sections). A symbol is needed to build this section but has not been loaded. |
39 |
Deprecated feature warning. Remains for backward compatibility. It is recommended that you use the new indicated feature, because this feature may be removed in future linker releases. |
40 |
Unknown output architecture. Either the architecture ID is invalid, or the library loader has not been loaded by the linker. Check loaded library loaders using verbose option. |
41…43 |
Reserved. |
44 |
Duplicate group definition. A group name is unique and cannot be defined twice. |
45 |
Invalid endianness. The endianness mnemonic is not one
of the expected mnemonics ( |
46 |
Multiple endiannesses detected within loaded input libraries. |
47 |
Reserved. |
48 |
Invalid type mnemonic passed to a |
49 |
Warning. A directory of link path is invalid (skipped). |
50 |
No linker-specific description file could be loaded
from the link path. Check that the link path
directories are valid, and that they contain |
51 |
Exclusive options (these options cannot be used
simultaneously). For example, |
52 |
Name given to a |
53 |
A |
54 |
A |
55 |
No memory layout found matching the name of the
current |
56 |
A named |
57 |
A named |
58 |
|
59 |
|
60 |
|
61 |
An option is set for an unknown extension. Most likely the extension has not been set to the linker classpath. |
62 |
Reserved. |
63 |
ELF unit flags are inconsistent with flags set using
the |
64 |
Reserved. |
65 |
Reserved. |
66 |
Found an executable object file as input (expected a relocatable object file). |
67 |
Reserved. |
68 |
Reserved. |
69 |
Reserved. |
70 |
Not enough memory to achieve the linking process. Try to increase JVM heap that is running the linker (e.g. by adding option -Xmx1024M to the JRE command line). |
Map File Interpretor
The map file interpretor is a tool that allows you to read, classify and display memory information dumped by the linker map file. The map file interpretor is a graph-oriented tool. It supports graphs of symbols and allows standard operations on them (union, intersection, subtract, etc.). It can also dump graphs, compute graph total sizes, list graph paths, etc.
The map file interpretor uses the standard Java regular expression syntax.
It is used internally by the graphical Memory Map Analyzer tool.
Commands:
createGraph graphName symbolRegExp ... section=regexp
createGraph all section=.*
Recursively create a graph of symbols from root symbols and sections described as regular expressions. For example, to extract the complete graph of the application:
createGraphNoRec symbolRegExp ... section=regexp
The above line is similar to the previous statement, but embeds only declared symbols and sections (without recursive connections).
removeGraph graphName
Removes the graph for memory.
listGraphs
Lists all the created graphs in memory.
listSymbols graphName
Lists all graph symbols.
listPadding
Lists the padding of the application.
listSections graphName
Lists all sections targeted by all symbols of the graph.
inter graphResult g1 ... gn
Creates a graph which is the intersection of
g1/\ ... /\gn
.union graphResult g1 ... gn
Creates a graph which is the union of
g1\/ ...\/ gn
.substract graphResult g1 ... gn
Creates a graph which is the substract of
g1\ ... \ gn
.reportConnections graphName
Prints the graph connections.
totalImageSize graphName
Prints the image size of the graph.
totalDynamicSize graphName
Prints the dynamic size of the graph.
accessPath symbolName
The above line prints one of the paths from a root symbol to this symbol. This is very useful in helping you understand why a symbol is embedded.
echo arguments
Prints raw text.
exec commandFile
Execute the given commandFile. The path may be absolute or relative from the current command file.