When the dynamic executable prog is executed, the function foo() is obtained from the filtee libbar.so.1, not from the filter libfoo.so.1. However, the data item bar is obtained from the filter libfoo.so.1, as this symbol has no alternative definition in the filtee libbar.so.1.
Auxiliary filters provide a mechanism for defining an alternative interface of an existing shared object. This mechanism is used in the Solaris operating environment to provide optimized functionality within platform specific shared objects. See "Instruction Set Specific Shared Objects" and "System Specific Shared Objects" for examples.
The runtime linker's processing of a filter defers the loading of a filtee until a reference to a symbol within the filter has occurred. This implementation is analogous to the filter performing a dlopen(3DL) on each of its filtees as they are required. This implementation accounts for differences in dependency reporting that can be produced by tools such as ldd(1).
The link-editor's -z loadfltr option can be used when creating a filter to cause the immediate processing of its filtees at runtime. In addition, the immediate processing of any filtees within a process can be triggered by setting the LD_LOADFLTR environment variable to any value.
A shared object can be used by multiple applications within the same system. The performance of a shared object affects the applications that use it and the system as a whole.
Although the actual code within a shared object will directly affect the performance of a running process, the performance issues focused upon here target the runtime processing of the shared object itself. The following sections investigate this processing in more detail by looking at aspects such as text size and purity, together with relocation overhead.
$ size -x libfoo.so.1 59c + 10c + 20 = 0x6c8 $ size -xf libfoo.so.1 ..... + 1c(.init) + ac(.text) + c(.fini) + 4(.rodata) + \ ..... + 18(.data) + 20(.bss) .....
The first example indicates the size of the shared objects text, data, and bss, a categorization used in previous releases of the SunOS operating system.
Sections are allocated to units known as segments, some of which describe how portions of a file will be mapped into memory (see the mmap(2) man page). These loadable segments can be displayed by using the dump(1) command and examining the LOAD entries. For example:
$ dump -ov libfoo.so.1 libfoo.so.1: ***** PROGRAM EXECUTION HEADER ***** Type Offset Vaddr Paddr Filesz Memsz Flags Align LOAD 0x94 0x94 0x0 0x59c 0x59c r-x 0x10000 LOAD 0x630 0x10630 0x0 0x10c 0x12c rwx 0x10000
There are two loadable segments in the shared object libfoo.so.1, commonly referred to as the text and data segments. The text segment is mapped to allow reading and execution of its contents (r-x), whereas the data segment is mapped to also allow its contents to be modified (rwx). The memory size (Memsz) of the data segment differs from the file size (Filesz). This difference accounts for the .bss section, which is part of the data segment, and is dynamically created when the segment is loaded.
$ nm -x libfoo.so.1 [Index] Value Size Type Bind Other Shndx Name .........  |0x00000538|0x00000000|FUNC |GLOB |0x0 |7 |_init  |0x00000588|0x00000034|FUNC |GLOB |0x0 |8 |foo  |0x00000600|0x00000000|FUNC |GLOB |0x0 |9 |_fini  |0x00010688|0x00000010|OBJT |GLOB |0x0 |13 |data  |0x0001073c|0x00000020|OBJT |GLOB |0x0 |16 |bss .........
$ dump -hv libfoo.so.1 libfoo.so.1: **** SECTION HEADER TABLE **** [No] Type Flags Addr Offset Size Name .........  PBIT -AI 0x538 0x538 0x1c .init  PBIT -AI 0x554 0x554 0xac .text  PBIT -AI 0x600 0x600 0xc .fini .........  PBIT WA- 0x10688 0x688 0x18 .data  NOBI WA- 0x1073c 0x73c 0x20 .bss .........
The output from both the previous nm(1) and dump(1) examples shows the association of the functions _init, foo, and _fini to the sections .init, .text and .fini. These sections, because of their read-only nature, are part of the text segment.
Similarly, the data arrays data, and bss are associated with the sections .data and .bss respectively. These sections, because of their writable nature, are part of the data segment.
Note - The previous dump(1) display has been simplified for this example.
When an application is built using a shared object, the entire loadable contents of the object are mapped into the virtual address space of that process at runtime. Each process that uses a shared object starts by referencing a single copy of the shared object in memory.
Relocations within the shared object are processed to bind symbolic references to their appropriate definitions. This results in the calculation of true virtual addresses that could not be derived at the time the shared object was generated by the link-editor. These relocations usually result in updates to entries within the process's data segments.
The memory management scheme underlying the dynamic linking of shared objects shares memory among processes at the granularity of a page. Memory pages can be shared as long as they are not modified at runtime. If a process writes to a page of a shared object when writing a data item, or relocating a reference to a shared object, it generates a private copy of that page. This private copy will have no effect on other users of the shared object. However, this page will have lost any benefit of sharing between other processes. Text pages that become modified in this manner are referred to as impure.
The segments of a shared object that are mapped into memory fall into two basic categories; the text segment, which is read-only, and the data segment, which is read-write. See "Analyzing Files" on how to obtain this information from an ELF file. An overriding goal when developing a shared object is to maximize the text segment and minimize the data segment. This optimizes the amount of code sharing while reducing the amount of processing needed to initialize and use a shared object. The following sections present mechanisms that can help achieve this goal.
Lazy Loading of Dynamic Dependencies
You can defer the loading of a shared object dependency until the dependency is first referenced by establishing the object as lazy loadable. See "Lazy Loading of Dynamic Dependencies".
For applications that require a small number of dependencies, running the application may load all the dependencies whether they are defined lazy loadable or not. However, under lazy loading, dependency processing may be deferred from process startup and spread throughout the process's execution.
For applications with many dependencies, employing lazy loading will often result in some dependencies not being loaded at all, as they may not be referenced for a particular thread of execution.
The compiler can generate position-independent code under the -K pic option. Whereas the code within a dynamic executable is usually tied to a fixed address in memory, position-independent code can be loaded anywhere in the address space of a process. Because the code is not tied to a specific address, it will execute correctly without page modification at a different address in each process that uses it. This code creates programs that require the smallest amount of page modification at runtime.
When you use position-independent code, relocatable references are generated as an indirection that will use data in the shared object's data segment. The text segment code will remain read-only, and all relocation updates will be applied to corresponding entries within the data segment. See "Global Offset Table (Processor-Specific)" and "Procedure Linkage Table (Processor-Specific)" for more details on the use of these two sections.
If a shared object is built from code that is not position-independent, the text segment will usually require a large number of relocations to be performed at runtime. Although the runtime linker is equipped to handle this, the system overhead this creates can cause serious performance degradation.
$ cc -o libfoo.so.1 -G -R. foo.c $ dump -Lv libfoo.so.1 | grep TEXTREL  TEXTREL 0
Note - The value of the TEXTREL entry is irrelevant. Its presence in a shared object indicates that text relocations exist.
To prevent the creation of a shared object that contains text relocations use the link-editor's -z text flag. This flag causes the link-editor to generate diagnostics indicating the source of any non-position-independent code used as input. Such code results in a failure to generate the intended shared object. For example:
$ cc -o libfoo.so.1 -z text -G -R. foo.c Text relocation remains referenced against symbol offset in file foo 0x0 foo.o bar 0x8 foo.o ld: fatal: relocations remain against allocatable but \ non-writable sections
Two relocations are generated against the text segment because of the non-position-independent code generated from the file foo.o. Where possible, these diagnostics will indicate any symbolic references that are required to carry out the relocations. In this case, the relocations are against the symbols foo and bar.
Another common cause of creating text relocations when generating a shared object is by including hand-written assembler code that has not been coded with the appropriate position-independent prototypes.
Note - You may want to experiment with some simple source files to determine coding sequences that enable position-independence. Use the compilers ability to generate intermediate assembler output.