微软在去年发布了Bash On Windows, 这项技术允许在Windows上运行Linux程序, 我相信已经有很多文章解释过Bash On Windows的原理,
而今天的这篇文章将会讲解如何自己实现一个简单的原生Linux程序运行器, 这个运行器在用户层实现, 原理和Bash On Windows不完全一样,比较接近Linux上的Wine.
示例程序完整的代码在github上, 地址是 https://github.com/303248153/HelloElfLoader
初步了解ELF格式首先让我们先了解什么是原生Linux程序, 以下说明摘自维基百科
In computing, the Executable and Linkable Format (ELF, formerly named Extensible Linking Format), is a common standard file format for executable files, object code, shared libraries, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4),[2] and later in the Tool Interface Standard,[1] it was quickly accepted among different vendors of Unix systems. In 1999, it was chosen as the standard binary file format for Unix and Unix-like systems on x86 processors by the 86open project. By design, ELF is flexible, extensible, and cross-platform, not bound to any given central processing unit (CPU) or instruction set architecture. This has allowed it to be adopted by many different operating systems on many different hardware platforms.Linux的可执行文件格式采用了ELF格式, 而Windows采用了PE格式, 也就是我们经常使用的exe文件的格式.
ELF格式的结构如下
大致上可以分为这些部分
让我们来实际看一下Linux可执行程序的样子
以下的编译环境是Ubuntu 16.04 x64 + gcc 5.4.0, 编译环境不一样可能会得出不同的结果
首先创建hello.c,写入以下的代码
#include <stdio.h> int max(int x, int y) { return x > y ? x : y; } int main() { printf(, max(123, 321)); printf(, 1, 2, 3, "a", "b", "c", "d", "e", "f"); return 100; }
然后使用gcc编译这份代码
gcc hello.c
编译完成后你可以看到hello.c旁边多了一个a.out, 这就是linux的可执行文件了, 现在可以在linux上运行它
./a.out
你可以看到以下输出
max is 321 test many arguments 1 2 3 a b c d e f我们来看看a.out包含了什么,解析ELF文件可以使用readelf命令
readelf -a ./a.out
可以看到输出了以下的信息
ELF 头: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 类别: ELF64 数据: 2 补码,小端序 (little endian) 版本: 1 (current) OS/ABI: UNIX - System V ABI 版本: 0 类型: EXEC (可执行文件) 系统架构: Advanced Micro Devices X86-64 版本: 0x1 入口点地址: 0x400430 程序头起点: 64 (bytes into file) Start of section headers: 6648 (bytes into file) 标志: 0x0 本头的大小: 64 (字节) 程序头大小: 56 (字节) Number of program headers: 9 节头大小: 64 (字节) 节头数量: 31 字符串表索引节头: 28 节头: [号] 名称 类型 地址 偏移量 大小 全体大小 旗标 链接 信息 对齐 [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .interp PROGBITS 0000000000400238 00000238 000000000000001c 0000000000000000 A 0 0 1 [ 2] .note.ABI-tag NOTE 0000000000400254 00000254 0000000000000020 0000000000000000 A 0 0 4 [ 3] .note.gnu.build-i NOTE 0000000000400274 00000274 0000000000000024 0000000000000000 A 0 0 4 [ 4] .gnu.hash GNU_HASH 0000000000400298 00000298 000000000000001c 0000000000000000 A 5 0 8 [ 5] .dynsym DYNSYM 00000000004002b8 000002b8 0000000000000060 0000000000000018 A 6 1 8 [ 6] .dynstr STRTAB 0000000000400318 00000318 000000000000003f 0000000000000000 A 0 0 1 [ 7] .gnu.version VERSYM 0000000000400358 00000358 0000000000000008 0000000000000002 A 5 0 2 [ 8] .gnu.version_r VERNEED 0000000000400360 00000360 0000000000000020 0000000000000000 A 6 1 8 [ 9] .rela.dyn RELA 0000000000400380 00000380 0000000000000018 0000000000000018 A 5 0 8 [10] .rela.plt RELA 0000000000400398 00000398 0000000000000030 0000000000000018 AI 5 24 8 [11] .init PROGBITS 00000000004003c8 000003c8 000000000000001a 0000000000000000 AX 0 0 4 [12] .plt PROGBITS 00000000004003f0 000003f0 0000000000000030 0000000000000010 AX 0 0 16 [13] .plt.got PROGBITS 0000000000400420 00000420 0000000000000008 0000000000000000 AX 0 0 8 [14] .text PROGBITS 0000000000400430 00000430 00000000000001f2 0000000000000000 AX 0 0 16 [15] .fini PROGBITS 0000000000400624 00000624 0000000000000009 0000000000000000 AX 0 0 4 [16] .rodata PROGBITS 0000000000400630 00000630 0000000000000050 0000000000000000 A 0 0 8 [17] .eh_frame_hdr PROGBITS 0000000000400680 00000680 000000000000003c 0000000000000000 A 0 0 4 [18] .eh_frame PROGBITS 00000000004006c0 000006c0 0000000000000114 0000000000000000 A 0 0 8 [19] .init_array INIT_ARRAY 0000000000600e10 00000e10 0000000000000008 0000000000000000 WA 0 0 8 [20] .fini_array FINI_ARRAY 0000000000600e18 00000e18 0000000000000008 0000000000000000 WA 0 0 8 [21] .jcr PROGBITS 0000000000600e20 00000e20 0000000000000008 0000000000000000 WA 0 0 8 [22] .dynamic DYNAMIC 0000000000600e28 00000e28 00000000000001d0 0000000000000010 WA 6 0 8 [23] .got PROGBITS 0000000000600ff8 00000ff8 0000000000000008 0000000000000008 WA 0 0 8 [24] .got.plt PROGBITS 0000000000601000 00001000 0000000000000028 0000000000000008 WA 0 0 8 [25] .data PROGBITS 0000000000601028 00001028 0000000000000010 0000000000000000 WA 0 0 8 [26] .bss NOBITS 0000000000601038 00001038 0000000000000008 0000000000000000 WA 0 0 1 [27] .comment PROGBITS 0000000000000000 00001038 0000000000000034 0000000000000001 MS 0 0 1 [28] .shstrtab STRTAB 0000000000000000 000018ea 000000000000010c 0000000000000000 0 0 1 [29] .symtab SYMTAB 0000000000000000 00001070 0000000000000660 0000000000000018 30 47 8 [30] .strtab STRTAB 0000000000000000 000016d0 000000000000021a 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), l (large) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) There are no section groups in this file. 程序头: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040 0x00000000000001f8 0x00000000000001f8 R E 8 INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238 0x000000000000001c 0x000000000000001c R 1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x00000000000007d4 0x00000000000007d4 R E 200000 LOAD 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10 0x0000000000000228 0x0000000000000230 RW 200000 DYNAMIC 0x0000000000000e28 0x0000000000600e28 0x0000000000600e28 0x00000000000001d0 0x00000000000001d0 RW 8 NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254 0x0000000000000044 0x0000000000000044 R 4 GNU_EH_FRAME 0x0000000000000680 0x0000000000400680 0x0000000000400680 0x000000000000003c 0x000000000000003c R 4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 10 GNU_RELRO 0x0000000000000e10 0x0000000000600e10 0x0000000000600e10 0x00000000000001f0 0x00000000000001f0 R 1 Section to Segment mapping: 段节... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 04 .dynamic 05 .note.ABI-tag .note.gnu.build-id 06 .eh_frame_hdr 07 08 .init_array .fini_array .jcr .dynamic .got Dynamic section at offset 0xe28 contains 24 entries: 标记 类型 名称/值 0x0000000000000001 (NEEDED) 共享库:[libc.so.6] 0x000000000000000c (INIT) 0x4003c8 0x000000000000000d (FINI) 0x400624 0x0000000000000019 (INIT_ARRAY) 0x600e10 0x000000000000001b (INIT_ARRAYSZ) 8 (bytes) 0x000000000000001a (FINI_ARRAY) 0x600e18 0x000000000000001c (FINI_ARRAYSZ) 8 (bytes) 0x000000006ffffef5 (GNU_HASH) 0x400298 0x0000000000000005 (STRTAB) 0x400318 0x0000000000000006 (SYMTAB) 0x4002b8 0x000000000000000a (STRSZ) 63 (bytes) 0x000000000000000b (SYMENT) 24 (bytes) 0x0000000000000015 (DEBUG) 0x0 0x0000000000000003 (PLTGOT) 0x601000 0x0000000000000002 (PLTRELSZ) 48 (bytes) 0x0000000000000014 (PLTREL) RELA 0x0000000000000017 (JMPREL) 0x400398 0x0000000000000007 (RELA) 0x400380 0x0000000000000008 (RELASZ) 24 (bytes) 0x0000000000000009 (RELAENT) 24 (bytes) 0x000000006ffffffe (VERNEED) 0x400360 0x000000006fffffff (VERNEEDNUM) 1 0x000000006ffffff0 (VERSYM) 0x400358 0x0000000000000000 (NULL) 0x0 重定位节 '.rela.dyn' 位于偏移量 0x380 含有 1 个条目: 偏移量 信息 类型 符号值 符号名称 + 加数 000000600ff8 000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0 重定位节 '.rela.plt' 位于偏移量 0x398 含有 2 个条目: 偏移量 信息 类型 符号值 符号名称 + 加数 000000601018 000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0 000000601020 000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main@GLIBC_2.2.5 + 0 The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported. Symbol table '.dynsym' contains 4 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5 (2) 2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (2) 3: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ Symbol table '.symtab' contains 68 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000400238 0 SECTION LOCAL DEFAULT 1 2: 0000000000400254 0 SECTION LOCAL DEFAULT 2 3: 0000000000400274 0 SECTION LOCAL DEFAULT 3 4: 0000000000400298 0 SECTION LOCAL DEFAULT 4 5: 00000000004002b8 0 SECTION LOCAL DEFAULT 5 6: 0000000000400318 0 SECTION LOCAL DEFAULT 6 7: 0000000000400358 0 SECTION LOCAL DEFAULT 7 8: 0000000000400360 0 SECTION LOCAL DEFAULT 8 9: 0000000000400380 0 SECTION LOCAL DEFAULT 9 10: 0000000000400398 0 SECTION LOCAL DEFAULT 10 11: 00000000004003c8 0 SECTION LOCAL DEFAULT 11 12: 00000000004003f0 0 SECTION LOCAL DEFAULT 12 13: 0000000000400420 0 SECTION LOCAL DEFAULT 13 14: 0000000000400430 0 SECTION LOCAL DEFAULT 14 15: 0000000000400624 0 SECTION LOCAL DEFAULT 15 16: 0000000000400630 0 SECTION LOCAL DEFAULT 16 17: 0000000000400680 0 SECTION LOCAL DEFAULT 17 18: 00000000004006c0 0 SECTION LOCAL DEFAULT 18 19: 0000000000600e10 0 SECTION LOCAL DEFAULT 19 20: 0000000000600e18 0 SECTION LOCAL DEFAULT 20 21: 0000000000600e20 0 SECTION LOCAL DEFAULT 21 22: 0000000000600e28 0 SECTION LOCAL DEFAULT 22 23: 0000000000600ff8 0 SECTION LOCAL DEFAULT 23 24: 0000000000601000 0 SECTION LOCAL DEFAULT 24 25: 0000000000601028 0 SECTION LOCAL DEFAULT 25 26: 0000000000601038 0 SECTION LOCAL DEFAULT 26 27: 0000000000000000 0 SECTION LOCAL DEFAULT 27 28: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c 29: 0000000000600e20 0 OBJECT LOCAL DEFAULT 21 __JCR_LIST__ 30: 0000000000400460 0 FUNC LOCAL DEFAULT 14 deregister_tm_clones 31: 00000000004004a0 0 FUNC LOCAL DEFAULT 14 register_tm_clones 32: 00000000004004e0 0 FUNC LOCAL DEFAULT 14 __do_global_dtors_aux 33: 0000000000601038 1 OBJECT LOCAL DEFAULT 26 completed.7585 34: 0000000000600e18 0 OBJECT LOCAL DEFAULT 20 __do_global_dtors_aux_fin 35: 0000000000400500 0 FUNC LOCAL DEFAULT 14 frame_dummy 36: 0000000000600e10 0 OBJECT LOCAL DEFAULT 19 __frame_dummy_init_array_ 37: 0000000000000000 0 FILE LOCAL DEFAULT ABS hello.c 38: 0000000000000000 0 FILE LOCAL DEFAULT ABS crtstuff.c 39: 00000000004007d0 0 OBJECT LOCAL DEFAULT 18 __FRAME_END__ 40: 0000000000600e20 0 OBJECT LOCAL DEFAULT 21 __JCR_END__ 41: 0000000000000000 0 FILE LOCAL DEFAULT ABS 42: 0000000000600e18 0 NOTYPE LOCAL DEFAULT 19 __init_array_end 43: 0000000000600e28 0 OBJECT LOCAL DEFAULT 22 _DYNAMIC 44: 0000000000600e10 0 NOTYPE LOCAL DEFAULT 19 __init_array_start 45: 0000000000400680 0 NOTYPE LOCAL DEFAULT 17 __GNU_EH_FRAME_HDR 46: 0000000000601000 0 OBJECT LOCAL DEFAULT 24 _GLOBAL_OFFSET_TABLE_ 47: 0000000000400620 2 FUNC GLOBAL DEFAULT 14 __libc_csu_fini 48: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab 49: 0000000000601028 0 NOTYPE WEAK DEFAULT 25 data_start 50: 0000000000601038 0 NOTYPE GLOBAL DEFAULT 25 _edata 51: 0000000000400624 0 FUNC GLOBAL DEFAULT 15 _fini 52: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@@GLIBC_2.2.5 53: 0000000000400526 22 FUNC GLOBAL DEFAULT 14 max 54: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@@GLIBC_ 55: 0000000000601028 0 NOTYPE GLOBAL DEFAULT 25 __data_start 56: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ 57: 0000000000601030 0 OBJECT GLOBAL HIDDEN 25 __dso_handle 58: 0000000000400630 4 OBJECT GLOBAL DEFAULT 16 _IO_stdin_used 59: 00000000004005b0 101 FUNC GLOBAL DEFAULT 14 __libc_csu_init 60: 0000000000601040 0 NOTYPE GLOBAL DEFAULT 26 _end 61: 0000000000400430 42 FUNC GLOBAL DEFAULT 14 _start 62: 0000000000601038 0 NOTYPE GLOBAL DEFAULT 26 __bss_start 63: 000000000040053c 109 FUNC GLOBAL DEFAULT 14 main 64: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _Jv_RegisterClasses 65: 0000000000601038 0 OBJECT GLOBAL HIDDEN 25 __TMC_END__ 66: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable 67: 00000000004003c8 0 FUNC GLOBAL DEFAULT 11 _init Version symbols section '.gnu.version' contains 4 entries: 地址: 0000000000400358 Offset: 0x000358 Link: 5 (.dynsym) 000: 0 (*本地*) 2 (GLIBC_2.2.5) 2 (GLIBC_2.2.5) 0 (*本地*) Version needs section '.gnu.version_r' contains 1 entries: 地址:0x0000000000400360 Offset: 0x000360 Link: 6 (.dynstr) 000000: 版本: 1 文件:libc.so.6 计数:1 0x0010:名称:GLIBC_2.2.5 标志:无 版本:2 Displaying notes found at file offset 0x00000254 with length 0x00000020: Owner Data size Description GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag) OS: Linux, ABI: 2.6.32 Displaying notes found at file offset 0x00000274 with length 0x00000024: Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: debd3d7912be860a432b5c685a6cff7fd9418528从上面的信息中我们可以知道这个文件的类型是ELF64, 也就是64位的可执行程序, 并且有9个程序头和31个节头, 各个节的作用大家可以在网上找到资料, 这篇文章中只涉及到以下的节
什么是动态链接上面的程序中调用了printf函数, 然而这个函数的实现并不在./a.out中, 那么printf函数在哪里, 又是怎么被调用的?
printf函数的实现在glibc库中, 也就是/lib/x86_64-linux-gnu/libc.so.6中, 在执行./a.out的时候会在glibc库中找到这个函数并进行调用, 我们来看看这段代码
执行以下命令反编译./a.out
objdump -c -S ./a.out
我们可以看到以下的代码