====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
soc:2010:peper:notes:usermode_explained [2010/06/05 15:10]
peper created
soc:2010:peper:notes:usermode_explained [2010/06/14 17:00] (current)
peper
Line 39: Line 39:
  
 ==== Linking to stdlib (glibc) ==== ==== Linking to stdlib (glibc) ====
 +
 +**UPDATE**: That approach has been moved to a separate ''​linuxlibc''​ ''​PLATFORM''​ and is available on the [[http://​git.etherboot.org/?​p=people/​peper/​gpxe.git;​a=shortlog;​h=refs/​heads/​linuxlibc|linuxlibc branch]].
  
 Despite being non-trivial,​ forcing some compile flags to be disabled (namely ''​-mrtd''​ and ''​-mregparm''​ mentioned earlier) and having [[#​the_other_problem_with_stdlib|some other problems]] linking to stdlib was still the quickest for prototyping. Despite being non-trivial,​ forcing some compile flags to be disabled (namely ''​-mrtd''​ and ''​-mregparm''​ mentioned earlier) and having [[#​the_other_problem_with_stdlib|some other problems]] linking to stdlib was still the quickest for prototyping.
Line 85: Line 87:
   }   }
 </​code>​ </​code>​
 +
 +=== Prefix ===
 +
 +stdlib'​s ''​_start''​ takes care of everything so the prefix code is empty.
 +
  
 ==== Being self-contained ==== ==== Being self-contained ====
  
-Work in progress.+To overcome the problems with linking to stdlib we need to implement some of its elementary features ourselves. 
 + 
 +=== Linker script === 
 + 
 +A good read for starters is [[http://​www.redhat.com/​docs/​manuals/​enterprise/​RHEL-4-Manual/​gnu-linker/​index.html|Using ld, the Gnu Linker]]. 
 +With that backgrund the currently used linker scirpts (''​arch/​*/​scripts/​*.lds''​) should make more sense. 
 + 
 +As we are not going to be linking against stdlib, the linker script should be really simple. 
 +In fact it turned out that there is already a simple enough linker script used for efi (''​arch/​x86/​scripts/​efi.lds''​) that can be used more or less out of the box. 
 +The only necessary modification is setting the start of the Text segment properly, because not every value works (you can try ''​0x0''​ and see :) 
 +We can see what's the convention by looking at how the default linker script does it 
 +by passing ''​--verbose''​ to ''​ld''​ while compiling a simple program ​in 32bit and 64bit mode. 
 + 
 +<​code>​ 
 +$ gcc -m32 foo.c -o foo -Wl,​--verbose 
 +$ gcc -m64 foo.c -o foo -Wl,​--verbose 
 +</​code>​ 
 + 
 +From that we can gather that ''​i386''​ uses ''​0x08048000''​ and ''​x86_64''​ uses ''​0x400000''​ as the start address. 
 +I haven'​t been able to find a good explanation on why these are used in particular. Moreover many other values also seem to be working. 
 +Other way of figuring out the specific values is reading [[http://​www.sco.com/​developers/​devspecs/​abi386-4.pdf|i386 ABI]] (page 48) 
 +and [[http://​www.x86-64.org/​documentation/​abi.pdf|AMD64 ABI]] (page 26). 
 + 
 +=== Prefix (_start) === 
 + 
 +''​_start''​ being the default ''​ENTRY''​ point is the very first thing that's executed when a new process receives control. 
 +What we want to do in ''​_start''​ is the minimal work necessary to actually call our ''​main()''​ function. 
 + 
 +To accomplish that we need to know 3 things: 
 +  * What's the state of things when ''​_start''​ is executed 
 +  * How to actually call ''​main()''​ 
 +  * What to do when ''​main()''​ returns 
 + 
 +The state of the stack and registers at the time of ''​_start''​ execution is descrbed in 
 +[[http://​www.sco.com/​developers/​devspecs/​abi386-4.pdf|i386 ABI]] (page 54) 
 +and [[http://​www.x86-64.org/​documentation/​abi.pdf|AMD64 ABI]] (page 28). 
 + 
 +The function calling convention is also desribed in the ABI docs: [[http://​www.sco.com/​developers/​devspecs/​abi386-4.pdf|i386 ABI]] (pages 36-38) 
 +and [[http://​www.x86-64.org/​documentation/​abi.pdf|AMD64 ABI]] (pages 15-23). A nice overview is [[http://​www.agner.org/​optimize/​calling_conventions.pdf|calling conventions]]. 
 + 
 +What we need to do after ''​main()''​ returns is to call the ''​exit''​ syscall. Details on that are in the next section. 
 + 
 +To actually make use of all that information we need to learn GNU Assembler first though. 
 +I haven'​t been able to find any too good docs on it and certainly nothing resembling a tutorial. 
 +Look at [[http://​sig9.com/​articles/​att-syntax|quick syntax]], [[ftp://​ftp.estec.esa.nl/​pub/​ws/​wsd/​erc32/​doc/​as.pdf|manual]] and [[http://​tigcc.ticalc.org/​doc/​gnuasm.html|manual2]]. 
 + 
 +Following simplified ''​_start''​s should make sense now: 
 + 
 +''​arch/​i386/​prefix/​linuxprefix.S'':​ 
 +<code asm> 
 +_start: 
 +        xorl    %ebp, %ebp // ABI wants us to zero the base frame 
 + 
 +        popl    %esi       // save argc 
 +        movl    %esp, %edi // save argv 
 + 
 +        pushl   %edi // argv -> C arg2 
 +        pushl   %esi // argc -> C arg1 
 + 
 +        call    main 
 + 
 +        movl    %eax, %ebx // rc -> syscall arg1 
 +        movl    $__NR_exit, %eax 
 +        int     ​$0x80 
 +</​code>​ 
 +''​arch/​x86_64/​prefix/​linuxprefix.S'':​ 
 +<code asm> 
 +_start: 
 +        xorq    %rbp, %rbp // ABI wants us to zero the base frame 
 + 
 +        popq    %rdi       // argc -> C arg1 
 +        movq    %rsp, %rsi // argv -> C arg2 
 + 
 +        call    main 
 + 
 +        movq    %rax, %rdi // rc -> syscall arg1 
 +        movq    $__NR_exit, %rax 
 +        syscall 
 +</​code>​ 
 + 
 +=== Syscalls === 
 + 
 +To provide the necessary kernel API (functions declared in ''​include/​linux_api.h''​) we need a way to perform syscalls. 
 + 
 +A simple way of doing that is implementing our own ''​int syscall(int number, ...);''​ 
 +as ''​long linux_syscall(int number, ...);''​ and using that as the building block. 
 + 
 +The syscall calling conventions is a bit different than normal function calling convention on both ''​i386''​ and ''​x86_64''​. 
 +The [[http://​www.x86-64.org/​documentation/​abi.pdf|AMD64 ABI]] (pages 123-124) is an informative section covering that for ''​x86_64''​. 
 +For ''​i386''​ we can look at [[http://​www.cin.ufpe.br/​~if817/​arquivos/​asmtut/​index.html#​syscalls|i386 syscalls]]. 
 + 
 +With that information we can implement our own ''​syscall()''​. 
 + 
 +''​arch/​i386/​core/​linux/​linux_syscall.S'':​ 
 +<code asm> 
 +linux_syscall:​ 
 +        /* Save registers */ 
 +        pushl   ​%ebx 
 +        pushl   ​%esi 
 +        pushl   ​%edi 
 +        pushl   ​%ebp 
 + 
 +        movl    20(%esp), %eax  // C arg1 -> syscall number 
 +        movl    24(%esp), %ebx  // C arg2 -> syscall arg1 
 +        movl    28(%esp), %ecx  // C arg3 -> syscall arg2 
 +        movl    32(%esp), %edx  // C arg4 -> syscall arg3 
 +        movl    36(%esp), %esi  // C arg5 -> syscall arg4 
 +        movl    40(%esp), %edi  // C arg6 -> syscall arg5 
 +        movl    44(%esp), %ebp  // C arg7 -> syscall arg6 
 + 
 +        int     ​$0x80 
 + 
 +        /* Restore registers */ 
 +        popl    %ebp 
 +        popl    %edi 
 +        popl    %esi 
 +        popl    %ebx 
 + 
 +        cmpl    $-4095, %eax 
 +        jae     1f 
 +        ret 
 + 
 +1: 
 +        negl    %eax 
 +        movl    %eax, linux_errno 
 +        movl    $-1, %eax 
 +        ret 
 +</​code>​ 
 + 
 +''​arch/​x86_64/​core/​linux/​linux_syscall.S'':​ 
 +<code asm> 
 +linux_syscall:​ 
 +        movq    %rdi, %rax    // C arg1 -> syscall number 
 +        movq    %rsi, %rdi    // C arg2 -> syscall arg1 
 +        movq    %rdx, %rsi    // C arg3 -> syscall arg2 
 +        movq    %rcx, %rdx    // C arg4 -> syscall arg3 
 +        movq    %r8, %r10     // C arg5 -> syscall arg4 
 +        movq    %r9, %r8      // C arg6 -> syscall arg5 
 +        movq    8(%rsp), %r9  // C arg7 -> syscall arg6 
 + 
 +        syscall 
 + 
 +        cmpq    $-4095, %rax 
 +        jae     1f 
 +        ret 
 + 
 +1: 
 +        negq    %rax 
 +        movl    %eax, linux_errno 
 +        movq    $-1, %rax 
 +        ret 
 +</​code>​ 
 + 
 +With that in place we can implement most of the functions as simple wrappers:  
 +<code c> 
 +void * linux_mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) 
 +
 +        return (void*)linux_syscall(__SYSCALL_mmap,​ addr, length, prot, flags, fd, offset); 
 +
 + 
 +void * linux_mremap(void * old_address,​ size_t old_size, size_t new_size, int flags) 
 +
 +        return (void*)linux_syscall(__NR_mremap,​ old_address,​ old_size, new_size, flags); 
 +
 +</​code>​ 
 +Now you can see why our ''​syscall()''​ returns a ''​long''​ instead of an ''​int''​. Otherwise we wouldn'​t be able to return a pointer on ''​x86_64''​.
  
 ===== Subsystems ===== ===== Subsystems =====

QR Code
QR Code soc:2010:peper:notes:usermode_explained (generated for current page)