soc:2010:peper:notes:usermode_explained

====== Differences ====== This shows you the differences between two versions of the page.

--- soc:2010:peper:notes:usermode_explained [2010/06/05 15:10]
peper created
+++ soc:2010:peper:notes:usermode_explained [2010/06/14 17:00] (current)
peper
@@ Line 39: / Line 39: @@
 ==== Linking to stdlib (glibc) ====
+**UPDATE**: That approach has been moved to a separate ''linuxlibc'' ''PLATFORM'' and is available on the [[http://git.etherboot.org/?p=people/peper/gpxe.git;a=shortlog;h=refs/heads/linuxlibc|linuxlibc branch]].
 Despite being non-trivial, forcing some compile flags to be disabled (namely ''-mrtd'' and ''-mregparm'' mentioned earlier) and having [[#the_other_problem_with_stdlib|some other problems]] linking to stdlib was still the quickest for prototyping.
@@ Line 85: / Line 87: @@
   }
 </code>
+=== Prefix ===
+stdlib's ''_start'' takes care of everything so the prefix code is empty.
 ==== Being self-contained ====
-Work in progress.
+To overcome the problems with linking to stdlib we need to implement some of its elementary features ourselves.
+=== Linker script ===
+A good read for starters is [[http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/gnu-linker/index.html|Using ld, the Gnu Linker]].
+With that backgrund the currently used linker scirpts (''arch/*/scripts/*.lds'') should make more sense.
+As we are not going to be linking against stdlib, the linker script should be really simple.
+In fact it turned out that there is already a simple enough linker script used for efi (''arch/x86/scripts/efi.lds'') that can be used more or less out of the box.
+The only necessary modification is setting the start of the Text segment properly, because not every value works (you can try ''0x0'' and see :)
+We can see what's the convention by looking at how the default linker script does it
+by passing ''--verbose'' to ''ld'' while compiling a simple program in 32bit and 64bit mode.
+<code>
+$ gcc -m32 foo.c -o foo -Wl,--verbose
+$ gcc -m64 foo.c -o foo -Wl,--verbose
+</code>
+From that we can gather that ''i386'' uses ''0x08048000'' and ''x86_64'' uses ''0x400000'' as the start address.
+I haven't been able to find a good explanation on why these are used in particular. Moreover many other values also seem to be working.
+Other way of figuring out the specific values is reading [[http://www.sco.com/developers/devspecs/abi386-4.pdf|i386 ABI]] (page 48)
+and [[http://www.x86-64.org/documentation/abi.pdf|AMD64 ABI]] (page 26).
+=== Prefix (_start) ===
+''_start'' being the default ''ENTRY'' point is the very first thing that's executed when a new process receives control.
+What we want to do in ''_start'' is the minimal work necessary to actually call our ''main()'' function.
+To accomplish that we need to know 3 things:
+  * What's the state of things when ''_start'' is executed
+  * How to actually call ''main()''
+  * What to do when ''main()'' returns
+The state of the stack and registers at the time of ''_start'' execution is descrbed in
+[[http://www.sco.com/developers/devspecs/abi386-4.pdf|i386 ABI]] (page 54)
+and [[http://www.x86-64.org/documentation/abi.pdf|AMD64 ABI]] (page 28).
+The function calling convention is also desribed in the ABI docs: [[http://www.sco.com/developers/devspecs/abi386-4.pdf|i386 ABI]] (pages 36-38)
+and [[http://www.x86-64.org/documentation/abi.pdf|AMD64 ABI]] (pages 15-23). A nice overview is [[http://www.agner.org/optimize/calling_conventions.pdf|calling conventions]].
+What we need to do after ''main()'' returns is to call the ''exit'' syscall. Details on that are in the next section.
+To actually make use of all that information we need to learn GNU Assembler first though.
+I haven't been able to find any too good docs on it and certainly nothing resembling a tutorial.
+Look at [[http://sig9.com/articles/att-syntax|quick syntax]], [[ftp://ftp.estec.esa.nl/pub/ws/wsd/erc32/doc/as.pdf|manual]] and [[http://tigcc.ticalc.org/doc/gnuasm.html|manual2]].
+Following simplified ''_start''s should make sense now:
+''arch/i386/prefix/linuxprefix.S'':
+<code asm>
+_start:
+        xorl    %ebp, %ebp // ABI wants us to zero the base frame
+        popl    %esi       // save argc
+        movl    %esp, %edi // save argv
+        pushl   %edi // argv -> C arg2
+        pushl   %esi // argc -> C arg1
+        call    main
+        movl    %eax, %ebx // rc -> syscall arg1
+        movl    $__NR_exit, %eax
+        int     $0x80
+</code>
+''arch/x86_64/prefix/linuxprefix.S'':
+<code asm>
+_start:
+        xorq    %rbp, %rbp // ABI wants us to zero the base frame
+        popq    %rdi       // argc -> C arg1
+        movq    %rsp, %rsi // argv -> C arg2
+        call    main
+        movq    %rax, %rdi // rc -> syscall arg1
+        movq    $__NR_exit, %rax
+        syscall
+</code>
+=== Syscalls ===
+To provide the necessary kernel API (functions declared in ''include/linux_api.h'') we need a way to perform syscalls.
+A simple way of doing that is implementing our own ''int syscall(int number, ...);''
+as ''long linux_syscall(int number, ...);'' and using that as the building block.
+The syscall calling conventions is a bit different than normal function calling convention on both ''i386'' and ''x86_64''.
+The [[http://www.x86-64.org/documentation/abi.pdf|AMD64 ABI]] (pages 123-124) is an informative section covering that for ''x86_64''.
+For ''i386'' we can look at [[http://www.cin.ufpe.br/~if817/arquivos/asmtut/index.html#syscalls|i386 syscalls]].
+With that information we can implement our own ''syscall()''.
+''arch/i386/core/linux/linux_syscall.S'':
+<code asm>
+linux_syscall:
+        /* Save registers */
+        pushl   %ebx
+        pushl   %esi
+        pushl   %edi
+        pushl   %ebp
+        movl    20(%esp), %eax  // C arg1 -> syscall number
+        movl    24(%esp), %ebx  // C arg2 -> syscall arg1
+        movl    28(%esp), %ecx  // C arg3 -> syscall arg2
+        movl    32(%esp), %edx  // C arg4 -> syscall arg3
+        movl    36(%esp), %esi  // C arg5 -> syscall arg4
+        movl    40(%esp), %edi  // C arg6 -> syscall arg5
+        movl    44(%esp), %ebp  // C arg7 -> syscall arg6
+        int     $0x80
+        /* Restore registers */
+        popl    %ebp
+        popl    %edi
+        popl    %esi
+        popl    %ebx
+        cmpl    $-4095, %eax
+        jae     1f
+        ret
+:
+        negl    %eax
+        movl    %eax, linux_errno
+        movl    $-1, %eax
+        ret
+</code>
+''arch/x86_64/core/linux/linux_syscall.S'':
+<code asm>
+linux_syscall:
+        movq    %rdi, %rax    // C arg1 -> syscall number
+        movq    %rsi, %rdi    // C arg2 -> syscall arg1
+        movq    %rdx, %rsi    // C arg3 -> syscall arg2
+        movq    %rcx, %rdx    // C arg4 -> syscall arg3
+        movq    %r8, %r10     // C arg5 -> syscall arg4
+        movq    %r9, %r8      // C arg6 -> syscall arg5
+        movq    8(%rsp), %r9  // C arg7 -> syscall arg6
+        syscall
+        cmpq    $-4095, %rax
+        jae     1f
+        ret
+:
+        negq    %rax
+        movl    %eax, linux_errno
+        movq    $-1, %rax
+        ret
+</code>
+With that in place we can implement most of the functions as simple wrappers:
+<code c>
+void * linux_mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
+{
+        return (void*)linux_syscall(__SYSCALL_mmap, addr, length, prot, flags, fd, offset);
+}
+void * linux_mremap(void * old_address, size_t old_size, size_t new_size, int flags)
+{
+        return (void*)linux_syscall(__NR_mremap, old_address, old_size, new_size, flags);
+}
+</code>
+Now you can see why our ''syscall()'' returns a ''long'' instead of an ''int''. Otherwise we wouldn't be able to return a pointer on ''x86_64''.
 ===== Subsystems =====

Trace: • usermode_explained • notes

Navigation

Search

Toolbox

QR Code