; Copyright 2009 David Elliott. All rights reserved. ; ; License to be determined. ; ; NT-style *LDR file: drwnldr ; ; The NT "LDR" file (NT/2k/XP: NTLDR, Vista/Win7: BOOTMGR) is very much like our boot2 file ; in that it is a flat binary that is entered in real-mode at offset 0. The first difference ; is that NTLDR expects to be loaded at 2000:0h whereas boot2 expects to be loaded at 2000:200h. ; The second difference is that the first 512 bytes of NTLDR contain some special code which ; completes the loading of NTLDR for FAT12 and FAT16 volumes. For NTFS and FAT32 volumes the ; entirety of NTLDR was loaded to 2000:0h and it expects to be called at offset 0. For FAT12 ; and FAT16 volumes, only the first cluster is loaded and a special 2000:3h entrypoint is used. ; ; This actually makes loading fairly simple as we can simply concatenate this drwnldr0 with the ; real boot (boot2) file and when NTFS or FAT32 booting the entire thing gets loaded in the correct ; place because this drwnldr0 is exactly 512 bytes long. ; ; When booting FAT12/FAT16 we have a slight bit less than 512 bytes to load the rest of the booter. struc BPB .BytsPerSec resw 1 ; Sector Size - 16 bits .SecPerClus resb 1 ; Sector Per Cluster - 8 bits .RsvdSecCnt resw 1 ; Reserved Sectors - 16 bits .NumFATs resb 1 ; Number of FATs - 8 bits .RootEntCnt resw 1 ; Root Entries - 16 bits .TotSec16 resw 1 ; Number of Sectors - 16 bits .Media resb 1 ; Media - 8 bits - ignored .FATSz16 resw 1 ; Sectors Per FAT - 16 bits .SecPerTrk resw 1 ; Sectors Per Track - 16 bits - ignored .NumHeads resw 1 ; Heads - 16 bits - ignored .HiddSec resd 1 ; Hidden Sectors - 32 bits - ignored .TotSec32 resd 1 ; Large Sectors - 32 bits endstruc struc tARG .ReadCluster: resw 2 .ReadSectors: resw 2 .SectorToRead: resd 1 endstruc struc bss .CurrentFatSectorLBA resd 1 ; LBA of last FAT sector laoded into memory .FatMask resb 1 ; FAT12 = 0xfff, FAT16 = 0xffff endstruc kFatBufferSeg EQU 0x1000 ; Setup BSS at DS:8000 which should be 0:8000 %define bss_seg ds _bss EQU 0x8000 ;-------------------------------------------------------------------------- ; Start of text segment. SEGMENT .text ORG 0 ; Assume CS = 2000h start: jmp start_boot2 times 3-($-$$) nop ; Entry from FAT12/16 bootsector (it is the same bootsector for both) ; Input: ; BX = Starting cluster (from dir entry) of this file. ; DL = Drive number ; DS:SI -> BPB ; DS:DI -> Argument structure ; When booting from the FAT12/16 disk bootsector the following is true: ; SS:BP = 0:7C00h = Top of data area, bottom of bootsector ; The ReadCluser and ReadSectors routines depend on SS:BP being the base pointer ; of the bootsector. This allows it to access its data area relative to SS:BP ; What it does is use negative offsets relative to BP. ; ; DS = 0 = Data segment of bootsector ; The data segment upon entry is the data segment of the bootsector. ; ; ES = CS = Our segment ; We don't depend on it, but because the bootsector has just issued a read of ; this sector to ES:BX, ES will wind up being CS which is 2000h. ; ; The ReadCluster routine can read AL sectors from cluster DX. Unfortunately, ; it always reads the first sector within the cluster. Because the bootsector ; loaded only the first sector of the boot file (that's us) and because there ; may be more sectors in the first cluster, we must overwrite ourself. ; ; This means that we cannot keep data inside of CS for it will get overwritten. ; For now we use DS:8000 for runtime data under the assumption that this should ; be high enough to avoid overwriting the boot sector code. ; ; A better method might be to allocate some space on the stack but then you run ; into the problem that SP is a moving target in the face of our push/pop and ; BP cannot be used without saving/restoring it across bootsector calls. fatstart: ; Figure out the type of FAT (12 or 16). We do this using the Microsoft-recommend ; method of calculating the LBA start of the data area and subtracting it from the ; total sectors count in the BPB. That gives the number of data sectors which ; when divided by sectors per cluster gives us the number of clusters. There must ; be less than 2 ** fatbits - 11 clusters for a given FAT type to be used. ; This code is basically lifted from boot1fat32, hence why it does 32-bit arithmetic ; when for FAT12/16, 16-bit arithmetic with carry would suffice. pushad ; Save all the 32-bit registers while we do a bunch of easy 32-bit arithmetic. xor eax, eax mov dword [bss_seg:_bss + bss.CurrentFatSectorLBA], eax ; Zero the current fat sector LBA so the cache doesn't think we already read it movzx dword eax, word [ds:si + BPB.FATSz16] ; Load 16-bit number of FAT sectors into 32-bit EAX movzx dword edx, byte [ds:si + BPB.NumFATs] ; Stick number of fats in EDX mul dword edx ; EDX:EAX = EAX * EDX (ignore output EDX) ; Add in the reserved sectors movzx dword edx, word [ds:si + BPB.RsvdSecCnt] add eax, edx ; EAX is now the partition-relative offset of the root directory.. which we don't care about push dword eax ; Save off the start of the root dir. movzx dword eax, word [ds:si + BPB.RootEntCnt] ; EAX = # of root dir entries shl dword eax, 5 ; EAX = # of bytes of root dir xor dword edx, edx ; EDX = 0 movzx dword ebx, word [ds:si + BPB.BytsPerSec] ; EBX = BPS add dword eax, ebx dec eax div dword ebx ; EAX = root dir entries / BPS (rounded up) pop dword ebx ; Restore start of root dir into EBX add dword eax, ebx ; Add this to EAX ; EAX now the start of the data area ; Grab the 16 or 32-bit sector count movzx dword edx, word [ds:si + BPB.TotSec16] test edx,edx jnz .use_total_sec16 mov edx, dword [ds:si + BPB.TotSec32] .use_total_sec16: ; Subtract DataOffset from TotalSectors neg eax add eax, edx ; EAX = (EDX - EAX) ; Calculate ClusterCount as DataSectors / BPB_SecPerClus ; Remainder is ignored (i.e. any trailing sectors aren't considered) movzx dword ecx, byte [ds:si + BPB.SecPerClus] ; Grab the BPB_SecPerClus byte into ECX, zero extended xor edx, edx ; Zero EDX in preparation for divide since it uses EDX:EAX as implicit input div dword ecx ; EAX = EDX:EAX / ECX ; Determine the fat type. ; The algorithm is actually very straightforward when you think about it but even Microsoft's ; description of it complicates things by making the numbers seem as if they were arbitrarily ; chosen. So here's the deal: ; ; One single entry in a FAT table can be a 12-bit value, a 16-bit value, or a 32-bit value. ; ; Each entry refers to the next cluster number with certain numbers being special: ; * Entry numbers 0 and 1 are reserved and used for various purposes. ; * The last 8 numbers are used to indicate the end of chain. That is for FAT12 0xff8 through 0xfff ; indicate end of chain, for FAT16 it's 0xfff8 through 0xffff and for FAT32, which would ; probably more correctly be called FAT28 it's 0x0ffffff8 through 0x0fffffff ; * The ninth last valid number is the bad cluster mark (0xff7, 0xfff7, 0x0ffffff7) ; ; That means there are 11 numbers that can't be used which means the limit is 2 ** n - 11 ; Therefore ClusterCount must be LESS THAN that limit. Not less than or equal, less than. cmp dword eax, 0xff5 mov word [bss_seg:_bss + bss.FatMask], 0xfff jb gotmask cmp dword eax, 0xfff5 mov word [bss_seg:_bss + bss.FatMask], 0xffff jb gotmask ; Fall through and let jnb take us to the error after we pop registers gotmask: popad ; Restore all the registers we clobbered doing 32-bit arithmetic jnb error_fat_too_large ; This really shouldn't happen, but just in case. ; Now that we've got the fat mask figured out.. let's proceed to actually load something. push word dx ; Save the drive push cs pop es mov ax, [ds:si + BPB.BytsPerSec] shr ax, 4 ; AX is now paragraphs per sector movzx dx, byte [ds:si + BPB.SecPerClus] mul dx ; DX:AX = AX * DX ; AX = paragraphs per cluster mov dx, bx ; Move the cluster number to DX mov cx, ax ; Move the paragraphs per cluster to CX movzx ax, byte [ds:si + BPB.SecPerClus] ; Set AX (actually AL) to the number of sectors per cluster loop_clusters: ; AL = number of clusters to read ; DX = cluster number ; ES:BX -> buffer xor bx, bx ; Set BX = 0 so that ES:BX is our buffer pusha call far [ds:di + tARG.ReadCluster] popa cant_read_cluster: jc error_cant_read_cluster ; Get the next cluster number call next_cluster jnb done_reading ; Increment ES (via BX) with CX which is the paragraphs per cluster mov bx, es add bx, cx mov es, bx jmp loop_clusters done_reading: pop dx ; Restore the drive we saved jmp start_boot2 ;-------------------------------------------------------------------------- ; Error handling error_fat_too_large: mov si, errorstr_fat_too_large jmp short print_error error_cant_read_cluster: pop dx ; Restore stack (this is only jumped to when DX is pushed onto the stack) mov si, errorstr_cant_read_cluster jmp short print_error error_read_fat_sector_failed: pop es pop word cx pop word ax ; Restore stack (this is jumped to when next_cluster fails to read. mov si, errorstr_cant_read_fat ;jmp short print_error print_error: cs lodsb mov bx, 1 mov ah, 0x0e int 0x10 test al, al jnz print_error ; Wait for a key then invoke INT 18. error: mov ah, 0 int 0x16 int 0x18 jmp short $ ;-------------------------------------------------------------------------- ; next_cluster ; ; Arguments: ; DX = Cluster number that was just read ; Returns: ; DX = Next cluster number to read ; CF = 0 Returned cluster number is EOC ; = 1 Returned cluster number is _not_ EOC ; Rather than thinking of it in terms of CF, use jb/jnb or jb/jae ; Because the comparison is of DX ??? EOC mark ; Does not return on read error. ; Clobbers: ; Flags next_cluster: push word ax push word cx xor cx, cx ; Zero CX, this is going to be our odd shift number used with FAT12 ; See if it's FAT12 cmp word [bss_seg:_bss + bss.FatMask], 0xfff jne next_cluster_16 next_cluster_12: ; For FAT12 the word we need to read is located at 1.5 * cluster ; From there, if it's an even cluster we just mask 12 bits ; But if it's an odd cluster we have to shift it mov ax, dx ; AX = DX (Cluster #) shr ax, 1 ; AX = DX/2 adc cx, 0 ; CX = CX + 0 + CF. Basically, If CF then CX = 1, else 0 add ax, dx ; AX = DX/2 + DX ; So now AX is 1.5 the cluster number shl cx, 2 ; CX *= 4, meaning it's 4 if odd, 0 if even xor dx, dx ; Zero DX for FAT12.. it's not possible to carry jmp short do_read_fat next_cluster_16: mov ax, dx ; AX = cluster number xor dx, dx ; Start DX at 0. shl ax, 1 ; AX *= 2 adc dx, 0 ; If carry, add it to DX which is already zero. do_read_fat: ; DX:AX = 32-bit byte offset of FAT entry ; CX = shift (e.g. 4 for odd FAT12) div word [ds:si + BPB.BytsPerSec] ; AX = DX:AX / BPS (Sector number) ; DX = DX:AX % BPS (Byte offset within the sector) ; NOTE: this cannot overflow because DX is either 0 or 1 and BytsPerSec must be >= 2 ; Put the sector number into the argument structure add ax, [ds:si + BPB.RsvdSecCnt] ; AX += RsvdSecCnt (with carry.. although this should RARELY be needed) mov word [ds:di + tARG.SectorToRead + 2], 0 ; Initialize the MSW to 0 mov [ds:di + tARG.SectorToRead], ax ; Set the LSW to the sector number (which includes the reserved count) adc word [ds:di + tARG.SectorToRead + 2], 0 ; Increment the MSW by the carry flag. push es ; Save ES.. it needs to keep pointing to the booter push word kFatBufferSeg pop es ; ES = kFatBufferSeg ; Compare the sector we need to the sector we have in the buffer cmp ax, word [bss_seg:_bss + bss.CurrentFatSectorLBA] mov ax, word [ds:di + tARG.SectorToRead + 2] ; Stash the MSW into AX for the code below jne do_call_read cmp ax, word [bss_seg:_bss + bss.CurrentFatSectorLBA + 2] je skip_read do_call_read: pusha mov al, 2 ; Read 2 sectors just in case it's FAT12 at offset 511 xor bx, bx ; Zero BX so we're at the start of the buffer. call far [ds:di + tARG.ReadSectors] popa read_failed: jc error_read_fat_sector_failed ; Update the last read LBA so we can avoid a read the next time through. mov word [bss_seg:_bss + bss.CurrentFatSectorLBA + 2], ax ; AX has the MSW of the sector number, write it to the MSW cache marker. mov ax, word [ds:di + tARG.SectorToRead] ; Assume it didn't clobber! ; We didn't save the LSW of the sector number but it should still be in the argument struct mov word [bss_seg:_bss + bss.CurrentFatSectorLBA], ax ; Write it to the LSW of the cache marker. skip_read: ; We have the sector in the buffer, now DX is the offset we need to read. push word bx mov bx, dx mov dx, [es:bx] ; Can't address relative to DX, only BX ; DX = the 16-bit word from the FAT. pop word bx shr dx, cl ; Shift the result by the FAT12 odd shift (if any) mov ax, [bss_seg:_bss + bss.FatMask] ; Put the fat mask into AX and dx, ax ; Mask it off ; DX is now the true cluster number sub ax, 7 ; Turn the mask into the EOC marker pop es ; Restore ES cmp dx, ax ; Compare the cluster number to the EOC marker pop word cx pop word ax ; Restore saved registers ret ;-------------------------------------------------------------------------- ; Strings, etc. errorstr_fat_too_large: db 'Only FAT12/16 is supported.', 13, 10, 0 errorstr_cant_read_cluster: db 'Read cluster failed', 13, 10, 0 errorstr_cant_read_fat: db 'Read FAT sector failed', 13, 10, 0 ;-------------------------------------------------------------------------- ; Pad to 512 bytes then provide a label indicating the true location of boot2 ; It expects to be running at 2000:0200h so this must be 512 bytes. times 512-($-$$) db 0 start_boot2: