Sunday 27 May 2018


With my Sinclair QL now armed with a disk interface and HxC Floppy Emulator, I wanted a simple way to access a Double Density disk image format and adding/removing files to it.

The qltools.c can be compiled on linux gcc, but I had a feeling I could also work towards understanding the raw image format a bit more deeply, and then write the HxC Floppy Emulator (HFE) files myself.

This is not really a blog post to be read, these are mostly very long "notes to self" about the topics that were needed to achieve what I wanted.

I'll look at:

-The Floppy Emulator 737280 raw byte disk image format and the QL disk ordering
-The Floppy Emulator HFE bit-level disk image format (MFM)
-CRC calculation

Bear in mind I'm not explaining IMGs and HFEs fully, I'm just doing the minimum to manipulate files on a 720k QL DD disk image.

For understanding the QL disk format, the below sites were valuable, but even then I had some head-scratching to do.

About the sector ordering:

Some useful disk terminology from here:

More definite header information from here:

The QLTools source code:

I am looking at a QL Double Density drive raw byte image as a middle-ground for transferring between PC/Linux files and the SD card-friendly HFE image. The image I'm working with is 80 Tracks, 9 disk sectors, two sides, with a sector size of 512 makes 737280 bytes.

A Block is a central unit here. The Block comprises of three sectors, making up 512 * 3 bytes, i.e. 1536 bytes, and there are 480 of these Blocks on this DD disk.

Two Blocks are already taken up by the disk header and File Allocation Table (FAT) itself and the Directory File.

Each QL stored file takes a minimum of 1 Block of space on the disk, even if I saved just one byte. Also, each QL file is preceded by 64 bytes of data, so a file saved on QL Basic...

SBYTES flp1_test, 196608, 1500

...will take two Blocks, as the true length of the file with the information will be 1500 + 64 = 1564, more than 1536 and no longer fits into one block. Additionally, the file takes a 64 bytes slot in the already reserved directory list block.

The raw byte IMG file

I will first look at a stored raw IMG file, converted from a HFE using the HxC floppy emulator utility software.

The raw IMG is exactly 737280 bytes long, so within the IMG there are no additional headers or other data than the floppy image as a byte representation form. Contrast this with the floppy emulator HFE image, which represents the bits on the actual disk and also has a header that describes the disk format.

For purposes of analyzing the image, I found it useful to display the IMG contents as hex/ASCII with 512-wide rows. The below shows the beginning of the DD image:

(Note that the block/sector order does not continue as neatly)
The only really certain thing is that the first 95 bytes are the header.

I guess the look-up table inside this area should describe the order of the sectors, but I use a fixed format/order.

Within this fixed format, the first block will contain both the header and the File Allocation Table, beginning from offset 96. The table is 12-bit, although I'm not sure that it corresponds to any FAT-12 standard.

What I call the "directory file header" in the above image is really one file taking up one slot in the directory list file itself.

The FAT contains 480 three-byte records, corresponding with the 480 Blocks that are on the disk of this size. After the FAT the image will contain by default one directory data Block, at block 1, and after this it's just empty blocks to the end of the image.

IMG File Allocation Table

So, in this disk the first block containing the disk header and the FAT information can be stored to a linear buffer by appending 512 (0x200) bytes from 0x0000, 0x600 and 0xC00 offsets each.

The FAT is made of 480 three-byte allocation records, and a record is made of two 12-bit values:

File Number 0x000 to 0xFFF
Sequential Number 0x000 to 0xFFF

Looking the record as three bytes, it is constructed this way:

Record #n
byte 0: 8 Most significant bits of file# value
byte 1: 4 Least significant bits of file# value, 4 Most significant bits of seq# value
byte 2: 8 Least significant bits of seq# value

An unused (empty, free) record has the values:

0xFD, 0xFF, 0xFF

e.g. File number 1, sequence 0:
0x00, 0x10, 0x00

e.g. File number 1, sequence 1:
0x00, 0x10, 0x01

0xF8 indicates that the block is reserved for the header/FAT. File 0x00 is commonly the directory file. 0xFC and 0xFE indicate files pending deletion and bad blocks respectively, but they don't concern me in this setup.

To simplify, a block can be considered to be in use for a normal, healthy file, if the value is below 0xF8. For my purposes it is enough to check the presence of 0xFD, 0xFF, 0xFF for empty blocks.

The record's location within the FAT indicates the Block number this entry is pointing to. So, the third successive record is "about" the Block #3.

Files larger than 1536 bytes use multiple Blocks, and are indicated by a similar file number and an increasing sequence count. If there are only small files on the disk, the sequence numbers will always be zero. For one giant file, there is only one repeated file number whereas the sequence number increases.

An example portion of the FAT might look like this:

...FDFFFF FDFFFF 001000 00200 00300 00301 00302 FDFFFF FDFFFF...

The above describes three entries between empty records (FDFFFF). The first two, 1 and 2, are one-block files. The third, 3, has three blocks, indicated by the same filename and an increasing sequence count. The directory file would have 3 named entries.

Although normal file writing places the Blocks after each other, nothing says the sequences have to be in any particular order. So, a routine that reads a file from the image has to go through the whole record, picking up all data in order of the sequence indicator.

Just to repeat myself, the FAT looks like this in an empty image:

Block/Entry#, 12-bit file#,  12-bit seq#

0: 0xF8, 0x00, 0x00 (Block #0 contains the FAT)
1: 0x00, 0x00, 0x00 file/seq points to block #1 - Directory
2: 0xFD, 0xFF, 0xFF file/seq points to block #2 - (Earmarked for directory?)
3: 0xFD, 0xFF, 0xFF file/seq points to block #3 - (Earmarked for directory?)
4: 0xFD, 0xFF, 0xFF file/seq points to block #4 - (Earmarked for directory?)
5: 0xFD, 0xFF, 0xFF file/seq points to block #5 - (Earmarked for directory?)
6: 0xFD, 0xFF, 0xFF file/seq points to block #6
7: 0xFD, 0xFF, 0xFF file/seq points to block #7...
479: 0xFD, 0xFF, 0xFF file/seq points to block #479

Interestingly, my real QL seems to save files starting from entry 6 onwards, whereas the QLtools utility seemed to add a file to the first available position whatever, which is 2.

Perhaps it's my fantasy, but records 2-5 might be seen as "earmarked" for directory, although I'm unsure if it would matter that much speed-wise or anything. (It might even slow down the disk, who knows)

Again, a directory reading routine can assume the first directory block to be 1, but theoretically the rest can be anywhere else. It might be better for a reading routine to take its cue from the FAT, looking for file #0 components.

IMG Directory Block

Looking at the first directory block, a row of 0x30 values covers the first directory entry (the entry is the directory itself). After that the directory entries are assigned to files added to the disk.

I'm told the file name and file name length information within the file's own header is useless, and the file names within the directory file are definite. This is the only place to access the true length of the file in bytes.
Note that it's pointless to have a 16-bit value for filename length, but some documents have it this way.

Given the first entry in the directory list is already in use, 23 files can be added before new directory blocks are needed. For example, if five blocks are reserved for directory, this would mean the directory takes 5 * 1536 bytes = 7680 bytes. The directory contains file information and each file takes 64 bytes. This would give 119 files on the DD disk before yet another block would need adding.

IMG Block and track order

My first diagram about the disk image beginning is misleading, as it looks like the blocks and sectors are interleaved in a clear way up until the end of the disk.

Although int(block/6) gives the Track# you can expect to have the Block in, the Blocks are not ordered, and neither are the three sectors (512 bytes) inside a Block in a sequential order.

Ok, so let's look at this thing. Again, the image file is arranged in 480 * 3 = 1440 sectors, including the FAT.

As my first diagram shows, if a row is 512 bytes, the FAT appears on rows 0, 3 and 6. Then the directory contents start at 9th row of the image, padded with sixty-four 0x30 byte-values. The next row containing directory data is at 12th row, and the third is at 15th row.

How to make sense of the arrangement? Here the soulsphere site comes to rescue, but it had to be read in a certain way before it made sense.

Let's find the directory. As the first directory block is the second entry in the FAT, it points Block #1, the ordering which falls within Track #0. (Which has blocks 0-5).

Track #0 contents are in the following order:

Offset: Block, Sector
0: B0S0 (each of these are 512 bytes)
1: B2S0
2: B4S0
3: B0S1
4: B2S1
5: B4S1
6: B0S2
7: B2S2
8: B4S2
9: B1S0 first
10: B3S0
11: B5S0
12: B1S1 second
13: B3S1
14: B5S1
15: B1S2 third
16: B3S2
17: B5S2

We're looking for Block #1, so I've highlighted the three sectors it is made of. Looking at the serial ordering of the table, the Block #1 sectors 0, 1 and 2 are at offsets 9, 12 and 15. Glancing at the 512-wide arranged diagram, it can be seen the directory contents are indeed at locations 9 * 512, 12 * 512 and 15 * 512.

Let's look at the first saved file on the disk. In this example, looking through the FAT, it is at entry #6, thus pointing at Block #6. (In my examples the files fit in one block).

The Block 6 can be deciphered by looking at track #1, where the ordering is different from track #0. The sectors 0, 1 and 2 are at offsets 23, 26 and 20 (counting from track #0 beginning), so they are a bit backwards compared to what was seen on track #0.

Offset: Block, Sector
18: B8S1
19: B10S1
20: B6S2 third
21: B8S2
22: B10S2
23: B6S0 first
24: B8S0
25: B10S0
26:B6S1 second
27: B9S1
28: B11S1
29: B7S2
30: B9S2
31: B11S2
32: B7S0
33: B9S0
34: B11S0
35: B7S1

So, the routine that constructs a multi-block file into a linear buffer has to go through the FAT in file sequence number order, collect and append the sector 0,1,2 data for each block in that order.

Building IMG Disk header and FAT

The 95-byte header is at the beginning of the image.

The directory length (0x22-0x25) is the 16-bit value at 0x22 - 0x23 multiplied by 512, and added with the 16-bit value at 0x24 - 0x25 multiplied by 64. (Each file entry takes 64 bytes in the directory, including the directory file). This can be recalculated by scanning the FAT for unique file numbers, including the directory.

Free sectors at (0x14-0x15) can be recalculated by counting which of the 480 blocks are unused, and multiplying that by 3.

Below has the whole header described. I highlighted with green the locations that need changing in the header, when adding or removing blocks to the FAT. It may be a good idea to increment the update counter and randomize the values at 0x0D and 0x0E, otherwise the QL might think in some situations that the disk has not been altered and refuses to display an updated directory.

Disk ID and filename (0x00-0x0D)

0x00: 0x51 'Q' Header
0x01: 0x4c 'L' Header
0x02: 0x35 '5' Header
0x03: 0x41 'A' Header
0x04: 0x51 'Q' Label
0x05: 0x4c 'L' Label
0x06: 0x5f '_' Label
0x07: 0x44 'D' Label
0x08: 0x44 'D' Label
0x09: 0x20 ' ' Label
0x0A: 0x20 ' ' Label
0x0B: 0x20 ' ' Label
0x0C: 0x20 ' ' Label
0x0D: 0x20 ' ' Label

Disk information

0x0E: 0x92: Random value
0x0F: 0x53: Random value
0x10: 0x00: HI Update counter
0x11: 0x00: .. Update counter
0x12: 0x00: .. Update counter
0x13: 0x02: LO Update counter
0x14: 0x00: HI Free Sectors
0x15: 0x00: LO Free Sectors
0x16: 0x05: HI Good Sectors
0x17: 0xA0: LO Good Sectors = 1440
0x18: 0x05: HI Total Sectors
0x19: 0xA0: LO Total Sectors = 1440
0x1A: 0x00: HI Sectors per track
0x1B: 0x09: LO Sectors per track = 9
0x1C: 0x00: HI Sectors per cylinder
0x1D: 0x12: LO Sectors per cylinder = 18 (Double-sided)
0x1E: 0x00: Number of cylinders
0x1F: 0x50: Number of cylinders = 80
0x20: 0x00: HI Allocation block
0x21: 0x03: LO Allocation block = sectors / block = 3
0x22: 0x00: HI Directory Length: sectors (512)
0x23: 0x01: LO Directory Length: sectors (512)
0x24: 0x01: HI Directory Length: units of 64
0x25: 0x00: LO Directory Length:  units of 64
0x26: 0x00: HI Sector offset / track
0x27: 0x05: LO Sector offset / track

Logical-to-physical sector mapping table (18 bytes)

0x28: 0x00
0x29: 0x03
0x2A: 0x06
0x2B: 0x80
0x2C: 0x83
0x2D: 0x86
0x2E: 0x01
0x2F: 0x04
0x30: 0x07
0x31: 0x81
0x32: 0x84
0x33: 0x87
0x34: 0x02
0x35: 0x05
0x36: 0x08
0x37: 0x82
0x38: 0x85
0x39: 0x88

Physical-to-Logical sector mapping table (18 bytes)

0x3A: 0x00
0x3B: 0x06
0x3C: 0x0C
0x3D: 0x01
0x3E: 0x07
0x3F: 0x0D
0x40: 0x02
0x41: 0x08
0x42: 0x0E
0x43: 0x03
0x44: 0x09
0x45: 0x0F
0x46: 0x04
0x47: 0x0A
0x48: 0x10
0x49: 0x05
0x50: 0x0B
0x51: 0x11

A bunch of 0xFFs

0x52: 0xFF
0x53: 0xFF
0x54: 0xFF
0x55: 0xFF
0x56: 0xFF
0x57: 0xFF
0x58: 0xFF
0x59: 0xFF
0x5A: 0xFF
0x5B: 0xFF
0x5C: 0xFF
0x5D: 0xFF
0x5E: 0xFF
0x5F: 0xFF

The above was the header, and what follows is the FAT. I'll just give the start of the FAT - which is spread across the three sectors of block #0 (480*3 bytes)

0x60: 0xf8 Block #0, Points to FAT itself
0x61: 0x00
0x62: 0x00
0x61: 0x00 Block #1, Directory
0x62: 0x00
0x63: 0x00
0x64: 0xFD Block #2, Free
0x64: 0xFF
0x65: 0xFF
0x66: 0xFD Block #3, Free
0x67: 0xFF
0x68: 0xFF

... the FDFFFF repeated to the end of the first block.

After the first block the img can be filled with zeroes, and there you have it, an empty, formatted disk image in IMG format.

Reading from the IMG

It would be handy to find a file from the IMG based on its name, and then store it in pc memory or disk.

First, decipher the directory, because it's the way to look at file names. In this disk image it is record #1, but can use more records than the 1, so look for any FAT entries that have the file number 0 and then append the block (sector contents in sequential order).

File blocks, including directory blocks, may not always follow each other.

Now that the directory is accessible in a linear way, it can be read:

entrybase = start + 64 * file_no

0-3: Length of file
14-15: Length of file name
16-51: File name
52-55: Modification date (I don't use this)

(Just as in that diagram way up.)

The directory is looped through with file_no * 64. When the filename corresponds with the desired name, the needed file_no has been acquired.

Then the FAT is again looked through for blocks that have that file_no, and all sequential blocks belonging to file are appended in order to create a linear file, just like with the directory. The directory is a file just like any other so one can proceed by loading the file# 0 to a linear file buffer, then use that to load the desired file to another linear buffer.

From that linear file, file_length-64 bytes are then be stored to pc disk, beginning from the 64th byte, in case the 64-byte header is not needed.

When loading any file, directory or otherwise, as the blocks are found, the true sector offset within the image needs to be deciphered.

This row is taken from a table/algorithm that "converts" them into the funny order they are in the disk and provides a simple table from which to fetch the order. Instead of explaining this I'll just show the Processing routine I use to generate the ordered translation tables:

int []g_transblok=new int[1440];
int []g_transtrak=new int[1440];

void track_offset_generator() {
  int tphase[]={0,0,0,1,1,1,2,2,2,0,0,0,1,1,1,2,2,2,0,0,0,1,1,1,2,2,2};
  int bphase[]={0,2,4,0,2,4,0,2,4,0,2,4,0,2,4,0,2,4,0,2,4,0,2,4,0,2,4};
  int ptr=9;
  int count=0;

  for(int trk=0;trk<80;trk++){
    for(int sec=0;sec<9;sec++){

If I want to know what is the offset row in the disk image of block x and sector y, I'll loop through the table with step of 2, until I have the pair. The result is that table position/2. Multiply by 512 and there's the address from the beginning of the IMG file.

int findoffset(int block,int sector) { 
  for(int i=0;i<1440;i++){
    if(g_transblok[i]==block&&g_transtrak[i]==sector)return i;
  return 0;

If I want to know what is the block and sector at the disk image row n, I just grab the two values from row*2 and row*2+1.

Writing to the IMG

The end game of this Tour de Force, adding a file to the image:


Find the highest file number in the FAT. Your new file has file_no of that +1
Are there enough empty blocks in the FAT for the file? If No, go to ERROR

Is a new directory block needed? If No, go to APPEND

Are there enough empty slots in the FAT for the file + the new directory block? If No, go to ERROR

Extend the directory list file by creating a new FAT entry for file #0 in an empty slot

Give it a sequence number+1 compared to the highest existing block with file #0


Append the file data to a 64 bytes header, giving the complete file data and the length of the file in bytes

Write the file name and file length to the directory slot file_name*64

For every 1536-byte file block:

  • Write the file number and sequence number (starting from 0) to an empty FAT slot, the block# is the number of the first empty slot.
  • Write the three sectors of data to the image, offset row *512, the row is based on the assigned block# and sectors (0-2) values, through a conversion table.
  • Increase sequence number, for subsequent blocks (if any)

Refresh the value for number of free sectors in the header
Refresh the values for the directory list size in the header
Randomize disk Random # value
Increment the disk update counter



Can't add the file! [END]

Into the HFE

Although the IMGs can already be converted to HFE with the HxC floppy emulator tools, as I had gone this far I wanted to get the IMG converted to HFE by my own means. This isn't as comprehensive, as I'm working with an already-formatted and HxC-converted file rather than creating a HFE totally from scratch.

The HFE file image for the QLDD is 2008064 bytes, which already clearly shows it is not a byte representation of the IMG.

Also, it has a 1024 byte file header as byte/word values, but the rest is bit-level information encoded in MFM (Modified Frequency Modulation) format suitable for the HxC to communicate with the disk controllers. MFM, the "Double Density", is a way to compactly encode the data and the clock signal into a waveform without wasting too much space/time.

Luckily the waveform generation stuff is not needed here, it suffices to know that information/data bits are interleaved with an encoding signal. I found it useful to have routines that access the HFE track bits directly.

Still, it is not entirely straightforward to transfer the IMG to an HFE, but it is not that difficult after understanding a few things. It also makes sense to approach the HFE only after doing the analysis on raw byte images discussed above. Also, writing data on an existing HFE ("formatted") is a bit simpler than building one from scratch.

Again, visualizing the HFE tracks in Processing was very helpful:

Before the sector bit data, the HFE file has a byte-readable HxC header. The most important information here are the byte offsets to each of the 80 tracks starting positions inside the HFE file:

base address: 1024 + track * 4
0: track offset LO
1: track offset HI (* 512)
2: track length LO (bytes)
3: track length HI (bytes)

(* 512 = multiply the 16-bit track offset value with 512 to get the offset from the beginning of the HFE file)

From the start of each track offset, reading every second bit after the first one gives the data bits for the track. Every bit in-between relates to the encoding. When reading the data, these are not really needed. Writing the data, these have to be coded to correspond with the data bits.

One header-related structuring still remains: within the sector data, disk side information is interleaved so that the first 256 bytes relate to side 0, the next 256 bytes to side 1 and so on. To make things simpler I make the whole HFE data linear before working on the image, then de-linearize it before saving it.

Now, supposing we've found the start of the disk data (more on that below) that corresponds with the header start in the raw byte image:

Q        L        5        A        Q        L        _        D        D

The bits have been broken into the data bits (above) and the encoding bits below. The data bits are obviously good ol' ASCII here, and the encoding scheme is not too hard to see:

-Start with a zero encoding bit.
-Each time there is a zero in the corresponding data bit, and the previous data bit is zero too, switch the encoding bit to one
-Keep the value 'one' until the data bit is one, at which position change the encoding bit to zero

To read a byte offset from the beginning of the track address, the address is offset*2. Then take 8 bits, hopping over every second bit, and build the byte that way.

Again, reading the encoding bits is in no way needed for reading the HFE image. Writing the encoding is necessary when converting an IMG to the HFE.

Although I'm working on the same image all the time, let's remember the start point of the disk is not in any absolute location inside the track #0 data. It is better to seek the sector identifiers and the data blocks through the ID markers spread across the disk.

The information about disk data is in areas indicated by A1A1A1. The A1A1A1FB appears after A1A1A1FE, which is the track identifier.

SYNC byte, before Index Address Mark or ID Address Mark

SYNC byte, before Data Address Mark

The corresponding sync byte clock signals are not 'normal', and help make detecting the marks less ambiguous. This Atari ST site cleared a lot of things for me.

Inside one track, these can be found:

A1,A1,A1,FE, trk#,00,01,02,CRC,CRC
A1,A1,A1,FE, trk#,01,01,02,CRC,CRC
A1,A1,A1,FE, trk#,00,02,02,CRC,CRC

A1,A1,A1,FE, trk#,01,02,02,CRC,CRC
A1,A1,A1,FE, trk#,00,03,02,CRC,CRC
A1,A1,A1,FE, trk#,01,03,02,CRC,CRC

A1,A1,A1,FE, trk#,00,04,02,CRC,CRC
A1,A1,A1,FE, trk#,01,04,02,CRC,CRC
A1,A1,A1,FE, trk#,00,05,02,CRC,CRC

A1,A1,A1,FE, trk#,01,05,02,CRC,CRC
A1,A1,A1,FE, trk#,00,06,02,CRC,CRC
A1,A1,A1,FE, trk#,01,06,02,CRC,CRC

A1,A1,A1,FE, trk#,00,07,02,CRC,CRC
A1,A1,A1,FE, trk#,01,07,02,CRC,CRC
A1,A1,A1,FE, trk#,00,08,02,CRC,CRC

A1,A1,A1,FE, trk#,01,08,02,CRC,CRC
A1,A1,A1,FE, trk#,00,09,02,CRC,CRC
A1,A1,A1,FE, trk#,01,09,02,CRC,CRC

Look familiar? They are the sector IDs for the six Blocks within a Track.
Each sector ID is soon followed by a Data ID, which is then followed with 512 bytes.

A1,A1,A1,FB,[512 bytes of data],CRC,CRC

But now we also have something that wasn't present in the raw byte disk image at all: CRC values.


So, the HFE format project led me to the world of calculating Cyclical Redundancy Check values, which I found to be a huge topic in itself and won't discuss it too much here.

This site opened up to me the "easy" way to calculate the results and also warned about erroneous calculations. One problem with CRC is that an incomplete implementation may result in correct CRC in some cases, something which led me astray for a while.

This on-line calculator was helpful for comparing my results with different types CRC initial values.

The QL disk track ID I was looking at has the values A1A1A1FE00000102, which is the message for which I need the CRC, as the 16-bit CRC is stored after it. In this case the CRC was CA6F.

I toiled away with another Processing sketch and visualization to get results and a sufficient understanding on the topic. Although CRC is potentially very math-heavy, the process can actually be made simple with just throwing bits around and visualizing the thing until it gets right.

For the QL Floppy CRC I need to use initial value of 0x84CF to get what would be equal to results of "CRC-CCITT 0xFFFF direct":

The message is preceded with the initial value. After the message, as an augmentation, 16 zero-bits are added.

The above image is off but shows the principle.

The bits are churned by XORing the polynomial 0x1021 (in practice 0x11021) with the bits "above", bringing in an extra bit from the message/augmentation as it proceeds from "left" to "right". After the limit is hit at the "right edge", the value we are at is then the CRC.

Writing to an existing formatted HFE disk, the sector ID CRCs need not be overwritten, as their contents do not change. Should this be needed, then CRC the 8-bytes starting from the first A1, and then write the 16-bit result after the data.

With the 512-byte file blocks, the 516 bytes starting from the first A1 are CRC'd, then write the 16-bit value after the data. That's a helluva lot to CRC, and a poorly optimized routine (*cough*) can result in a fair amount of churning.

Whenever data changes, the encoding signals have to be regenerated. This is best done for the data areas, starting after the A1A1A1FB, with the initial bit value as 0. The encoding bits are written with the above-mentioned rules, up until and after the data block and the byte after the CRC value. (If you rewrite sector IDs and for some reason they change, the encoding bits have to be rewritten there too)

When writing the encoding bits, the whole track encoding could be calculated from beginning to an end. But when working on an existing formatted HFE image, I felt it was easier not to encode areas that don't need re-encoding, as then the anomalous sync signals within the ID portions do not need to be addressed.

Maybe I'll publish it, one day.