![]() |
| Most recent update (All By Hand(TM)): 16-May-2005 00:13 |
|
A True Love storyThe questionfollowing this post is a copy of a file format I have encounter on a job that I am currently working on - what I need to do is convert this from the format below to BMP (or some other modern format) and quite frankly I am banging my head against the wall trying to figure it out. This question was posted in comp.graphics.algorithms about a file format with the extension "MRS". Searching the web just yielded the same question by the same poster, and a veritable shit-load of non-relevant answers, ranging from "it's a WordPerfect Macro Resource" to "have you tried renaming it to BMP". Any decent programmer would scoff at that; apparently noone bothered even to look at the files. Deft searching based on the few file names disclosed revealed the source of the images: True Love. Wikipedia revealed what kind of game this is: True Love is a classic Japanese bishoujo H game developed by Software House Parsley and released by Otaku Publishing, Ltd. It can be viewed as a hentai dating sim with RPG elements. The game was originally published in 1994 (a Windows compatible version appeared around 1998). It seems noone was able to extract any image other than by taking screenshots. Just the thing for me! The approachThe poster provided some example files and guessed header information. Staring at various hex dumps, it was obvious the compression was some kind of RLE, but further searching the web didn't yield anything useful. On to more drastic measures. I wrote a skeleton program to display primary information for each file. As usual, I just took an old graphics displaying program and deleted everything I could not use (I've lost more than one good program that way 'cause I tend to press 'Save' first, only to remember too late I had to use 'Save As'!). The skeleton program initially did nothing more than accept a file on the command line or from a drag-drop operation, and display the information from the header: some signature id, image width and height, and a color palette. After that the RLE image followed, which I just dumped as raw pixels. The original game was so old that it ran under DOS, so I had a hunch about the number of colors used. The largest images were 640x400 pixels, and, although the palette had 256 entries, I suspected that the game would have run in the high-resolution 16-color VGA mode. Modern games do not use this resolution any more, they usually go up from 640 x 480. What did this tell me? If the compression is RLE, all the original pixels could be found in the file, albeit surrounded by copy/repeat commands. And if the range of the original pixels was 0..15 I could filter out all out-of-range pixels, which resulted in less garbage and more image-like displays; all other bytes had to be command bytes. From there on it was actually straightforward. RLE encoding consist of two major commands: copy the next n pixels literally, and repeat the single next pixel n times. A third variant is used in Command & Conquer: copy n previously unpacked bytes; this is also used in the MRS format. RLE unpacking usually decides which command to use by inspecting the highest bit or bits of the 'next' command (see the PCX or BMP/RLE file formats for examples of these). The only thing which would ruin this approach was the possibility that four-bit pixels (only 16 colors, remember?) could be packed into bytes two-by-two. Lucky for me that wasn't the case. The 'decision' bits were found to be the highest two bits of each command, which allowed four different unpack algorithms and a repeat/copy value range of 0..63. Each command could have an extra byte; this was the case when the repeat/copy value was 0. Though there are four possible commands, only three were actually used: copy next pixel(s), repeat previous pixel (a variant of "repeat next pixel"), and copy multiple previous pixels. The unpacking code was actually clever in getting the values to be copied/repeated; to some of the values a constant value was added, so the actual range was not limited to 0..63 for a single command. See the pseudo code if you're confused... I got some pretty weird results at first, but there is another (self-imposed) limit on this kind of compression. Usually, the images are not compressed as a single run of data, but one row at a time. Everywhere I ran out of bounds on the current row I just stopped unpacking (with a marker telling me so) and skipped to the next image row. If the program did not end up on the last line of the image at the end of the file, or ran over the last line, I also issued a warning. Just ignoring bytes I didn't understand produced skewed graphics at first, but by careful filtering I finally got what I wanted! All in all it took me about 12 hours of staring at hex and garbled pixels, jotting down values that seem crucial for some reason, trying variants and rewriting large chunks of code, whilst consuming one cigarette after another. To be honest, for me that equals to having a good time ... The codeRLE is an uncomplicated algorithm (compared to JPEG and the likes), and its code can usually be jotted down in a few lines. The pseudo code below shows the entire algorithm for MRS images. Every y = 0 repeat { x = 0 repeat { b = fetch_byte if (b and 0xc0 = 0) { ## Option 1 (00xx.xxxx): Copy next pixel(s) if (b and 0x3f = 0) copy_count = fetch_byte + 0x40 else copy_count = b and 0x3f copy (copy_count) } else { if (b and 0x80 = 0) { ## Option 2 (01xx.xxxx): duplicate last pixel if (b and 0x3f = 0) duplicate_count = fetch_byte + 0x41 else duplicate_count = (b and 0x3f)+1 duplicate (last_pixel, duplicate_count) } else { ## Option 3 (1?xx.xxxx): duplicate existing pixels source = destination - ((b & 0x0f)*256 + fetch_byte + 1) duplicate_count = (b/16) and 7 if (duplicate_count = 0) duplicate_count = fetch_byte + 10 else duplicate_count = duplicate_count + 2 duplicate (source, duplicate_count) ## caveat: source and dest may overlap! } } } until x = width } until y = height The caveat on option 3 is that it's not a plain copy operation. Consider this run:
... a b c d e f duplicate(-5, 8)
Copying should start at the ... a b c d e f b c d e f ? ? ? (where The trick is to copy one pixel at the time, resulting in the correct sequence ... a b c d e f b c d e f b c d The resultsSome of the more innocent pictures in the game. If you want to have the full load you'll have to find the original!
Final wordBy the time I found out the explicit nature of some of the images it was already too late; at that point I just had to crack the algorithm. Before you try to decode them yourself, be warned that the game contains explicit nudity and sex. Then again, since you already must own the original, that would hardly come as a surprise...
-- Jongware [May 2005]
| |||||||||||||||||
![]() |
For comments -- not about the pictures! -- you can reach me at jongware. |