Saturday, November 28, 2009

Seekable tarballs

In the Linux world, the most common archive tool is tar, hands down. You take a bunch of files, you throw them in a tar file, and then you compress the tar file with your compressor of choice. GZIP is the most common, leading to a .tar.gz/.tgz extension. BZIP2 is also common (.tar.bz2), but I've played around with LZMA (.tar.lzma) and rzip (.tar.rz) as well.

I'm only going to talk about gzip/DEFLATE because that's the only general compression algorithm I've considered in this approach.

It occurred to me there's a way to make a tarball (and, indeed, any DEFLATE stream) seekable. It stems from the reason DEFLATE isn't really seekable in the first place; The actual encoding of the data depends on data that it's already seen, so you can't just peek at any one place in the stream and start decoding there without knowing what came earlier.

In the case of DEFLATE, there's a compressor state that keeps changing as the compressor processes more data, in order to improve the compressor's efficiency. That state represents how the compressor will encode the symbols it sees in the near future, and the symbols that it sees will cause it to change its state to be more efficient for the symbols following.

In the process of decompressing a DEFLATE stream, that same state gets reconstructed, referenced and updated as a decode key as the stream continues. One can't normally jump to the middle of a DEFLATE stream because one needs to have that state in the form that it would be by that point in the stream.

The solution is simple; Save off a copy of the compressor/decompressor state wherever you want a seek point. Keep an index containing the original data stream address, the deflate stream address, and the decompressor state one would have at those two points.

Put the index in a separate file, after the terminator indicator, or multiplex it using some sort of container format; I don't care. Point is, you have the option of resuming a decompression run that you never really started.

Yes, this harms the compression efficiency, in a way. To seek, you need that decompressor state, and that decompressor state will have a nonzero size. No worries; There are all kinds of ways to tune how large the index will be:

  • You could set an index size as a percentage of the resulting deflate size. After building N bytes of deflate stream data, you save off an index point of M bytes, where (M/(N+M)) is your target percentage.

  • If you know enough about your source data stream, you could be selective about where you put the seek points. For example, if you know that the source stream is a tar file, you can put a seek point at the beginning of each file in the tar archive.

  • You don't have to generate the seek index at all until you'd like to start examining the file. Most of the tools I've used that allow be to browse tarballs decompress the entire archive into a temporary file or directory. In such cases, generating the seek index as an initial step saves the disk cost of a temp file or directory, and the seek index can be kept for later reference.


Another interesting side effect of the seek index is that it allows parallel decoding. If the underlying I/O subsystem is up to the task, multiple chunks of the overall DEFLATE stream may be decompressed simultaneously, with each seek index point as a potential starting point.

Friday, November 27, 2009

Thursday, November 26, 2009

Just watched the Pink Panther movie with Steve Martin in it.

Tradition says it can't hold up to the original (more serious comedy) or its Peter Sellers sequels (which were more slapstick), but I couldn't help but laugh at a few things.

"Don't you feel alone?"
"Not since I found the Internet."
Steve Martin does fine as the senseless, slapstick-mode Clouseau, and I think only my young age might let me get away with saying he's potentially on par with Peter Sellers in the role; I really think it's the script and the poorly-done special effects that bring down the movie. It does give an interesting perspective of the type of person this Clouseau might be, though: Focused on details, but sees a completely different set from those around him, and has misplaced priorities in other areas. Also has an intellectual understanding of form and social protocol, but a major disconnect between his intellect and actions.

Of course, the classic big band Pink Panther theme is a pleasure to listen to.

Monday, November 23, 2009

On RAID types, type classes, and reducing risk of catastrophic failure

RAID solutions like RAID 5 and RAID 6 are intended to improve data protection* while sacrificing less capacity than things like RAID 1.

RAID 5 allows you to put N+1 drives into a system, and get N drives' worth of capacity out. Meanwhile, you can lose a drive and still not necessarily lose your data. RAID 6 allows you to put N+2 drives into a system, and get N drives' worth of capacity out. Meanwhile, you can lose two drives out of the system and not necessarily lose your data.

The problem with RAID is that when you're down to N functioning drives, if you lose one more drive, you're in for a massive world of trouble; Good luck getting your data back. If you can even perceive your filesystem, you're going to be missing 1/N of just about each file. And likely not in a contiguous chunk.

So when you lose a drive out of a RAID system, you put one in, fast, and rebuild the array. Rebuilding the array populates the drive with what was in the old drive that went missing.** Once the rebuild is finished, and you're back up to N+1 drives (for RAID 5) or N+2 drives (for RAID 6), then everything should be back to normal; You just survived a drive failure (or two) without losing data.

The problem is that this rebuild process is hell on the drives; It involves a lot of reading of data from the remaining drives, in addition to their existing live load, to rebuild the data to put on the newly re-added drive. It's not unknown to have an additional drive failure during the rebuild period.

Part of the problem is that most of the drives in a fresh RAID setup will be new, which means that after one or two of the original drives have failed, the rest may not be far behind, which drives up the likelyhood of a failure during the rebuild.

So what if one were to induce a drive to fail earlier? I don't mean a time-decided or simulated failure, I mean a physical, time-unknown failure. Say, for example, that when setting up a new RAID, you put the drives through a burn-in period where you intentionally induce a high level of wear, such as by performing a nasty mix of random writes and random reads, inducing spindowns and spinups, etc.

Burn-in periods are already used in some places; They help weed out drives that are prone to early failure. However, if you give each of the drives in your array a different length of burn-in time, then you've reduced each drive's likely lifetime by a different amount, ideally by an exponentiated difference. That, in turn, means that if the drive with the longest burn-in period is the first in the array to fail, then the next drive to fail may be less likely to fail during the rebuild. Given enough of a difference in reduction of expected lifetime, one may even be able to procure something of a safety margin.

The sacrifice, of course, is that you're intentionally reducing the lifetime of your component drives, which means you put out more money in equipment replacement, and you rebuild your array more frequently.

The question is, is that additional equipment replacement cost and rebuild frequency sufficiently offset by the reduction in the likelyhood of having a drive failure reduce you to less than N working drives?

Some other thoughts:

RAID 0 is simple striping. You put N drives in, you get N drives' worth of capacity out, you get faster read times, and if you lose a drive, you've essentially lost all your data.

RAID 5 is similar to RAID 0 in that it uses striping, but an entire drive's worth of error-correction data is spread across all your disks so that if you lose a drive, you can retain your data. That means you get N drives worth of capacity for a system with N+1 drives.

RAID 6 is like RAID 5, but it uses a second drive's worth of data for error correction. You get N drives' worth of data for an array with N+2 drives, and you can lose 2 drives and still retain your data.

In all three of these cases, if you drop below N drives, you're pretty much hosed.

A second recap, more terse:
  • RAID 0: N drives, N drives capacity. Any drive loss means failure of the array.

  • RAID 5: N+1 drives, N drives capacity. Losing more than 1 drive means failure of the array.

  • RAID 6: N+2 drives, N drives capacity. Losing more than 2 drives means failure of the array.


Hopefully, you can see the abstraction I'm trying to point out.

Let's call RAID 0, 5 and 6 members of the same class of RAID array types, and note that for any*** array with N+x drives in the array, the array can withstand the loss of x drives before total failure.

In RAID 0, x is 0. In RAID 5, x is 1. In RAID 6, x is 2. It seems obvious that configurations are possible and practicable for the functionality of this class of RAID types where x may be greater than 2.

I would assume there's going to be a sacrifice in throughput performance as x increases, due to the calculation(writes) and verification(reads) of the error data. Just the same, the potential to increase x leads to the potential to increase N while reducing the additional risk that each increment of N brings.

That means an increase in the (acceptably un-)safe volume size with component drives below the maximum available, meaning component drives which aren't going to be the most expensive on the market. Likewise, as the data density of component drives reach an inevitable**** cap due to the laws of physics, one can select drive types with more weight given to component drive reliability.

* Yes! I know! RAID is not a backup solution. Now that that kneejerk reaction is out of the way...
** The information required to generate that data already exists on the other drives, assuming you haven't dropped below N active drives. Likewise, if one of those other drives were to die, the information on this drive can be used, in concert with the other remaining drives, to regenerate that drive's data.
*** OK, there's still a matter of the array being in a consistent state.
**** I'm not saying we're there yet. I'm not saying we'll be there within the next fifty years. I'm just saying it has to happen eventually.

Dice 1000

I don't know where it came from, but a piece of paper in one of my binders has the rules for a game written on it:

Required:
  • Paper
  • Pencil
  • 5 dice (I'm assuming d6s)


To start, have each player roll one of the dice and the highest roll goes first, cotinuing clockwise. The first player will roll all five dice. The scoring of the dice is:
  • 1 -- 100 points
  • 5 -- 50 points
  • 3 dice the same number -- number * 100


So if you roll three 2s, you would have 200 points, combination. If they stop at that point, they keep the total for that turn. If they roll again, they must roll dice that will add to the score or they lose that turn's score. If a player rolls all five dice and receives a non-scoring roll, they lose all accumulated points for the game. The first player to score 1000 is the winner.

Sunday, November 22, 2009

PulseAudio, sound daemons and screensavers

I had to debug a network issue that was caused by* PulseAudio (multicasting 1.5Mb/s of audio was saturating my wifi, which was preventing anyone from being able to get anywhere.), and in the process of learning about the problem to fix it, I learned something about PulseAudio.

It's amazingly over-engineered. Enough that it makes convenient things that wouldn't normally be possible. For example, any app can register itself as a source or sink. Registering as a source is kinda important for a sound daemon that wants to multiplex audio from multiple apps, but as a sink?

However, I keep thinking about media player visualizations and how most of them suck. Likewise, I think about screensavers and how they could be better.

Take a screensaver, have it register as an audio sink, and let the audio have an impact on the screensaver. For example, I'm looking at the "fiberlamp" screensaver. It looks like it uses some sort of a physics engine to have the fibers hanging down realistically, and when the screensaver starts, the thing starts off with a bit of a bounce before the whole thing settles.

You could vibrate the base of the fiberlamp in response to the sound fed out by PulseAudio, causing the fibers to shake and oscillate in response. You could take advantage of the fiber-optic metaphor, and feed a raster image into the base of the fiber bundle that looks like a more traditional visualization, so the fiber tips look like part of a stretched spherical mapping of that base.

There are a lot of possibilities when you can hook into a sound daemon like that.

* "caused by" is rather nebulous...One could just as easily point out that I shouldn't have enabled the multicast. Or I should have had the wifi block it. Yada yada.

A crazy idea

Construction paper is interesting stuff. What if one were to pulp several different colors, and feed those through what amounts to an inkjet?

You'd wind up with a color printout that has a very different tactile feel.

Friday, November 20, 2009

Wednesday, November 18, 2009

Ideas

"If you have an apple and I have an apple and we exchange apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas." - George Bernard Shaw



Now there's an idea I can get behind.

The Four Brethren


In the musty air
Above the misty lake
Sat the four brethren
A proud four stakes

Four stakes against water
Four stakes against ice
Four stakes against beasts birds and eyes.

They all knew their purpose
They each knew why
Each one shared a method--
Protection from the sky

So cozy for a mallard
So safe for the unspry
So guarded against malice
And hungry, prying eyes.

Sunday, November 15, 2009

If it ain't broke...

Engineer: ...don't fix it.

Venture capitalist: ...you're not thinking "out of the box" enough.

Salesperson: ...it'll be obsolete within the next six months, and your ROI will be far better with our newest model.

Agile programmer: ...write a test case for the current behavior.

Hacker: ...It's not fast enough.

Tester: ...hit it harder.

Manager: ...what is it?

End-user: ...why do I care?

Saturday, November 14, 2009

Thinking about the file cache

So, under Linux, just about any memory that's not actively used by applications eventually gets used by the file cache. The file cache keeps a copy of data from recently-used files in memory so that you don't need to read them from disk if you need them again.

One great way to visualize this process this is to install htop, add the memory counter, and configure that counter as a bar graph. The green portion of the graph represents memory actively used by applications, the blue portion represents buffers and such in use by the kernel(I'm still a tad unclear on this point; I think some shared memory mechanisms may be represented there), and the yellow portion is your file cache. The file cache will hold data chosen by some heuristic I'm unfamiliar with. I might describe it as "popular" or "recently-used", but it's really up to the kernel devs.

My desktop machine has 8GB of RAM. That's an insane amount, by any conventional reasoning; Even I'll admit that none of the applications I've run have used that much memory, 64-bit aware or no. However, again, any memory that's not being used by an application eventually gets used by the file cache, which means I eventually have about 6-7GB of file data cached in RAM. Believe me, it makes a difference when I'm not cycling through disk images tens of gigabytes long.

What if that file cache could be populated in advance? What if a filesystem could retain a snapshot of which files (or pages or sectors or blocks; However they organize the data in the cache.) were in the file cache at a particular time? I'm not talking about the file data itself, but pointers to that data. When the filesystem is mounted, assuming it's clean, the snapshot could be used for initially populating the filesystem cache.

At a naive level, the snapshot could be made when unmounting a write-enabled filesystem, though not when remounting to read-only. (That's a common failsafe approach for dealing with hardware blips, and it doesn't make sense to try to commit data to a potentially failing device..) When the filesystem is next mounted, the file cache state could be restored, immediately bringing recently-used files into memory. That will increase the mount time, but in a large number of use cases, it will improve the speed of file access. You could even choose to not restore that file cache state without any worries for data integrity.

More sophisticated approaches might allow the triggered switching of profiles. Let's say you use your system for web browsing as well as the occasional game, or even as a file server. You might have a different set of cache data for each purpose. Tie it to individual binaries, or even trigger it based on loading particular files, and be able to flush a large amount of data into the cache in anticipation of the workload historically seen associated with that application. Did gdm just start? Load all the GNOME pixmaps and sound into the file cache. Did Firefox just start? Load the theme data, plugins and that stuff under ~/.mozilla-firefox.

So long as the filesystem is aware of these cache profiles, it might even be able to take advantage of some of the free space on the disk to keep copies of the cached files in a contiguous place on the block device, to speed up loading of the cached data. If the data was modified, of course, the filesystem would have to rebuild the cache block at an idle time in accordance with system energy usage policy. (I.e. if you're on battery, you might only tack the modified version onto the end of the block. Or you might not rebuild the block at all until after you're wall-powered again.)

Thursday, November 12, 2009

Cables, frequency loss and equalization

So, apparently, different types and qualities of speaker cable will have different attenuation to different frequencies. Why not boost the signal at those frequencies being attenuated?



Imagine a receiver that you connect to a "stub" speaker for calibration. The receiver could pass a signal through the stub, measure the relative power loss at different frequencies and different drive levels, and build a model for a dynamic power-switched equalizer. (As a stage between the primary equalizer and the amp)



This would let you (mostly; Your frequency bucket size will still have an impact) eliminate your wiring as a source of frequency variation, though you'd still have to deal with general attenuation. Your initial eq is still in place, so you can still choose between your "jazz", "rock", and "movies" modes.

Home theater update: New placements, new wiring.

So I got the receiver and PS3 moved out of view, which is awesome; It reduces the number of non-presentation light sources.

I started with a 100ft spool of 16ga, but I was losing 30db between the receiver and the speakers. I just finished ripping that out and replacing it with 12ga copper.

New spool of speaker cable

wiring above receiver

Top two (copper) are 12ga rope-braided speaker wire. (Dayton SKRL-12-100 12) Pinned as a pair with 3/4" plastic cable staples, also picked up at Home Depot. They replaced some crappy 16ga I'd been feeding the front left and right channels, whose speakers consume up to 100W each. I was losing 30db across the 16ga; It's a 50ft run.

The bottom two are RG-6, using F-type-to-RCA adapters on either end; They're actually carrying headphone-level audio from the receiver to the subwoofer, whose internal amp has a built-in crossover. Once I get my receiver-side crossover set up, one will carry plain cable, while the other will carry LFE.

I did what I could to reduce cable strain.

The remaining crap isn't my wiring; It's network, power and cable.

Wiring leading to receiver and video source.

Top cable is HDMI, leading to the TV, pinned with 11mm round cable staples I picked up at Home Depot.

Next two (copper) are 12ga rope-braided speaker wire. (Dayton SKRL-12-100 12) Pinned as a pair with 3/4" plastic cable staples, also picked up at Home Depot. They replaced some crappy 16ga I'd been feeding the front left and right channels, whose speakers consume up to 100W each. I was losing 30db across the 16ga; It's a 50ft run.

The bottom two are RG-6, using F-type-to-RCA adapters on either end; They're actually carrying headphone-level audio from the receiver to the subwoofer, whose internal amp has a built-in crossover. Once I get my receiver-side crossover set up, one will carry plain cable, while the other will carry LFE.

Wiring leading to TV and speakers

Top cable is HDMI, leading to the TV, pinned with 11mm round cable staples I picked up at Home Depot.

Next two (copper) are 12ga rope-braided speaker wire. (Dayton SKRL-12-100 12) Pinned as a pair with 3/4" plastic cable staples, also picked up at Home Depot. They replaced some crappy 16ga I'd been feeding the front left and right channels, whose speakers consume up to 100W each. I was losing 30db across the 16ga; It's a 50ft run.

The bottom two are RG-6, using F-type-to-RCA adapters on either end; They're actually carrying headphone-level audio from the receiver to the subwoofer, whose internal amp has a built-in crossover. Once I get my receiver-side crossover set up, one will carry plain cable, while the other will carry LFE.

Hanging grayish cable is a plug for the light in the room. I've got it stapled, but I'd *really* like to be able replace the light and wire it in directly. At least I've got it stapled, so it's not hanging down as far...

View in the dark

Main menu in the Ghost in the Shell menu.

Only annoying light is the power LED on the TV, and I can take care of that with some black electrical tape.

There's also some reflection off the ceiling. Not sure what to do about that yet. May try using a black diffuse covering if/when I put some R13 fiberglass up there.

Tuesday, November 10, 2009

I know that I know nothing

"I know that I know nothing."

That's great. Now if only there were a way to get an enumerated list of everything I don't already know.

Hey, what's this machine you guys are bringing into the office? And what's with the wires sticking out of the fairy cake?

Sunday, November 8, 2009

I'm hungry. Dinner break.

Depart to partake in repast,
he said,
Depart to partake in repast!

In order to function tonight,
he said,
depart to partake in repast!

It's time to consume!
It's time to devour!
Depart to partake in repast!