My brother is in town. We watched a bunch of movies together. I introduced him to The Usual Suspects, he then introduced me to Cypher, to which I responded with Paprika.
Coincidentally, that turns out to be an awesome sequence if you like mysteries and plots that twist your brain into pretzels. It's like the hard-liquor cocktail of mystery/thrillers. Watched in that order, those movies' themes flow into each other; The first two have someone unseen pulling the strings, while the latter two make it difficult for the character to discern what's real and what's not. (Paprika takes this to extremes...)
I need to show him Dark City next if I still have the BD from Netflix. I expect doing all four of those in a marathon session would be like a Pan Galactic Gargleblaster. I'd probably throw Dark City somewhere between Cypher and Paprika in the sequence; It shares some of the 'unseen strings' characteristics with Cypher, albeit not quite so dense, and some of the uncertainty about reality, albeit not as dense as that of Paprika.
Friday, December 25, 2009
Tuesday, December 22, 2009
Techtalk Tuesday: Nexus bot
This was an idea I've been chewing on for a while, now, and something I've been thinking about doing. Simply put, it's a chat room bridge, but it's not that simple.
Normally, I sit in rosettacode, haskell, proggit, perl, tcl and any number of other channels, ready to offer insight or assistance, or even just observe, if someone mentions Rosetta Code. What Would Be Nice would be if I could just sit in rosettacode, and let a bot handle it for me.
The general sequence might go as follows:
proggit - * soandso thinks Rosetta Code needs better algorithms
rosettacode - * soandso thingks Rosetta Code needs better algorithms
proggit - soandso: What are you looking for?
rosettacode - <#jubilee> soandso: What are you looking for?
rosettacode nexusbot: soandso: Did you see this category? (some url)
proggit - soandso: Did you see this category? (some url)
Nexusbot has to perform several complicated behaviors there, so let's look at them.
First:
* soandso thinks Rosetta Code needs better algorithms
"Rosetta Code" matches one of nexusbot's highlight rules for forwarding to rosettacode, so nexusbot relays the message to rosettacode, thinks of it as a "connection", and associates soandso as a primary for that connection, with a most recent related activity timestamp attached to his association with the connection.
Next:
soandso: What are you looking for?
soandso is associated with a current connection (and that association hasn't timed out), and jubilee just said something to him. nexusbot associates jubilee with soandso, and, through soandso, to the relay to rosettacode. jubilee is attached to the relay with his own related activity timestamp, copied from soandso's.
rosettacode - nexusbot: soandso: Did you see this category? (some url)
shortcircuit addresses nexusbot, and indicates he's addressing soandso through nexusbot. Nexusbot sees that soandso is associated with a connection between rosettacode and proggit, associates shortcircuit with that connection (along with a recent activity timestamp), and passes shortcircuit's message along to proggit
Each time someone triggers a highlight, they're considered a primary for the connection that highlight creates (or would create, if it exists already), and their "recent related activity" timestamp is updated. Each time someone talks to a primary for a connection, they're also associated with the connection, and their "recent related activity" timestamp is set to that of the primary's.
Whenever a primary or secondary talks, their communications are relayed across the connection, but their RRAs are not updated.
When a primary's RRA grows old past a certain point, they're disassociated from the connection. When all of a connection's primaries are gone, the connection is ended.
There are a couple scenarios this logic doesn't quite resolv. What if jubilee is a channel champion, someone who talks to everyone, and who everyone talks to? It's probable that his side of a conversation with someone else would leak across the channel. What if someone talks to a secondary on a related subject, but he doesn't trigger a hilight keyword? Well, that line would be lost.
No solution is perfect.
Now to deal with the Big Brother concerns. Ideally, nexusbot would only be in a channel if he were legitimately asked to be there. That means only a /invite, and preferably checking to see if the user who sent the invite is, in fact, in the destination channel. Ideally, nexusbot would only be in a channel until he were asked to leave. That means no autojoin after a /kick.
There's also the consideration that it should let someone who's in authority in the channel know they're there and what they are, and offer a command set to control the bot's behavior in the channel.
Random braindump of possible commands:
HILIGHT LIST/ADD/REMOVE [#channel] -- lists, adds or removes a hilight rule, optionally associated with a channel. Lists include who requested the hilight rule, and when.
RATELIMIT GET/SET -- get or set the maximum number of lines per minute.
LINEBUFFER GET/SET -- get or set the size of the buffer for queuing lines if the ratelimit is hit.
REPLYMODE USER/HIGHLIGHT/CHANNEL/AUTO +/- b/m/v -- treat connections derived from highlights or associated with particular remote channels as channels themselves, and allow some channel modes like +/- m to be applied to them. Likewise, allow user modes like +/- b and v to be associated with remote users. AUTO means having the bot automatically sync its remote user modes (as they apply in that channel) with the channel's mute, voice and bans.
Ideally, only channel members with +O or +o would have access to the setter commands.
Normally, I sit in rosettacode, haskell, proggit, perl, tcl and any number of other channels, ready to offer insight or assistance, or even just observe, if someone mentions Rosetta Code. What Would Be Nice would be if I could just sit in rosettacode, and let a bot handle it for me.
The general sequence might go as follows:
proggit - * soandso thinks Rosetta Code needs better algorithms
rosettacode -
proggit -
rosettacode -
rosettacode
proggit -
Nexusbot has to perform several complicated behaviors there, so let's look at them.
First:
* soandso thinks Rosetta Code needs better algorithms
"Rosetta Code" matches one of nexusbot's highlight rules for forwarding to rosettacode, so nexusbot relays the message to rosettacode, thinks of it as a "connection", and associates soandso as a primary for that connection, with a most recent related activity timestamp attached to his association with the connection.
Next:
soandso is associated with a current connection (and that association hasn't timed out), and jubilee just said something to him. nexusbot associates jubilee with soandso, and, through soandso, to the relay to rosettacode. jubilee is attached to the relay with his own related activity timestamp, copied from soandso's.
rosettacode -
shortcircuit addresses nexusbot, and indicates he's addressing soandso through nexusbot. Nexusbot sees that soandso is associated with a connection between rosettacode and proggit, associates shortcircuit with that connection (along with a recent activity timestamp), and passes shortcircuit's message along to proggit
Each time someone triggers a highlight, they're considered a primary for the connection that highlight creates (or would create, if it exists already), and their "recent related activity" timestamp is updated. Each time someone talks to a primary for a connection, they're also associated with the connection, and their "recent related activity" timestamp is set to that of the primary's.
Whenever a primary or secondary talks, their communications are relayed across the connection, but their RRAs are not updated.
When a primary's RRA grows old past a certain point, they're disassociated from the connection. When all of a connection's primaries are gone, the connection is ended.
There are a couple scenarios this logic doesn't quite resolv. What if jubilee is a channel champion, someone who talks to everyone, and who everyone talks to? It's probable that his side of a conversation with someone else would leak across the channel. What if someone talks to a secondary on a related subject, but he doesn't trigger a hilight keyword? Well, that line would be lost.
No solution is perfect.
Now to deal with the Big Brother concerns. Ideally, nexusbot would only be in a channel if he were legitimately asked to be there. That means only a /invite, and preferably checking to see if the user who sent the invite is, in fact, in the destination channel. Ideally, nexusbot would only be in a channel until he were asked to leave. That means no autojoin after a /kick.
There's also the consideration that it should let someone who's in authority in the channel know they're there and what they are, and offer a command set to control the bot's behavior in the channel.
Random braindump of possible commands:
HILIGHT LIST/ADD/REMOVE [#channel] -- lists, adds or removes a hilight rule, optionally associated with a channel. Lists include who requested the hilight rule, and when.
RATELIMIT GET/SET -- get or set the maximum number of lines per minute.
LINEBUFFER GET/SET -- get or set the size of the buffer for queuing lines if the ratelimit is hit.
REPLYMODE USER/HIGHLIGHT/CHANNEL/AUTO +/- b/m/v -- treat connections derived from highlights or associated with particular remote channels as channels themselves, and allow some channel modes like +/- m to be applied to them. Likewise, allow user modes like +/- b and v to be associated with remote users. AUTO means having the bot automatically sync its remote user modes (as they apply in that channel) with the channel's mute, voice and bans.
Ideally, only channel members with +O or +o would have access to the setter commands.
Monday, December 21, 2009
Frameworking realtime communications
While looking into flexible ways of writing an IRC bot(the nature of which will probably be in a "Tech Tuesday" post for tomorrow; It's a comm routing bot, so don't complain about IRC != IM quite yet.), I tried using IM libraries. I started with libpurple, because I use Pidgin all the time. That was utter failure, as I couldn't find enough organized documentation to even write a stub quickly.
Then I tried playing with telepathy. I wasn't able to get it to work in the few minutes I had left that evening, but I did learn a few things.
First, it's awesomely flexible, and there are tools that exist that allow a GUI user to essentially interact with it the same way your program might, which let me immediately dive in and try testing some capabilities.
Second, it's missing a decent IRC backend. The closest was "haze", which I couldn't get to connect to more than one IRC network at a time. Turns out Haze is just a glue module between Telepathy and libpurple, and the one-IRC-connection-at-a-time thing is a libpurple limitation. (I'm glad I didn't waste my time thoroughly studying the libpurple header files; It wouldn't have done what I needed, anyway.) I might be able to use multiple libpurple instances, but I don't know how safe that would be; libpurple uses a common filesystem location, I don't have control over it through Telepathy, and I can't trust it to be multi-instance safe in its access patterns.
Telepathy is interesting because it allows multiple connections, any of which routed through any number of connection modules. One could conceivably create a program that talks to fifteen different networks using one or as many protocols.
Darn cool. I wish it had better support for IRC as a client (and maybe as a server...who knows what kinds of options that opens up?). Support for things like identi.ca and Twitter would be pretty cool as well.
Then I tried playing with telepathy. I wasn't able to get it to work in the few minutes I had left that evening, but I did learn a few things.
First, it's awesomely flexible, and there are tools that exist that allow a GUI user to essentially interact with it the same way your program might, which let me immediately dive in and try testing some capabilities.
Second, it's missing a decent IRC backend. The closest was "haze", which I couldn't get to connect to more than one IRC network at a time. Turns out Haze is just a glue module between Telepathy and libpurple, and the one-IRC-connection-at-a-time thing is a libpurple limitation. (I'm glad I didn't waste my time thoroughly studying the libpurple header files; It wouldn't have done what I needed, anyway.) I might be able to use multiple libpurple instances, but I don't know how safe that would be; libpurple uses a common filesystem location, I don't have control over it through Telepathy, and I can't trust it to be multi-instance safe in its access patterns.
Telepathy is interesting because it allows multiple connections, any of which routed through any number of connection modules. One could conceivably create a program that talks to fifteen different networks using one or as many protocols.
Darn cool. I wish it had better support for IRC as a client (and maybe as a server...who knows what kinds of options that opens up?). Support for things like identi.ca and Twitter would be pretty cool as well.
Sunday, December 20, 2009
Computer cable rerouting and bundling (and pics)
I rerouted some cabling, replacing a hodgepodge of miscellaneous SATA cables with a bunch 90-degree-ended cables I'd bought from Digi-Key.
This is the best I can do for now, until I get another SATA controller (so I can use SATA instead of PATA fo that bottom drive) and replace the power supply with one with modular cables.
In a pinch, I also wound up using some blue electrical tape to bind a coil of some other cable. I wrapped it sticky-side out, then doubled over and wrapped it sticky-side in. Net effect is that it's just rubbery plastic, no sticky adhesive anywhere.
This is the best I can do for now, until I get another SATA controller (so I can use SATA instead of PATA fo that bottom drive) and replace the power supply with one with modular cables.
In a pinch, I also wound up using some blue electrical tape to bind a coil of some other cable. I wrapped it sticky-side out, then doubled over and wrapped it sticky-side in. Net effect is that it's just rubbery plastic, no sticky adhesive anywhere.
Friday, December 18, 2009
Bass by phase modulation.
I want to take two 50KHz synced tone sources, invert them, then phase modulate so that the difference between the two represents my actual signal.
I'm really seeking to produce a high power signal in the 0-50Hz range, but the phase modulation approach has three advantages. First, the high carrier frequency can be deadened and insulated much more easily than a pure low frequency signal. Second, the emitting devices don't necessarily need to be as large as a corresponding cone. Third, they don't need to be directly attached, as with bass shakers, making seating and such easier to manage.
The first problem with the phase modulation approach is the carrier signal; You don't want it to be within the audible range of any human or other hearing animal.
The biggest up-front problem, though, is managing the positioning of the interference nodes. Having a high carrier frequency solves part of the problem by reducing the distance between the nodes. The smaller the distance, the lower the likelyhood of a positive node being in one ear, and a negative node in the other. (Wouldn't want to scramble your brains, now, would we?)
I'm really seeking to produce a high power signal in the 0-50Hz range, but the phase modulation approach has three advantages. First, the high carrier frequency can be deadened and insulated much more easily than a pure low frequency signal. Second, the emitting devices don't necessarily need to be as large as a corresponding cone. Third, they don't need to be directly attached, as with bass shakers, making seating and such easier to manage.
The first problem with the phase modulation approach is the carrier signal; You don't want it to be within the audible range of any human or other hearing animal.
The biggest up-front problem, though, is managing the positioning of the interference nodes. Having a high carrier frequency solves part of the problem by reducing the distance between the nodes. The smaller the distance, the lower the likelyhood of a positive node being in one ear, and a negative node in the other. (Wouldn't want to scramble your brains, now, would we?)
Monday, December 14, 2009
QOTD
"I went back in after I wrote them and added all sorts of weasel words. It sort of saps the punch from the statements, but it's one of those things I've learned that I have to do to avoid the more egregious forms of willful misunderstanding. " -- Raymond Chen
"Willful misunderstanding" ... That's an issue I've been contemplating for a few months, now.
"Willful misunderstanding" ... That's an issue I've been contemplating for a few months, now.
Reboots and Automatic Updates
I couldn't check last week, as I was out of the office sick, but someone had asked if anyone's system had done an uncommanded reboot in response to automatic updates.
I thought I'd go through my workstation's event log, and I've found a two interesting log entries, so far:
Installation Ready: The following updates are downloaded and ready for installation. This computer is currently scheduled to install these updates on Wednesday, December 09, 2009 at 3:00 AM: - Windows Malicious Software Removal Tool x64 - December 2009 (KB890830)
and
Installation Ready: The following updates are downloaded and ready for installation. This computer is currently scheduled to install these updates on Wednesday, December 09, 2009 at 3:00 AM: - Cumulative Security Update for Internet Explorer 8 for Windows 7 for x64-based Systems (KB976325) - Windows Malicious Software Removal Tool x64 - December 2009 (KB890830)
Now, I have Windows set up to download only. It's supposed to wait for my confirmation before I install. Notice that those two events include the phrase, "This computer is currently scheduled to install these updates on Wednesday, December 09, 2009 at 3:00AM" ... I didn't tell Windows it could do that. Or, at least, that's not what the normal interface indicated it was set up to do. (I don't make a habit of digging through Administration Tools and tweaking things, as I don't know the full impact of most of what's in there.)
Without finding the setting that automatically schedules updates for next-day installs, even when you tell it to download and wait for confirmation, the only obvious solution is
The only solution to uncommanded reboots, as far as I can see, is to tell it to not even download the updates unless instructed. Not as convenient as having it download in the background, but it saves you the hassle of having a machine reboot on its own, leaving you scratching your head as to why.
Update
Ok, something reset my Windows Update preferences. I just checked again, and it was set to "Install Automatically."
I thought I'd go through my workstation's event log, and I've found a two interesting log entries, so far:
Installation Ready: The following updates are downloaded and ready for installation. This computer is currently scheduled to install these updates on Wednesday, December 09, 2009 at 3:00 AM: - Windows Malicious Software Removal Tool x64 - December 2009 (KB890830)
and
Installation Ready: The following updates are downloaded and ready for installation. This computer is currently scheduled to install these updates on Wednesday, December 09, 2009 at 3:00 AM: - Cumulative Security Update for Internet Explorer 8 for Windows 7 for x64-based Systems (KB976325) - Windows Malicious Software Removal Tool x64 - December 2009 (KB890830)
Now, I have Windows set up to download only. It's supposed to wait for my confirmation before I install. Notice that those two events include the phrase, "This computer is currently scheduled to install these updates on Wednesday, December 09, 2009 at 3:00AM" ... I didn't tell Windows it could do that. Or, at least, that's not what the normal interface indicated it was set up to do. (I don't make a habit of digging through Administration Tools and tweaking things, as I don't know the full impact of most of what's in there.)
Without finding the setting that automatically schedules updates for next-day installs, even when you tell it to download and wait for confirmation, the only obvious solution is
The only solution to uncommanded reboots, as far as I can see, is to tell it to not even download the updates unless instructed. Not as convenient as having it download in the background, but it saves you the hassle of having a machine reboot on its own, leaving you scratching your head as to why.
Update
Ok, something reset my Windows Update preferences. I just checked again, and it was set to "Install Automatically."
Saturday, December 12, 2009
Friday, December 11, 2009
Remote control
@coderjoe: You don't keep the remote for the receiver over here any more?
Me: I haven't seen it in a week and a half...
Me: I haven't seen it in a week and a half...
Wednesday, December 9, 2009
A very simple idea for a browser extension
DOM-walking shortcut keys: hjkl. If you've used vi or Google Reader, you probably know where I'm going with this.
H: Switch focus and highlight to parent DOM node.
J: Switch focus and highlight to previous sibling DOM node.
K: Switch focus and highlight to next sibling DOM node.
L: Switch focus and highlight to first child DOM node.
Something like that would make browsing most blogs' and forums' comment sections much more convenient.
It's simple enough I'd probably do it if I knew how to make a Firefox extension. :-|
H: Switch focus and highlight to parent DOM node.
J: Switch focus and highlight to previous sibling DOM node.
K: Switch focus and highlight to next sibling DOM node.
L: Switch focus and highlight to first child DOM node.
Something like that would make browsing most blogs' and forums' comment sections much more convenient.
It's simple enough I'd probably do it if I knew how to make a Firefox extension. :-|
Hypoallergenic detergent
So we now have one of those front-loading washers that uses a lot less water. Problem is, even using the "extra wash" and "second rinse" cycles without soap, I still have to put it through a second run of the above to get all of the soap out, and we're already using less than the normal amount of soap specified in all the directions.
More details on the washer: It supports a normal wash, rinse, spin-dry sequence, same as any other. It also supports an "extra wash" and a "2nd rinse". I usually run loads with "extra wash" and "2nd rinse" enabled, to get more water through to dilute out the soap.
Why is it important to get all the soap out? It gives my grandmother hives. We're using "Hypoallergenic Purex UltraConcentrate h.e", and even with five soap-free rinses (one normal run with extra wash and 2nd rinse enabled, one soap-free run with the same), there's still enough soap remaining to cause allergic reactions.
Does anyone know of something more hypoallergenic than the stuff we're already using?
More details on the washer: It supports a normal wash, rinse, spin-dry sequence, same as any other. It also supports an "extra wash" and a "2nd rinse". I usually run loads with "extra wash" and "2nd rinse" enabled, to get more water through to dilute out the soap.
Why is it important to get all the soap out? It gives my grandmother hives. We're using "Hypoallergenic Purex UltraConcentrate h.e", and even with five soap-free rinses (one normal run with extra wash and 2nd rinse enabled, one soap-free run with the same), there's still enough soap remaining to cause allergic reactions.
Does anyone know of something more hypoallergenic than the stuff we're already using?
Vinegar, Venom and salt
Vinegar, venom and salt
I'm counting all your faults
When the time is right
The pen could write
And joy would come to a halt
Vinegar, venom and salt
They're quite the painful thoughts
I see the fool
I see the tool
Words can be quite the assault
Vinegar, venom and salt
They must be kept in a vault
Padlocked, lost key
No-exit, you see
Void the words and their poisonous waltz.
I'm counting all your faults
When the time is right
The pen could write
And joy would come to a halt
Vinegar, venom and salt
They're quite the painful thoughts
I see the fool
I see the tool
Words can be quite the assault
Vinegar, venom and salt
They must be kept in a vault
Padlocked, lost key
No-exit, you see
Void the words and their poisonous waltz.
Tuesday, December 8, 2009
Looking at a longer Winter in Michigan
We're looking at having a longer winter here in Michigan.
You see, there are only two seasons here: "Winter" and "Under Construction"
They're talking about losing federal dollars for road maintenance, as Michigan can't put up the matching dollars.
So we're looking at the road maintenance budget being cut to less than half.
The "Under Construction" season would thus be shortened.
Ergo, we're looking at having a longer winter here.
You see, there are only two seasons here: "Winter" and "Under Construction"
They're talking about losing federal dollars for road maintenance, as Michigan can't put up the matching dollars.
So we're looking at the road maintenance budget being cut to less than half.
The "Under Construction" season would thus be shortened.
Ergo, we're looking at having a longer winter here.
GPG and signed email
I've started using GPG to sign my email. It was easy; Install FireGPG and generate a key.
In order to send signed emails, FireGPG contact's GMail's SMTP server directly. Fair enough, but that got me thinking...What about having an SMTP server that only delivered signed emails where the signature checked out against some public keyring, and the signer wasn't marked as unauthorized due to abusive behavior? You could have an anonymous relay that operated in that fashion.
Add in a "X-Server-GPG-Signature" header in the email, and an email provider using such a technique could garner a decent reputation, and thus get more or less a pass by any anti-spam filters in the next stage of the email relay.
I'm sure the idea isn't new. I suspect, though, that all that's needed are a few seed SMTP servers that operate in this fashion.
In order to send signed emails, FireGPG contact's GMail's SMTP server directly. Fair enough, but that got me thinking...What about having an SMTP server that only delivered signed emails where the signature checked out against some public keyring, and the signer wasn't marked as unauthorized due to abusive behavior? You could have an anonymous relay that operated in that fashion.
Add in a "X-Server-GPG-Signature" header in the email, and an email provider using such a technique could garner a decent reputation, and thus get more or less a pass by any anti-spam filters in the next stage of the email relay.
I'm sure the idea isn't new. I suspect, though, that all that's needed are a few seed SMTP servers that operate in this fashion.
Two puffs every four hours and a vaccine
So all the crap that's in my lungs is dead, but there's still a lot down there. Plain old albuterol inhaler is going to increase coughing and accelerate getting all that crap out. Codeine cough syrup apparently keeps me awake, so some numbing agent from the Novocaine family will help me sleep at night instead.
Meanwhile, I also got the H1N1 vaccine. Whether or not I'm just getting over H1N1 (they just don't test for it any more; Tests come back positive in Kent County more often than not), the vaccine is a good move for me avoiding carrying something that could infect my grandmother. I specifically asked my doctor about whether or not I could take the vaccine, and, as it turns out, high fever is the disqualifier, and I'm already over that.
Meanwhile, I also got the H1N1 vaccine. Whether or not I'm just getting over H1N1 (they just don't test for it any more; Tests come back positive in Kent County more often than not), the vaccine is a good move for me avoiding carrying something that could infect my grandmother. I specifically asked my doctor about whether or not I could take the vaccine, and, as it turns out, high fever is the disqualifier, and I'm already over that.
Techtalk Tuesday -- Video editing and image compositing
So I've been thinking about video editing, Linux, and how much I hate Cinelerra.
Now, I don't know a lot about the internals and features of existing video editing tools, but I at least know some of the basics. First, you produce a series of images at a rate intended give the illusion of movement. Let's look at a single point in time, and ignore the animation side of the equation. Let's also focus on visual (as opposed to auditory) factors.
You have source images, you have filters and other transformations you want to apply to them in a particular order, and you want to output them into a combined buffer representing the full visualization of the frame.
Let's break it down a bit further, along the lines of the "I know enough to be dangerous" areas.
Raster image data has several possible variations, aside from what is being depicted. It may have a specific color space, be represented in binary using different color models (RGB vs YUV vs HSL vs HSV), may have additional per-pixel data (like an alpha channel) thrown in, and the subpixel components can have different orderings(RGB vs RBG), sizes(8bpp to 32bpp), and even formats (integer, floats of various sizes and multiplier/mantissa arrangements). ICC color profiles fit in there somewhere, too, but I'm not sure where. There's even dpi, though not a lot of folks pay attention to that in still imagery, much less video. Oh, don't forget stride (empty space often left as padding at the end of an image data row, to take advantage of performance improvements related to byte alignment.).
Now let's look at how you might arrange image transformations. The simplest way to do it might me to organize the entire operation set as an unbalanced tree, merging from the outermost leafs inward. (Well, that's the simplest way I can visualize it, at least). Each node would have a number of children equal to the number of its inputs. A simple filter would have one input, so it would have one child. Any more inputs, and you have a compositing node. An alpha merge, binary (XOR/OR/AND) or arithmetic(subtract, add, multiply, etc) merge would be two-arity, while a mask merge might be three-arity.
Fortunately, all of this is pretty simple to describe in code. You only need one prototype for all of your image operations:
void imageFunc(in ConfigParams, in InputCount, in BUFFER[InputCount], out BUFFER,)
{
}
An image source would have an InputCount of 0; It gets its data from some other location, specified by ConfigParams.
So assuming you were willing to cast aside performance in the interests of insane levels of flexibility (hey, i love over-engineering stuff like this; Be glad I left out the thoughts on scalar filter inputs, vector-to-scalar filters, multiple outputs (useful for deinterlacing), and that's not even fully considering mapping in vector graphics.), you probably want to be able to consider all that frame metadata. Make it part of your BUFFER data type.
One needs to make sure to match input and output formats as much as possible, and minimize glue-type color model and color space conversions. For varying tradeoffs of performance to accuracty, you could up-convert lower-precision image formats to larger range and higher-precision ones, assuming downstream filters supported those higher-precision ones. Given modern CPU and GPU SIMD capabilities, that might even be a recommended baseline for stock filters.
Additionally, it *might* be possible to use an optimizing compiler for the operation graph. From rearranging mathematically-equal filters and eliminating discovered redundancy to building filter machine code based on types and op templates. But that's delving into domain-specific language design, and not something I want to think too hard about at 4AM. In any case, it would likely be unwise to expose all but the most advanced users to the full graph, instead allowing the user interface to map more common behaviors to underlying code.
There's also clear opportunity for parallelism, in that the tree graph, being processed leaf-to-root, could have a thread pool, with each thread starting from a different leaf.
That's an image compositor. Just about any image editing thing you could want to do can be done in there. One exception I can think of are stereovision video, though the workaround for that is to lock-mirror the tree and have the final composite a map-and-join. (If you want to apply different fiters to what each eye sees, you're an evil, devious ba**ard. I salute you, and hope to see it.) Another is gain-type filtering, where a result from nearer the tree root could be used deeper in the tree (such as if you wanted to dynamically avoid clipping due to something you were doing, or if you simply wanted to reuse information you lost due to subsequent filtering or compositing steps). Still another is cross-branch feeding; I can think of a few interesting effects you could pull off with somethig like that. There's also layering and de-layering of per-pixel components.
As a bonus, it's flexible enough that you could get rid of that crap compisitor that's been sitting at the core of the GIMP for the past twenty years.
Now, I don't know a lot about the internals and features of existing video editing tools, but I at least know some of the basics. First, you produce a series of images at a rate intended give the illusion of movement. Let's look at a single point in time, and ignore the animation side of the equation. Let's also focus on visual (as opposed to auditory) factors.
You have source images, you have filters and other transformations you want to apply to them in a particular order, and you want to output them into a combined buffer representing the full visualization of the frame.
Let's break it down a bit further, along the lines of the "I know enough to be dangerous" areas.
Raster image data has several possible variations, aside from what is being depicted. It may have a specific color space, be represented in binary using different color models (RGB vs YUV vs HSL vs HSV), may have additional per-pixel data (like an alpha channel) thrown in, and the subpixel components can have different orderings(RGB vs RBG), sizes(8bpp to 32bpp), and even formats (integer, floats of various sizes and multiplier/mantissa arrangements). ICC color profiles fit in there somewhere, too, but I'm not sure where. There's even dpi, though not a lot of folks pay attention to that in still imagery, much less video. Oh, don't forget stride (empty space often left as padding at the end of an image data row, to take advantage of performance improvements related to byte alignment.).
Now let's look at how you might arrange image transformations. The simplest way to do it might me to organize the entire operation set as an unbalanced tree, merging from the outermost leafs inward. (Well, that's the simplest way I can visualize it, at least). Each node would have a number of children equal to the number of its inputs. A simple filter would have one input, so it would have one child. Any more inputs, and you have a compositing node. An alpha merge, binary (XOR/OR/AND) or arithmetic(subtract, add, multiply, etc) merge would be two-arity, while a mask merge might be three-arity.
Fortunately, all of this is pretty simple to describe in code. You only need one prototype for all of your image operations:
void imageFunc(in ConfigParams, in InputCount, in BUFFER[InputCount], out BUFFER,)
{
}
An image source would have an InputCount of 0; It gets its data from some other location, specified by ConfigParams.
So assuming you were willing to cast aside performance in the interests of insane levels of flexibility (hey, i love over-engineering stuff like this; Be glad I left out the thoughts on scalar filter inputs, vector-to-scalar filters, multiple outputs (useful for deinterlacing), and that's not even fully considering mapping in vector graphics.), you probably want to be able to consider all that frame metadata. Make it part of your BUFFER data type.
One needs to make sure to match input and output formats as much as possible, and minimize glue-type color model and color space conversions. For varying tradeoffs of performance to accuracty, you could up-convert lower-precision image formats to larger range and higher-precision ones, assuming downstream filters supported those higher-precision ones. Given modern CPU and GPU SIMD capabilities, that might even be a recommended baseline for stock filters.
Additionally, it *might* be possible to use an optimizing compiler for the operation graph. From rearranging mathematically-equal filters and eliminating discovered redundancy to building filter machine code based on types and op templates. But that's delving into domain-specific language design, and not something I want to think too hard about at 4AM. In any case, it would likely be unwise to expose all but the most advanced users to the full graph, instead allowing the user interface to map more common behaviors to underlying code.
There's also clear opportunity for parallelism, in that the tree graph, being processed leaf-to-root, could have a thread pool, with each thread starting from a different leaf.
That's an image compositor. Just about any image editing thing you could want to do can be done in there. One exception I can think of are stereovision video, though the workaround for that is to lock-mirror the tree and have the final composite a map-and-join. (If you want to apply different fiters to what each eye sees, you're an evil, devious ba**ard. I salute you, and hope to see it.) Another is gain-type filtering, where a result from nearer the tree root could be used deeper in the tree (such as if you wanted to dynamically avoid clipping due to something you were doing, or if you simply wanted to reuse information you lost due to subsequent filtering or compositing steps). Still another is cross-branch feeding; I can think of a few interesting effects you could pull off with somethig like that. There's also layering and de-layering of per-pixel components.
As a bonus, it's flexible enough that you could get rid of that crap compisitor that's been sitting at the core of the GIMP for the past twenty years.
Monday, December 7, 2009
Flu
My grandmother clipped this out of some county newsletter and left it for me:
How do I know if I have the H1N1 flu?
The symptoms of this influenza virus are similar to just about every 'flu' bug out there. Common effects are a cough, sore throat, headache, fever and chills, severe fatigue with body aches. Some people are over the fever and chills phase in about 2 to 3 days while others suffer for more than a week. Complications can occur days into the illness when lower chest congestion progresses to pneumonia. This is a secondary bacterial infection that comes about because the immune system is so taxed fighting the flu it cannot fend off the bacterial pneumonia. This disease is so prevalent that hospitals and the health department have stopped testing specifically for H1N1 2009. It is always coming back positive, so if you have the above symptoms you are presumed to have H1N1 influenza.
(Yeah, whoever wrote that article needs to work on their language skills.)
Cough, sore throat, fever and chills, severe fatigue with body aches. All there, at one point or another in the last two weeks. Pneumonia last week.
Doctor's appointment tomorrow, because I hate the idea of missing a third week of work.
How do I know if I have the H1N1 flu?
The symptoms of this influenza virus are similar to just about every 'flu' bug out there. Common effects are a cough, sore throat, headache, fever and chills, severe fatigue with body aches. Some people are over the fever and chills phase in about 2 to 3 days while others suffer for more than a week. Complications can occur days into the illness when lower chest congestion progresses to pneumonia. This is a secondary bacterial infection that comes about because the immune system is so taxed fighting the flu it cannot fend off the bacterial pneumonia. This disease is so prevalent that hospitals and the health department have stopped testing specifically for H1N1 2009. It is always coming back positive, so if you have the above symptoms you are presumed to have H1N1 influenza.
(Yeah, whoever wrote that article needs to work on their language skills.)
Cough, sore throat, fever and chills, severe fatigue with body aches. All there, at one point or another in the last two weeks. Pneumonia last week.
Doctor's appointment tomorrow, because I hate the idea of missing a third week of work.
Sunday, December 6, 2009
Thinking about backups.
Thinking about backups.
I've got a 3TB RAID5 volume (three 1.5TB disks) that reads between 150-200MB/s, but only writes at 25-50MB/s.
I would like to have full backup capacity of all 3TB of data, but the question becomes "how"?
If we assume that the reason for the slow write speed to the software RAID 5 array stems from parity calculation, then it stands to reason that a RAID 0 array wouldn't suffer the same speed limitation. Additionally, a RAID 0 array of two 1.5TB disks would hit a 3TB volume size, as opposed to requiring a third disk as in RAID 5.
I'm considering having a second, weaker box run software RAID0, and do a nightly rsync from primary box to the backup box. A dedicated 1Gb/s link would facilitate the copy.
If a drive in the RAID0 array fails, I replace it, rebuild and re-run the backup. If a drive in the RAID5 array fails, I replace it and rebuild. If the rebuild kills a drive and the RAID5 fails, I've got a backup. Meanwhile, I've got an isolated power supply, reducing the number of single points of failure. I'm using fewer drives in the backup machine, reducing cost. I'm reusing older hardware for the backup machine, reducing cost.
Tricky part is figuring out offsite backups from there, but my data isn't that valuable yet.
I've got a 3TB RAID5 volume (three 1.5TB disks) that reads between 150-200MB/s, but only writes at 25-50MB/s.
I would like to have full backup capacity of all 3TB of data, but the question becomes "how"?
If we assume that the reason for the slow write speed to the software RAID 5 array stems from parity calculation, then it stands to reason that a RAID 0 array wouldn't suffer the same speed limitation. Additionally, a RAID 0 array of two 1.5TB disks would hit a 3TB volume size, as opposed to requiring a third disk as in RAID 5.
I'm considering having a second, weaker box run software RAID0, and do a nightly rsync from primary box to the backup box. A dedicated 1Gb/s link would facilitate the copy.
If a drive in the RAID0 array fails, I replace it, rebuild and re-run the backup. If a drive in the RAID5 array fails, I replace it and rebuild. If the rebuild kills a drive and the RAID5 fails, I've got a backup. Meanwhile, I've got an isolated power supply, reducing the number of single points of failure. I'm using fewer drives in the backup machine, reducing cost. I'm reusing older hardware for the backup machine, reducing cost.
Tricky part is figuring out offsite backups from there, but my data isn't that valuable yet.
IR->Bluetooth->IR
You know what would be nice to have? A near-proximity IR->Bluetooth->IR adapter.
By "near-proximity", I mean having it attached to the original IR transmitting device, have an IR sensor, convert that to a 2KHz 1-bit bitstream, send that via BT to a receiver that converts it back to an IR transmission near whatever device needs to receive it.
Two significant problems remain: Bulk and power. For bulk, one could take advantage of paper-ICs or other film integration. (When I was a kid, I saw thick-film ICs dating back to the 70s. They may have been around longer than that.)
For power, I don't know. Probably the best way to go about it is to leech off the existing remote control's battery pack. I can think of a couple ways one might do that without interfering with the remote's internal expectations of its power source.
Of course, you could just build the thing into a lithium battery pack, rechargeable via USB, and tout it as both a range and life extension of the remote. Lithium power density is such that you might be able to pack the lithium, charging circuitry and bt transmitter in the space of a couple AA batteries. Some mechanical finagling and shoe-horning might be necessary to fit different battery compartment configurations.
By "near-proximity", I mean having it attached to the original IR transmitting device, have an IR sensor, convert that to a 2KHz 1-bit bitstream, send that via BT to a receiver that converts it back to an IR transmission near whatever device needs to receive it.
Two significant problems remain: Bulk and power. For bulk, one could take advantage of paper-ICs or other film integration. (When I was a kid, I saw thick-film ICs dating back to the 70s. They may have been around longer than that.)
For power, I don't know. Probably the best way to go about it is to leech off the existing remote control's battery pack. I can think of a couple ways one might do that without interfering with the remote's internal expectations of its power source.
Of course, you could just build the thing into a lithium battery pack, rechargeable via USB, and tout it as both a range and life extension of the remote. Lithium power density is such that you might be able to pack the lithium, charging circuitry and bt transmitter in the space of a couple AA batteries. Some mechanical finagling and shoe-horning might be necessary to fit different battery compartment configurations.
Tis the Season to Barter
So my router was dropping 70-80% of packets, making it nightmarish trying to do anything via SSH. I called up a friend and asked him to pick up a cheap router from Best Buy, along with some new CAT6 ends (These are a *lot* nicer than the crap ones that took me 45 minutes apiece to do badly...).
Of course, an order like that comes to about $70, and I don't have that laying around in cash. PayPal was inconvenient for technical reasons, so we came up with a convenient solution...I just bought an equal amount of stuff from his Amazon wishlist, and am having it shipped directly to his registered address.
Of course, an order like that comes to about $70, and I don't have that laying around in cash. PayPal was inconvenient for technical reasons, so we came up with a convenient solution...I just bought an equal amount of stuff from his Amazon wishlist, and am having it shipped directly to his registered address.
Friday, December 4, 2009
Migrating from a solid disk to software RAID.
So I've now got three 1.5TB disks in RAID 5, the array is assembled and running on bootup, and is mounted as my /home. Previously, I was using a 1TB drive for /home.
(I'm going to abbreviate this, including only the steps that worked, and initially omitting some of the excruciatingly long retries)
After installing the three disks into the machine, my first step was using Ubuntu's Palimpset Disk Utility to build the software RAID device. That took about seven hours, unattended.
The next step was copying my old /home filesystem to the new RAID array, using dd. That took about nine hours, unattended.
The next step was expanding the filesystem and tuning some parameters. ext3 isn't aware of the size of the block device it sits on, but it does remember a similar value related to that of the device it was created on. I had to use resize2fs to expand it from having sat on a 1TB volume to occupying a 3TB volume.
I looked at tune2fs and enabled a few options, including dir_index (I have a few folders with thousands of files in them), sparse_super (That saved a *lot* of disk space) and uninit_bg (Supposed to speed up fsck). I didn't read the man page clearly, and didn't discover until afterwards that by enabling uninit_bg, I'd given tune2fs the go-ahead to convert my filesystem from ext3 to ext4. Oh well...Seems to be working, and there are a few features of ext4 (such as extents) that I expect will come in handy.
The next step was to reboot and ensure that I could mount the array after rebooting; I didn't want some screw-up on my part to lead to all that time being wasted* by failing a RAID volume. After establishing I could mount it, it came time to modify mdadm.conf. and seeing that the array would come up on bootup. After that, all that was left was modifying /etc/fstab to mount the RAID volume at /home, rebooting, and restoring compressed tarballs and such from my overflow drive.
I've gone from having 8GB free on /home to having 1.4TB free. Can't complain.
Getting over 200MB/s in raw streaming read. Can't complain about that, either; I only read at about 70MB/s when pulling from a single (mostly) idle disk that's not part of the array.
Of course, it's not as good as I'd get with a hardware RAID card, but it's a heck of a lot better than I'd get otherwise. My comparative write speed sits down at about 25MB/s when dd'ing from a USB drive to the md0 device. I probably should have tried testing the write speed while reading from /dev/zero before putting the filesystem in place, but the bonnie disk benchmark at least gives some non-destructive results:
For ext4 on top of the software RAID5 volume (consisting of three Seagate ST31500541AS), I get 38MB/s sequential output, 161MB/s sequential input, and 353 random seeks per second. Little to no difference between per-character writing and block writing.
For ext3 on top of a single disk (a Seagate ST3500630AS), I get 51M/s sequential per-character write, 61M/s sequential block write, 50M/s sequential character read, 84M/s sequential block read, and 202 random seeks per second.
Long and short of it, a single block disk kicks my software RAID5 volume's butt for sequential write, but the software RAID5 blows away the single disk for sequential reads, and gets a 50% improvement over the single disk's random seek rate.
One thing I find particularly interesting about this is that the three disks in the RAID volume are 5900 RPM spindle speed, while that single disk is 7200 RPM spindle speed. I suppose having three heads is better than one. :)
(I'm going to abbreviate this, including only the steps that worked, and initially omitting some of the excruciatingly long retries)
After installing the three disks into the machine, my first step was using Ubuntu's Palimpset Disk Utility to build the software RAID device. That took about seven hours, unattended.
The next step was copying my old /home filesystem to the new RAID array, using dd. That took about nine hours, unattended.
The next step was expanding the filesystem and tuning some parameters. ext3 isn't aware of the size of the block device it sits on, but it does remember a similar value related to that of the device it was created on. I had to use resize2fs to expand it from having sat on a 1TB volume to occupying a 3TB volume.
I looked at tune2fs and enabled a few options, including dir_index (I have a few folders with thousands of files in them), sparse_super (That saved a *lot* of disk space) and uninit_bg (Supposed to speed up fsck). I didn't read the man page clearly, and didn't discover until afterwards that by enabling uninit_bg, I'd given tune2fs the go-ahead to convert my filesystem from ext3 to ext4. Oh well...Seems to be working, and there are a few features of ext4 (such as extents) that I expect will come in handy.
The next step was to reboot and ensure that I could mount the array after rebooting; I didn't want some screw-up on my part to lead to all that time being wasted* by failing a RAID volume. After establishing I could mount it, it came time to modify mdadm.conf. and seeing that the array would come up on bootup. After that, all that was left was modifying /etc/fstab to mount the RAID volume at /home, rebooting, and restoring compressed tarballs and such from my overflow drive.
Filesystem Size Used Avail Use% Mounted on
/dev/md0 2.8T 1.2T 1.4T 47% /home
I've gone from having 8GB free on /home to having 1.4TB free. Can't complain.
root@dodo:/home/shortcircuit# dd if=/dev/md0 of=/dev/null
10985044+0 records in
10985043+0 records out
5624342016 bytes (5.6 GB) copied, 27.5768 s, 204 MB/s
18288001+0 records in
18288000+0 records out
9363456000 bytes (9.4 GB) copied, 46.1469 s, 203 MB/s
22992060+0 records in
22992059+0 records out
11771934208 bytes (12 GB) copied, 57.7066 s, 204 MB/s
Getting over 200MB/s in raw streaming read. Can't complain about that, either; I only read at about 70MB/s when pulling from a single (mostly) idle disk that's not part of the array.
Of course, it's not as good as I'd get with a hardware RAID card, but it's a heck of a lot better than I'd get otherwise. My comparative write speed sits down at about 25MB/s when dd'ing from a USB drive to the md0 device. I probably should have tried testing the write speed while reading from /dev/zero before putting the filesystem in place, but the bonnie disk benchmark at least gives some non-destructive results:
Version 1.03c ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
dodo 16G 38025 73 38386 8 25797 5 47096 85 161903 16 353.8 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
dodo,16G,38025,73,38386,8,25797,5,47096,85,161903,16,353.8,0,16,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++
For ext4 on top of the software RAID5 volume (consisting of three Seagate ST31500541AS), I get 38MB/s sequential output, 161MB/s sequential input, and 353 random seeks per second. Little to no difference between per-character writing and block writing.
Version 1.03c ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
dodo 16G 50964 96 61461 15 29468 6 49902 87 84502 6 202.1 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
For ext3 on top of a single disk (a Seagate ST3500630AS), I get 51M/s sequential per-character write, 61M/s sequential block write, 50M/s sequential character read, 84M/s sequential block read, and 202 random seeks per second.
Long and short of it, a single block disk kicks my software RAID5 volume's butt for sequential write, but the software RAID5 blows away the single disk for sequential reads, and gets a 50% improvement over the single disk's random seek rate.
One thing I find particularly interesting about this is that the three disks in the RAID volume are 5900 RPM spindle speed, while that single disk is 7200 RPM spindle speed. I suppose having three heads is better than one. :)
Tuesday, December 1, 2009
Well, I went RAID.
Three 1.5TB 5900 rpm Seagates in software RAID 5. Write speeds as high as 50MB/s, so I'm not unhappy. In the process of dd'ing my old /home partition to it, and then I'll expand that filesystem to consume the whole 3TB volume.
As I only have 1.5TB of raw disk *not* part of the RAID, I'm going to peek at compressed filesystems for the 1TB disk. Trouble is, it takes 11 hours to copy a 1TB volume *to* the RAID via DD; I don't think my backups can be daily...
In other news, it's interesting watching SMART data on the drives. The "head flying hours" counter is already higher than the "power on hours", 2.2 days vs 1.8. Go figure.
As I only have 1.5TB of raw disk *not* part of the RAID, I'm going to peek at compressed filesystems for the 1TB disk. Trouble is, it takes 11 hours to copy a 1TB volume *to* the RAID via DD; I don't think my backups can be daily...
In other news, it's interesting watching SMART data on the drives. The "head flying hours" counter is already higher than the "power on hours", 2.2 days vs 1.8. Go figure.
Subscribe to:
Posts (Atom)