Monday, January 19, 2009

R'thoria's Used Plot Elements: Disney Edition (Vol. 2, Ep. 002)

Ladies and Gentlemen, boys and girls, welcome to the special Disney edition of RUPE*! Citing financial difficulties, Disney recently announced a number of layoffs and general cutbacks, and has begun to sell off their back stock of plot elements to independent retailers, such as yours very truly, and which are now available directly to YOU at discount prices!

[* Definitely NOT G-Rated whenever we can avoid it!]

So, without further ado, let me introduce the best of the stock, everyone's favorite story: Sleeping Baby! Enjoy this touching, four-hour, continuous-shot adventure that critics have called the most nap-tacular event of the century!

Next, we have the cinematic masterpiece based on an ancient French fairy tale that dazzled audiences when it was first released. Integrating computer-generated graphics with Disney's classic hand-drawn style, Beauty and the Beet tells the touching story of a young woman who discovers that external appearances can be deceiving, and that a person's roots are more important.

Lilo and Stick was the last of the hand-animated films to come from Disney Studios. Set in the beautiful Hawaiian Islands and rendered in luscious watercolors, this refreshing tale shows how a young girl, and her best friend, can make a family in the face of difficulty.

Choose from these and many others, including The Cryin' King, The Emperor's New Boob and that fascinating medical documentary, The Sore and The Stone!

So remember: for ACTION, ADVENTURE, luscious watercolors and gratuitous boob jokes, visit us at RUPE, where our prices can't be matched!

Saturday, January 10, 2009

Difference in POV between Centralized VCS and Distributed VCS

'However, the total cost of branching is paid by reduced code velocity to main, merge conflicts and additional testing can be expensive. Throughout this guidance we ask the user to confirm that a branch is really needed and always ask the question "how does this branch support my development project?"'
-- from the TFS branching guide.


I came across this interesting passage yesterday, and I was struck by the profound difference in point of view between centralized version control and distributed version control.

In distributed version control systems (dvcs), branching is fundamental (in most scenarios): every "checkout" is really a branch of the repository, with full history. Therefore the focus in developing dvcs has been on making branching and merging as simple, quick and painless as possible.

Now, arguably, branching is also fundamental to all version control; after all, when you check out a copy of the code and modify it, you make a kind of lightweight, local branch for your changes that is quickly merged back into the trunk. But, with centralized version control, the focus is on control of the codebase, which is concentrated in the trunk. So, not only is branching not optimized (because of the lack of focus), but it is indirectly made more costly by the controls put on the trunk. And, because local checkouts aren't recognized as 'branches', you lack most of the usual tools for managing them separately from the trunk/server, like full history and an easy way to exchange patches with another developer.

Now, this isn't to say that the above advice isn't germane (PSP: take a shot!). If you are using a centralized version control tool, then you should absolutely consider the costs associated with branching, because it is quite expensive. And, arguably, you could go crazy with dvcs as well, but it takes a lot more work, I think. But the elaborate systems that the aforementioned branching guide mentions for scenarios just seem baroque and unnecessary when considered from the context of a dvcs.

Wednesday, January 07, 2009

"Poetry"

I once knew a young man of Ire
Whose toves, quite slithy, would gyre.
But of gimble they'd none;
Said he, "It's no fun!
And it's manxome how quickly they tire!"

Tuesday, January 06, 2009

Better diff/merge

Wouldn't it be nice if your VCS/SCM understood the code that was checked in, so that it could show diffs that were correctly contained?

For example, I modify a method and delete the method right after it. It shouldn't show that as a single change, but as two (one change for the modify, one for the delete). Or if I move a method in the source file but leave it otherwise unchanged (or even change it!): it shows as a 'move' (and potentially, then, a change as well!).

Some of this could be done without understanding the code much, eg, looking for identical blocks of text between the new and old files, which constitute 'moves'.

And it wouldn't have to happen when sending changes to the server, just when comparing the two versions for human consumption.

Saturday, January 03, 2009

More googol fun

Earlier today I posted on twitter that you would need only 333 bits to store the value 10^100 (one googol), which is really just saying that there are about a googol states that 333 bits can be in (actually there are quite a few more ways, but to a rough order of magnitude it's accurate).

That, of course, got me wondering how many bits it would take to represent a googolplex, or 10^googol = 10^10^100. Actually, in my enthusiasm I asked my computer to display a googolplex, since it had so readily displayed a googol (which, written out, is really just 101 characters). But I realized my folly and quickly aborted that request.

But I was still wondering how many bits it would take. It turns out there's actually a straightforward way to solve this problem:

Let's start with the derivation for the number of bits needed to store the value googol.

x = 10^100 = 2^a , where a will represent the needed number of bits

log_10(x) = log_10(10^100) = 100 = log_10(2^a)

How to get the 'a' out? By remembering that log_w(s^t) = t * log_w(s); that log_w(w) = 1, and that we can convert bases as follows:

log_g(p) = log_h(p)/log_h(g)

Therefore,

100 = a * log_10(2)

100/log_10(2) = a * log_10(2)/log_10(2) = a * 1 = a

and so 'a' is ~332.1928 which we'll round UP (since we have to work in whole bits) to 333.

FINE AND DANDY SAYS I :-)

But what about our friend the googolplex?

We can use the same derivation as above, but substitute 100 for 10^100, so that we find

10^100 / log_10(2) = a

or, about 3.32 * 10^100 bits. That's right, we'd need 3 googol bits to store a googolplex. And there aren't that many bits. In the whole universe.

Now, of course, if we used something with more states, then we could represent a googolplex in fewer of whatever we're using; for example, with a three-state object would only need 2 googol of them; with a 60-state object, only 5.6 * 10^99, or about half a googol. Unfortunately, things don't improve very quickly due to that log_10. With a million-state object (10^6), we're only at 1/6 googol, which is still WAY WAY too many.

Anyway, there's not really a point to this rambling, I just think that really large numbers are cool.