alex's blog

External Hard Drive for Sale

http://phoenix.craigslist.org/evl/sys/1104192621.html

Free delivery for family and friends!

Sold! To the man in the funny hat. Yes, you sir, you in the back.

Easy way to list public methods for a Ruby object

I often have new code I'm trying to figure out, and constantly jumping back and forth between a console and a web browser with documentation sometimes really slows me down. Often times it's easy enough to remember which method you want just by seeing a list, even without detailed descriptions.

Any object has a public_methods method, which returns an array of strings listing the commands. Trouble is, all the methods it inherits are of course listed in there, and Object provides a LOT of stuff. It's easy for the interesting stuff about the instance at hand to get lost in that noise.

So, I'm trying a little experiment. I added this snippet to my ~/.irbrc file. Works on Linux and OSX, and in either normal irb or in the Rails console.

class Object
  # print public methods which are not inherited from Object
  def pm
    (self.public_methods - Object.public_methods).sort
  end
end
Now any object has a pm method. So what?

Loading development environment (Rails 2.2.2)
>> class Blah
>>   def foo; end
>>   def bar; end
>>   def blatz; end
>> end
=> nil
>> b = Blah.new
=> #<Blah:0x12b6b04>
>> b.pm
=> ["bar", "blatz", "foo"]
>> b.public_methods
=> ["returning", "to_yaml", "pretty_print_cycle", "inspect", "method_exists?", "stubs", "to_param", "extend_with_included_modules_from", "subclasses_of", "require_or_load", "clone", "method", "to_enum", "to_yaml_properties", "public_methods", "__metaclass__", "to_json", "suppress", "instance_variable_defined?", "instance_variable_names", "dclone", "equal?", "blatz", "freeze", "expects", "with_options", "foo", "methods", "respond_to?", "geometry_data_types", "instance_exec", "enable_warnings", "to_matcher", "to_query", "silence_warnings", "reset_mocha", "to_c", "dup", "enum_for", "instance_variables", "__id__", "copy_instance_variables_from", "duplicable?", "eql?", "object_id", "pretty_inspect", "require", "id", "send", "singleton_methods", "silence_stderr", "encode64", "class_eval", "taint", "taguri", "require_association", "stubba_method", "instance_variable_get", "frozen?", "instance_of?", "__send__", "require_library_or_gem", "b64encode", "to_a", "pretty_print", "taguri=", "daemonize", "remove_subclasses_of", "`", "type", "debugger", "blank?", "instance_eval", "protected_methods", "display", "==", "silence_stream", "unloadable", "decode64", "===", "acts_like?", "pm", "c", "to_yaml_style", "instance_variable_set", "extend", "kind_of?", "to_s", "extended_by", "class", "hash", "breakpoint", "present?", "mocha", "private_methods", "=~", "tainted?", "decode_b", "mocha_inspect", "instance_values", "untaint", "nil?", "load_with_new_constant_marking", "__is_a__", "bar", "stubba_object", "pretty_print_inspect", "require_dependency", "is_a?", "pretty_print_instance_variables", "metaclass"]

See the difference? Call it micro-documentation.

Installing Apple's calendarserver on Ubuntu

Update : 12 Dec 2009

I've added a new post describing how to install Darwin Calendar Server (aka DCS, aka Apple's Calendar Server) from source on Ubuntu 9.10.


I've wanted to have a shared family calendar for a while. We've tried paper calendars. They never got updated, and moving things around was a pain. Sara may disagree with me, she still prefers a regular paper datebook. But the ability to remind each other of when various things are scheduled is really better in the electronic world. My work uses calendarserver (part of OSX Server) to schedule meetings, and combined with a client program like iCal, it's a great system. Just because it's Apple, though, doesn't mean it's only available for Mac users. Programs like Outlook for Windows or Evolution for linux can also subscribe to calendars published by calendarserver. You can find a partial list of compatible clients at http://trac.calendarserver.org/wiki/CalendarClients, but any program which can speak CalDAV should work fine.

CalendarServer is, itself, freely available. You don't need a Mac server, and you don't have to buy OSX. It's written in Python, so it runs on pretty much any platform, and is distributed under an MIT license. It's even available via apt in recent Ubuntu releases, which makes it really really easy to set up if you have an Ubuntu machine handy. These are some quick notes about how I configured this for our local network.

Monitor resource overruns in OpenVZ

OpenVZ is a great virtualization tool for Linux servers. I've repeatedly run up against various resource limits, which can sometimes lead to really weird errors like 'cannot allocate memory' when you do something awful like ls -l. I cooked up the following script to keep a log file of times when a server overruns its bounds. I can then either raise the limits, or try to correlate the overrun time to something going on at that time. Read more for the full details.

Syntax highlighting problem with .html.erb files in Kate text editor

Recently, I've been trying out the Kate (http://kate-editor.org/) text editor when programming. It's free, lightweight, and seems quite nice. It has most of the features of TextMate (my current favorite), runs on Linux, and is free.

Last weekend I noticed the syntax highlighting for Ruby on Rails view templates (files ending in .html.erb) was all messed up. Matching tags were not marked, functions weren't highlighted, and everything was basically really difficult to read.

OSX clipboard from a Terminal

Sometimes I want to get something into the system clipboard without using a mouse, like wanting the entire contents of a file in the clipboard, but without the odd linebreaks that text editors are prone to inserting all on their own.

I discovered the 'pbcopy' and 'pbpaste' tools, which are just perfect.

$ echo "blah" | pbcopy

Now "blah" is in my clipboard. I can Command-V in any window, and there it is.

I can also manipulate the contents of the clipboard in any script with 'pbpaste'

$ pbpaste | wc -l

I just counted the number of lines of text in the clipboard. Yay!

http://www.commandlinefu.com/commands/view/751/mac-os-x-copy-and-paste-t...

Oh God not this again...

I really enjoy being part of the Phoenix Linux Users Group (PLUG). It's a group of smart, friendly, helpful folks who love free software and solving technical problems.

Some, unfortunately, also love to bitch about Microsoft and spew gossip and conspiracy theories with some regularity. Seems everything Microsoft does is pure evil, and the heroes of freedom need to oppose them at every turn!

The 'us vs. them' mentality seems quite childish to me. I don't have any love for Microsoft, but my interest in free software is about what it IS, not what it ISN'T. If Microsoft wants to lock things down, invade privacy, and charge a lot of money for it, I'll continue to not use their products and advise others to do the same. That doesn't automatically mean everything they come up with is existentially wrong.

There's a lot more I could say on this topic, but instead I'll finish with a blatantly unrelated cute picture to show how great it is to share.


Test::Unit gotchas

I've run across a few things worth remembering when writing unit tests for Ruby using Test::Unit. Might be obvious to some, but they tripped me up once or twice.

class SomeTests < Test::Unit::TestCase
  def test_something_happened
    puts 'A'
  end

  def test_something_happened
    puts 'B'
  end
end

Because you're allowed to modify methods at runtime, you get 'B' in your test output, and the first test_something_happened is never run. Shouldn't usually be a problem, but this is one of those times I wish you could make Ruby return an error, or at least a warning, when methods are redefined. If 2 tests get the same name, you could have a broken test with no warnings.

def test_load!_throws_exception
  assert_raises SomeException do
    Record.load!( invalid_record )
  end
end

Apparently, you can name a method 'load!', but you can't include '!' in the name of a test method. test_load!_throws_exception is never run, and no errors are generated. When I rename the method to test_load_bang_throws_exception, then the test actually gets run.

Running your test case in this way :

  $ ruby -w2 test/unit/some_test.rb

will at least get you a warning, but it's not the default. It would be nice if the Rails rake tasks enabled warnings like that when running 'rake test'.

Lots of little methods?

UPDATE 12/16/2010

http://reprog.wordpress.com/2010/03/28/what-is-simplicity-in-programming/

This is a great article, and gets to the heart of the issue with much more clarity that my little rough-draft below.

I completely agree there are different kinds of simplicity. Do you want fewer methods, fewer files, fewer variables, or otherwise? I'm skeptical that any such simplistic metrics really serve much purpose in a real sense, but they do point out the kinds of tradeoffs we face every day when writing code.

I think the real issue is making the right tradeoffs for the task at hand. As Stanley Fish said somewhere (in a book I read once, but now can't remember the title of) we should favor rules of thumb over absolute laws.


The consensus I hear in OO design is to prefer lots of little methods over big monolithic do-everything methods. This makes plenty of sense to me. Little methods are easier to re-use and easier to test. The additional claim that they are easier to understand has been a sticking point for me. I don't necessarily agree that's true.

In terms of just 'what does this code do', it's certainly easier to understand what 5 lines of code as opposed to 50 lines. But in terms of 'why would you want to do that', I don't think there's an obvious correlation between length and comprehensibility. An API with hundreds of tiny methods isn't simple, and isn't necessarily any easier to understand than the same lines of code grouped into 10 functions. If anything, it can be MORE complex to understand, because you probably have to call several of these little methods (and probably call them in a specific order) to do anything useful.

That being said, I still think lots of little methods are the way to go. I would also like to see some higher level methods, which don't do much more that call sets of more fundamental methods. This allows a new user of the API to see not just WHAT is done, but also WHY. Answer the 'who cares' question. I think to some extent there's an inverse relationship between comprehensibility and the number of options you are presented with. The more choices you have, the more you have to examine them all to figure out what's the right thing to do. Give me lots of little methods, but also give me those higher-level do-something-useful methods.

And, please God, name them well and name them consistently!

Not happy with Active Record today.

UPDATE 1/13/2009
I ran across an article on InfoQ about Performance Anti-Patterns in Database-Driven Applications. Check out the section titled 'Load More Data Then Needed'. It's very relevant to my beef with ActiveRecord, where 'SELECT *' is the default.


Using :includes always uses OUTER LEFT joins, even when it really should use INNER joins. No way to configure or change that. Also ignores your :select clause, so if you only want a few fields from the dependent models, too bad! This stinks when you want a text field from a dependent model, and you're forced to bring a giant GeoRuby polygon along for the ride.

Using :joins does INNER joins, which is great. But if you later reference a dependent model, AR goes back to the db for 'SELECT * FROM dependents', even if you've selected data from that table in your original :select.

So there just doesn't seem to be a way to
* Load dependent models with INNER joins
* and only load a few fields for those models.

My 'rant in a comment' from the source code I was fighting looks like this...

@locations = AlertLocation.user_id( current_user.id ).find(
  :all,
  # :select=>'alert_locations.*, us_counties.name, us_zones.name, us_zones.state',
  # :joins=>[ :us_county, :us_zone ],
       
  # :include rather than :join to prevent later 'select * from us_zones', etc... (which is stupid because the data's already in the resultset)
  # :include does an outer join when it should do an inner join, but can't find a way to configure that.
  # 1 outer join query is better than N*2 inner joins. (1 us_zones and 1 us_counties per location)
  # equally stupid... :include ignores my :select line and gets everything from all tables, including stuff I don't want like the giant us_zones.the_geom.
  # so which is less bad, way too many little queries or one way-too-bloated query?  I choose way-too-bloated for now.
  :include=>[ :us_county, :us_zone ],
  :order=>:user_label
)

http://www.slideshare.net/RowanHick/how-to-avoid-hanging-yourself-with-r...

A trouble ticket describing this issue is marked as 'wontfix'. Drag.
http://dev.rubyonrails.org/ticket/7147

Major Security Flaw in Internet Explorer

I don't know all the details, but it sounds like this is a pretty bad one. I don't have an axe to grind, and this isn't just some oddball tech website reporting the story. This is a really major flaw that's being reported in a lot of major news outlets. This is a big deal.

http://news.bbc.co.uk/2/hi/technology/7784908.stm
http://tech.yahoo.com/blogs/null/111811;_ylt=AqJLQ7r2VhquAHfYFHFXhYAazJV4
http://voices.washingtonpost.com/securityfix/2008/12/microsoft_big_secur...
http://www.abcnews.go.com/Technology/wireStory?id=6478928

If you use IE to do anything at all online (read email, shop, whatever), you really should stop as soon as possible. You are opening yourself to having your computer taken over by crackers, and no anti-virus software will save you. Simply visiting a specially-coded website is all it takes. No other browser is affected by this problem. Only IE. All versions of IE from IE 5 (ancient) to IE 8 (not yet released) are affected.

There are lots of good alternatives. http://www.getfirefox.com is one.

What's linked to an OSX binary?

In linux, I'm used to doing something like this to list which shared objects are linked to a binary.

 ldd /usr/bin/whatever

OSX does things a little differently.

 otool -L /usr/bin/whatever

http://www.omnigroup.com/mailman/archive/macosx-admin/2002-April/029578....

Multiple SMTP Servers in Thunderbird

I've used Mozilla Thunderbird as my email client for years. Overall, I love it. Spam filtering is top-notch, the threaded view is great for reading mailing-list traffic, it's an RSS reader, and the huge range of free extensions allow me to add all sorts of functionality that isn't in the base application.

New Ganglia released

I helped fix some cross-site scripting vulnerabilities in the Ganglia web frontend application. My code was released as 3.0.6.

http://ganglia.info/?p=60

I chose the name 'Foss' since Ganglia has a convention of naming releases after aviators. Joe Foss was a South Dakotan who flew fighters for the Marine Corps in WWII. http://en.wikipedia.org/wiki/Joe_Foss

UPDATE

Some bugs introduced in 3.0.6 prompted the release of 3.0.7. Of course I hate to say my code has bugs, but I don't feel too badly about this one. The time from problem report to the release of 3.0.6 was about 48 hours. The importance of fixing the XSS problems was deemed far more important than a full test period prior to release.

PHP4 and PHP5 on 1 Apache server using FastCGI

I did a presentation at last month's AzPHP meeting on how to run both PHP4 and PHP5 at the same time on Apache .  PHP4 runs as an Apache module, and PHP5 runs as a FastCGI program.

You get a lot of the flexibility out of this setup, and avoid most of the performance problems usually associated with CGI.  I put the talk notes up on this website for anyone who's interested.

Synergy Multi-System Setup

I have 2 computers at work. One is a PowerMac G5, and one is a Dell running Ubuntu Linux. I switch frequently from one computer to the other, and having 2 keyboards and mice got to be a real nuisance. I've experimented with a few setups which allow just 1 keyboard and mouse to control both computers, and this is a quick report/howto on what I'm using.

Option 1 : KVM Switch

A KVM (keyboard, video, mouse) switch lets you hook up 1 keyboard, monitor, and mouse into both computers. You use either a button on the switch or a keyboard command to tell it which computer you want to control. This is kinda what I wanted, but not really. I have 2 monitors, and I want to be able to zip back and forth between the 2 without having to hit a button.

I improved the setup a little bit by setting the 2 monitors side by side, and only hooking the keyboard and mouse into the KVM switch. The monitors were hooked directly to their respective computers. This way, I could see both monitors at the same time, but switching between the 2 still felt clunky and slow. I wanted the 2 machines to act like one big desktop.

Why all the SQL hate?

I've been trying out a few different PHP frameworks, and generally getting more familiar with full-blown OO design and development. When they talk about how much time the framework will save you, just about every one makes some kind of claim like 'you never have to write SQL again'! What they usually mean is that the framework includes some object persistence layer that lets you insert/update/delete data via object methods rather than SQL.

I can see the benefit of this in a number of contexts, but I still have concerns.

The typical persistence layer will turn $obj->load() into something like "SELECT * FROM table", and that's a problem in my mind. Every time you load the object, you get EVERYTHING from the row, even if you only actually need the data from 1 column. It gets worse if the object you're loading is spread over multiple tables - you get all data from all tables. The object makes your PHP code nice and neat, but it translates into unnecessarily large queries, and lots of unnecessary data being sent from the database to the application.

Given that database access is often the #1 bottleneck in a web application, this seems like a formula for making performance worse.

Thoughts on PHP Certification

As I previously wrote, I took (and passed) the Zend PHP certification exam last fall. I even got an ultra-special logo for being one of the first 1000 people to pass. Neato.

I've been asked a few times since then if I thought it was a good test, would I do it again, should (other person) take the test, etc. I haven't really known how to answer. I mainly took it as a personal challenge. At the time I was self-employed, and I didn't often get to interact with other developers. I wanted to know if my PHP skills measured up in any meaningful way. I was happy with the outcome of my test, but I'm still not sure about its long-term value.

The test I took would establish a base level of familiarity with PHP. It won't identify superstars. Anyone who had programmed for a year or two, and spends a few weeks preparing, can pass this test. It is more of a 'weed out the boneheads' than a badge of genius. (It removes a negative possibility, rather than adding a positive.) This might seem obvious, but when the test is promoted as a way of 'making you stand out among job applicants', I think the implication is a bit off the mark.

Lastly, I'm really underwhelmed by the self-aggrandizing promotion for this test by the people who have created it, and who are making money selling test prep courses. Again, probably no big surprise that this is going on, but it still is a turnoff to me.


Problems installing Seagull? Try disabling APC.

Yesterday, I installed the Seagull PHP framework, and ran into an odd problem. I grabbed the latest source, untarred it on my web server, and browsed to the installation script. (seagull/www/setup.php). All went just fine until I hit 'Submit' on Step 4. The 'Step 5' page presented an ugly PHP error message.

Fatal error: Call to undefined method HTML_QuickForm_hidden::HTML_QuickForm_element() in /home/alex/public_html/seagull-0.6.0RC1/lib/pear/HTML/QuickForm/input.php on line 50

This struck me as very odd, because it's an error reported not by the Seagull code, but by the PEAR HTML_QuickForm class, which is used by Seagull. QuickForm is a stable product, has been around for a while, and is widely used. Even stranger, the very method which is reporting an error was used in all previous stages of the Seagull installer. How can it exist on the previous pages, and then 'disappear' at stage 5?

I don't know the precise answer to this question, but I can say I'm pretty sure I know where the problem lies. The APC caching system. I had installed it earlier in the day, as it's recommended by Seagull. I'd heard good things about it previously, so I figured it was worth a try. The APC install went smoothly. After installing, I decided to browse the deanspot.org site to see if I could notice any speedup due to the caching.

Syndicate content