alex's blog

How to not get stuck.

The software equivalent of writers block is what I call 'getting stuck'. You have a problem, and for whatever reason you're blocking on how to move forward. Maybe it's especially hard, maybe it's boring, but for whatever reason you end up doing other things instead of what you need to be doing.

This all is a little bit embarrassing to write, since it's a post about how I sometimes end up wasting time and not doing what I know I need to be doing. Lately I've been actively trying to understand and fix this problem, and I've concluded that becoming more aware of the problem is key to solving it. I write now because I expect I'm far from unique in this respect. I hope that others who have similar experiences may find some value in how I'm trying to describe things here.

Recognizing your own tendencies when you get stuck can be really helpful.

It's good to be able to step back and recognize habits, and think intentionally about how you can modify them to help get you back on track as soon as possible.

  • I'm in danger of getting stuck when the problem is big. When it's too much to think about, it's easy to drift onto other things and put off the big decisions.
  • I get stuck when I have a big task and I get interrupted a lot. It's hard to get in the groove of a big project when something else is coming up every few minutes.
  • I get stuck when issues outside of work pull my attention away. This is similar to 'getting interrupted', but is more about other stuff which is on my mind, rather than other people who need my attention.
  • The real kicker is that I get stuck when I've already been stuck for a while. Stuck-ness has momentum, and the longer it goes on, the more intractable the problem seems. Believing in the intractability of a problem is a huge motivation-killer.

Recognizing that you have a problem is the sine qua non. You won't get very far until you at least recognize that there's an issue to be dealt with. Just understanding you have a problem is no guarantee you'll be able to easily overcome it, but it's rare to overcome a problem you're not aware you have.

What does it look like?

  • I check email and Twitter too often.
  • I get a snack I'm not really hungry for.
  • I do dumb stuff like brush my teeth in the middle of the day.
  • I read Wikipedia pages about completely random events.

I expect this list is familiar to most people, and could easily contain hundreds of other items. Really, the point of this list isn't so much what's on it, but rather what's not on it... the task you should actually be working on.

How do you get unstuck?

Here's the real question. Generally, drifting into stuck-ness happens unconsiously and unintentionally. I never say to myself "hey, this is a tough project, I think I'll blow some time doing nothing." But the truth is, that's exactly what happens when you're stuck, and productivity suffers. This leads to self-inflicted frustration that you're still faced with this big problem which hasn't been dealt with.

  • Just do it. Dive in and force yourself to stay focused. Sometimes this works, but often it doesn't.
  • Carve it up. The best way to solve a big problem is to make it a series of little problems.
    • Details help, but beware that planning itself can become procrastination. (aka "paralysis by analysis". See 'Just do it' above.) The point is to get moving again, not to plan it to death.
    • I like to ask: "What do I need to do and in what order?" and "What do I need to get done in the next 2 hours to feel like I'm getting somewhere?"
    • The easiest first answer is "I don't know.", and the correct response is "What do I need to do to start figuring that out.?"

    The point is to arrange things in such a way that you can solve a series of smaller problems instead of solving some enormous problem all at once.

  • Explain the problem to someone. This is really powerful. Just having to explain the issue and it's background to someone not already familiar with it forces you to think about the problem differently. It gets you to back up, and often helps you re-examine assumptions you didn't realize you were making. I have often started long emails to co-workers or mailing lists, only to eventually delete them because the act of explaining the problem in detail got me rolling again.
  • Ask for help. If the problem really is too big, or you really don't know what to do and you've been in that situation for hours, it's better to admit this than to continue to be unproductive.
  • Take a break. This is really different than checking email or any of the other avoidance strategies I mentioned. Taking a break (by taking a walk around the block, maybe) to clear your mind is not at all the same as simple mindless avoidance. The difference is that you're intentionally trying to change direction, rather than giving in to some mental Brownian motion. Nobody but you will be able to tell which one you're doing, but you should be able to. I strongly think this should be done outside of your normal work environment, otherwise it's too easy to drift back into the unproductive time-wasting behaviors which get you nowhere.
  • Change your environment in some small but noticeable way. Is the light on? Turn it off. Is the window closed? Open it. Are you sitting down? Walk around or stand on your chair. Is nobody else there? Talk about the problem out loud anyway. Don't mutter, speak confidently. Explain it like you're giving a keynote address. The key is that if you're in a rut, don't stay there. Little tweaks to your surroundings can help jolt you into motion.

I find that since I spend so much of my day doing things purely in my head and on my keyboard, shifting to doing something which is more physical and less mental is a really good way to loosen the blockages. I've had lots of "AH-HA!" moments walking my dog, taking a shower, or doing other things which have no bearing on the actual problem at hand. Don't underestimate the power of these moments, but also keep in mind that "just do it" is often the right decision.

You have to work on knowing yourself, and figuring out which strategy is appropriate to the situation. I tend to take some quick break in my office, then try to "just do it", and then try something more disruptive (like asking for help or taking a walk) if I'm still not getting anywhere.

Over time, it should happen less and less.

If it doesn't, you need to find the patterns in what gets you stuck. Is it always hard to come up with the ideal database schema, or to feel like your test coverage is good enough? Are you uncomfortable with filesystems? Does Oracle seem like a total mystery you'd rather avoid?

If you're constantly facing things you hate, figure out how to get onto different projects. Getting stuck is, I think, both a strong indicator of and a cause of unhappiness. "Just do it" is a great short-term strategy, and a terrible long-term strategy. You should strive for a work environment (and a mental environment) where stuck-ness is the exception not the rule.

Which one is that, again?

I'm really enjoying reading "High Performance JavaScript" by Nicholas C. Zakas and others. There's great advice on how to write better JavaScript, backed up by clear explanations about how the language works and why their recommendations are what they are.

I'm much less impressed by many of the charts they include. Try this example:

Scan10002

Answer these questions for me:

  • Which one is Firefox 3, and which one is IE 8?
  • Which one is Firefox 3.5, which one is IE 7, and which one is Opera 10 Beta?
  • Which browser is represented by a dashed line with circles?

My best guess is that this was originally a color chart which wasn't re-thought when it went to the black & white press. In my mind, that's poor quality control, and really detracts from the overall quality of the book. I'm still in the early chapters of "High Performance JavaScript", and overall I still have to give it a thumbs-up, but little annoyances like this definitely detract from the experience.


Creating a bonded network interface in Red Hat linux

Introduction

Network interfaces (NICs) are crucial to any server machine. If you can't talk to the network, you can't do much of anything. In any high-availability system, you want to remove as many single-points-of-failure as possible, and NICs are no exception. The linux kernel provides several ways to aggregate (bond) many physical NICs into a single virtual interface. This interface looks like a single thing to any applications which use it.

There are 2 primary use cases for using bonding:

  • Load balancing for higher throughput : If you have more network traffic than can flow through a single interface, you can build a single bonded interface which uses all the bandwidth of its physical interfaces. Keep in mind that this doesn't necessarily provide high-availability. (If you NEED 2 Gb/sec of throughput, losing 1 of your 2 1 Gb/sec interfaces will kill you.)
  • Failover for high availability : Sets of physical interfaces can be configured in primary/secondary arrangements. Traffic flows through only 1 interface at a time, but if that interface fails for some reason, the bonding driver instantly starts routing traffic through the secondary instead.

There are many bonding modes supported by the linux kernel. Some provide load balancing, some provide high-availability, some may provide both. Some require cooperation/configuration on the switch, and some don't. Modes are described in detail : http://www.linuxfoundation.org/collaborate/workgroups/networking/bonding

Configure Network Interfaces

First we create a configuration script for the virtual interface. This is the thing which can have IP addresses configured on it.

/etc/sysconfig/network-scripts/ifcfg-bond0

DEVICE=bond0
IPADDR=192.168.0.1
NETMASK=255.255.255.0
USERCTL=no
BOOTPROTO=none
ONBOOT=yes
BONDING_OPTS="mode=0 miimon=100 primary=eth2"

Selecting a different bonding mode is simply a matter of changing the mode=0 option listed above.

Now we need to create configuration scripts for the physical interfaces which will be part of bond0. Since they are configured as components of bond0 instead of independent interfaces in their own right, they cannot have IP addresses attached to them.

/etc/sysconfig/network-scripts/ifcfg-eth2

DEVICE=eth2
USERCTL=no
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes

/etc/sysconfig/network-scripts/ifcfg-eth3

DEVICE=eth3
USERCTL=no
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes

Configure modprobe

Add the following to /etc/modprobe.conf

alias bond0 bonding
options bond0 mode=balance-rr miimon=100

The options specified here get overridden by those in ifcfg-bond0, but it may be helpful in some circumstances to know that you can set options here as well.

Then load the module

modprobe bonding

Start It Up

Now restart the nework to bring up the bonded device.

service network restart

You can get information on the bonding configuration and status:

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth2
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:26:55:32:26:2c

Slave Interface: eth3
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:26:55:32:26:2e

Tests

  • Try pinging over the bonded interface, and make sure you aren't dropping packets.
  • Use tcpdump on both interfaces to see which one is being used. In round-robin mode, both should be generating traffic. In failover mode, only 1 should be. More info in the last paragraph of this post: http://lists.linbit.com/pipermail/drbd-user/2007-September/007440.html

References

Simple handling of 1 or many method arguments

It's pretty common to have a method which accepts either a single value, or an array of values. In the method body, it's not hard to detect whether you've been passed an array or not, but you can actually do this in Ruby without any conditionals.

def test( args )
  Array( args )
end

>> test( 1 )
=> [1]
>> test( [ 1, 2 ] )
=> [1, 2]

Taking this one step further, it's pretty simple to handle multiple arguments in the same manner.

def test( *args )
  Array( args ).flatten
end

>> test( 1 )
=> [1]
>> test( [ 1, 2 ] )
=> [1, 2]
>> test( 1, 2 )
=> [1, 2]

I think you can make a good case that this isn't necessarily great API design, but it's nice to have something as simple as Array() when it's appropriate.

Inferring Motives

I saw this at http://www.infoq.com/presentations/Hacking-Your-Organization, around time 38:00.

When we observe the action of another...

  • we impute a motivation for that action
  • and react emotionally to that imputed motivation

This imputation process is the core of most conflict.

This whole presentation is a really useful examination of what 'company culture' is, and how interpersonal communication matters in organizations. Really good stuff.

Secure file deletion in OSX

Want to remove a file from your Mac, and have some confidence that it'll be very hard or impossible to recover that file later on? OSX includes the srm utility, which is quite useful for this purpose.

srm somefile.txt

The default is to overwrite your file with 35 passes of random data, as prescribed by the Gutmann method. This can take quite a while, so the slightly less paranoid among you can use the 7-pass --medium option.

srm --medium somefile.txt

Keep in mind that this isn't a fast process. I sold an external hard drive a while ago, and used srm to wipe the contents. It took about 2 days to completely wipe the 120GB of contents using --medium.

This StackOverflow question on secure disk erasure methods has a few other interesting comments and references.

Fun with binary searches

I ran across a blog post challenging people to write a binary search without actually running the code. The idea is to see what you can come up with without iteratively finding & fixing problems.

Given that I really like iteratively finding & fixing problems, this seemed a little bizarre at first. But the more I ponder it, the more I like the idea (at least as an exercise/toy/diversion). It reminds me of a programming class where we got a Java program on a printed sheet of paper, and had to write out what the outputs would be. It was really annoying, and I hated it, but I did definitely learn to think more clearly about code by doing these exercises. I hardly ever do anything like that anymore, and this little binary-search challenge was a nice refresher.

The post is the first in a series, and the whole set is really great reading. The "testing is not a substitute for thinking" part was especially good. I remember lots of times arguing that "the absence of errors is not evidence of correct behavior", which is one of his main points. Though he stated it more clearly, I think, as "Tests can only show the presence of bugs, not their absence".

So, here's the binary search I came up with. I've never run this code. It might have syntax errors. It might not work (though I think it probably does). I'll try some tests and see how it goes, but posting before actually trying it seemed most in line with the spirit of the challenge. I usually do Ruby these days, so I tried it in PHP just for fun.

/**
 * Returns either the array index of $needle in $haystack, or null.
 * Assume: $haystack is a non-sparse, sorted, numerically indexed array.
 */

function binary_search( $needle, $haystack ) {
  $left_idx = 0;
  $right_idx = count( $haystack ) - 1;
  if( $needle < $haystack[ $left_idx ] || $needle > $haystack[ $right_idx ] ) {
    return null;
  }
  return recursive_binary_search( $needle, $haystack, $left_idx, $right_idx );
}

function recursive_binary_search( $needle, $haystack, $left_idx, $right_idx ) {
  $midpoint_idx = floor( ($right_idx - $left_idx) / 2 ) + $left_idx;
  $midpoint_val = $haystack[ $midpoint_idx ];
 
  if( $midpoint_val == $needle ) {
    return $midpoint_idx;
  } else if ( $needle > $midpoint_val ) {
    $next_left_idx = $midpoint_idx+1;
    $next_right_idx = $right_idx;
  } else {
    $next_left_idx = $left_idx;
    $next_right_idx = $midpoint_idx-1;
  }
  if( $next_left_idx > $next_right_idx ) {
    return null;
  }
  return recursive_binary_search( $needle, $haystack, $next_left_idx, $next_right_idx );
}

I'm pretty sure I could make it shorter, more dense, etc. I think it's pretty readable this way, though.

Errors compiling ganglia gmond gmetad on Mac OSX?

If you run into errors like the following, it's because kvm.h is no longer part of OSX as of release 10.5 (Leopard). Have a look at my instructions on building ganglia for OSX. There's an easy workaround.

Ganglia has this problem with OSX released >= 10.5 (Leopard). A bug report for the kvm.h problem has been created in Ganglia's Bugzilla, and will hopefully be addressed soon. I've put a patch for the problem on my GitHub account in the meantime.

mkdir .libs
 gcc -std=gnu99 -DHAVE_CONFIG_H -I. -I.. -I/opt/local/include -I/usr/X11R6/include -I.. -I../../lib -I../../include -g -O2 -Wall -MT metrics.lo -MD -MP -MF .deps/metrics.Tpo -c metrics.c  -fno-common -DPIC -o .libs/metrics.o
metrics.c:14:17: error: kvm.h: No such file or directory
metrics.c: In function ‘proc_run_func’:
metrics.c:658: warning: implicit declaration of function ‘host_processor_sets’
metrics.c:667: warning: implicit declaration of function ‘host_processor_set_priv’
metrics.c:693: warning: implicit declaration of function ‘thread_info’
metrics.c:718: warning: implicit declaration of function ‘vm_deallocate’
metrics.c:631: warning: unused variable ‘ti’
metrics.c:630: warning: unused variable ‘a_task’
metrics.c:626: warning: unused variable ‘port’
metrics.c: In function ‘mem_free_func’:
metrics.c:789: warning: pointer targets in passing argument 4 of ‘host_statistics’ differ in signedness
metrics.c:783: warning: unused variable ‘host_port’
metrics.c: In function ‘mtu_func’:
metrics.c:847: warning: unused variable ‘min’
metrics.c: In function ‘get_netbw’:
metrics.c:966: warning: implicit declaration of function ‘errx’
metrics.c: In function ‘makenetvfslist’:
metrics.c:1263: warning: implicit declaration of function ‘warnx’
metrics.c:1302: warning: assignment makes integer from pointer without a cast
make[4]: *** [metrics.lo] Error 1
make[3]: *** [all-recursive] Error 1
make[2]: *** [all] Error 2
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

ArgumentError: Anonymous modules have no name to be referenced by

ArgumentError: Anonymous modules have no name to be referenced by

Huh?

This less-than-clear error messages means you're trying to use a class or module which doesn't exist. The 'anonymous' module is the one you're trying to use.

Maybe you've got a typo in the class name, or maybe you're missing a require somewhere.

Selector support in ActiveMQ ajax

At my day job, I've been doing a lot of work with ActiveMQ, especially the ajax interface which allows web browsers to participate in the world of JMS publish & subscribe messaging.

You can write code like this snippet of JavaScript

  var amq = org.activemq.Amq;
  amq.init( { uri: 'http://your.amq.server:8161/amq', logging: true, timeout: 20 } );
  amq.addListener( 'test', 'topic://your-topic-name', function( message ) {
    console.log( message );
  } );
  amq.sendMessage( 'topic://your-topic-name', 'Test Message!' );

Your browser will start making ajax calls to the message broker (ActiveMQ) in the background. When a new message arrives on topic://your-topic-name, it will be delivered to the browser and your callback function will fire. This is a great way to develop chat rooms, multi-player games, and other highly-interactive applications without Flash.

amq.js, the client-side library which supplies this functionality, lacked the ability to use JMS selectors. Selectors are strings (written according to the SQL92 spec) which describe which messages you want to receive. So, if you used amq.js to subscribe to a topic or queue, you had to receive all messages, and do your filtering client-side. For a busy topic, this can mean lots of wasted message deliveries, and slower performance for your application.

I wrote up a patch (including unit tests!) to add this functionality in issue AMQ-2874, which was accepted & even made the ActiveMQ 5.4.1 'New Features' list.

Now, amq.addListener supports a fourth parameter which can contain additional options for the subscription being created.

  amq.addListener( 'test', 'topic://your-topic-name', function( message ) {
    console.log( message );
  },
  { selector: "subject='blah'" );

This snippet will only receive messages which have a subject property/header matching blah. This is very handy, and I'm happy it was accepted into ActiveMQ so quickly.

Environment-based configuration in Ruby

A while ago I created a simple class to allow management of configuration via YAML files, similar to Rails' config/database.yml. Seems like this might be useful in other contexts besides the one I created it for, so I just pulled the code into its own public GitHub repository.

http://github.com/alexdean/env_based_config

You create a YAML file like this, with the top-level keys matching your environment names.

development:
  server: dev-server
  port: 8080
test:
  server: test-server
  port: 8080
production:
  server: production.com
  port: 80

Then define a configuration class like

class AppConfig < EnvBasedConfig
  set_configuration "config/app.yml"
end

The default behavior is to figure out which YAML block to used based on Rails.configuration.environment, but that's easy to change by providing your own implementation of env.

class AppConfig < EnvBasedConfig
  def env
    # your logic for selecting the correct environment here
  end
  set_configuration "config/app.yml"
end

Code which needs access to this configuration information should make calls like AppConfig.server. In the test environment, this returns //test-server.

Programming Aphorisms

I've collected a few little phrases and slogans here and there which I have hung on to since they seemed to capture some useful bit of wisdom. Here's a few.

Know What You're Doing

"If you don't know what to do, don't do anything."
-- Pete Conrad

If you don't know, don't guess. Go find out. Figure out the problem, and figure out a solution. It's phenomenally rare that a guess right now is better than an informed decision a little later. Of course time pressure exists, and you can't leave everything to tomorrow. I don't interpret this as "everything can wait". I interpret it as "the risks associated with ignorance are huge". Production systems don't often have an 'undo' button.

This applies equally well to both software engineering and systems administration.


Make Informed Choices and Accept the Consequences

"Everything is more important than everything else."
-- Chris Herbert

This comes to mind especially when sitting in meetings intended to set priorities. It's really hard to make decisions, and it's easy to just punt, and allow circumstances to make them for you. But this is bad for a project and bad for a business. When priorities get set through 'personal favors', constant interruptions for 'emergencies', and things like this, quality and happiness suffer.

Thanks to my good friend Chris Herbert for that little gem. I'm not sure if he made it up or passed it along. Either way, it's brilliant.


Look For Drawbacks in All Ideas, Especially Your Own

"I want all my groceries in one bag, but I don't want that bag to be heavy."
-- Agnes Skinner

When projects suffer from feature-creep, and complaints arise about how "it used to be fast", this is what comes to mind. I think it's really a special case of the 'everything is more important...' quote above. They both come down to a refusal to accept that you can't always have everything you want right now. I'm not a pessimist, by any means. I think things can and do improve. But we must not ignore the fact that a new feature which may benefit one aspect of the system can negatively impact other aspects. There are often creative ways to maximize the good and minimize the bad, but sometimes it's just as simple as being clear about what's really important. That kind of clarity is difficult both to achieve and to maintain. Losing it leads to mentalities like this quote brings to mind.

Clean up a subversion working copy

You could just delete your working copy and check out a fresh one. In my case, that takes hours. So I wanted to find a way to remove all local modifications.

svn st | awk '{ if ( $1 == "?" ) print $2 }' | xargs rm -Rf
svn st | awk '{ if ( $1 == "M" ) print $2 }' | xargs svn revert

Remove all files whos' svn status is '?', meaning not part of version control. Then revert changes in any file which has an "M" (modified) status.

svt st

You should now have no local changes or unknown files.

WARNING
xargs may not do the right thing if your file names have spaces or other special characters in them. Check man xargs for more details before running these commands.

Demo of automatic merge tracking in Subversion 1.5

Create a Demo Project

I'll create a basic project to demonstrate how the merge-tracking features in Subversion 1.5 work. This isn't exactly new, but it's new to me since our company just upgraded our svn server.

$ svnadmin create frobulator-repo
$ mkdir skeleton
$ mkdir skeleton/trunk
$ touch skeleton/trunk/frob.rb
$ mkdir skeleton/branches
$ mkdir skeleton/branches/release-1.0.0
$ cp skeleton/trunk/frob.rb skeleton/branches/release-1.0.0/
$ svn import skeleton file:///Users/alex/tmp/frobulator-repo -m 'initial import'
Adding         skeleton/trunk
Adding         skeleton/trunk/frob.rb
Adding         skeleton/branches
Adding         skeleton/branches/release-1.0.0
Adding         skeleton/branches/release-1.0.0/frob.rb

Committed revision 1.
$ svn co file:///Users/alex/tmp/frobulator-repo frobulator-wc
A    frobulator-wc/trunk
A    frobulator-wc/trunk/frob.rb
A    frobulator-wc/branches
A    frobulator-wc/branches/release-1.0.0
A    frobulator-wc/branches/release-1.0.0/frob.rb
Checked out revision 1.

Make a Change

So far, I've created a new repository with a trunk and 1 branch. The trunk and branch have identical content right now, a single empty frob.rb file.

I'll fix a 'bug' in the trunk now.

$ cd frobulator-wc/trunk
frobulator-wc/trunk$ echo 'puts "bug fixed!"' > frob.rb
frobulator-wc/trunk$ svn commit -m 'fixed bug in frob.rb'
Sending        trunk/frob.rb
Transmitting file data .
Committed revision 2.

Merge Fix

Now I want to merge my bug fix into my release branch.

frobulator-wc/trunk$ cd ../branches/release-1.0.0/
frobulator-wc/branches/release-1.0.0$ svn merge -c 2 file:///Users/alex/tmp/frobulator-repo/trunk .
--- Merging r2 into '.':
U    frob.rb
frobulator-wc/branches/release-1.0.0$ svn commit -m 'merge bug fix from trunk'
Sending        release-1.0.0
Sending        release-1.0.0/frob.rb
Transmitting file data .
Committed revision 3.

View Basic Log

Up to this point, there's nothing different from how I would have done things in subversion 1.4, except that my commit in the branch didn't mention where the merge came from. Formerly, I would have added a commit log like "merged trunk r2 into release-1.0.0". Otherwise we wouldn't have known what got merged and what didn't.

Notice the log only shows changes which were made in this branch. This behavior is the same as in previous version of Subversion.

frobulator-wc/branches/release-1.0.0$ svn log -v
------------------------------------------------------------------------
r3 | alex | 2010-07-27 09:08:43 -0500 (Tue, 27 Jul 2010) | 1 line
Changed paths:
   M /branches/release-1.0.0
   M /branches/release-1.0.0/frob.rb

merge bug fix from trunk
------------------------------------------------------------------------
r1 | alex | 2010-07-27 09:04:51 -0500 (Tue, 27 Jul 2010) | 1 line
Changed paths:
   A /branches
   A /branches/release-1.0.0
   A /branches/release-1.0.0/frob.rb
   A /trunk
   A /trunk/frob.rb

initial import
------------------------------------------------------------------------

View Merge Information in Log

This is where the new merge-tracking features in Subversion 1.5 show themselves. Adding the -g flag displays revision 2 as being part of release-1.0.0, and adds a note describing when this change was merged into the branch. Pretty nice.

frobulator-wc/branches/release-1.0.0$ svn log -vg
------------------------------------------------------------------------
r3 | alex | 2010-07-27 09:08:43 -0500 (Tue, 27 Jul 2010) | 1 line
Changed paths:
   M /branches/release-1.0.0
   M /branches/release-1.0.0/frob.rb

merge bug fix from trunk
------------------------------------------------------------------------
r2 | alex | 2010-07-27 09:07:15 -0500 (Tue, 27 Jul 2010) | 1 line
Changed paths:
   M /trunk/frob.rb
Merged via: r3

fixed bug in frob.rb
------------------------------------------------------------------------
r1 | alex | 2010-07-27 09:04:51 -0500 (Tue, 27 Jul 2010) | 1 line
Changed paths:
   A /branches
   A /branches/release-1.0.0
   A /branches/release-1.0.0/frob.rb
   A /trunk
   A /trunk/frob.rb

initial import
------------------------------------------------------------------------

ActiveRecord annoyance

class AnExampleWhichAnnoysMe < ActiveRecord::Migration
  def self.up
    create_table :test do |t|
      t.column :column_a, :integer
    end
    add_column :test, :column_b, :integer
    add_index :test, :column_b, :name=>'idx_on_test_col_b'
  end

  def self.down
    remove_index :test, :name=>'idx_on_test_col_b'
    remove_column :test, :column_b
    drop_table :test
  end
end

  • Columns: add & remove.
  • Indexes: add & remove.
  • Tables: create & drop.

I wish these were more consistent. The names seem to be based on common SQL commands, but for that to be a useful rule-of-thumb, columns and indexes should be manipulated through some 'alter' method, since ALTER is usually the command you use to change these kinds of objects in SQL. In the context of ActiveRecord, I can see that add & drop really make more sense, but I wish that logic would have been applied to tables as well.

As it is, I often make the mistake of using create_table and drop_index.

Semitic Markup

Everybody talks about the importance of semantic markup on the web.

I think Semitic markup isn't getting nearly enough attention.

I love xmlformat

There have been many times that I've captured some machine-generated XML document for debugging purposes. Usually these documents omit all line breaks, indentation, and other niceties which make the text more readable. Of course no real XML parser cares about such things, but the very-incomplete XML parser in my head really appreciates them. I've spent plenty of time manually re-formatting various documents to get them into some state where I could understand the structure and makes some sense out of it.

But, wow, that is tedious work. Click, enter, space, space, click, enter, space, space, etc, etc, etc... I went looking for a better solution, and found xmlformat. You can also use tidy, which is included on OSX by default. I prefer xmlformat because of the ease of configuration.

alex@rutabaga:~$ xmlformat --show-config
*DEFAULT
  format = block
  entry-break = 1
  element-break = 1
  exit-break = 1
  subindent = 2
  normalize = no
  wrap-length = 0

*DOCUMENT
  format = block
  entry-break = 0
  element-break = 1
  exit-break = 1
  subindent = 0
  normalize = no
  wrap-length = 0

This is pretty much the default configuration. I keep my configuration in ~/.xmlformat.conf, and set export XMLFORMAT_CONF=~/.xmlformat.conf in my ~/.profile script.

Why is this good?

With a single command:

alex@rutabaga:~$ xmlformat wfs_getfeature.xml

This mess:

<wfs:GetFeature xmlns:wfs="http://www.opengis.net/wfs" service="WFS" version="1.1.0" xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.1.0/wfs.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><wfs:Query typeName="kodos:" srsName="EPSG:4326" xmlns:kodos="http://www.regionsproject.org/kodos"><ogc:Filter xmlns:ogc="http://www.opengis.net/ogc"><ogc:And><ogc:PropertyIsLessThan><ogc:PropertyName>map_id</ogc:PropertyName><ogc:Literal>0</ogc:Literal></ogc:PropertyIsLessThan><ogc:BBOX><ogc:PropertyName>area</ogc:PropertyName><gml:Envelope xmlns:gml="http://www.opengis.net/gml" srsName="EPSG:4326"><gml:lowerCorner>-227.08007812014 0.922841119201</gml:lowerCorner><gml:upperCorner>17.08007812012 68.341004876129</gml:upperCorner></gml:Envelope></ogc:BBOX></ogc:And></ogc:Filter></wfs:Query></wfs:GetFeature>

is transformed into this:

<wfs:GetFeature xmlns:wfs="http://www.opengis.net/wfs" service="WFS" version="1.1.0" xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.1.0/wfs.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <wfs:Query typeName="kodos:" srsName="EPSG:4326" xmlns:kodos="http://www.regionsproject.org/kodos">
    <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
      <ogc:And>
        <ogc:PropertyIsLessThan>
          <ogc:PropertyName>map_id</ogc:PropertyName>
          <ogc:Literal>0</ogc:Literal>
        </ogc:PropertyIsLessThan>
        <ogc:BBOX>
          <ogc:PropertyName>area</ogc:PropertyName>
          <gml:Envelope xmlns:gml="http://www.opengis.net/gml" srsName="EPSG:4326">
            <gml:lowerCorner>-227.08007812014 0.922841119201</gml:lowerCorner>
            <gml:upperCorner>17.08007812012 68.341004876129</gml:upperCorner>
          </gml:Envelope>
        </ogc:BBOX>
      </ogc:And>
    </ogc:Filter>
  </wfs:Query>
</wfs:GetFeature>

It's now easy to see what elements are nested inside which other elements. This makes all kinds of debugging and troubleshooting tasks immensely easier.

Uploading GeoTIFFs to GeoServer via REST

A while ago I wanted to be able to upload a GeoTIFF to GeoServer via their REST interface. The default way to do this is to send the GeoTIFF image in an HTTP PUT. In my case, the image was already on the same filesystem as GeoServer's data directory, so sending it again in another HTTP transaction seemed like kind of a waste. I was hoping to find a way to just tell GeoServer to use the GeoTIFF in its current location, without the need to make a copy of the file.

I poked around in the source code for GeoServer's REST extension and discovered that this behavior was indeed supported. It just wasn't documented. A while ago I submitted some documentation additions which should make this clearer. This is a common use case, judging by the number of people who ask about this kind of thing on the GeoServer user's mailing list. So, until my patch gets accepted and makes its way into the official GeoServer REST API docs, I thought I'd point it out for anyone searching for a way to get GeoServer to use GeoTIFFs which are outside the data directory.

http://jira.codehaus.org/browse/GEOS-3966

Enjoy.

If you have polygon data, don't try to render it as points!

11 May 09:27:51 ERROR [geotools.rendering] - side location conflict [ (462.7099002767874, 220.80214145084886, NaN) ]
com.vividsolutions.jts.geom.TopologyException: side location conflict [ (462.7099002767874, 220.80214145084886, NaN) ]

This error occurs sometimes when you use a point-style SLD for a polygon layer. When adding new layers via the GeoServer web interface, the 'point' style is the default, so if you're just wanting a quick preview of your data you might forget to select an appropriate style and then encounter this error.

I'm not sure why some polygon layers can actually be rendered as points, while others raise this exception. In any case, I didn't find much information when searching for this specific error message, so I thought others might find this helpful.

Presentation on Heartbeat and DRBD

I did a presentation on heartbeat and drbd for the Siouxland Linux Users Group (SLUG) a few days ago. Sioux Falls is about 100 miles from Wessington Springs, so it makes for a long evening to drive there and back, but it was fun to meet some other South Dakota linux users. I plan to attend more meetings in the future, and hopefully present some more. It's a nice group, and the Q&A session after my presentation was great.

I don't have any slides to post because, well, I feel like powerpoint & friends often detract from a real presentation. I did my talk while sharing my screen and demonstrating various parts of heartbeat & drbd configuration and testing. Now that I'm posting to the blog, I suppose some slides would be nice to give you all an idea of what I said. But for the presentation itself, I think slides would have just gotten in the way.

The Siouxland Linux Users Group
Meeting Minutes
Thursday, April 15, 2010
6:30 p.m.
Siouxland Libraries - Oak View Branch Room 3

Alex Dean gave an overview of DRBD and Heartbeat, two high availability tools for Linux.

DRBD is used to manage a single data set mirrored on two computers, one computer acting as the primary server, and the other as secondary. The computers can be in separate server farms, if desired. Heartbeat works with DRBD to switch the secondary server to primary quickly and automatically, if the primary server should fail.

Alex went through the configuration files for both programs, and outlined situations in which both servers may be running, but they each think the other host has disappeared from the network. This situation usually occurs more frequently than the loss of a server, he said.

In the Q-and-A session that followed his talk, he outlined a number of ways that DRBD and Heartbeat could be used with MySQL database servers.

His recommended resource for DRBD is www.drbd.org/users-guide. The recommendations for heartbeat are www.linux-ha.org/HeartbeatTutorials, http://people.linbit.com/~florian/heartbeat-users-guide/, and www.clusterlabs.org.

Alex works for IPS MeteoStar in Denver, Colo., a company that provides high availability solutions for weather data worldwide to government and industry. See http://weather.meteostar.com/ or www.meteostar.com/company/news.html for more information about IPS MeteoStar.

http://groups.google.com/group/sluglinux/browse_thread/thread/3e23fd02f3...

Syndicate content