Thursday, November 15, 2007

Too many people

Overcrowding is a common fear these days, just like global warming and other similar crap. I just thought I'd save here a quick calculation to put things in perspective.

  • Living area for a single person: 100 m^2 (this would make it 400 m^2 for a 4-people family)

  • Number of people you can fit in a km^2: 10,000 (because 1 km^2 = 1 million m^2)

  • Land area of the Earth: 148 million km^2 (see Wikipedia)

  • Say only a fifth of that area is available for people, that would be 30 million km^2. Multiplied by 10,000 this gives us 300 billion people.

300 billion people. With no many-stories buildings, no overcrowding a la Asimov's fiction (how on Earth could he estimate only 40 billion people for Trantor I cannot imagine) and no food problems ('cause we have four fifths of the land area for that - in fact, even if we remove 30% of the total area as desert it still leaves us with 70 million km^2 for animals and crops).

One other thing - what the overcrowding gang never mention is this: on average, people will necessarily produce more than they consume (because they need to produce for their own consumption, for the new population and to improve the living standard). More people means more resources, not less - well, unless governments interfere with the process, of course.

Finally - these calculations are useless for persuading anyone. If it's not population, it's food. (Malthus said that people grow exponentially while food grows arithmetically, so we're going to run out of food sooner or later. That he was an idiot is not necessarily a surprise; that there are millions of other idiots who repeat that without realizing that food is also biological, and therefore should also increase exponentially, is the really weird part.) If it's not those, it's water (I'd try calculating the amount of water in a single glacier but it would also be pointless) or energy or who knows what else.

So. I guess the conclusion is: we're all gonna die. Yay :P

Thursday, November 08, 2007

Non-nullable reference types in C#

C# 1.0 has non-nullable value types (an int variable must always have a valid int value, it cannot be null) and nullable reference types (a string variable can be null). C# 2.0 added nullable value types (an int? variable - note the "?" - can be null). There are unfortunately no non-nullable reference types (a string variable which can never be null). Patrick Smacchia has a summary of a possible notation and also of the issues that could appear.

However, until Microsoft decides to add non-nullable reference types, there's a way of doing the same thing with a somewhat more complicated notation.

  1. using System;  
  3. namespace Extensions  
  4. {  
  5.   public struct NotNull<T> where T : class  
  6.   {  
  7.     public readonly T value;  
  9.     public NotNull(T arg)  
  10.     {  
  11.       if (arg == null)  
  12.         throw new ArgumentNullException("arg");  
  14.       value = arg;  
  15.     }  
  17.     // convert non-nullable to regular  
  18.     public static implicit operator T(NotNull<T> arg)  
  19.     {  
  20.       return arg.value;  
  21.     }  
  23.     // convert regular to non-nullable  
  24.     public static implicit operator NotNull<T>(T arg)  
  25.     {  
  26.       return new NotNull<T>(arg);  
  27.     }  
  29.     // explicit cast to non-nullable -- needed when T == object  
  30.     public static NotNull<object> Cast(object arg)  
  31.     {  
  32.       return new NotNull<object>(arg);  
  33.     }  
  35.     public override string ToString()  
  36.     {  
  37.       return value.ToString();  
  38.     }  
  40.     public override bool Equals(object obj)  
  41.     {  
  42.       return value.Equals(obj);  
  43.     }  
  45.     public override int GetHashCode()  
  46.     {  
  47.       return value.GetHashCode();  
  48.     }  
  50.     public static bool operator ==(NotNull<T> left, NotNull<T> right)  
  51.     {  
  52.       return Equals(left, right);  
  53.     }  
  55.     public static bool operator !=(NotNull<T> left, NotNull<T> right)  
  56.     {  
  57.       return !Equals(left, right);  
  58.     }  
  59.   }  
  60. }  

A method enforcing a not-null argument would look like this:

  1. public string Test(NotNull<string> s)  
  2. {  
  3.   string ss = s;  
  4.   return ss.Substring(0); // no need to test for null  
  5. }  

and would be called like this:

  1. public void TestNotNull()  
  2. {  
  3.   string s;  
  5.   s = "something";  
  6.   Test(s);  
  7.   s = null;  
  8.   Test(s);  
  9. }  

Incidentally, this notation also serves as a form of specification - this parameter should not be null. I was thinking of using attributes but I think it would lead to a more verbose notation in case of several non-nullable parameters. I hope this helps someone :)

Saturday, November 03, 2007

Structure of Object-Oriented systems

Objects are state machines

Real-life entities have state. Your checking account has an amount of money; your car has a mileage, a gas level, a speed and so on. Objects in an application should also have a state and a set of operations that can manipulate that state. (You can change the color of your car or add money to your checking account; you cannot do the opposite.)

That being said, we normally want to hide the state and expose operations because it reduces coupling, which in turn reduces the "brittleness" of the whole system.

Tell, don't ask

Objects shouldn't have getters and setters; this is sometimes called the "tell, don't ask" principle. There are two cases where getters appear to be necessary:

  • Asking an object for some information and then deciding what how to change the state of the same object. This code could be a fragment of a ping-pong game:

  1. if (ball.Y < 0 || ball.Y > MAX_Y)  
  2.   ball.SpeedY = ball.SpeedY * -1;  

A much better way to do this is:

  1. ball.CheckYBounds();  

thus encapsulating logic having to do with the ball inside the ball object.

  • What about when you ask an object for its status and then modify *another* object depending on that status?

  1. if (ball.Y < 0 || ball.Y > MAX_Y)  
  2. {  
  3.   ball.SpeedY = ball.SpeedY * -1;  
  4.   sound.Bell();  
  5. }  

Hmm. Changing this to having the ball object invoke sound.Bell(); doesn't seem that good now. It creates a dependency between the ball object and the sound object. Even worse, what about drawing the ball on the screen? Allen Holub suggests having a method ball.DrawOnScreen(); - but this seems way too weird. The ball object is a "model world" object. It should only "know" about model world stuff - moving around, having coordinates, a size and a speed, interacting with other object. Having it know about drawing stuff seems dangerous. What if I want to save its state to a database? Should it know about that too? This is definitely a violation of the Single Responsibility principle (aka SRP) - an object should only have one reason to change.

Communication between objects

One thing I read in the Turbo Pascal 6 manuals was this: every time you need object A to communicate with object B, think about adding a new object X in between them. At first, it sounds weird. However, if you need to add objects C, D and E to the mix, and all of them need to communicate with each other, then it pays for all of them to go through X: this way, you only need to have "send to X" and "receive from X" algorithms, instead of all the combinations. The clipboard is a good example of this: Word doesn't need to know how to communicate with Excel, Notepad, and a zillion other programs; it only needs to know how to put stuff into the clipboard and get stuff out of it.

So - how can this solve the above problem? How can the ball object inform the sound object that it needs to sound the bell, without creating a dependency between the two?

My solution would be a messaging system. We can have the ball object invoke the messaging object, which in turn invokes the sound object.

  1. void Ball.CheckYBounds()  
  2. {  
  3.   //...  
  4.   if (ball.Y < 0 || ball.Y > MAX_Y)  
  5.   {  
  6.     ball.SpeedY = ball.SpeedY * -1;  
  7.     Messaging.SendMessage(SOUND_BELL);  
  8.   }  
  9.   //...  
  10. }  
  12. void Sound.OnBell() // answers to SOUND_BELL  
  13. {  
  14.   //...  
  15. }  

Ok... the ball object is now free of the dependency on the sound object... it doesn't even have to know about the Sound class. However, it still has to know about the SOUND_BELL message. Why would a Ball know about making sounds? What if I want to reuse the Ball class in another application, where it's not supposed to make sounds when it gets reflected by the up and down edges? What if it's also supposed to flash when this happens - should I change the Ball class when I need to make a change to the way the ball is displayed? We're back to violating the single responsibility principle.

Inform, don't direct

Here is a great way of removing all such dependencies: don't have the object decide what other objects should do; just inform them of the state change and let the interested ones react to it (or not):

  1. void Ball.CheckYBounds()  
  2. {  
  3.   //...  
  4.   if (ball.Y < 0 || ball.Y > MAX_Y)  
  5.   {  
  6.     ball.SpeedY = ball.SpeedY * -1;  
  7.     Messaging.SendMessage(BALL_HIT_HORIZONTAL_EDGE);  
  8.   }  
  9.   //...  
  10. }  

This has a few beneficial effects:

  • The message sent by the ball is defined in the Ball class itself, where it makes sense. It made sense for the SOUND_BELL message to be defined in the Sound class, but in that case we still would have had a dependency from Ball to Sound, and the Ball class did not care about sounds. Any object that wants to react to the BALL_HIT_HORIZONTAL_EDGE message shouldn't have a problem about having a dependency on the Ball class - after all, it obviously cares about the Ball class. If we don't want the Sound class itself to have this dependency on Ball, that's fine: we can have a BallSounds object with dependencies on both Ball and Sound:

  1. void Ball.CheckYBounds()  
  2. {  
  3.   //...  
  4.     Messaging.SendMessage(Ball.BALL_HIT_HORIZONTAL_EDGE);  
  5.   //...  
  6. }  
  8. void Sound.OnBell() // answers to Sound.BELL  
  9. {  
  10.   //... ring a bell  
  11. }  
  13. void BallSounds.OnBallHitHorizontalEdge() // answers to Ball.BALL_HIT_HORIZONTAL_EDGE  
  14. {  
  15.   Messaging.SendMessage(Sound.BELL);  
  16. }  

Note that both the compile-time and the run-time dependencies are in the directions we want: the Ball and the Sounds objects do not know about each other, and they also do not know about the BallSounds object. All the objects know only about the messaging system.

  • A second advantage is that we've created an actual plug-and-play architecture, which has been a long-held dream of programmers, as far as I know. Let's say we decided to log one or more (or all) messages flowing through the system for debugging purposes. Simply create a new object, subscribe it to all the relevant messages (maybe add an all flag to the messaging system for this purpose) and have it log them when invoked. None of the other objects need to be modified in any way for this.

This is the architecture I'm envisioning:

The important point to keep in mind is this: an object should not send messages telling other objects what to do; it should inform them on what has been done and let them react to that.

to be continued...