PHP and inconsistent return values

The other day a co-worker asked me why the following snippet of code was acting weird:

1
2
3
4
5
6
7
8
<?php
	$box = 'Tall box';
	if(stripos($box, 'wide ') == 0) {
		echo "It's a wide box!";
	} else {
		echo "It's a tall box!";
	}
?>


C:\> php test.php
It's a wide box!

I’m sure any seasoned PHP devs will spot the error straight away, but it took us a couple of minutes.

From the PHP docs for stripos:

int stripos ( string $haystack , string $needle [, int $offset = 0 ] )
Returns the numeric position of the first occurrence of needle in the haystack string.

Unlike strpos(), stripos() is case-insensitive.

Returns the position as an integer. If needle is not found, stripos() will return boolean FALSE.

This is all well and good, except that in PHP, (0 == false) == true. The PHP interpreter tries to guess the types of both sides of the evaluation, and comes to the conclusion that false is equivalent to 0 (which is in turn equivalent to both NULL and empty string). This is correct in a lot of circumstances, but not great if you are specifically looking for the return value 0.

The solution? In PHP 4 the === operator was introduced (comparison operators). This gets rid of type-juggling (or inference):

===: TRUE if $a is equal to $b, and they are of the same type.

Contrast this simple task to how you might approach it in C#1.

1
2
3
4
5
6
7
8
9
string box = "Tall box";
if (box.IndexOf("wide ") == 0)
{
    Console.WriteLine("It's a wide box!");
}
else
{
    Console.WriteLine("It's a tall box!");
}

The difference here is that String.IndexOf(...) will return -1 if the substring isn’t found.

Working with a statically typed language for a number of years, to me the whole idea of having to deal with a return value which may be this type or that type just feels a bit wrong. One thing worth noting though is that this is not necessarily true in all dynamically typed languages. For instance, Python is a lot more sensible in this scenario.

1
2
3
4
5
6
7
8
>>> None == False
False
>>> 0 == False
True
>>> "test".index("hello")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found

This issue pops up in other PHP library functions (strtok(), current(), readdir() to name a few) maybe due to the standard library’s endemic inconsistencies. There are warnings on the official documentation and the functionality seems to be widely accepted – but I still can’t help feeling that this is counter-intuitive and must be yet another sticking point for new PHP programmers. Python is a great example of a middle ground between PHP and C# – strongly typed but also dynamic. Maybe that’s part of the reason why PHP’s position in the TIOBE is gradually falling while Python’s is rising.


1String.StartsWith() would be better for this specific case.

London Riots – A couple of useful tools

Well, I’ve been glued to the news all evening watching the third night of riots unfold. Found these tools which have provided a constant stream of information on my second monitor:

  • DeskPins
    Allows you to set windows to float above all other windows on your desktop.
  • Chrome Auto-Refresh
    Reloads web pages at a constant interval.
I’m not in London by have family and friends who are. It sounds like a real mess down there – so much is being reported on Twitter and other sources that just isn’t even getting on BBC News or *shudder* Sky. Stay safe.

SQL Server Management Studio Formatting Plug-in

I stumbled across this pretty nice SQL Server Management Studio Plug-in today that formats your SQL for you, apparently using the same formatter that is used for poorsql.com. I’m using 2008 R2 and it works great. Source is available on Github.


Tip: If you are like me and don’t like leading commas in field lists, install it, open SSMS and go to Tools » T-SQL Formatting Options, and check ‘Trailing Commas’.

Convert WinSCP/PuTTY SSH Host Keys to known_hosts format

The other day I post about SharpSSH and mentioned that if you want to use strict host checking (i.e. checking the remote server’s public key against a stored version to protect against man-in-the-middle attacks), you need a ready-made known_hosts file. This is all well and good if you use the OpenSSH utilities but not so awesome if you are on Windows and use PuTTY or WinSCP, which store the known hosts in the registry.

There is already a converter that takes your known_hosts and turns it in to some registry entries, but I needed something to work in the other direction, so I wrote reg2kh.

It works for both PuTTY and WinSCP stored keys. I’ve tested only with rsa keys but dss should work, with at most only minor changes.

> python reg2kh.py --winscp
127.0.0.1 ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDMskjPhmqsB *snip* AA==
> python reg2kh.py --putty
127.0.0.1 ssh-rsa AAAAB3NzaC1yc2EAAAAD23ABAAAAf8qHnPq90Adnd+ *snip* bQ==

Not being a Pythonista, surely there will be omissions or quirky bugs, so I await your pull requests!

SharpSSH UnknownHostKey error

Well today I was mincing around with SharpSSH (based on Java Secure Channel) – we were writing a small app that’s part of a larger scheduled process, but all we needed was something that would go off and download some files from an SFTP server. I actually found SharpSSH incredibly easy to use after reading a couple of examples, but one annoying error came up that took a while to solve:

JSchException was unhandled:
    UnknownHostKey: 127.0.0.1. RSA key fingerprint is 6c:cb:9d:c6:ce:d7:b1:fb:3e:e0:b0:c8:59:8c:b1:ba

The first fix I found was a bit of a hack – just tell the SSH client not to worry about a host’s identity.

// Connect to the server
JSch jsch = new JSch();
Session session = jsch.getSession(User, Host, Port);
// other session init stuff
 
Hashtable config = new Hashtable();
config.Add("StrictHostKeyChecking", "no");
session.setConfig(config);
 
Log(String.Format("Connecting to {0}@{1}...", User, Host));
session.connect();

Now, this does work but causes a security issue. Do we want to blindly connect to hosts we can’t verify the identity of? What if we suddenly become the target of some clever man-in-the-middle attack? You can’t even turn the equivalent option off in PuTTY, and Simon Tatham goes as far to say that this is the point of using SSH instead of Telnet.

The solution to this is to store you host’s fingerprints in a known_hosts file. For the OpenSSH implementations, you can find this file in your home directory. If you are using PuTTY or WinSCP, you’ll need to connect from a machine that has one of OpenSSH derived implementations as there doesn’t seem to be a way of converting PuTTY’s known host fingerprints to a known_hosts file*. Once you’ve got the file, remove the keys you don’t need and you can reference it in your application like this:

// Connect to the server
JSch jsch = new JSch();
Session session = jsch.getSession(User, Host, Port);
// other session init stuff
 
StreamReader kr = new StreamReader(File.Open(@".\known_hosts", FileMode.OpenOrCreate));
jsch.setKnownHosts(kr);

This approach, combined with other tactics such as use of public keys for authentication and tight firewall rules, should make it as hard as possible for attackers to intercept the data that is being transferred.


*PuTTY and WinSCP use their own formats, whereas Jsch and SharpSSH both use the OpenSSH format. It is possible to covert public and private keys via PuTTYGen but I can’t work out how to do the same with the known hosts. There is a Python script to convert from a known_hosts file to a PuTTY-compatible registry key however so the conversion has to be possible in reverse.