Category: Adam's Software Blabbing

03/12/07

Permalink 08:12:18 pm, by destructionator Email , 625 words, 80 views   English (US)
Categories: Adam's Software Blabbing

PHP annoyances

I posted this on a web programming forum where I have been ranting about PHP, but figure it makes sense to fit it here, so here it is:


From my current project:

/*
	$orderby= ($sort == 'date')	?	'o.date_placed'
		: ($sort == 'brand')	?	'b.brand_name'
		: ($sort == 'customer')	?	'c.customer_lastname'
		: ($sort == 'status')	?	'o.order_status'
		: ($sort == 'order')	?	'o.order_id'
		: 'b.brand_id';
*/
// Have to rewrite the above thusly because PHP hates me
	$orderby = 'b.brand_id';
	if($sort == 'date')
		$orderby = 'o.date_placed';
	if ($sort == 'brand')
		$orderby = 'b.brand_name';
	if ($sort == 'customer')
		$orderby = 'c.customer_lastname';
	if ($sort == 'status')
		$orderby = 'o.order_status';
	if ($sort == 'order')
		$orderby = 'o.order_id';
// all done

The commented out block that comes first is a way you can easily do in-line translation tables in C, C++, Java, etc.

It used nested ternary operators that basically mean a bunch of if() else blocks, but all together with one one assignment statement.

You can see here with some indenting, it is very easy to read and add, remove, or modify conditions, even if you aren't familiar with the ternary statement. Even if you have never seen it before, I think that is pretty easy to read at first glance.

The second part (the uncommented part) does exactly the same thing with a series of if statements. I think it is a little harder to read, and it was harder to write, but modifying, adding, and removing from it is pretty easy too.

I wanted to use the commented part, but when I ran it, it always sorted by the order id, regardless of the sort variable's contents, which certainly wasn't what I had in mind.

So I go to the PHP manual, and sure enough, it talked about what I was trying to do, and explained it:

PHP.net
Note: Is is recommended that you avoid "stacking" ternary expressions. PHP's behaviour when using more than one ternary operator within a single statement is non-obvious:

<?php
 // on first glance, the following appears to output 'true'
 echo (true?'true':false?'t':'f');
 
 // however, the actual output of the above is 't'
 // this is because ternary expressions are evaluated from left to right
 
 // the following is a more obvious version of the same code as above
 echo ((true ? 'true' : 'false') ? 't' : 'f');
 
 // here, you can see that the first expression is evaluated to 'true', which
 // in turn evaluates to (bool)true, thus returning the true branch of the
 // second ternary expression.
 ?>

Non-obvious, indeed. Even Javascript does it the C way, and JS is also weakly typed, so it might fall into the same trap as the final comment of the example.

This is understandable; it just goes left to right and doesn't recurse through it like the other languages, a decision perhaps made to improve execution time (though, nested ternaries aren't that common for things other than something similar to what I am trying to do anyway, unless you are purposefully obfuscating your code), but it still breaks from the other big languages, which is another small annoyance, and it removes a very useful and easy to use tool for not a very good reason, as far as I can tell.

(In the time it took them to write that warning in the manual, they probably could have added a recursive call to the parser to allow the feature to be used as well - it was surely a conscious design decision to do it their way, one I needless to say, disagree with.)

Oh well, the if() works, but if that were my logic, I could say if() and goto work just as well as for and while loops, and throw them out too!

Anyway, venting complete, time to finish the project.

12/24/06

Permalink 05:21:09 pm, by destructionator Email , 1106 words, 132 views   English (US)
Categories: Adam's Software Blabbing

Digital signing of executables in Linux

In this post, I will talk about the digital signing of binaries in Linux, something Windows does and I like. I know I am once again saying Linux should basically copy Microsoft, but that is another post for another time. Anyway....

I was reading some Microsoft blogs earlier, and I was reminded of a nice feature Windows has that Linux does not: the ability of a publisher to digitally sign a binary file before distributing it.

When a program is run, Windows checks the signature against a list of trusted publishers and the expected checksum. If this works out, it runs the program. If something is wrong, it warns the user and will not run until he gives the OK.

Now, a dialog box warning a user is often ignored, since the average user just tends to click to make it go away, but the idea is certainly valid and the box might warn some users, especially the more technically skilled users (like many Linux users).

What the signature does is two things:

1) Ensure that the file is from someone you trust, and actually came from that file (meaning it wasn't spoofed of the site hacked, etc).

2) Ensure the file has not been altered or tampered with. It ensures the download was successful and the site not hijacked. It ensures the file being loaded was not subtulely corrupted (like if by a failing hard drive), and what I think is most important: it ensures the file hasn't been altered or infected by a virus.

Now, the digital signature isn't a perfect method - there are ways to fool it, but these ways are extremely hard to do, requiring often brute force of the cryptography and it is still limited in what it can do unless a private key is compromised, at which point it can be invalidated by some other secure mechanism.


There are a couple down sides:

1) It makes loading and running a program slightly slower, because it needs to check the checksum and signature. Probably not a big deal, and could be optimized by a cache for already checked programs, which would be invalidated if the file is written to at any time (notified by the kernel). That would also have the down side of slightly slowing down filesystem writes, but I feel that this could be negligible, especially since most the relevant binaries, would only be writable by root anyway.

2) The trusted keys would have to be gathered. Some could come with the kernel patch required (a similar list is how https works: the browser knows what root certs to trust), you can trust yourself, and getting more can be done in a similar way to PGP keys for digitally signed emails (you get it from the author's website and add it to your trusted list. To be really secure, you would get some keys in person to be sure you aren't being faked, but for this, you can probably get by with a easier, but less secure method)

3) It would need to load the entire binary at once to do the checksum, which is slightly memory inefficient and slightly slow (related to problem 1) or load and check sections of the binary one at a time on demand, which would probably require a change in binary format to accomodate it, and of course the needed code as well.

4) The developers would have to sign their binaries and distribute their signatures to avoid the warnings, otherwise the system loses its effectiveness (if users get used to the warning, they will start to ignore it). This isn't too bad, and would probably be automated in makefiles and by a developer trust script.

I might propose adding this to a package format to automate it even more.

5) It wouldn't apply to source downloads, not directly at least (the source tarballs could still be signed like they are now). When you compile the source, you would probably want to self-sign it for your own use. This would either have to be built into the linker (dangerous, IMO), the makefile (also somewhat dangerous, but then again, makefiles can be horribly dangerous if used maliciously anyway, so this probably shouldn't be discounted), or done manually by the user. Of course, it could also not be signed, but that would again break the effectiveness of the check.

6) Devs or users who self sign would have to remember one more passphrase for their key. Not a big deal, especially for devs, and simple self-signing could be done in a transient manner or without a passphrase (I don't recommend that), or tied to their other password, which I also am against, but might be useful.

7) The binary format would need to support a place for the signature block. I'm not sure if ELF can do this or not, but if not, it shouldn't be too hard to modify it slightly to allow it, and add a simple kernel patch so the kernel can understand the modification.

With a small kernel patch, a relatively simple cache and execute daemon and gpg, I think this can be done. The signing tool would be built around gpg, which has all the necessary cryptographic capabilities and can be automated (like I do for my emails; it really is pretty simple)

The trusted keys would probably be stored in a file that only root can get at, so malware wouldn't be able to simply add their signatures to the safe list ahead of time.

Inertia would probably be the biggest problem. Many Linux devs, distros, and users might not want to change, or some might see it as DRM (which it isn't: it is just a warning, but sometimes that can be hard to explain)

A question would be how the checking daemon warns the user if things are wrong. Would it be to the terminal? If so, what one? I would say it should be the console attached to the running program. But if one isn't present or visible? Would it use a dialog box? What if X isn't running? What GUI toolkit would it use for the box?

The fragmentation of Linux GUIs once again shows its ugly head, but this is something that should be able to be worked around on a per user basis in the daemon config file, I think. (Needs more thought to make that right)

I just wanted to get this down, and will be putting more work into it at another time, and when I think about it more, I will be sure to post again, but this is something I think could be a good idea to implement.

11/29/06

Permalink 11:57:14 pm, by destructionator Email , 750 words, 97 views   English (US)
Categories: Adam's Software Blabbing

The advantages of knowing C

I'm going to share a quick anecdote from my rotate 12 project (an asteroids like game) I did a few weeks ago.

So I was working on a game object engine that could dynamically support an arbitrary number of objects that can shatter on collision for the game, written in C++. I decided to use a dynamic array of pointers to objects allocated on the heap. I considered simply writing this myself with C, but instead decided to use an STL vector. I figured it would handle the dynamic memory for me and by using it, I would have a chance to get a wee bit more experience with the STL, which is always good.

So I go about creating a std::vector<GameObject*> that held the objects. Each program loop, I would create an iterator and use it to go through each item in the list, see if it is colliding, and if so, delete the object, remove it from the list, then add objects to represent the fragments.

Once the adding and removing is done, I would continue through the iterator. And the program would randomly segfault. Those of you familar with C++'s STL probably see the problem already.

I try different combinations and speeds, and eventually, it would always segfault. Sometimes quickly, sometimes not for a few thousand loops.

I pull up the debugger and see it is crashing in an STL function. Being like many programmers, I want to blame the library, but my brain knew better. There was something wrong with my code. I thought I might have messed up synchronization between some of my threads, but taking a looks showed this could not be the case. My problem was limited to the main thread.

Next, I decided to try to preallocate more memory for the vector. This made the crashes happen later on, but they still happened.

After dicking around with it for over two hours, it finally hits me in a DUH moment. Think about what is going on when adding objects to the vector. Internally, if the new size exceeds what is preallocated, it would call realloc for the new size.

Of course, the pointer returned by realloc is not guaranteed to be the same one you pass into it. If the block of memory where it was does not have a continuious section big enough for the new size, realloc will get a new block somewhere else in memory, copy the memory from the old location to the new one, and free the old one, returning a pointer to the new memory.

An STL iterator has a pointer to the memory location of the current object in the list. Since I tried to grow the vector while iterating, it would sometimes call realloc, which would sometimes move the memory, rendering the pointer held in the iterator invalid until it is reloaded! Thus, I next try to read the next location of the iterator and it accesses a freed block, and segfaults.

I ended up fixing the problem simply by not doing the changes while iterating; I would queue the removal and insertion events and execute them all after the iteration is complete. I lost some speed (from n squared to n squared plus n for this algorithm), but it no longer crashes, and the main performance bottleneck isn't in that algorithm anyway.

Now, if I just decided to code it myself in C, this wouldn't have happened. Why? Because I would have been keenly aware of what was going on with the pointers under the abstraction and would have accounted for it. Indeed, since I knew how to do it myself, the reason for the crashes became clear once I thought to look in the right place. Imagine if I didn't know C. I probably would have gotten a fix from documentation and googling, but probably wouldn't have understood it.

Of course, now it won't happen again, since now I know and am aware of that. By using the STL, I did get some of the experience in using it that I wanted in the first place, so I call this a great success (and the code does exactly what I wanted beautifully).

But, I point to this as another reason why all programmers should be familiar with lower level assembly and C, even if you never write it. By knowing what is going on, you can prevent yourself from making silly errors and catch and understand bugs better.

11/24/06

Permalink 01:29:04 am, by destructionator Email , 1000 words, 51 views   English (US)
Categories: Adam's Software Blabbing

Software usability I - this blog program

As I get started with this blog, I notice a number of things about this program that make me cry WTF. I have decided to make this the first in a series where I will talk about software usability, with a special emphasis on Open Source software.

Here on the write new blog post, I see a number of things that jump right out at me as being terrible design. First, writing the post seems to default to a HTML input system, with some scripted buttons for inputting tags. However, these buttons don't offer any indication as to what the tags to. There is a series of buttons: ins, del, str, em, code, p, il, or, li, etc. Now, as a developer, I am familiar with the meaning of even the more esoteric tags (and somewhat like it this way), but imagine being a regular user looking at that.

The average user isn't going to know, or care, what the HTML tags mean. They just want results, so they can accomplish their task, which is probably not dicking around with your software. They aren't going to want to know that ul means 'unordered list'. They aren't going to care that you can't nest block level elements inside inline tags. They want it to just work.

The program give little indication to what the tags do, either. If you click a button, it simply appears in the text area. Hovering the mouse over some of them do expand the name (such as ul does say 'unordered list', if you hover, which to be frank, I didn't even think of doing that at first; it is not very obvious), but they don't go into the actual use of it. A user might think once you open a list you can simply type inside it, but you need list items inside an list, which necessitates another tag, lest it throw an error, like this:

Invalid post, please correct these errors: Tag ol may not contain raw character data

That is somewhat useful if you are already familiar with how XHTML works, but gives little assistance to a newbie. I happened to check the manual as well to talk about it here, however it too offered little help. And even if it did, let's be realistic, very few users will even touch the manual, if it even exists (poor or nonexistant documentation will be a post for another day).

I'm a big fan of context-sensitive help in applications, which is something the Web interface does not make easy (such as there is no real support for the 'What's This' feature of real desktop applications; the weakness of the browser as a platform will also be a future post), however can be done to at least a little extent, especially through the use of scripting. phpBB tries to do this, and while I think it could be better, it isn't terrible. What it does is have a small text box under its buttons, which are labeled far more meaningfully than simply HTML tags, which changes its text to explain what the button does when you move the mouse over it.

Adding some form of an automatic help box gives an at-a-glance blurb to help the users ensure they have the right thing, something more than an expansion of the feature's name, and is something I think more web apps should have.

Aside from the terrible tags situation, this writing section also suffers from two other obvious faults. One is a checkbox simply labeled "Auto-BR This option is deprecated, you should avoid using it." What the fuck is Auto-BR? OK, so I should avoid using it... so why even put it there? My guess is it automatically converts new lines in the text area to line breaks in the outputted text, since that is what a br tag would do, but how would a normal user know that? It is not like it offers any explanation at all; not even a name expansion in the title attribute for mouse hovering. I could look it up in the manual... but again, the average user probably wouldn't bother with that, instead just ignoring it as another bizarre mystery box.

Since that is what the user probably expects, not realizing that whitespace is compressed in SGML, and of course by extension, HTML, having such an option makes sense, and indeed, the program does have one, under my last complaint, a section called 'renderers' with a list of checkboxes, and it is by default clicked.

So if the renderers section does the same job, why the hell even keep the mystery Auto-BR option there anyway? Just an example of clueless usability design.

Lastly comes my complaint with that renderers section. It is a list of checkboxes with bizarre names, such as "Textile (beta)" (and the beta thing will be another post eventually), "Wacko formatting" and "Texturize". What the hell do those mean? Even as an experienced developer, I am at a loss, and hovering over it gives little help: "Wacko style formatting". Gee, great, that really sheds light on everthing. Not. I would have to look it up in the manual, and you know what I said about users doing that.

So, while I rather understand, as a developer, why the authors would list HTML tags, it is a terrible decision for general usability. The renderers list seems to me as to be a 'feature oriented' interface, rather than the superior 'task oriented' interface we see in more successful products. These come from developers saying 'ooo look what we support!' rather than them thinking 'here is how you can do what you want to do.' Even Eric Raymond, an open source hacker, has attacked these interfaces in the past: they are just not generally a good idea.

Joel Spolsky has written a free online book on user interface design. I think every programmer should read it before turning out any more half baked interfaces on the general public.

Permalink 12:48:31 am, by destructionator Email , 340 words, 89 views   English (US)
Categories: Adam's Software Blabbing

Warcraft II on Windows XP

I wanted to play Warcraft II today on a LAN between my Linux box and by brother's Windows box, and while setting up the network on Linux was easy (get Greg Page's IPX tools, easy to find on google, compile and install. Then run ipx_interface add -p eth0 etherii 0x12345678 to set up the interface), and similarly easy on Windows 95 and 98, it was quite a hassle on Windows XP.

First, the obvious had to be done: install the IPX protocol (go to network connections, properties, add, protocol, IPX/SPX compatible protocol). After that, set the network type to match what was set on the Linux box (IPX protocol -> properties, frame type set to Ethernet II and network number under frame type to 12345678). IPX does its networking different than TCP. Note I know far more about TCP/IP than I do IPX, but basically IPX is about networks whereas IP is about hosts. An IP address describes a single host on the network, and therefore cannot conflict. An IPX address describes the network number, and therefore must match on all machines on the network.

Well, that much was simple enough, but Warcraft II still wasn't able to connect to the Linux box! I knew this time the problem was XP; as I said, it worked fine between Linux and other Linux machines or a Win 9x box. After double checking everything, I ended up turning to Google. The reason it didn't work was due to a NetBIOS conflict! See, on WinXP, there is by default a NetBIOS implementation for IP installed. When IPX is installed, it also adds a NetBios over IPX, which screws things up.

Fixing the problem, after knowing what it was, became trivial. Simple ensure the IPX box is checked in network properties on Windows and the other NetBIOS box above it is not checked. Once it is unchecked, hit ok and you will be ready to return to playing Warcraft II, or any one of the many other games of that era that required IPX.

Destructionator Domain

September 2010
Sun Mon Tue Wed Thu Fri Sat
 << <   > >>
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

Search

XML Feeds

What is RSS?

Who's Online?

  • Guest Users: 70

powered by b2evolution free blog software