More Hacking Shelob

I fussed around more with logging today, which lead me to the parseHeader() function. Parsing is one of the weakest areas right now. For simplicity, I had implemented it by simply tokenizing on “space”, shoving the tokens into a string vector and then iterating over that vector for the tokens I needed. So far, I’ve not peeked at anyone else’s source code, Shelob is a clean room implementation of a basic HTTP server.

However, I really need to clean up the parser. I thought about going with a full lexer using flex or something, but that is probably overkill. Plus, I’d rather not add another dependency. More thought on this is needed and maybe some research into how other people are doing this. Very much an area where security can go wrong, it needs to be done right.

The other thought I had while poking around, is that I could make each component into its own server, sort of a mini-microkernel approach. I could imagine a swarm of different servers, all being able to communicate. You could have the log server running on one host, separate cgi servers for each user, as well as different backends.

The only thing I’m not sure about is how much overhead this would be. A lot of the interprocess communication could happen over local UNIX sockets, FIFOs, or even shared memory, but it would be awesome if it all worked fast over a regular socket. Yet more thought needed here.

So far I’m having a blast playing with this program. It is nice to write something for yourself and make only the trade offs you decide. I don’t have any customer or management trying to shoehorn this thing into something I don’t want. Even if I never release it, it is a good brain exercise.

Comments

Discussion