Should be free.
For one thing, it greatly reduces costs because you don't need all the ticket printing/selling/checking equipment and salaries for all those people who collect the money. You also don't need cops checking people's tickets. Because of this the cost for tax-supported free public transit would be much less than the total cost paid by ticket purchasers currently.
It also greatly improves efficiency because buses can just pick people up without the big stall for ticket buying and checking at the entry way. You can change your buses to have more doors and wider doors so that people can just flow on and off more easily and eliminate the huge pinch point that causes big slowdowns.
This will only appeal to liberals, but it's a nice way of giving back to the working poor. Most people who take the bus to work are working poor and can use every break they can get. Riding the bus will always be unpleasant so it's not like people will abuse the priviledge.
Now some people would argue that free public transit would overburden the system, there wouldn't be enough buses, they would get too crowded. That's completely retarded, that's exactly the outcome you *want*. You buy more buses!
I believe it may actually be cheaper in the long run. If you have more people on public transit you don't have to spend as much making new roads everywhere, people buy fewer cars saving them tons of money, we pollute less and have to spend less on reducing CO2, etc. there are tons of long term savings that are hard to quantify.
Now I'm sure the libertarian/republican/dickhead/ayn-randers out there are saying "but I don't ride the bus, why should I pay for it?" First of all, fuck you, you're a dick. Public policy should be optimized to maximize the total good (or perhaps to maximize the median good), the principle of "pay per use" does not enter into rational consideration of public policy at all. Second of all, *you* are the person who would get the most benefit from this. Personally I would probably keep driving because I hate the bus, but I would love to see free public transit, because it gets cars off the road and reduces traffic for me. It's win-win for everyone.
Volunteer Park is an oasis of calm and beauty in this shitty pot-hole-ridden non-stopping-car-filled city.
I met Barefoot Ted in Volunteer Park the other day. Apparently he's running 45 miles barefoot around Green Lake for his 45th birthday today. He's like a barefoot jack lalanne.
The next day I discovered my PT Wolf is friends with Ted. Wolf sometimes wears Five Fingers shoes which are for barefoot runners who want protection. They look pretty sweet, I might pick up a pair just so I can enhance my goofball factor a bit.
The reservoir in the middle of Volunteer Park is calling to me. It's all fenced up because America is a bunch of cocks that locks everything and puts fences around everything fun. I've been fantasizing about getting a pair of bolt cutters and breaking in to swim around in it; it's a beautiful huge fresh water pool, it would be so sweet. Obviously I would get arrested, but it would be worth it.
Another act of civil disobedience I've been contemplating lately is painting crosswalks at all the intersections around here. It rankles me deeply the way the cars don't stop for pedestrians here, even on the quiet residential streets. I think it would be a simple pointed statement to go out in the night and paint crosswalks all over.
I finally went and drove a 370Z this morning. I skipped it in the earlier car testing because I think it's really ugly and I thought the G37 was just a better version of the same thing. I was wrong, the 370Z is fucking fantastic.
It feels small and tight and powerful, the steering feeling and response is excellent, the throttle is responsive. Obviously I didn't push it too hard on a test drive, but it just felt *fun* unlike any of the other cars I've driven. It's the lightest car I've driven at around 3200 pounds (still no featherweight by a longshot, but 300 less than the 135 and 400 less than the G37 and 800 less than the Audi RS4 or S5), and you can feel the difference.
There are some ridiculous things about it though. The seats suck, they're not very adjustable and they're just uncomfortable and cheap. If you lean your head back on the headrest, you can feel the lower inside bars of the headrest get levered forward and poke you in the back. The seats have sporty side flanges to pinch you, but they're not adjustable and are apparently made for very narrow people (the BMW's for example have power adjustable seat pinchers that lock you in very nicely).
The visibility is just absurd. Like dangerous. In fact I seriously think it should be illegal to make cars with such bad visibility. There's not really so much of a "blind spot" as you just can't look behind you *at all*. You have to use your side mirrors. The rear window is tiny, and the extreme angle of it means that you get bigtime fresnel effect so the glass is more reflective, and you actually can see the contents of your own trunk better than the cars behind you. Parallel parking in it would have to be done by closing your eyes and using the force. Even ignoring the rearward visiblity problems, the front and side visibility isn't awesome either. Those windows are small too, and the hood and doors feel very high.
This is a common trend on lots of modern cars. The proper proportion for a car (says me) is about 50/50 body panel height to cabin glass height (more like 55/45 is ideal). Lots of modern cars are more like 65:35 or even 75:25. The hoods and doors are too high, you can't see the ground around you; it's claustrophobic and shitty. I think part of the reason is safety vs. the fucking big trucks and SUVs with their high bumpers, but I also think some people like the styling of it.
The 370Z also has lots of shitty plastic fit & finish. Whatever, I don't really care about that. The ergonomics of the console is actually better than BMW or Audi. The buttons are pretty simple and right at your hand and easy to use and well labelled. One thing that pisses me off in many news cars is fucking (+) and (-) buttons for things like air vent speed or stereo volume. That's fucking retarded. They need to be dials or wheels or knobs. I've ranted about this before, but look people - you want a few things from a control like that - you want the full range to be displayed physically, you want it to be easy to go instantly to any position, you want it to be friendly to muscle memory so that you can do it with your eyes away from it, and you need to be able to read the setting from the control. All of that is satisfying by the well known device of the knob (and I mean an absolute knob with the range and current value marked, not a digital knob that just spins forever in both directions). Stop fucking trying to reinvent the knob, you fail. Anyway, as cheap as the 370Z interior is, it actually has knobs FTW.
(while I'm ranting about nobs - fucking push button toggles are awful too, especially when the only indicator of what state it's in is from a dim LED that you can't tell if it's lit or not on a sunny day; switches should be actual physical switches, for all the same very obvious reasons - you can tell the state it's in by touching it with your eyes closed, and you get physical feedback so you know when you've switched it, and aside from all that it just feels cool. It's fucking awful that shit like the "M button" or the traction control buttons are little digital push button toggles; they should be big giant lever switches like something Nicola Tesla would switch so that when you are ready to tear it up you say "engage!" and flip a big fucking switch). Anyhoo, back on track...
It has a little more headroom than the G37, but less than the 135. I can just barely sit all the way up in the 370Z but it's certainly not comfortable to just sit in. The interior and trunk space is in general just ridiculously small, like you would have trouble getting a big load of groceries in it, and if you ever buy even the smallest piece of furniture you have to have it delivered.
So, I'm a little torn. It's the first car I've driven that got me hot and bothered when I floored it around a corner, but the bad seats and shit visibility are pretty big problems. It is $10k cheaper than a G37 or 135 so I guess that goes in the equation somewhere. Maybe I could spend that $10k getting the body panels replaced with a Jaguar E-Type body.
In general I'm disappointed with all these new cars. The engines in all of them are phenomenal, but all the other bits are stupidly messed up. In many ways they are all inferior to my old Prelude, which has delightful big windows and very simple functional ergonomically friendly dash controls, and fantastic raw steering feel, and everything is delightfully manual and non-computer-involved and simple. After every one of these fancy car test drives, I've been pleasantly happy to get back in the Lude. Granted, after mashing the hungry throttle of a modern engine the Lude feels incredibly slow and impotent, but least I have a nice view and no fucking beeping while I take 10 seconds to accelerate to 60.
I suppose I shouldn't be surprised. In my dreams people would just take their product and only change it in ways that are definite improvements, and leave alone the things that work just fine, but of course they don't do that. It all makes me think of MS Office or Visual Studio or something. Yes, the newer version are fundamentally much better at their core, they have better engines and some features that I really want, but they also completely fucking change the user interface and move the menus around and rename things for no god damn reason, and add all kinds of bells and whistles that I don't want and just kludge it all up. Modern cars remind me of that.
A 370Z :
Psyche! That's what it should look like.
This is what it does look like
Still no internet at home and Comcast has remotely rebooted my modem three more times. They've used up all the minutes on my pay-go phone so I also have no phone. Now I have to wait around at home for a technician to come again.
I'm going from being on hold with Comcast and navigating the infuriating automated menu (and my usual trick of just slamming on buttons to get to an operator doesn't work in their system), to sitting tense at home as upstairs neighbor stomps around on my head, to disputing health care bills or going to another fucking doctor or PT appointment, to commuting in the goddamn fucking traffic. I want to scream and/or punch someone. I am a ball of sadness and rage.
I've been a total dick to the people I talk to at Comcast; I don't mean to be, but the fucking automated menu system is so fucking awful, I'm in a complete rage by the time I get through to someone. Some of the amusing things it's done :
Navigate through the menus to the choice for "problems with internet". I finally get there and it just plays a recording "for customer support please dial XXXX" and hangs up. Of course that number was the one I had dialed. Yum.
When you actually get through to the right place for problems it plays a recording "most problems can be solved by rebooting your modem, we will do so now, then play music for 30 seconds while it powers up" WHAT !? NO !? STOP!! And you can't hit any buttons to escape out of that - and in fact if you mash on buttons trying to get past it, one of the buttons apparently is "please reboot me again" and it starts over that cycle.
I remember back in SLO when I had a similar kind of problem with my cable modem, it took like a week of talking to frustrating morons before I finally got escalated to a "stage 2 technician" (or was it a third stage guild navigator?) and then it was fixed in one day. In SLO it turns out the problem was that some backbone IP conflict that was spoofing my cable modem's public address (or something, I don't really know shit about the net). The point is that it had nothing to do with the wires and wasn't anything anybody needed to come to my house for. All it took was an actual computer guy to look into the cable-side networking problem.
(ADDENDUM : someone from that "comcast cares" email address mailed me back within about 6 hours. I have a technician coming out today theoretically so we'll see what they say; of course the connection has been fine so far today. Everyone I've actually gotten to talk to at Comcast has been pretty nice and reasonable, it's just so aggravating how hard it is to get through the system to talk to someone).
On the phone with HealthNet you get 60 seconds of "you can also view your claims and benefits online" FUCK YOU I KNOW THAT. Fortunately with HealthNet the system of frantically mashing buttons does get you through to a human.
HealthNet has been really horrible, I recommend against them as I recommend against Comcast. They have a policy of approving providers for PPO on an individual basis, which means when you go into an office which is "in network" you have to check every single individual person. Somehow over and over I keep getting treated by the one person in an office who's not "preferred". I'm not sure if the fucking providers are doing this on purpose (MTI and Olympic PT, you fuckers), or if it's just bad luck. It is a financial boon for the providers to fuck you in this way because it removes the contractual limit on charges. Early on I made the mistake of trusting that when I went to an office and they got my health insurance info in advance and told me it was fine and I would be covered that it meant they would give me to a provider that was preferred. LOL.
I'm also frustrated and annoyed with my lack of progress on Oodle. At times I feel like I'm writing some really good code, and lots of it, but when I step back and look at what I've got done in the last 8 months, I'm not happy with where I am. I'm literally doing nothing but working, I basically have no friends (that I see), no hobbies, I never go out or do anything but work and take care of fucking errands and todos and desperately try to sleep. And yet I feel like I'm not working enough. Part of the problem is definitely all the fucking PT which is such a huge time sink and distraction. I'm tempted to just say fuck it and give up on my body because it's a frustrating annoyance, and the stress of it all is half undoing any progress I make.
In the voice of Mark from Peep Show : "that's what I need, to sink my teeth into a double helping of work".
My internet has been out at home for four days now. Arg Comcast. I've spend probably an hour on the phone with them, partly because they refuse to give me a direct number to tech support. They're so evasive about it. I ask for the number for tech support and they say "okay" and then give me the number for general customer support. I say "umm no, that's the main number for customer service, I want the number for tech support" and they say "that is the number for tech support" ; one time I said "are you refusing to give me the number to tech support" and the guy was like "no, I have given you the number for tech support". Yeesh.
This Cable Modem Troubleshooting Tips is pretty good. With a Surfboard you can go to http://192.168.100.1/ and see your own status and event logs. Of course good luck getting to talk to tech support person who understands any of that. The new thing at Comcast is apparently rebooting your modem. They just love rebooting your modem (they can do it by remote control). Any time you call up, it's "I'm going to reboot your modem, please wait" which takes two minutes and does absolutely nothing. I try to explain that I've done that myself many times to no avail.
Last Friday I was stuck in the elevator at RAD for about an hour. I got in with Mike and pressed the button and it took us up as usual, and then just stopped. None of the buttons did anything, it refused to budge. Turns out it stopped just a few inches away from the floor, and because it wasn't lined up it refused to open the doors. We waiting about an hour for the technicians to show up and pry the doors open. One of the weirdest things about the whole experience to me was that when the technicians showed up they didn't say "hello" or anything at all, there was no "hello, is everyone okay in there?" or "we're going to pry the door open now" - nothing. They just started banging on the door, then one of them climbed through the roof and jumped onto the top of the elevator with a huge BANG. I was like WTF, warn us that you're about to jump onto the fucking top of the elevator. It was weird.
Game Angst has a pretty good article about the problems of deferred shading that nobody talks about. I'm a fan of deferred shading in theory, but he has a good point. There are other problems - for example it makes the pipeline for alpha vs. non-alpha completely different (presumably you would render alpha stuff onto your scene using forward shading after the deferred shading is done). Also, it basically means you have to use the same shader on everything, or that you can't use many different types of shaders that take different parameters. Like if you wanted your main characters to be rendered with very nice spherical harmonic lighting and the rest of your stuff to be lit with N*L , you can't really do that with deferred shading. (obviously the uniformity and unification is one of the appealing things about deferred shading). Another one that popped into my head while reading that is the hacky semi-Lambertian falloff. Instead of just doing Clamp[ N*L ] , it's nice to do Clamp[ (N*L + C)/(1 + C) ] , which serves to let the light go "around the edge" a little. By giving the artists control of C per object you get a very cheap way to improve lighting, but with deferred shading you'd have to add an extra attribute channel which is absurd.
I keep looking at apartments. I had to pay July rent, so now I have a lot of time, so I thought I'd go ahead and try the "Secretary Solution". See the Bruss article "The art of a right decision" . The basic idea is to look at places for a while and not take any of them. Then when you hit a certain preset time, you switch modes and then take the first place which is the best seen so far. As I noted before, this maximizes the chance of getting the best place, but it doesn't maximize average real EV and when it fails it can be arbitrarily bad. Anyway, I plan to look without taking for another week or so, and then switch modes.
I looked at this Grand Apartment in 1903 Mansion . It'd fucking amazing. The kitchen and the floor plan are perfect - it's almost exactly what I would design myself, which is saying a lot. The kitchen is huge and open, but not too open - it's got counters and islands separating it from the living room (I hate places where the kitchen directly spills into the living room with no barrier). The big problem with this place is the location. It's on a really shitty street on the west side of Broadway (the east side of Broadway is the good part). The street has a bunch of halfway houses for men on parole; those are actually the good neighbors - the halfway houses have curfews and the guys are very quiet because they have rules and they don't want to get in trouble. The bad neighbors are all the other shitty apartments full of broke punk kids. Anyway, it's another place like the Duplex that kind of presents a quandary for me - it's fucking gorgeous inside this apartment, but not so hot outside, I dunno how to weight that.
There are also tons of houses coming for rent. So far none of them is quite ideal, but it's encouraging to see a bunch of them on the market. Most of the best places are still trying to sell, the neighborhood is just blanketed in "for sale" signs. Upstairs neighbor stimpy dickhead has been waking me up at 7 AM recently. I desperately need to get the fuck out of here. I wouldn't mind waking up early except that it puts my commute right in the middle of rush hour, which I usually try to avoid.
In other apartment news : padmapper is a much better version of the "craigslist on google maps" than the original "housingmaps.com". On padmapper you can hit the "+" to expand the filters and get much nicer control. PadMapper desperately needs to be able to save searches though, it's semi-useless without that.
Mercer Island around Mercer Way is a lovely place to ride. Getting there is miserable. It would be kinda sweet to live there, so I could hop right out my door and ride the loop. Of course I only like riding up here maybe 4 months out of the year since I don't do cold or wet. And if I really want nice riding out of my door I should move back to SLO where the riding is sweet and the weather is fair every day. Anyway, while I was riding I saw a cyclist getting carried to an ambulance on a stretcher. It was a scary reminder of the extreme danger of this hobby. On the plus side, I'm over my fear of speed that I've had since my last bad crash. I can now open up in fast descents and my body doesn't tighten up. I am still consciously choosing to be more careful and go slower, but it's a logical choice, not a panic reaction, so I'm happy about that. Two of my car crashes and one of my bike crashes were caused just by bad road conditions; I'm much more aware now of how likely it is to go around a corner and find a huge patch of oil or sand or something you can't control or avoid that's going to make you crash. People who speed around corners that they've never been around before are just retarded, it would be fine if it was only dangerous to yourself, but you never know when someone else is going to be there around that corner too. I consciously try to only speed around corners if I've been there before recently so I know the conditions. It would be really sweet to be able to ride on closed roads that are all smoothly paved and free of debris, but I guess you have to be a pro to get that. If I was billionaire I'd spend my money on shit like that. You could just rent the entire route you want to ride, country lanes in vermont, or mountain roads in the rockies, send a street sweeper ahead of you to make sure it's all clean, have any nasty bits repaved, and then have a beautiful ride. What would it cost, maybe $100k ? If you spend $100k every day for the rest of your life, it only reaches around $1 billion. No problem for a Gates or Buffet.
Seattle's summer days are quite wonderful. It's a shame to waste even a single one, since you only get maybe 20 total. You need to run around in the sun, have nothing in particular to do except enjoy the moments, sit on a patio and have drinks, play, breathe.
I need to have a child so that I have someone to play with. I just want to go the park and play tag and catch. I guess I could get a dog, but dogs are gross. I've always wanted a rent-a-dog service; I'd love to have a dog at the park, I just don't want one getting hair and slobber all over my house. It's kind of horrible to have a kid just to get someone to play with, but I suppose it's not as bad as the reasons why most people have kids - to "carry on the family name" (WTF is that) , to show the world how awesome you are by creating a child that's very successful (look at my son the doctor, aren't I such a great parent), or to make a little you that can do all the things you wish you'd done and have the breaks you didn't get so you can live vicariously through them.
The Vintage Seattle blog is super awesome. If you like old photos of cities (as I do) it's a treasure trove; go back through the old posts. It always blows my mind just how empty this country was 100 years ago ( for example ). The picture of the great white fleet is stunning.
Sean has put up his game Succor . It's an extremely interesting & clever concept; I won't spoil it, go download and play.
It occured to me the other day that since I use an HTPC to watch TV now, I could plug in a gamepad controller and play some little games on my TV (like Mutant Storm). LOL yeah right. I don't want to crash or reboot my fucking TV machine, which playing games would certainly cause.
I test-drove a G37 again a few days ago. It's a good car, I don't blame anyone who gets one, but I won't. It's a tiny bit too small, I hit my head on the roof. Supposedly by the official "front headroom" measurement it's bigger than the 135, but the reality is not so; I have plenty of room in the 135 but not in G37. The other problem is it just feels impotent at low revs, and it's too big and heavy. They brag about the "smoothness" of the acceleration; I don't want to ride a wave of acceleration, I want to be punched in the gut. The car should be gentle when I'm soft on the pedal, but it should jerk me around mercilessly when I thrash it.
Anyway, it was a funny test drive. The salesman guy was a total stereotypical douchebag car salesman; he had a baggy suit on and told me the trunk fit his golf clubs just fine. He told me has a G37 himself and got it lowered 1.5" so that he can't get over any bumps. I was like "ah, sweet", playing along trying to encourage the douchiness. He told me it looks great with bigger rims, "put some 20's on it". Yeah. So he showed me the bluetooth functionality, which granted is very good - much better than BMW - and he mentioned you can easily turn it off if you're in the car with someone and you don't want your calls to show up on the screen. I was like "oh yeah, if I'm riding with my girlfriend and my mistress calls I don't want that showing up" and he immediately said "yeah, I always turn it off, I told my girlfriend the bluetooth doesn't work in my car". OMG super lol. Nice job super douchebag car salesman, you win life.
The advantage of the G37 over the 135 is that it is more direct and mechanical; it has a real LSD, the steering feels much more connected and responsive to me, the pedal to acceleration response feels more analog and mechanical. That's all good. Also, the computer in the Infinitis is totally superior. I've seen BMWs and Audis and Mercs and in all of them the computer is the fucking suck, I would rather not have it, it's a total mess to use. The Infiniti is a touch screen for the mother fucking win. Plus the navigation is very simple and intuitive, it's got arrow keys and "enter" on the steering wheel so that you can do many things without looking, and it has voice activation and it actually works (as in, I tried it on a car I've never used before and it recognized my selections correctly on the first try with no problems). I was extremely impressed, it's by far the best car computer I've ever seen. I still would probably rather not have it, but if you really like car computers, it's the one.
God damn I keep getting screwed by providers and the insurance company. Here are some tips :
1. Do your own research, find the specific doctor you want. This can be hard to do, but you should be able to find the local specialist in your problem. I recommend someone young and up & coming because they will actually care and be up on modern techniques. The crucial thing here is the difference between a good doctor and any old doctor is like night and day. Don't be afraid to just leave a doctor after one visit if you don't think you're getting the attention you need.
Specifically - if you have an injury where you get an MRI, you should expect a doctor to actually LOOK at the MRI. The first two doctors I went to literally didn't ever look at it. They order MRI's and make a ton of money off you and then just read the examiner's report. The MRI examiner is not an MD and not qualified to be making judgements on your condition. Furthermore, if you have something like a sports injury, if the doctor does not actually look at your body, eg have you take your clothes off and move around and show him the disfunction - you should walk right out of there.
2. Do your own research on how you will be billed and who will treat you. This is tricky and you cannot trust ANYONE on this matter. You will be told that the person who will treat you is "covered" by your insurance. That doesn't mean anything. They may be outside your "network" which will cause them to bill you at a very high rate. Assuming you are on an HMO/PPO plan of some kind you must find specific providers who are "preferred" or "in network". You must do your own research on this because no one else will.
Pursuant to that - it's not enough to know that a certain office is "in network" for you. You must call ahead and get the name of the exact person who will be treating you. Twice I've gone to PT offices that I was told were "in network" for me, and then the exact person I was assigned to was out of network. This can be rather tricky - when you go in to a doctor who's in network for you, he might send you into a room to get xrays, if the xray tech is out of network, boom you're fucked. The classic one that people get bit by this was is the anaesthesiologist. Often when people get surgery they find a surprise bill for $10,000 from the anaesthesiologist who's out of network for them even though the doctor/hospital/everything else is preferred.
You really have to be a huge asshole about this, you need to call ahead and get the names of everyone who is going to treat you and check on them. When you go in to the office, you need to be firm, any time someone walks in a room to treat you, you have to say "who is this" and if they're not a name you've checked you have to say no. I know this is ridiculous and often not possible in practice.
3. When you get bills, go over everything with a fine toothed comb. Providers will bill you directly for the balance not covered by insurance. Quite often they do this wrong/illegally either on purpose or by accident. Of course the health insurance often fails to reimburse correctly too, so check on that as well. If the doctor takes your health insurance, then they are not allowed to bill more than the negotiated rates; this amount is normally marked "not allowed" on your explanation of benefits. Often the doctors will go ahead and bill this to you one way or another; this is called "balance billing" and it's specifically illegal. It can be tricky to spot sometimes because they mix it in with allowed billing, so you have to do the math and see that all the billing amounts add up right.
When you get bills that don't look right to you, you have to call your health insurance and your provider. If it's a question of "balance billing" you just confirm it with your health insurance, and then tell your provider to fuck off. The other common problem is something is getting billed at a higher rate or not covered when you think it should be. I've had more luck calling the provider about this kind of thing, the health insurance will just tell you tough luck. The provider can be a little funny about how exactly they bill things, so if they change the billing code and resubmit they may get more coverage; they are usually happy to do this, you may need to talk to the head of the billing department or whatever.
Anyhoo, if you have some kind of Orthopedic problem, I highly recommend Dr. Chris Wahl. I haven't actually had surgery from him, so I can't comment on his surgical skills (anecdotal reports on surgical outcomes are pretty worthless anyway, we need fucking public statistical analysis of doctor's outcomes, but of course the corrupt greedy bastards at the AMA would never allow that). He's the first doctor I've ever had that actually looked at my MRIs. He's also the first doctor who even gave me a proper physical visual exam, as in I took my shirt off and he immediately said "hmm your right shoulder is dropped, looks like you had a type 2 sepration". Yes! yes I did, and both my previous doctors failed to diagnose it.
This apartment searching is really annoying me. I can't handle having "many balls in the air" ; when I put something on my todo list, I like to work at it until it's gone. God I fucking hate shit on my todo list (the fucking health care keeps reinserting itself on my todo list and it's pissing me off; they got me again today with some billing fuckup, but I digress...).
Anyway, it's reminding me of a concept I often think about. I'll call it "the redrawer's dilemma" but there must be a better/standard name for this.
The hypothetical game goes something like this :
You are given a bag with 100 numbers in it. You know the numbers are in [0,1000] but don't know how many of each number there are in the bag. You start by drawing a random number from the bag.
At each turn of play, you can either keep your current number (in which case that is your final score), or you can put your current number back in the bag and draw again, but drawing again costs you -1 that will be subtracted from your final score.
How do you play this game optimally?
There are two things that are interesting to me about this game in real life. One is that humans almost always play it incredibly badly, and the second is that when you finally decide to stop redrawing you're almost always unhappy about it (unless you got super lucky and draw a 900+ number).
The two classic human player errors in this game are the "I just started drawing, I shouldn't stand yet" and the "I can't stop now, I already passed on something better than this". The "I just started drawing, I shouldn't stand yet" guy draws something like an 800 on one of his early draws. He thinks dang that's really good, but maybe this bag just has lots of high numbers in it, I just started drawing, I should put some time into it. Now of course that reasoning is based in correct logic - if you have reason to believe that your chance of drawing higher is good enough to merit the cost of continued looking, then yes, do so, but just drawing more because "it's early" makes no sense - the game is totally non temporal, the cost of continuing drawing doesn't go up over time. This often leads into the "I can't stop now, I already passed on something better than this" guy, who's mainly motivated by pride and shame - he doesn't want to admit to himself that he made a big mistake passing early when he got a high number, so he has to keep drawing until he gets something better. He might draw an 800, then a whole mess of single digit numbers and he's thinking "oh fuck I blew it" and then he draws a 400. At that point he should stand and quit redrawing, but he can't, so he draws again.
The thing is, even if he played correctly and just took the 400 after passing on the 800, he would be really unhappy about. And if the early termination guy played correctly and just got an early 800 and didn't draw, he would be unhappy too, because he'd always be wondering if he could've done better.
The other game theory / logical fallacy that plagues me in these kind of things is "I'm already spending X I may as well spend X". First I was looking for places around $1500, then I bumped it to $1700, then $1900. Now I'm looking at places for $2500 cuz fuck it they're nicer and I was looking at places for $2000 so it's only $500 more.
In other news, hotpads is actually a pretty cool apartment search site. It seems they are just scraping craigslist and maybe some other classifieds sites, so it's not like they have anything new, but the map interface and search features and such are solid. One thing is really annoying me about it though - the wheel zooming in the map is totally broken, I keep trying to wheel zoom and it sends the map off the never never land. Urg!
In more random news, I've really enjoyed the "Wallander" series on PBS ; the stories are pretty retarded/ridiculous, but I like the muddled contemplative pace of it, and the washed out monochrome color palette.
So in an earlier post I wrote about approximation of log2 and Ryg commented with links to Robin Green's great GDC 2003 talk : part1 (pdf) and part2 (pdf) ( main page here ).
It's mostly solid, but in part 2 around page 40 he talks about "fastexp" and "bitlog" and my spidey senses got tingling. Either I don't understand, or he was just smoking crack through that section.
Let's look at "bitlog" first. Robin writes it very strangely. He writes :
A Mathematical Oddity: Bitlog
A real mathematical oddity
The integer log2 of a 16 bit integer
Given an N-bit value, locate the leftmost nonzero bit.
b = the bitwise position of this bit, where 0 = LSB.
n = the NEXT three bits (ignoring the highest 1)
bitlog(x) = 8x(b-1) + n
Bitlog is exactly 8 times larger than log2(x)-1
Bitlog Example
For example take the number 88
88 = 1011000
b = 6th bit
n = 011 = 3
bitlog(88) = 8*(6-1)+3
= 43
(43/8)+1 = 6.375
Log2(88) = 6.4594
This relationship holds down to bitlog(8)
Okay, I just don't follow. He says it's "exact" but then shows an example where it's not exact.
He also subtracts off 1 and then just adds it back on again. Why would you do this :
bitlog(x) = 8x(b-1) + n
Bitlog is exactly 8 times larger than log2(x)-1
When you could just say :
bitlog(x) = 8xb + n
Bitlog is exactly 8 times larger than log2(x)
??? Weird.
Furthermore this seems neither "exact" nor an "oddity". Obviously the position of the MSB is the integer part of the log2 of a number. As for the fractional part of the log2, this is not a particular good way to get it. Basically what's happening here is he takes the next 3 bits and uses them for linear interpolation to the next integer.
Written out verbosely :
x = int to get log2 of b = the bitwise position of top bit, where 0 = LSB. x >= (1 << b) && x < (2 << b) fractional part : f = (x - (1 << b)) / (1 << b) f >= 0 && f < 1 x = 2^b * (1 + f) correct log2(x) = b + log2(1+f) approximate with b + f note that "f" and "log2(1+f)" both go from 0 to 1, so it's exact at the endpoints but wrong in the middleSo far as I can tell, Robin's method is actually like this :
uint32 bitlog_x8(uint32 val)
{
if ( val <= 8 )
{
static const uint32 c_table[9] = { (uint32)-1 , 0, 8, 13, 16, 19, 21, 22, 24 };
return c_table[val];
}
else
{
unsigned long index;
_BitScanReverse(&index,(unsigned long)val);
ASSERT( index >= 3 );
uint32 bottom = (val >> (index - 3)) & 0x7;
uint32 blog = (index << 3) | bottom;
return blog;
}
}
where I've removed the weird offsets of 1 and this just returns log2 times 8. You need the check for val <= 8 because shifting by negative amounts
is fucked.
But you might well ask - why only use 3 bits ? And in fact you're right, I see no reason to use only 3 bits. In fact we can do a fixed point up to 27 bits : (we need to save 5 bits at the top to store the max possible integer part of the log2)
float bitlogf(uint32 val)
{
unsigned long index;
_BitScanReverse(&index,(unsigned long)val);
uint32 vv = (val << (27 - index)) + ((index-1) << 27);
return vv * (1.f/134217728); // 134217728 = 2^27
}
what we've done here is find the pos of the MSB, shift val up so the MSB is at bit 27, then we add the index of the MSB (we subtract one
because the MSB it self starts the counting at one in the 27th bit pos). This makes a fixed point value with 27 bits of fractional part,
the bits below the MSB act as the fractional bits. We scale to return a float, but you could of course do this with any # of fixed
point bits and return a fixed point int.
But of course this is exactly the same kind of thing done in an int-to-float so we could use that too :
float bitlogf2(float fval)
{
FloatAnd32 fi;
fi.f = fval;
float vv = (float) (fi.i - (127 << 23));
return vv * (1.f/8388608); // 8388608 = 2^23
}
which is a lot like what I wrote about before. The int-to-float does the exact same thing we did manually above, finding the MSB and making
the log2 and fractional part.
One note - all of these versions are exact for the true powers of 2, and they err consistently low for all other values. If you want to minimize the maximum error, you should bias them.
The maximum error of ( log2( 1 + f) - f ) occurs at f = ( 1/ln(2) - 1 ) = 0.442695 ; that error is 0.08607132 , so the correct bias is half that error : 0.04303566
Backing up in Robin's talk we can now talk about "fastexp". "fastexp" is doing "e^x" by using the floating point format again, basically he's just sticking x into the exponent part to get the int-to-float to do the 2^x. To make it e^x instead of 2^x you just scale x by 1/ln(2) , and again we use the same trick as with bitlog : we can do exact integer powers of two, to get the values in between we use the fractional bits for linear interpolation. Robin's method seems sound, it is :
float fastexp(float x)
{
int i = ftoi( x * 8.f );
FloatAnd32 f;
f.i = i * 1512775 + (127 << 23) - 524288;
// 1512775 = (2^20)/ln(2)
// 524288 = 0.5*(2^20)
return f.f;
}
for 3 bits of fractional precision. (note that Robin says to bias with 0.7*(2^20) ; I don't know
where he got that; I get minimum relative error with 0.5)).
Anyway, that's all fine, but once again we can ask - why just 3 bits? Why not use all the bits of x as fractional bits? And if we put the multiply by 1/ln(2) in the float math before we convert to ints, it would be more accurate.
What we get is :
float fastexp2(float x)
{
// 12102203.16156f = (2^23)/ln(2)
int i = ftoi( x * 12102203.16156f );
FloatAnd32 f;
f.i = i + (127 << 23) - 361007;
// 361007 = (0.08607133/2)*(2^23)
return f.f;
}
and indeed this is much much more accurate. (max_rel_err = 0.030280 instead of 0.153897 - about 5X better).
I guess Robin's fastexp is preferrable if you already have your "x" in a fixed point format with very few fractional bits (3 bits in that particular case, but it's good for <= 8 bits). The new method is preferred if you have "x" in floating point or if "x" is in fixed point with a lot of fractional bits (>= 16).
ADDENDUM :
I found the Google Book where bitlog apparently comes from; it's Math toolkit for real-time programming By Jack W. Crenshaw ; so far as I can tell this book is absolute garbage and that section is full of nonsense and crack smoking.
ADDENDUM 2 :
it's obvious that log2 is something like :
x = 2^I * (1+f) (I is an int, f is the mantissa) log2(x) = I + log2(1+f) log2(1+f) = f + f * (1-f) * C We've been using log2(1+f) ~= f , but we know that's exact at the ends and wrong in the middle so obvious we should add a term that humps in the middle. If we solve for C we get : C = ( log2(1+x) - x ) / x*(1-x) Integrating on [0,1] gives C = 0.346573583hence we can obviously do a better bitlog something like :
float bitlogf3(float fval)
{
FloatAnd32 fi;
fi.f = fval;
float vv = (float) (fi.i - (127<<23));
vv *= (1.f/8388608);
//float frac = vv - ftoi(vv);
fi.i = (fi.i & 0x7FFFFF) | (127<<23);
float frac = fi.f - 1.f;
const float C = 0.346573583f;
return vv + C * frac * (1.f - frac);
}
So I'm thinking about renting this place ; it's kind of ridiculous, it's a 2 BR which I don't need, it's $1950 which is an outrageous amount to pay for rent (and they're overcharging for the current market conditions). Sticking with the negative, it's a duplex, and the bottom halve is inhabited by the owner. I'm somewhat worried about that, I don't like the idea of seeing the owner all the time, it makes me feel like I'm being watched and judged which is a very unpleasant feeling. Oh, and the driveway is shared and only wide enough for one car so you have to ask each other to move to get your cars in and out, that seems awful.
On the plus side, the kitchen is huge and full of light and has a real gas stove and fume hood. The other huge plus is that it has a big private deck on the roof. Those are my dreams, just to be able to cook and sit outside in the morning while I have my coffee. And it's a good location. The back yard private deck is so much better than even the balconies on big apartment buildings, because the balconies are all right next to each other.
Oh, the other drag is that the landlord is a handyman and does the repairs himself. That's such a huge fucking disadvantage. I've had that at three apartments now and it's been a huge disaster each time. They're always super slow, if you ask them to fix something it seems to be code for "please tear up my apartment and then leave your tools in it for a month and maybe stop by for an hour every other week". Then they act so smug and pleased with themselves when they actually fix something for you. You know, I'm sure a professional would've done a better job much faster, and it's just your duty to have things fixed, so stop acting like I should praise your amazing skills and thank you over and over. In reality *you* should be thanking *me* for letting you play your handyman dressup game on my time in my home.
I also looked at this place in the Trace Lofts. It's ridiculous, it makes me angry. For one thing, like so many of the new apartments, it's basically a shoebox with a hole cut in one end. It's long and skinny and only one skinny side has windows. It's also just a bit open space with no rooms, no closets. Okay, fine, it's a "loft", it's all urban and trendy and cool. But the whole fucking point of those big industrial loft conversions is that they SUCK so they're really cheap, so broke artists can afford them, which is what makes them cool. This place is expensive as balls and it's targetted at yuppies who want to play like they're urban cool artists and pay out the nose for nice fixtures but still get all the suckitude of not actually having rooms or doors or closets. So dumb. (the penthouses on top of the Trace Lofts seem amazing, the one available is in the old building part). (hell the penthouse is only $2250 , well well worth the $250 for the upgrade) (if you want to go to outrageous rents like that this looks nice too).
If I don't get the ridiculous duplex I might just rent a house. You can get whole houses now for around $1600/mo that are just slightly out of the main area here. There are several in the 19th & Prospect area for around that, which is a bit of a long walk to the happening area but still walkable, and there's even one at 16th and Harrison which is not far at all. If you rent a house, noone lives above you.
Okay, in Part 1.5 I asked about the downsample that was the best inverse of bilinear upsampling. I have a solution that pleases me.
Sean reminded me that he tackled this before; I dunno if he has any notes about it on the public net, he can link them. His basic idea was to do a full solve for the entire down-sampled image. It's quite simple if you think about. Consider the case of 2X up & down sampling. The bilinear filter upsample will make a high res image where each pixel is a simple linear combo of 4 low res. You take the L2 error :
E = Sum[all high res pixel] ( Original - Upsampled ) ^2
For Sean's full solution approach, you set Upsampled = Bilinear_Upsample( X) , and just solve this for X without any assumption of how X is made from Original. For an N-pixel low res image you have 4N error terms, so it's plenty dense (you could also artificially regularize it more by starting with a low res image that's equal to the box down-sample, and then solve for the deltas from that, and add an extra "Tikhonov" regularization term that presumes small deltas - this would fix any degenerate cases).
I didn't do that. Instead I assumed that I want a discrete local linear filter and solved for what it should be.
A discrete local linear filter is just a bunch of coefficients. It must be symmetric, and it must sum to 1.0 to be mean-preserving (flat source should reproduce flat exactly). Hence it has the form {C2,C1,C0,C0,C1,C2} with C0+C1+C2 = 1/2. (this example has two free coefficients). Obviously the 1-wide case must be {0.5,0.5} , then you have {C1,0.5-C1,0.5-C1,C1} etc. as many taps as you want. You apply it horizontally and then vertically. (in general you could consider assymetric filters, but I assume H & V use the same coefficients).
A 1d application of the down-filter is like :
L_n = Sum[k] { C_k * [ H_(2*n-k) + H_(2*n+1+k) ] }
That is : Low pixel n = filter coefficients times High res samples centered at (2*n * 0.5) going out both directions.
Then the bilinear upsample is :
U_(2n) = (3/4) * L_n + (1/4) * L_(n-1)
U_(2n+1) = (3/4) * L_n + (1/4) * L_(n+1)
Again we just make a squared error term like the above :
E = Sum[n] ( H_n - U_n ) ^2
Substitute the form of L_n into U_n and expand so you just have a matrix equation in terms of H_n and C_k. Then do a solve for the C_k. You can do a least-squares solve here, or you can just directly solve it because there are generally few C's.
Here's how the error varies with number of free coefficients (zero free coefficients means a pure box downsample) :
r:\>bmputil mse lenag.256.bmp bilinear_down_up_0.bmp rmse : 15.5437 psnr : 24.3339 r:\>bmputil mse lenag.256.bmp bilinear_down_up_1.bmp rmse : 13.5138 psnr : 25.5494 r:\>bmputil mse lenag.256.bmp bilinear_down_up_2.bmp rmse : 13.2124 psnr : 25.7454 r:\>bmputil mse lenag.256.bmp bilinear_down_up_3.bmp rmse : 13.0839 psnr : 25.8302you can see there's a big jump from 0 to 1 but then only gradually increasing quality after that (though it does keep getting better as it should).
Two or three free terms (which means a 6 or 8 tap filter) seems like the ideal width to me - wider than that and you're getting very nonlocal which means ringing and overfitting. Optimized on all my test images the best coefficients I get are :
// 8 taps :
static double c_downCoef[4] = { 1.31076, 0.02601875, -0.4001217, 0.06334295 };
// 6 taps :
static double c_downCoef[3] = { 1.25 , 0.125, - 0.375 };
(the 6-tap one was obviously so close to those perfect fractions that I just manually rounded it; I assume
that if I solved this analytically that's what I would get. The 8-tap one is not so obvious to me what it
would be).
Now, how do these static ones compare to doing the lsqr fit to make coefficients per image ? They're 99% of the benefit. For example :
// solve :
lena.512.bmp : doing solve exact on 3 x 524288
{ 1.342242526 , -0.028240414 , -0.456030369 , 0.142028257 } // rmse : 10.042138
// static fit :
lena.512.bmp : // rmse : 10.116388
------------
// static fit :
clegg.bmp : // rgb rmse : 50.168 , gray rmse : 40.506
// solve :
fitting : clegg.bmp : doing lsqr on 3 x 1432640 , c_lsqr_damping = 0.010000
{ 1.321164423 , 0.002458499 , -0.381711250 , 0.058088329 } // rgb rmse : 50.128 , gray rmse : 40.472
So it seems to me this is in fact a very simple and high quality way to down-sample to make the best
reproduction after bilinear upsampling.
I'm not even gonna touch the issue of the [0,255] range clamping or the fact that your low res image should actually be considered discrete, not continuous.
ADDENDUM : it just occured to me that you might do the bilinear 2X upsampling using offset-taps instead of centered taps. That is, centered taps reconstruct like :
+---+ +-+-+ | | | | | | | -> +-+-+ | | | | | +---+ +-+-+That is, the area of four high res pixels lies directly on one low res pixel. Offset taps do :
+---+ | | | | -+-+- | | -> | | | | -+-+- +---+ | |that is, the center of a low res pixel corresponds directly to a high res pixel.
With centered taps, the bilinear upsample weights in 1d are always (3/4,1/4) then (1/4,3/4) , (so in 2d they are 9/16, etc.)
With offset taps, the weights in 1d are (1) (1/2,1/2) (1) etc... that is, one pixel is just copied and the tweeners are averages.
Offset taps have the advantage that they aren't so severely variance decreasing. Offset taps should use a single-center down-filter of the form :
{C2,C1,C0,C1,C2}
(instead of {C2,C1,C0,C0,C1,C2} ).
My tests show single-center/offset up/down is usually slightly worse than symmetric/centered , and occasionally much better. On natural/smooth images (such as the entire Kodak set) it's slightly worse. Picking one at random :
symmetric :
kodim05.bmp : { 1.259980122 , 0.100375561 , -0.378468204 , 0.018112521 } // rmse : 25.526521
offset :
kodim05.bmp : { 0.693510045 , 0.605009745 , -0.214854612 , -0.083665178 } // rgb rmse : 26.034
that pattern holds for all. However, on weird images it can be better, for example :
symmetric :
c:\src\testproj>Release\TestProj.exe t:\test_images\color\bragzone\clegg.bmp f
{ 1.321164423 , 0.002458499 , -0.381711250 , 0.058088329 } // rgb rmse : 50.128 , gray rmse : 40.472
offset :
c:\src\testproj>Release\TestProj.exe t:\test_images\color\bragzone\clegg.bmp f
{ 0.705825115 , 0.561705835 , -0.267530949 } // rgb rmse : 45.185 , gray rmse : 36.300
so ideally you would choose the best of the two. If you're decompressing in a pixel shader you need another parameter for whether to offset
your sampling UV's by 0.5 of a pixel or not.
ADDENDUM : I got Humus working with a KLT color transform. You just do the matrix transform in the shader after fetching "YUV" (not really YUV any more). It helps on the bad cases, but still doesn't make it competitive. It's better just to go with DXT1 or DXT5-YCoCg in those cases. For example :
On a pure red & blue texture :
Humus YCoCg : rmse : 11.4551 , psnr : 26.9848 ssim : 0.9529 , perc : 80.3841% Humus KLT with forced Y = grey : KLT : Singular values : 56.405628,92.022781,33.752548 KLT : 0.577350,0.577350,0.577350 KLT : -0.707352,0.000491,0.706861 KLT : 0.407823,-0.816496,0.408673 rmse : 11.4021 , psnr : 27.0251 ssim : 0.9508 , perc : 79.9545% Humus KLT : KLT : Singular values : 93.250313,63.979282,0.230347 KLT : -0.550579,0.078413,0.831092 KLT : -0.834783,-0.051675,-0.548149 KLT : -0.000035,-0.995581,0.093909 rmse : 5.6564 , psnr : 33.1140 ssim : 0.9796 , perc : 87.1232% (note the near perfect zero in the last singular value, as it should be) DXT1 : rmse : 3.0974 , psnr : 38.3450 ssim : 0.9866 , perc : 89.5777% DXT5-YCoCg : rmse : 2.8367 , psnr : 39.1084 ssim : 0.9828 , perc : 88.1917%So, obviously a big help, but not enough to be competitive. Humus also craps out pretty bad on some images that have single pixel checkerboard patterns. (again, any downsampling format, such as JPEG, will fail on the same cases). Not really worth it to mess with the KLT, better just to support one of the other formats as a fallback.
One thing I'm not sure about is just how bad the two texture fetches is these days.
"Modern" = shitty faux-modernist from the 70's with aluminum windows and beige carpet and particle board kitchen cabinets.
"Amenities" = a closet off the lobby has a stairmaster in it.
"Turn of the century Classic" / "vintage" / etc = run down, peeling paint, cracked windows, original wood-burning stove, etc.
"Heat maintained for all resident's comfort" = heat not run at all ever so the landlord can make more money.
"Vibrant community" = Noisy white trash neighbors hang out right outside your window.
"Act fast / rare opportunity / don't wait!" = this has been listed for 6 months and nobody wants it, please please be the sucker who takes it.
"Proximity to downtown / near businesses" = way the fuck out in some neighborhood you've never heard of.
"Great location" = more often than not I'm finding this means it's located directly on I-5 ; "West facing" means the same thing.
"View view view" = everything about this place is so shitty we want you to focus on what's outside.
1. Craigslist is fucking awful, but it's all I've got. I can't filter in any meaningful way, hell I can't even select for just 1 bedrooms or by neighborhood in Seattle. So I have to manually poke through the listings myself (of course the search is horribly broken), and I can't mark ads I've already seen before, and people keep relisting the same property over and over.
2. Walking around my neighborhood, I see *tons* of stuff for rent that's not on the internet. It seems like every building around here has a vacancy now as the market is crashing. What the fuck am I supposed to do with that? Oh, yay, you put a "for rent" sign outside the building. How many bedrooms? what floor? what square feet? am I supposed to phone every single fucking building in the whole city !? my god.
3. Ridiculously amateurish and unprofessional people renting these things. Some people I call & email over and over and they never get back to me at all. Good job. Others put up apartment listings that are just woefully lacking in information. My god, fucking list the square footage at least.
4. Intentional lies & left out information in the ads. Of course lots of the people who leave out the square feet do it on purpose, like the place that say "spacious 1 bedroom" and then you email them and they tell you it's 550 square feet. The big thing people do here is neighborhood lies. Everything around claims to be "Capitol Hill" - no, fuckers, First Hill down in the hospitals is not Cap Hill, fucking Central District half way down Rainier is not cap hill, fucking Eastlake is not cap hill, hell I've seen places across the fucking bridge on Beacon Hill claiming to be "Cap Hill". Liars.
5. The ridiculous pretention that we're not in a huge real estate crash. All the ads are like "rare unit available - act now ! prestigious building!" uhh, hello, half the fucking city is vacant or on sale right now. You can quit with pretending that I should feel fortunate that you are being kind enough to try to rent to me.
In general I'm a little torn about whether to hurry up and get out of here (god knows I need to get out of here), vs. take my time and find a really great place or wait for the market to crash a bit more. My "boots on the ground" view of the market here is that it has completely crashed, people are moving out left and right, but the sellers/landlords have not yet come face to face with the reality, so sale prices are still high and rents are still high. Right now there's a glut of supply that's just not moving, tons of condos are sitting empty unsold. In the next 6 months or so there's going to be a price crash.
I'm finding I'm still a sucker for these charming old 1910-1920 buildings. They're just so beautiful and full of charm and character; every time I look at a brand new building it just feels like a boring box, like a hotel room, and it makes me feel claustrophobic and ill. I dunno, maybe I'm attracted to the high I get from lead and mold poisoning.
For example I noticed these new condos Lumen are going on auction. Looking at the pictures just makes me angry and sick. Some developer just slapped together a bunch of drywall and sheets of metal and calls it "modern" "sleek" "urban" and wants to sell each condo for $500k. I bet you could build those kind of units a few days each. One can easily understand why there are so many new condos like this popping up all over the city - if people are dumb enough to buy them, it's a HUGE windfall for the developer.
Some of the more interesting projects around here :
First Church on 15th near Group Health was going to be converted to condos. I'm sure the project is going to die now, but it's at least sort of interesting. Of course they just put super cheapo generic "modern" hotel box shit inside it, but at least you have the old church around you.
The new unfinished condos on Cal Anderson Park are going to auction. I've been looking at them for a while now, wondering why they've been sitting there for months 90% finished. Well, apparently the developer ran out of money. It's kind of an amazing location, looking right out on the park, though that could also be a bit of a negative because you have zero privacy and there are a lot of hobos and kids in that park.
Harvard and Highland is the big new project going up in the historic mansion district. The condos are huge (2000 sqft) and crazy expensive, it doesn't make any sense IMO, but the web page is cool because it has an interactive map of the neighborhood with info on all the great homes around there. It's one of the best pages I've ever seen on the local robber baron's mansions.
Unrelated but Edith Macefield’s army of tattoos is cool.
There's still a ton of new construction around here that isn't done yet. All the massive amounts of shitty old buildings are going to be in trouble. Among other things, the population density in urban seattle is *massively* expanding these days, with tons of big condo projects in cap hill & especially South Lake Union. The infrastructure does not exist to handle all these people and no thought or money is being put into designing the growth of the city in a manageable constructive way.
I finally came back to DXTC and implemented some of the new slightly different techniques. ( summary of my old posts )
See the : NVidia Article or NVTextureTools Wiki for details.
Briefly :
DXT1 = my DXT1 encoder with annealing. (version reported here is newer and has some more small improvements; the RMSE's are slightly better than last time). DXT1 is 4 bits per pixel (bpp)
Humus BC4BC5 = Convert to YCoCg, Put Y in a single-channel BC4 texture (BC4 = the alpha part of DXT5, it's 4 bpp). Put the CoCg in a two-channel BC5 texture - downsampled by 2X. BC5 is two BC4's stuck together; BC5 is 8 bpp, but since it's downsampled 2x, this is 2bpp per original pixel. The net is a 6 bpp format
DXT5 YCoCg = the method described by JMP and Ignacio. This is 8 bpp. I use arbitrary CoCg scale factors, not the limited ones as in the previously published work.
Here are the results in RMSE : (modified 6-19 with new better results for Humus from improved down filter)
| name | DXT1 | Humus | DXT5 YCoCg |
| kodim01.bmp | 8.2669 | 3.9576 | 3.8355 |
| kodim02.bmp | 5.2826 | 2.7356 | 2.643 |
| kodim03.bmp | 4.644 | 2.3953 | 2.2021 |
| kodim04.bmp | 5.3889 | 2.5619 | 2.4477 |
| kodim05.bmp | 9.5739 | 4.6823 | 4.5595 |
| kodim06.bmp | 7.1053 | 3.4543 | 3.2344 |
| kodim07.bmp | 5.6257 | 2.6839 | 2.6484 |
| kodim08.bmp | 10.2165 | 5.0581 | 4.8709 |
| kodim09.bmp | 5.2142 | 2.519 | 2.4175 |
| kodim10.bmp | 5.1547 | 2.5453 | 2.3435 |
| kodim11.bmp | 6.615 | 3.1246 | 2.9944 |
| kodim12.bmp | 4.7184 | 2.2811 | 2.1411 |
| kodim13.bmp | 10.8009 | 5.2525 | 5.0037 |
| kodim14.bmp | 8.2739 | 3.9859 | 3.7621 |
| kodim15.bmp | 5.5388 | 2.8415 | 2.5636 |
| kodim16.bmp | 5.0153 | 2.3028 | 2.2064 |
| kodim17.bmp | 5.4883 | 2.7981 | 2.5511 |
| kodim18.bmp | 7.9809 | 4.0273 | 3.8166 |
| kodim19.bmp | 6.5602 | 3.2919 | 3.204 |
| kodim20.bmp | 5.3534 | 3.0838 | 2.6225 |
| kodim21.bmp | 7.0691 | 3.5069 | 3.2856 |
| kodim22.bmp | 6.3877 | 3.5222 | 3.0243 |
| kodim23.bmp | 4.8559 | 3.045 | 2.4027 |
| kodim24.bmp | 8.4261 | 5.046 | 3.8599 |
| clegg.bmp | 14.6539 | 23.5412 | 10.4535 |
| FRYMIRE.bmp | 6.0933 | 20.0976 | 5.806 |
| LENA.bmp | 7.0177 | 5.5442 | 4.5596 |
| MONARCH.bmp | 6.5516 | 3.2012 | 3.4715 |
| PEPPERS.bmp | 5.8596 | 4.4064 | 3.4824 |
| SAIL.bmp | 8.3467 | 3.7514 | 3.731 |
| SERRANO.bmp | 5.944 | 17.4141 | 3.9181 |
| TULIPS.bmp | 7.602 | 3.6793 | 4.119 |
| lena512ggg.bmp | 4.8137 | 2.0857 | 2.0857 |
| lena512pink.bmp | 4.5607 | 2.6387 | 2.3724 |
| lena512pink0g.bmp | 3.7297 | 3.8534 | 3.1756 |
| linear_ramp1.BMP | 1.3488 | 0.8626 | 1.1199 |
| linear_ramp2.BMP | 1.2843 | 0.7767 | 1.0679 |
| orange_purple.BMP | 2.8841 | 3.7019 | 1.9428 |
| pink_green.BMP | 3.1817 | 1.504 | 2.7461 |
And here are the results in SSIM :
Note this is an "RGB SSIM" computed by doing :
SSIM_RGB = ( SSIM_R * SSIM_G ^2 * SSIM_B ) ^ (1/4)
That is, G gets 2X the weight of R & B. The SSIM is computed at a scale of 6x6 blocks which I just randomly picked out of my ass.
I also convert the SSIM to a "percent similar". The number you see below is a percent - 100% means perfect, 0% means completely unrelated to the original (eg. random noise gets 0%). This percent is :
SSIM_Percent_Similar = 100.0 * ( 1 - acos( ssim ) * 2 / PI )
I do this because the normal "ssim" is like a dot product, and showing dot products is not a good linear way to show how different things are (this is the same reason I show RMSE instead of PSNR like other silly people). In particular, when two signals are very similar, the "ssim" gets very close to 0.9999 very quickly even though the differences are still pretty big. Almost any time you want to see how close two vectors are using a dot product, you should do an acos() and compare the angle.
| name | DXT1 | Humus | DXT5 YCoCg |
| kodim01.bmp | 84.0851 | 92.6253 | 92.7779 |
| kodim02.bmp | 82.2029 | 91.7239 | 90.5396 |
| kodim03.bmp | 85.2678 | 92.9042 | 93.2512 |
| kodim04.bmp | 83.4914 | 92.5714 | 92.784 |
| kodim05.bmp | 83.6075 | 92.2779 | 92.4083 |
| kodim06.bmp | 85.0608 | 92.6674 | 93.2357 |
| kodim07.bmp | 85.3704 | 93.2551 | 93.5276 |
| kodim08.bmp | 84.5827 | 92.4303 | 92.7742 |
| kodim09.bmp | 84.7279 | 92.9912 | 93.5035 |
| kodim10.bmp | 84.6513 | 92.81 | 93.3999 |
| kodim11.bmp | 84.0329 | 92.5248 | 92.9252 |
| kodim12.bmp | 84.8558 | 92.8272 | 93.4733 |
| kodim13.bmp | 83.6149 | 92.2689 | 92.505 |
| kodim14.bmp | 82.6441 | 92.1501 | 92.1635 |
| kodim15.bmp | 83.693 | 92.0028 | 92.8509 |
| kodim16.bmp | 85.1286 | 93.162 | 93.6118 |
| kodim17.bmp | 85.1786 | 93.1788 | 93.623 |
| kodim18.bmp | 82.9817 | 92.1141 | 92.1309 |
| kodim19.bmp | 84.4756 | 92.7702 | 93.0441 |
| kodim20.bmp | 87.0549 | 90.5253 | 93.2088 |
| kodim21.bmp | 84.2549 | 92.2236 | 92.8971 |
| kodim22.bmp | 82.6497 | 91.0302 | 91.9512 |
| kodim23.bmp | 84.2834 | 92.4417 | 92.4611 |
| kodim24.bmp | 84.6571 | 92.3704 | 93.2055 |
| clegg.bmp | 77.4964 | 70.1533 | 83.8049 |
| FRYMIRE.bmp | 91.3294 | 72.2527 | 87.6232 |
| LENA.bmp | 77.1556 | 80.7912 | 85.2508 |
| MONARCH.bmp | 83.9282 | 92.5106 | 91.6676 |
| PEPPERS.bmp | 81.6011 | 88.7887 | 89.0931 |
| SAIL.bmp | 83.2359 | 92.4974 | 92.4144 |
| SERRANO.bmp | 89.095 | 75.7559 | 90.7327 |
| TULIPS.bmp | 81.5535 | 90.8302 | 89.6292 |
| lena512ggg.bmp | 86.6836 | 95.0063 | 95.0063 |
| lena512pink.bmp | 86.3701 | 92.1843 | 92.9524 |
| lena512pink0g.bmp | 89.9995 | 79.9461 | 84.3601 |
| linear_ramp1.BMP | 92.1629 | 94.9231 | 93.5861 |
| linear_ramp2.BMP | 92.8338 | 96.1397 | 94.335 |
| orange_purple.BMP | 89.0707 | 91.6372 | 92.1934 |
| pink_green.BMP | 87.4589 | 93.5702 | 88.4219 |
Conclusion :
DXT5 YCoCg and "Humus" are both significantly better than DXT1.
Note that DXT5-YCoCg and "Humus" encode the luma in exactly the same way. For gray images like "lena512ggg.bmp" you can see they produce identical results. The only difference is how the chroma is encoded - either a DXT1 block (+scale) at 4 bpp, or a downsampled 2X BC4 block at 2 bpp.
In RGB RMSE , DXT5-YCoCg is measurably better than Humus-BC4BC5 , but in SSIM they are are nearly identical. This is because almost all of the RMSE loss in Humus comes from the YCoCg lossy color conversion and the CoCg downsampling. The actual BC4BC5 compression is very near lossless. (as much as I hate DXT1, I really like BC4 - it's very easy to produce near optimal output, unlike DXT1 where you have to run a really fancy compressor to get good output). The CoCg loss hurts RMSE a lot, but doesn't hurt actual visual quality or SSIM much in most cases.
In fact on an important class of images, Humus actually does a lot better than DXT5-YCoCg. That class is simple smooth ramp images, which we use very often in the form of lightmaps. The test images at the bottom of the table (linear_ramp and pink_green) show this.
On a few images where the CoCg downsample kills you, Humus does very badly. It's bad on orangle_purple because that image is specifically designed to be primarily in Chroma not Luma ; same for lena512pink0g.bmp ; note that normal chroma downsampling compressors like JPEG have this same problem. You could in theory choose a different color space for these images and use a different reconstruction shader.
Since Humus is only 6 bpp, size is certainly not a reason to prefer DXT1 over it. However, it does require two texture fetches in the shader, which is a pretty big hit. (BTW the other nice thing about Humus is that it's already down-sampled in CoCg, so if you are using something like a custom JPEG in YCoCg space with downsampled CoCg - you can just directly transcode that into Humus BC4BC5, and there's no scaling up or down or color space changes in the realtime recompress). I think this is probably what will be in Oodle because I really can't get behind any other realtime recompress.
I also tried something else, which is DXT1 optimized for SSIM. The idea is to use a little bit of neighbor information. The thing is, in my crazy DXT1 encoder, I'm just trying various end points and measuring the quality of each choice. The normal thing to do it to just take the MSE vs the original, but of course you could do other error metrics.
One such error metric is to decompress the block you're working on into its context - decompress into a chunk of neighbors that have already been DXT1 compressed & decompressed as well. Then compare that block and its neighbors to the original image in that neighborhood. In my case I used 2 pixels around the block I was working on, making a total region of 8x8 pixels (with the 4x4 DXT1 block in the middle).
You then compare the 8x8 block to the original image and try to optimize that. If you just used MSE in this comparison, it would be the same as before, but you can use other things. For example, you could add a term that penalizes not changes in values, but changes in *slope*.
Another approach would be to take the DCT of the 8x8 block and the DCT of the 8x8 original. If you then just take the L2 difference in DCT domain, that's no different than the original method, because the DCT is unitary. But you can apply non-uniform quantizers at this step using the JPEG visual quantization weights.
The approach I used was to use SSIM (using a 4x4 SSIM block) on the 8x8 windows. This means you are checking the error not just on your block, but on how your block fits into the neighborhood.
For example if the original image is all flat color - you want the output to be all flat color. Just using MSE won't give you that, eg. MSE considers 4444 -> 3535 to be just as good as 4444 -> 5555 , but we know the latter is better.
This does in fact produce slightly better looking images - it hurts RMSE of course because you're no longer optimizing for RMSE.
In the previous post we attacked the problem :
If you are given a low res signal L and a known down-sampler D() (in particlar, box down sampling), find an up sampler U() such that :
L = D ( U( L ) )
and U( L ) is as close as possible to the actual high res signal that L was made from (unknown).
I'm also interested in the opposite problem :
If you are given a high res signal H, and a known up-sampler U() (in particular, bilinear filtering), find a down sampler D() such that :
E = ( H - U( D( H ) ) )^2 is minized
This is a much more concrete and tractable problem. In particular in games/3d we know we are forced to use bilinear filtering as our up-sampler. If you use box down-sampling for D() as many people do, that's horrible, because bilinear filtering and box-downsampling are both interpolating and variance reducing. That, they both take noisey signals and force them towards gray. If you know that U() is going to be bilinear filtering, then you should use a D() that compensates for that. It's intuitively obvious that D should be something a bit like a sinc to bring in some neighbors with negative lobes to compensate for the blurring aspect of bilinear upsample, but what exactly I don't know yet.
(note that this is a different problem than making mips - in making mips you are actually going to be viewing the mip at a 1:1 resolution, it will not be upsampled back to the original resolution; you would use this if you were trying to substitute a lower res texture for a higher one).
I haven't tried my hand at solving this yet, maybe it's been done? Much like the previous problem, I'm surprised this isn't something well known and standard, but I haven't found anything on it.
A while ago I posed this problem to the world :
Say you are given the box-downsampled version of a signal (I may use "image" and "signal" interchangeably cuz I'm sloppy). Box-downsampled means groups of N values in the original have been replaced by the average in that group and then downsampled N:1. You wish to find an image which is the same resolution as the source and if box-downsampled by N, exactly reproduces the low resolution signal you were given. This high resolution image you produce should be "smooth" and close to the expected original signal.
Examples of this are say if you're given a low mip and you wish to create a higher mip such that downsampling again would exactly reproduce the low mip you were given. The particular case I mainly care about is if you are given the DC coefficients of a JPEG, which are the averages on 8x8 blocks, you wish to produce a high res image which has the exact same average on 8x8 blocks.
Obviously this is an under-constrained problem (for N > 1) because I haven't clearly spelled out "smooth" etc. There are an infinity of signals that when downsampled produce the same low resolution version. Ideally I'd like to have a way to upsample with a parameter for smoothness vs. ringing that I could play with. (if you're nitty, I can constrain the problem precisely : The correlation of the output image and the original source image should be maximized over the space of all real world source images (eg. for example over the space of all images that exist on the internet)).
Anyway, after trying a whole bunch of heuristic approaches which all failed (though Sean's iterative approach is actually pretty good), I found the mathemagical solution, and I thought it was interesting, so here we go.
First of all, let's get clear on what "box downsample" means in a form we can use in math.
You have an original signal f(t) . We're going to pretend it's continuous because it's easier.
To make the "box downsample" what you do is apply a convolution with a rectangle that's N wide. Since I'm treating t as continuous I'll just choose coordinates where N = 1. That is, "high res" pixels are 1/N apart in t, and "low res" pixels are 1 apart.
Convolution { f , g } (t) = Integral{ ds * f(s) * g(t - s) }
The convolution with rect gives you a smoothed signal, but it's still continuous. To get the samples of the low res image, you multiply this by "comb". comb is a sum of dirac delta functions at all the integer coordinates.
F(t) = Convolve{ rect , f(t) }
low res = comb * F(t)
low res = Sum[n] L_n * delta_n
Okay ? We now have a series of low res coefficients L_n just at the integers.
This is what is given to us in our problem. We wish to try to guess what "f" was - the original high res signal. Well, now that we've written is this way, it's obvious ! We just have to undo the comb filtering and undo the convolution with rect !
First to undo the comb filter - we know the answer to that. We are given discrete samples L_n and we wish to reproduce the smooth signal F that they came from. That's just Shannon sampling theorem reconstruction. The smooth reconstruction is made by just multiplying each sample by a sinc :
F(t) = Sum[n] L_n * sinc( t - n )
This is using the "normalized sinc" definition : sinc(x) = sin(pi x) / (pi x).
sinc(x) is 1.0 at x = 0 and 0.0 at all other integer x's and it oscillates around a lot.
So this F(t) is our reconstruction of the rect-filtered original - not the original. We need to undo the rect filter. To do that we rely on the Convolution Theorem : Convolution in Fourier domain is just multiplication. That is :
Fou{ Convolution { f , g } } = Fou{ f } * Fou{ g }
So in our case :
Fou{ F } = Fou{ Convolution { f , rect } } = Fou{ f } * Fou{ rect }
Fou{ f } = Fou{ F } / Fou{ rect }
Recall F(t) = sinc( t - n ) , so :
Fou{ f } = Sum[n] L_n * Fou{ sinc( t - n ) } / Fou{ rect }
Now we need some Fourier transform knowledge. The easiest way for me to find this stuff is just to do the integrals myself. Integrals are really fun and easy. I won't copy them here because it sucks in ASCII so I'll leave it as an exercise to the reader. You can easily figure out the Fourier translation principle :
Fou{ sinc( t - n ) } = e^(-2 pi i n v) * Fou{ sinc( t ) }
As well as the Fourier sinc / rect symmetry :
Fou{ rect(t) } = sinc( v )
Fou{ sinc(t) } = rect( v )
All that means for us :
Fou{ f } = Sum[n] L_n * e^(-2 pi i n v) * rect(v) / sinc(v)
So we have the Fourier transform of our signal and all that's left is to do the inverse transform !
f(t) = Sum[n] L_n * Fou^-1{ e^(-2 pi i n v) * rect(v) / sinc(v) }
because of course constants pull out of the integral. Again you can easily prove a Fourier translation principle : the e^(-2 pi i n v) term just acts to translate t by n, so we have :
f(t) = Sum[n] L_n * h(t - n)
h(t) = Fou^-1{ rect(v) / sinc(v) }
First of all, let's stop and see what we have here. h(t) is a function centered on zero and symmetric around zero - it's a reconstruction shape. Our final output signal, f(t), is just the original low res coefficients multiplied by this h(t) shape translated to each integer point n. That should make a lot of sense.
What is h exactly? Well, again we just go ahead and do the Fourier integral. The thing is, "rect" just acts to truncate the infinite range of the integral down to [-1/2, 1/2] , so :
h(t) = Integral[-1/2,1/2] { dv e^(2 pi i t v) / sinc(v) }
Since sinc is symmetric around zero, let's take the two halves of the range around zero and add them together :
h(t) = Integral[0,1/2] { dv ( e^(2 pi i t v) + e^(- 2 pi i t v) ) / sinc(v) }
h(t) = Integral[0,1/2] { dv 2 * cos ( 2 pi t v ) * pi * v / sin( pi v) }
(note we lost the c - sinc is now sin). Let's change variables to w = pi v :
h(t) = (2 / pi ) * Integral[ 0 , pi/2 ] { dw * w * cos( 2 t w ) / sin( w ) }
And.. we're stuck. This is an integral function; it's a pretty neat form, it sure smells like some kind of Bessel function or something like that, but I can't find this exact form in my math books. (if anyone knows what this is, help me out). (actually I think it's a type of elliptic integral).
One thing we can do with h(t) is prove that it is in fact exactly what we want. It has the box-unit property :
Integral[ N - 1/2 , N + 1/2 ] { h(t) dt } = 1.0 if N = 0 and 0.0 for all other integer N
That is, the 1.0 wide window box filter of h(t) centered on integers is exactly 1.0 on its own unit interval, and 0 on others. In other words, h(t) reconstructs its own DC perfectly and doesn't affect any others. (prove this by just going ahead and doing the integral; you should get sin( N * pi ) / (N * pi ) ).
While I can't find a way to simplify h(t) , I can just numerically integrate it. It looks like this :
You can see it sort of looks like sinc, but it isn't. The value at 0 is > 1. The height of the central peak vs. the side peaks is more extreme than sinc, the first negative lobes are deeper than sinc. It actually reminds me of the appearance of a wavelet.
Actually the value h(0) is exactly 4 G / pi = 1.166243... , where "G" is Catalan's constant.
Anyway, this is all very amusing and it actually "works" in the sense that if you blow up a low-res image using this h(t) basis shape, it does in fact make a high res image that is smooth and upon box-down sampling exactly reproduces the low-res original.
It is, however, not actually useful. For one thing, it's computationally ridiculous. Of course you would precompute the h(t) and store it in a table, but even then, the reach of h(t) is infinite, and it doesn't get small until very large t (beyond the edges of any real image), so in practice every output pixel must be a weighted sum from every single DC values in the low res image. Even without that problem, it's useless because it's just too ringy on real data. Looking at the shape above it should be obvious it will ring like crazy.
I believe these problems basically go back to the issue of using the ideal Shannon reconstruction when I did the step of "undoing the comb". By using the sinc to reproduce I doomed myself to non-local effect and ringing. The next obvious question is - can you do something other than sinc there? Why yes you can, though you have to be careful.
Say we go back to the very beginning and make this reconstruction :
F(t) = Sum[n] L_n * B( t - n )
We're making F(t) which is our reconstruction of the smooth box-filter of the original. Now B(t) is some reconstruction basis function (before we used sinc). In order to be a reconstruction, B(t) must be 1.0 at t = 0, and 0.0 at all other integer t. Okay.
If we run through the math with general B, we get :
again :
f(t) = Sum[n] L_n * h(t - n)
but with :
h(t) = Fou^-1{ Fou{ B } / sinc(v) }
For example :
If B(t) = "triangle" , then F(t) is just the linear interpolation of the L_n
Fou{ triangle } = sinc^2 ( v)
h(t) = Fou^-1{ sinc^2 ( v) / sinc(v) } = Fou^-1{ sinc } = rect(t)
Our basis functions are rects ! In fact this is the reconstruction where the L_n is just made a constant over each DC domain. In fact if you think about it that should be obvious. If you take the L_n and make them constant on each domain, then you run a rectangle convolution over that - as you slide the rectangle window along, you get linear interpolation, which is our F(t).
That's not useful, but maybe some other B(t) is. In particular I think the best line of approach is for B(t) to be some kind of windowed sinc. Perhaps a Guassian-windowed sinc. Any real world window I can think of leads to a Fourier transform of B(t) that's too complex to do analytically, which means our only approach to finding h is to do a double-numerical-integration which is rather a disastrous thing to do, even for precomputing a table.
So I guess that's the next step, though I think this whole approach is a practical dead end and is now just a scientific curiosity. I must say it was a lot of fun to actually bust out pencil and paper and do some math and real thinking. I really miss it.
It's time now for me to give a shout out to all the b-boys in the werld.
mischief.mayhem.soap.
Adventures of a hungry girl
Atom
Aurora
Beautiful Pixels
Birth of a Game
bouliiii's blog
Braid
C0DE517E
Capitol Hill Triangle
cbloom rants
Cessu's blog
Culinary Fool
David Lebovitz
Diary of a Graphics Programmer
Diary Of An x264 Developer
Eat All About It
EntBlog
fixored?
Game Rendering
GameArchitect
garfield minus garfield
Graphic Rants
Graphics Runner
Graphics Size Coding
Gustavo Duarte
His Notes
Humus
I Get Your Fail
Ignacio Castaño
Industrial Arithmetic
John Ratcliff's Code Suppository
Lair Of The Multimedia Guru
Larry Osterman's WebLog
level of detail
Lightning Engine
limegarden.net
Lost in the Triangles
Mark's Blog
Married To The Sea
meshula.net
More Seattle Stuff
My Green Paste, Inc.
Nerdblog.com
not a beautiful or unique snowflake
NVIDIA Developer News
Nynaeve
onepartcode.com
Pete Shirley's Graphics Blog
Pixels, Too Many..
Real-Time Rendering
realtimecollisiondetection.net - the blog
RenderWonk
Ryan Ellis
Seattle Daily Photo
snarfed.org
Some Assembly Required
stinkin' thinkin'
Stumbling Toward 'Awesomeness'
surly gourmand
Sutter's Mill
Syntopia
Thatcher's rants and musings
The Atom Project
The Big Picture
The Data Compression News Blog
The Ladybug Letter
The software rendering world
TomF's Tech Blog
View
VirtualBlog
Visual C++ Team Blog
Void Star: Ares Fall
What your mother never told you about graphics development
Wright Eats
xkcd.com
Bartosz Milewski's Programming Cafe
autogen from the Google Reader xml output. I would post the code right here but HTML EATS MY FUCKING LESS THAN SIGNS and it's pissing me off. God damn you.
SAVED : Thanks Wouter for linking to htmlescape.net ; I might write a program to automatically do that to anything inside a PRE chunk when I upload the block.
int main(int argc,char *argv[])
{
char * in = ReadWholeFile(argv[1]);
while( in && *in )
{
in = skipwhitespace(in);
if ( stripresame(in,"<outline") )
{
char * title = strextracttok(in,'"',&in);
in = strstr(in,"htmlUrl");
char * url = strextracttok(in,'"',&in);
lprintf("<A HREF=\"%s\"> %s </A> <BR>\n",url,title);
}
in = strnextline(in);
}
return 0;
}
God dammit, there's no good biking around here. The bike lanes are meager, and half of the so-called "bike routes" run you right down busy streets with hardly any space (I'm looking at you, Howell-Stewart junction). The roads are in a shameful state, full of pot holes and ruts that are literally rattling the bolts loose on my bike (and banging them is a huge hazard and hurts my joints like a motherfuck). The Lake Washington loop is tolerable, but there are tons of bits with ridiculously bad pavement or narrow/ no shoulder, as well as plenty of major hazards, lots of commuting traffic, and bad routing.
My original right shoulder injury in 2006 was a separation that became frozen and is now still bugging me in the form of a winging scapula and arthritic AC joint. That crash was caused by a pot hole in San Francisco. Fucking pot holes.
Also, the fucking mini traffic circles they've tossed around cap hill and 28th are fucking retarded. They don't function as real traffic circles because they're too small; a real traffic circle works because being "in the circle" is a seperate state. The big problem with them is that cars have to swing really wide to get around them, and the road isn't wide, so cars swing right into the path of pedestrians, and cut right into bicycles. It's fucking awful. Actually I hate all the "Yield" streets around here too since half of them are at blind intersections and lots of dumb fucks come barrelling through them at high speed. All of these residential intersections should have full 4-way stops and painted crosswalks. Hell, more cities just need streets that are ped/bike only. For example Pike between Broadway and 12th should just be closed to cars. It would be fantastic for local businesses.
Kirkland's got this lovely pool right by my work, so I go to check when the lap swim hours are ... none. I mean, they do have lap swim from 5:30-7 am, but that may as well not exist. Even if I wanted to get up that early, it's fucking cold and gross that early, I want to swim in the afternoon sun you fucks. WTF you have this fucking great pool and you just can't open it? Presumably this is the same problem as the fucking roads, that there's no damn taxes and the governments are fucking dumb. It's such stupid cost saving though; you've spent all the money to make this pool, you clean it and pump it, and then you only have it open 6 hours a day.
Anyway, Colman park just south of I90 is really cool. Not the part down on the lake, that's okay, but it's obvious, what with it's views or Rainier and whatnot. The cool part is up Lake Washington Blvd S toward 31st Ave. You get the best effect if you park down at the bottom and walk up the hill - it has cool winding paths and stairs and bridges, and then at the top there's this huge public vegetable+flower garden that's like a hidden garden surprise for the hardy souls that made it up the hill.
Biking is so fucking great. I went and did the Mercer Island loop this weekend; it's pretty nice once you're out there, though the bike path is damn annoying and riding over I90 is scary and not fun. I have vertigo and the high view down to water with just a railing next to me is nauseau inducing. (only did the ride over the Golden Gate Bridge once; I nearly had a heart attack; after that I always drove my bike across the bridge and parked and then rode north).
I passed two seperate Mercer Island residents who seemed to intentionally stand right in my way. They were just standing in the road in the bike lane, cars were coming so it's not like I had a ton of room, and they made no movement out of the way at all. Rich people are fucking cocks.
Some douchebag cyclist dropped a passive-aggressive bummer-bumb on me. He was riding ahead of me on the bike lane, I move to the left and pass him. As I'm passing he says "I'm on your right". Huh? Yeah you are. I came up behind you and saw you. Oh, I get it. That's a fucking dickweed way of saying you expected me to say "on your left" when I passed you. Fuck you. It took me a little while to figure out what an acrid like asshole he was and by that time I was well past (because in additional to being a passive aggressive holier than thou dick he was also fat and slow); if I'd realized it sooner I would've yelled something back at him. How dare you fucking bring that negative shit into my world when I'm out on my ride having my one fucking moment of pure joy and pleasure? Fuck you, I know the fucking rules of courtesy, I say "on your left" if I think there's any danger or if it's a tight spot, but I don't say it every damn time I pass every person, and that's an unreasonable expectation, and even if you do think I should you can fucking keep it to yourself.
Some random dude also drafted me for a mile or so. That's not cool. You don't just jump on the ass of someone you don't know without saying anything. To draft correctly you have to be mere inches from the person in front of you. It's great for efficiency, but it's also very dangerous if you aren't communicating, because if the lead person brakes, you have an instant crash. Sometimes I'll latch onto someone's wheel when they pass me, but I hang back far enough to be safe, or I say hey can we draft a while? This dude just put his nose in my butt and stuck there. Mild scowl.
I got this book : Bicycling the Backroads around Puget Sound at the library. It sucks & basically proves that the biking here blows. There's one or two good rides in it (one of them being the Mercer Island Loop). Then it's full of rides that are just bullshit. It's got a bunch of rides that go down highways that are totally not suitable for biking (like the 203 and the 169) ; it also literally has a ride that goes on I90. WTF. Oh and then it's got rides that go off road. Umm, hello, this is the road biking book, you can put the unpaved road rides in another book, thank you. There are a few rides that look interesting, but they're well the hell far away, like Enumclaw or Granite Falls kind of far away. What the hell I guess I'll go try one soon cuz I don't have anything else to do.
The NYT Travel this week featured biking around Provence . That's like my dream; that article is pretty worthless and the writer is not a real biker, but doing some real country touring around Europe, in the sun of Provence, Tuscany, Catalonia, seeing all the countryside at the pace of a bike (the bike is the perfect speed for seeing country; walking is too slow and driving is too fast), eating and drinking. I don't want to ride the fucking Tour de France routes, that's way too hard and not fun.
I'm literally surrounded by crazy annoying loud people on all sides. To the west is the young couple who just had a baby that cries constantly; I've been around many babies and never heard one cry and cry like this; plus their Russian mom has now moved in with them and is always barking out orders to kill dissenting journalists in high pitched Russian.
To the north (fortunately across the street) is a party house full of frat boys who are constantly screaming "wooo" about something or other. Whoah bro sportscenter is on! Open the windows and scream "wooo" !! They literally set their house on fire the other day, fire trucks came and sprayed it, some fire dudes hacked open a wall to get at some stray embers inside. The next night after the fire they had another big party.
To the south is some amateur indie/punk band (emphasis strongly on the "amateur"). Fortunately they are a few houses away, but the pounding of the drums travels far. Sadly for me, they seem to be very dilligent about practicing the same song over and over and over. Sadly for them it doesn't seem to be helping.
To the east are the fucking white trash who are sitting outside drinking bud light and talking five feet away from my window. Blurg. Houses around here are way the fuck too close together. It's might be worse than the traditional row house like you have in SF or back east; with the row house you have a solid wall between you and the neighbor. Here we have open space and windows, but only like five feet of distance.
And of course there's "stompy" the crack head upstairs neighbor who seems to pass the time by moving his furniture from one end of the building to the other.
I guess it's kind of a noisy neighborhood, but I couldn't tell that when I moved in, because there are plenty of very nice single family homes around, with yuppie parents and kids and high property values.
There should really be segregation. There should be "ghettos" for the people who want to be noisy and have parties and whatnot - fine, that's cool, just go live in the noisy ghetto with other people of your kind. Alternatively, people should have to sign up in advance for certain weekends when they're allowed to be noisy so I can just go out of town those weekends.
I gather that many places in Europe have the tradition of a local pub on each block, and the people who live there just go congregate in the pub and make their noise there, so it's not in anyone's house. If you're making a ruckus, you go down to the pub. That seems like a good system.
Anyway, I write this now because by some divine conspiracy, all my neighbors went out of town this weekend. Hippie smoker jabberers next door - gone! Upstairs stompy - gone! No band practices, and no huge block parties. It was sublime, I was free, at peace, I could sit and read or work (I did lots of work), cook, listen to music, and I felt alone and happy. My god. Sometimes I get into these funks in life where I just think that everything fucking sucks and everyone is a huge dick, and then it's like the clouds clear - you see a moment where in fact, things do not suck, and it's like a revelation - whoah this misery is not how it has to be.
.. can be turned into multiplies and shifts as I'm sure you know. Often on x86 this is done most efficiently through use of the "mul high" capability (the fact that you get 64 bit multiply output for free and can just grab the top dword).
Sadly, C doesn't give you a way to do mul high, AND stupidly MS/Intel don't seen to provide any intrinsic to do it in a clean way. After much fiddling I've found that this usually works on MSVC :
uint32 Mul32High(uint32 a,uint32 b)
{
return (uint32)( ((uint64) a * b) >>32);
}
but make sure to check your assembly. This should assemble to just a "mul" and "ret edx".
Now, for the actual divider lookup, there are lots of references out there on the net, but most of them are not terribly useful because 1) lots of them assume expensive multiplies, and 2) most of them are designed to work for the full range of arguments. Often you know that you only need to work on a limited range of one parameter, and you can find much simpler versions for limited ranges.
One excellent page is : Jones on Reciprocal Multiplication (he actually talks about the limited range simplifications, unlike the canonical references).
The best reference I've found is the AMD Optimization Guide. They have a big section about the theory, and also two programs "sdiv.exe" and "udiv.exe" that spit out code for you! Unfortunately it's really fucking hard to find on their web site. sdiv and udiv were shipped on AMD developer CD's and appear to have disappeared from the web. I've found one FTP Archive here . You can find the AMD PDF's on their site, as these links may break : PDF 1 , PDF 2 .
And actually this little CodeProject FindMulShift is not bad; it just does a brute force search for simple mul shift approximations for limited ranges of numerators.
But it's written in a not-useful way. You should just find the shift that gives you maximum range. So I did that, it took two seconds, and here's the result for you :
__forceinline uint32 IntegerDivideConstant(uint32 x,uint32 divisor)
{
ASSERT( divisor > 0 && divisor <= 16 );
if ( divisor == 1 )
{
return x;
}
else if ( divisor == 2 )
{
return x >> 1;
}
else if ( divisor == 3 )
{
ASSERT( x < 98304 );
return ( x * 0x0000AAAB ) >> 17;
}
else if ( divisor == 4 )
{
return x >> 2;
}
else if ( divisor == 5 )
{
ASSERT( x < 81920 );
return ( x * 0x0000CCCD ) >> 18;
}
else if ( divisor == 6 )
{
ASSERT( x < 98304 );
return ( x * 0x0000AAAB ) >> 18;
}
else if ( divisor == 7 )
{
ASSERT( x < 57344 );
return ( x * 0x00012493 ) >> 19;
}
else if ( divisor == 8 )
{
return x >> 3;
}
else if ( divisor == 9 )
{
ASSERT( x < 73728 );
return ( x * 0x0000E38F ) >> 19;
}
else if ( divisor == 10 )
{
ASSERT( x < 81920 );
return ( x * 0x0000CCCD ) >> 19;
}
else if ( divisor == 12 )
{
ASSERT( x < 90112 );
return ( x * 0x0000BA2F ) >> 19;
}
else if ( divisor == 13 )
{
ASSERT( x < 98304 );
return ( x * 0x0000AAAB ) >> 19;
}
else if ( divisor == 13 )
{
ASSERT( x < 212992 );
return ( x * 0x00004EC5 ) >> 18;
}
else if ( divisor == 14 )
{
ASSERT( x < 57344 );
return ( x * 0x00012493 ) >> 20;
}
else if ( divisor == 15 )
{
ASSERT( x < 74909 );
return ( x * 0x00008889 ) >> 19;
}
else if ( divisor == 16 )
{
return x >> 4;
}
else
{
CANT_GET_HERE();
return x / divisor;
}
}
Note : an if/else tree is better than a switch() because we're branching on constants. This should all get removed by the compiler. Some compilers get confused by large switches, even on constants, and fail to remove them.
7 seems to be the worst number for all of these methods. It only works here up to 57344 (not quite 16 bits).
I'm doing some fun math that I'll post in the next entry, but first some little random shit I've done along the way.
First of all this is just a handy tiny C++ functor dealy for doing numerical integration. I tried to be a little bit careful about floating point issues (note for example the alternative version of the "t" interpolation) but I'm sure someone with more skills can fix this. Obviously the accumulation into "sum" is not awesome for floats if you have a severely oscillating cancelling function (like if you try to integrate a high frequency cosine for example). I suppose the best way would be to use cascaded accumulators (an accumulator per exponent). Anyhoo here it is :
template < typename functor >
double Integrate( double lo, double hi, int steps, functor f )
{
double sum = 0.0;
double last_val = f(lo);
double step_size = (hi - lo)/steps;
for(int i=1;i <= steps;i++)
{
//double t = lo + i * (hi - lo) / steps;
double t = ( (steps - i) * lo + i * hi ) / steps;
double cur_val = f(t);
// trapezoid :
double cur_int = (1.0/2.0) * (cur_val + last_val);
// Simpson :
// double mid_val = f(t + step_size * 0.5);
//double cur_int = (1.0/6.0) * (cur_val + 4.0 * mid_val + last_val);
sum += cur_int;
last_val = cur_val;
}
sum *= step_size;
return sum;
}
BTW I think it might help to use steps = an exact power of two (like 2^16), since that is exactly representable in floats.
I also did something that's quite useless but kind of educational. Say you want a log2() and for some reason you don't want to call log(). (note that this is dumb because most platforms have fast native log or good libraries, but whatever, let's continue).
Obviously the floating point exponent is close, so if we factor our number :
X = 2^E * (1 + M) log2(X) = log2( 2^E * (1 + M) ) log2(X) = log2( 2^E ) + log2(1 + M) log2(X) = E + log2(1 + M) log2(X) = E + ln(1 + M) / ln(2) M in [0,1]Now, obviously ln(1 + M) is the perfect kind of thing to do a series expansion on.
We know M is "small" so the obvious thing that a junior mathematician would do is the Taylor expansion :
ln(1+x) ~= x - x^2/2 + x^3/3 - x^4/4 + ...that would be very wrong. There are a few reasons why. One is that our "x" (M) is not actually "small". M is equally likely to be anywhere in [0,1] , and for M -> 1 , the error of this expansion is huge.
More generally, Taylor series are just *NEVER* the right way to do functional approximation. They are very useful mathematical constructs for doing calculus and limits, but they should only be used for solving equations, not in computer science. Way too many people use them for function approximation which is NOT what they do.
If for some reason we want to use a Taylor-like expansion for ln() we can fix it. First of all, we can bring our x into range better.
if ( 1+x > 4/3 )
{
E ++
x = (1+x)/2 - 1;
}
if (1+x) is large, we divide it by two and compensate in the exponent. Now instead of having x in [0,1] we have x in [-1/3,1/3] which is
better.
The next thing you can do is find the optimal last coefficient. That is :
ln(1+x) ~= x - x^2/2 + x^3/3 - x^4/4 + k5 * x^5for a 5-term polynomial. For x in [-epsilon,epsilon] the optimal value for k5 is 1/5 , the Taylor expansion. But that's not the range of x we're using. We're using either [-1/3,0] or [0,1/3]. The easiest way to find a better k5 is to take the difference from a higher order Taylor :
delta = k5 * x^5 - ( x^5/5 - x^6/6 + x^7/7 )Integrate delta^2 over [0,1/3] to find the L2 norm error, then take the different wrst k5 to minimize the error. What you get is :
x in [0,1/3] : k5 = (1/5 - 11/(18*12)) x in [-1/3,0] : k5 = (1/5 + 11/(18*12))it's intuitive what's happening here; if you just truncate a Taylor series at some order, you're doing the wrong thing. The N-term Taylor series is not the best N-term approximation. What we've done here is found the average of what all the future terms add up to and put them in as a compensation. In particular in the ln case the terms swing back and forth positive,negative, and each one is smaller than the last, so when you truncate you are overshooting the last value, so you need to compensate down in the x > 0 case and up in the x < 0 case.
using k5 instead of 1/5 we improve the error by over 10X :
basic : 1.61848e-008
improved : 1.31599e-009
The full code is :
float log2_improved( float X )
{
ASSERT( X > 0.0 );
///-------------
// approximate log2 by getting the exponent from the float
// and then using the mantissa to do a taylor series
// get the exponent :
uint32 X_as_int = FLOAT_AS_INT(X);
int iLogFloor = (X_as_int >> 23) - 127;
// get the mantissa :
FloatAnd32 fi;
fi.i = (X_as_int & ( (1<<23)-1 ) ) | (127<<23);
double frac = fi.f;
ASSERT( frac >= 1.0 && frac < 2.0 );
double k5;
if ( frac > 4.0/3.0 )
{
// (frac/2) is closer to 2.0 than frac is
// push the iLog up and our correction will now be negative
// the branch here sucks but this is necessary for accuracy
// when frac is near 2.0, t is near 1.0 and the Taylor is totally invalid
frac *= 0.5;
iLogFloor++;
k5 = (1/5.0 + 11.0/(18.0*12.0));
}
else
{
k5 = (1/5.0 - 11.0/(18.0*12.0));
}
// X = 2^iLogFloor * frac
double t = frac - 1.0;
ASSERT( t >= -(1.0/3.0) && t <= (1.0/3.0) );
double lnFrac = t - t*t*0.5 + (t*t*t)*( (1.0/3.0) - t*(1.0/4.0) + t*t*k5 );
float approx = (float)iLogFloor + float(lnFrac) * float(1.0 / LN2);
return approx;
}
In any case, this is still not right. What we actually want is the best N-term approximation on a certain interval. There's no need to mess about, because that's a well defined thing to find.
You could go at it brute force, start with an arbitrary N-term polynomial and optimize the coefficients to minimize L2 error. But that would be silly because this has all been worked out by mathemagicians in the past. The answer is just the "Shifted Legendre Polynomials" . Legendre Polynomials are defined on [-1,1] ; you can shift them to any range [a,b] that you're working on. They are an orthonormal transform basis for functions on that interval.
The good thing about Legendre Polynomials is that the best coefficients for an N-term expansion in Legendre polynomials are just the Hilbert dot products (integrals) with the Legendre basis functions. Also, if you do the infinite-term expansion in the Legendre basis, then the best expansion in the first N polynomials is just the first N terms of the infinite term expansion (note that this was NOT true with Taylor series). (the same is true of Fourier or any orthonormal series; obviously it's not true for Taylor because Taylor series are not orthonormal - that's why I could correct for higher terms by adjusting the lower terms, because < x^5 * x^7 > is not zero). BTW another way to find the Legendre coefficients is to start with the Taylor series, and then do a least-squares best fit to compensate each lower coefficient for the ones that you dropped off.
(note the standard definition of Legendre polynomials makes them orthogonal but not orthonormal ; you have to correct for their norm as we show below :)
To approximate a function f(t) we just have to find the coefficients : C_n = < P_n * f > / < P_n * P_n >. For our function f = log2( 1 + x) , they are :
0.557304959, 0.492127684, -0.056146683, 0.007695622, -0.001130694, 0.000172362which you could find by exact integration but I got lazy and just did numerical integration. Now our errors are :
basic : 1.62154e-008 improved : 1.31753e-009 legendre : 1.47386e-009Note that the legendre error reported here is slightly higher than the "improved" error - that's because the Legendre version just uses the mantissa M directly on [0,1] - there's no branch test with 4/3 and exponent shift. If I did that for the Legendre version then I should do new fits for the ranges [-1/3,0] and [0,1/3] and it would be much much more accurate. Instead the Legendre version just does an unconditional 6-term fit and gets almost the same results.
Anyway, like I said don't actually use this for log2 , but these are good things to remember any time you do functional approximation.
Urg. I just spent the entire fucking day trying to track down a replacement stem for my moderately old (mid 90's) racing bike. My bike uses a quill stem, which is pretty rare now (everyone has gone to threadless). The few quill stems you can find are targetted mainly at cruisers / mountain bikes. To make matters worse, even if you find one it has to be a match on several different sizing criteria.
First I had to figure out the sizes of my old stem so I could match. Some of the Stem Measurement guides just made me go WTF. Fortunately I found some nice clear ones like : this and this . So, yeah you just measure the stem along the bar that goes from the steerer axis to the handlebar axis. Road bike stems have a 73 degree angle, which makes them horizontal, a 90 degree stem will point up from the headset. This also had me confused for a second. If you changed from a 73 to a 90, the handlebars will actually do different things when you turn. Road bike handlebars actually dip down when you turn, they sort of lead you into leaning in the right direction. Neat!
Then you have the issue of all the tube diameters. Fortunately Sheldon has that sorted . The steerer tube part is almost always the ISO 1" size (which is 22.2 mm ; I know it should be 25.4 mm, but the 1" refers to the outside of the head tube, while the 22.2 is the diameter of the stem that fits into that). The handlebar diameter is a bit of a bigger problem. My bars don't have any label on them so I had to measure. The common sizes are 25.4, 25.8 or 26.0 mm. Of course I don't have calipers, so the easiest way to measure a diameter is by measuring the circumference with a piece of paper. Hah! Good luck telling the difference between 25.8 or 26.0 with a ruler. Anyway I know I don't have 25.4 , and the 25.8 and 26.0 are considered interchangeable.
Now I finally know what I want. A 22.2 mm - 26.0 mm stem with 100 mm reach. Sadly, the vast majority of cool old quill stems look like this . They have one bolt and the handlebar clamping bit wraps all the way around - you can't just take the handlebars in and out the way you can with modern threadless stems that have face plates like this . What that means is I would have to take off my brakes/shifters and hoods, take off my bar tape, slide the bar ends through the hole and twist it around, then put everything back on. Umm, no thanks, I'll have a face plate please.
Okay, I so I need a 22.2 mm - 26.0 mm quill stem with 100 mm reach and a removable face plate. Okay, let's track one down. Well, they do exist. One popular model is the Salsa SUL which is highly recommended various places. It's been recalled due to catastrophic dangerous metal failure. Umm yikes. Hmmm, well lucky me with more searching I tracked down another. The Deda Murex. Umm.. reputed to fail to hold handlebars , very loose and flexes easily. Umm.. okay, I found another, the Cinelli Frog ! Urg, same problem as the Deda.
I thought I found the mother lode at two cool eBay Stores : New Old Stock (NOS) and Mario's that have lots of classic bike gear. I started looking at a 3TTT Motus. (BTW 3TTT is is the most fucking retarded name ever, it's like Mount Fuji or something, and people either write it as "TTT" or "3T" or "3TTT" which annoys the fuck out of my Google searching). So the 3TTT Motus ... many cases of catastrophic clamp failure, strongly advised against. Yay. I think the Mutant might be fine, but I can't find a 100mm mutant.
Finally I found one - the Profile H2O ; it's a very common cheapo stem, I ruled it out at first because I only saw 90 and 105 degree models for the mountain/casual market. Then by chance I discovered they do make a 74 degree, so here I am. A full day later and $1000 of my time down the drain, I now get a shitty down-market aluminum stem.
BTW this might be the worst web site design ever. Oh yes, I know what this text needs - rotating and wiggling to make the viewer sick! And it should take forever to settle down so you have to try to read it while it wiggles or just sit there for minutes. To whoever made this : be ashamed.
I have a much worse problem upcoming with my around-town bike. It's an old bike in the 28.5" size which doesn't exist any more (it's all 700 now - BTW don't take those measurements too literally, they often don't actually correspond to anything that's actually on the bike). There's a lot of stuff I should do to fix it, but the parts are rare and expensive. I suppose I should give it away and get a new one, but it is an old CroMoly lugged frame that hardly exists any more except from bespoke custom builders.
I quite enjoy working on bikes. It's so much simpler and cheaper than working on cars, you can get all the tools you need for $100, and it's very satisfying to fix something with your own hands, and it brings you closer to your machine, you get to know how all the bits really work. I've written this before.
Sadly Seattle is an awful place to bike, but that's another rant. I'm still going, cuz hey, I still love biking even though the pavement quality and traffic and bike routes and countryside accessibility all suck balls here. It's like sex with a condom; yeah it's fucking awful and if you can have real sex you should, but if all you can get is sex with a condom, then you still do it. (people who tell you "well then don't do it" when you complain about something are fucking morons).
Computers destroy bodies in ways you probably aren't even aware of. I'm finally getting weekly massage and weekly PT that's nominally because of my shoulder problems, but really we spend just as much time working on computer-related problems. It costs a fortune. I'll probably spend $10k this year on massage and PT and pilates and chiropractors, and it takes a lot of time, but it's completely worth it.
A lot of people think they're okay because they aren't currently in pain or having numbness or carpal tunnel problems or whatever (BTW carpal tunnel is severely over-diagnosed and is also very easy to avoid; basically it's not a significant problem unless you are a retard and just mouse in a horrible position and ignore all the warning signs your body is sending you). In reality, all of these therapies can't really do much for you once you are in pain and having problems - the best time to work is *before* you have problems to prevent them.
Let me describe what happens to you when you sit at a computer day after day :
1. Your muscles just all generally atrophy because you're not doing a damn thing. Note that even if you work out this can be a big problem because you tend to only work your big movers, so you can get yourself into a dangerous situation where you have over-developed big movers and under-developed structural/stabilizer muscles.
2. Your hips lock up and your hamstrings shorten from sitting with knees bent all the time. Most of you actually sit on your low back, not your sit bones, which puts pressure on the vertebrae and nerves in the low back which can lead to sciatic pain and other nerve impingement disorders of the low body.
3. Your back rounds forward; obviously this happens badly if you slouch, but it also affects most people who try to be good and sit on a physioball or something, because they get tired and start resting on their arms and leaning into the monitor and keyboard. The back rounding forward does a lot of things - it shortens the muscles on the front of the body (mainly the pec minor) and it over-lengthens the muscles on the back (mainly the trapezies and teres major). Permanently stretched or compressed muscles are crippled - they can't execute their movement in their power zone near neutral. Back rounding also puts lots of bad pressure on the vertebrae, it pinches discs on the anterior side. Nerve bundles run out of your vertebrae through little holes and they get squeezed which leads to pain and weakness.
4. Your shoulders roll forward and get weak; partly because of #3. This is mainly because your arms are forward all the time, never above your head or even just relaxed at your side. The weight of your arms pulls the shoulder forward off where it should be resting. The shoulder is a very elaborate and delicate contraption - it doesn't have a ball and socket, the humerus just sort of sits up against the side of the body and is held in place by the rotator cuff tendon-muscular complex. By rolling the shoulder forward, parts of it are stretched and other compressed, which leads to weakness, constriction of nerves and blood flow, and pure mechanical disfunction (because it's in the wrong place, you can't get the right leverage with the right muscles and your body winds up compensating in bad ways).
Most people who have computer-related numbness or weakness or arm pain are actually have nerve pinching due to shoulder problems, not carpal tunnel. The nerve bundles from C5-C7 run through the shoulder and down the arm; they run through very small spaces which is fine if your body geometry is correct, but when you sit at a computer and your body gets all deformed with your head way forward and your upper back kyphotic and your shoulders rounded forward, it screws up the passages that the nerves should run through.
A lot of computer users think they are okay because they are working out or whatever. Certainly that is a good thing and a huge help, however you have to be very careful about how you do it and what you do.
There's a general societal problem that our image of the ideal male body right now is focused on abs and pecs. That leads people to over-develop the anterior muscles. This is like poison for computer user's bodies, because you are already rounded forward and the anterior muscles are over-shortened. Doing a bunch of crunches and bench presses will just make this work and do nothing to develop the stabilizers that you need. In fact this kind of training can make you even more primed for injury because you're moving heavy weights around and doing extreme athletic things without good stabilizers and basic body geometry.
BTW I love that Wikipedia has an entry for Tramp Stamp .
I've had this idea forever but didn't want to write about it because I wanted to try it, and I hate writing about things before I try them. Anyway, I'm probably never going to do it, so here it is :
It's obvious that we are now at a point where we could use the actual optimal KLT for 4x4 transforms. That is, for a given image, take all the 4x4 blocks and turn them into a 16-element vector. Do the PCA to find the 16 bases vectors that span that 16-element space optimally. Tack those together to make a 16x16 matrix. This is now your KLT transform for 4x4 blocks.
(in my terminology, the PCA is the solve to find the spanning axes of maximum variance of a set of vectors; the KLT is the orthonormal transform made by using the PCA basis vectors as the rows of a matrix).
BTW there's another option which is to do it seperably - a 4-tap optimal horizontal transform and a 4-tap optimal vertical transform. That would give you two 4x4 KLT matrices instead of one 16x16 , so it's a whole lot less work to do, but it doesn't capture any of the shape information in the 2d regions, so I conjecture you would lose almost all of the benefit. If you think about, there's not much else you can do in a 4-tap transform other than what the DCT or the Hadamard already does, which is basically {++++,++--,-++-,+-+-}.
Now, to transform your image you just take each 4x4 block, multiply it by the 16x16 KLT matrix, and away you go. You have to transmit the KLT matrix, which is a bit tricky. There would seem to be 256 coefficients, but in fact there are only 15+14+13.. = 16*15/2 = 120. This because you know the matrix is a rotation matrix, each row is normal - that's one constraint, and each row is perpendicular to all previous, so the first only has 15 free parameters, the second has 14, etc.
If you want to go one step crazier, you could do local adaptive fitting like glicbawls. For each 4x4 block that you want to send, take the blocks in the local neghborhood. Do the PCA to find the KLT, weighting each block by its proximity to the current. Use that optimal local KLT for the current block. The encoder and decoder perform the same optimization, so the basis vectors don't need to be transmitted. This solve will surely be dangerously under-conditioned, so you would need to use a regularizer that gives you the DCT in degenerate cases.
I conjecture that this would rock ass on weird images like "barb" that have very specific weird patterns that are repeated a lot, because a basis vector will be optimized that exactly matches those patterns. But there are still some problems with this method. In particular, 4x4 transforms are too small.
We'd like to go to 8x8, but we really can't. The first problem is that the computation complexity is like (size)^8 , so 8x8 is 256 X slower than 4x4 (a basis vector has (size^2) elements, there are (size^2) basis vectors, and a typical PCA is O(N^2)). Even if speed wasn't a problem though, it would still suck. If we had to transmit the KLT matrix, it would be 64*63/2 = 2016 coefficients to transmit - way too much overhead except on very large images. If we tried to local fit, the problem is there are too many coefficients to fit so we would be severely undertrained.
So our only hope is to use the 4x4 and hope we can fix it using something like the 2nd-pass Hadamard ala H264/JPEG-XR. That might work but it's an ugly heuristic addition to our "optimal" bases.
The interesting thing about this to me is that it's sort of the right way to do an image LZ thing, and it unifies transform coding and context/predict coding. The problem with image LZ is that the things you can match from are an overcomplete set - there are lots of different match locations that give you the exact same current pixel. What you want to do is consider all the possible match locations - merge up the ones that are very similar, but give those higher probability - hmmm.. that's just the PCA!
You can think of the optimal local bases as predictions from the context. The 1st basis is the one we predicted would have most of the energy - so first we send our dot product with that basis and hopefully it's mostly correct. Now we have some residual if that basis wasn't perfect, well the 2nd basis is what we predicted the residual would be, so now we dot with that and send that. You see, it's just like context modeling making a prediction. Furthermore when you do the PCA to build the optimal local KLT, the eigenvalues of the PCA step tell you how much confidence to have in the quality of your prediction - it tell you what probability model to use on each of the coefficients. In a highly predictable area, the 1st eigenvalue will be close to 100% of the energy, so you should code predicting the higher residuals to be zero strongly; in a highly random area, the eigenvalues of the PCA will be almost all the same, so you should expect to code multiple residuals.
First of all : The number of non-zero values in the lower-diagonal area of the 8x8 block after quantization to a reasonable/typical value.
(36 of the 64 values are considered to be "lower diagonal" in the bottom right area) The actual number :
avg : 3.18 : 0 : 37.01 1 : 16.35 2 : 9.37 3 : 6.48 4 : 5.04 5 : 3.88 6 : 3.22 7 : 3.02 8 : 2.49 9 : 2.34 10 : 2.09 ...The interesting thing about this is that it has a very flat tail, much flatter than you might expect. For example, if the probability of a given coefficient being zero or nonzero was an indepedent random event, the distribution would be binomial; it peaks flatter and is then much faster to zero :
avg : 3.18 : 0 : 13.867587 1 : 28.162718 2 : 27.802496 3 : 17.775123 4 : 8.272518 5 : 2.986681 6 : 0.870503 7 : 0.210458 8 : 0.043037 9 : 0.007553 10 : 0.001150 ...What this tells us is the probability of a given coefficient being zero is highly *not* idependent. They are strongly correlated - the more values that are on, the more likely it is that the next will be on. In fact we see that if there are 6 values on, it's almost equally likely there are 7, etc. , that is : P(n)/P(n-1) goes toward 1.0 as n gets larger.
Also, amusingly the first two ratios P(1)/P(0) and P(2)/P(1) are both very close to 0.5 in every image I'm tried (in 0.4 to 0.6 generally). What this means is it wouldn't be too awful just to code the # of values on with unary, at least for the first bit (you could use something like an Elias Gamma code which uses unary at first then adds more raw bits).
Now for pretty pictures. Everyone has seen graphics like this, showing the L2 energy of each coefficient in the DCT : (none of these pictures include the DC because it's weird and different)
This shows the percentage of the time the value is exactly zero :
Now for some more interesting stuff. This shows the percent of correlation to the block above the current one : (to the north) :
Note in particular the strong correlation of the first row.
The next one is the correlation to the block to the left (west) :
Finally the fun one. This one shows the correlation of each coefficient to the other 63 coefficients in the same block :
The self-correlation is 100% which makes it a white pixel obviously. Black means 0% correlation. This is absolute-value correlation in all cases (no signs). There are a lot of patterns that should be pretty obvious to the eye. Beware a bit in over-modeling on these patterns because they do change a bit from image to image, but the general trend stays the same.
And another one from a different image :
This one's from Lena. A few things I think are particularly interesting - in the upper left area, which is where most of the important energy is, the correlation is most strong diagonally. That is, you see these "X" shape patterns where the center pixel is correlated mainly to it's diagonal neighbors, not the one's directly adjacent to it.
I rambled a bit before about how to store floating point images . I think I have the awesome solution.
First of all there are two issues :
Goal 1. Putting the floating point data into a format that I can easily run through filters, deltas from previous, etc. Even if you're doing lossless storage your want to be able to delta from previous and have the deltas make sense and preserve the original values.
Goal 2. Quantizing in such a way that the bits are used where the precision is wanted. Obviously you want to be sending bits only for values that are actually possible in the floating point number.
Also to be clear before I start, the issue here is with data that's "true floating point". That is, it's not just data that happens to be in floating point but could be equally well converted to ints inside some [min,max] range. For example, the coordinates of most geometry in video games really isn't meant to be floating point, the extra precision near zero is not actually wanted. The classic example where you actually want floating point is for HDR images where you want a lot of range, and you actually want less absolutely precision for higher values. That is, the difference between 999991 and 999992 is not as important as the difference between 1 and 2.
Now, we are going to be some kind of lossy storage. The loss might be very small, but it's kind of silly to talk about storing floating points without talking about lossy storage, because you don't really intend to have 23 bits of mantissa or whatever. To be lossy, we want to just do a simple linear quantization, which means we want to transform into a space where the values have equal significance.
Using some kind of log-scale is the obvious approach. Taking the log transforms the value such that even size steps in log-space are even multipliers in original space. That's good, it's the kind of scaling we want. It means the step from 1 to 2 is the same as the step from 100 to 200. The only problem is that it doesn't match the original floating point representation really, it's more continuous than we need.
What we want is a transform that gives us even size steps within one exponent, and then when the exponent goes up one, the step size doubles. That makes each step of equal importance. So, the quantizer for each exponent should just be Q ~= 2^E.
But that's easy ! The mantissa of the floating point that we already have is already quantized like that. We can get exactly what we want by just pretending our floating point is fixed point !
value = (1 + M)* 2^E
(mantissa with implicit one bit at head, shifted by exponent)
fixed point = {E.M}
That is, just take the exponent's int value and stick it in the significant bits above the mantissa. The mantissa is already quantized
for you with the right variable step size. Now you can further quantize to create more loss by right shifting (aka only keeping N bits of
the mantissa) or by dividing by any number.
This representation also meets Goal 1 - it's now in a form that we can play with. Note that it's not the same as just taking the bytes of a floating point in memory - we actually have an integer now that we can do math with and it makes sense.
when you cross an exponent :
1.99 = {0.1111}
2.01 = {1.0001}
So you can just subtract those and you get a sensible answer. The steps in the exponent are the correct place value to compensate for
the mantissa jumping down. It means we can do things like wavelets and linear predictors in this space.
Now there is a bit of a funny issue with negative numbers vs. negative exponents. First the general solution and what's wrong with it :
Negatives Solution 1 : allow both negative numbers and negative exponents. This creates a "mirrored across zero" precision spectrum. What you do is add E_max to E to make it always positive (just like the IEEE floating point), so the actual zero exponent is biased up. The spectrum of values now looks like :
(positives) real step size 3.M / 2.M / 1.M / 0.M / -1.M / -2.M / -3.M / (zero) + -3.M \ -2.M \ -1.M \ 0.M \ 1.M \ 2.M \ 3.M \ (negatives)What we have is a huge band of values with very small exponents on each side of the zero. Now, if this is actually what you want, then fine, but I contend that it pretty much never is. The issue is, if you actually have positives and negatives, eg. you have values that cross zero, you didn't really intend to put half of your range between -1 and 1. In particular, the difference between 1 and 2^10 is the same as the difference between 1 and 2^-10. That just intuitively is obviously wrong. If you had a sequence of float values like :
{ 2.0 , 1.8 , 1.4, 1.0, 0.6, 0.2, -0.1, -0.3, -0.6, -1.0, -1.4 }
That looks nice and smooth and highly compressible right? NO ! Hidden in the middle there is a *MASSIVE* step from 0.2 to -0.1 ; that seems benign but it's actually a step past almost your entire floating point range. (BTW you might be thinking - just add a big value to get your original floating poin tdata away from zero. Well, if I did that it would shift where the precision was and throw away lots of bits; if it's okay to add a big value to get away from zero, then our data wasn't "true" floating point data to begin with and you should've just done a [min,max] scale instead).
So I contend that is almost always wrong.
Negatives Solution 2 : allow negative numbers and NOT negative exponents. I content that you almost never actually want negative exponents. If you do want precision below 1.0, you almost always just want some fixed amount of it - not more and more as you get smaller. That can be represented better just with the bits of the mantissa, or by scaling up your values by some fixed amount before transforming.
To forbid negative exponents we make everything in [0,1.0] into a kind of "denormal". We just give it a linear scale. That is, we make a slightly modified representation :
Stored val = {E.M} (fixed point)
Float val =
if E >= 1 : (1 + M) *2^(E-1)
makes numbers >= 1.0
if E == 0 : M
makes numbers in [0.0,1.0)
(M is always < 1, that is we pretend it has a decimal point all the way to the left)
Now the low values are like :
0 : 0.0000
0.5 : 0.1000
0.99 : 0.1111
1.01 : 1.0001
Of course we can do negative values with this by just putting the sign of the floating point value onto our fixed point value, and crossing
zero is perfectly fine.
Bartosz made an interesting post about extending D for automatic multithreading with some type system additions.
It made me think about how you could do guaranteed-safe multithreading in C++. I think it's actually pretty simple.
First of all, every variable must be owned by a specific "lock". It can only be accessed if the current thread owns that lock. Many of the ownership relationships could be implicit. For example there is an implicit lock for every thread for stuff that is exlusively owned by that thread. That thread almost has that lock, so you never actually generate a lock/unlock, but conceptually those variables still have a single lock that owns them.
So, stack variables for example are implicitly automatically owned by the thread they are made on. Global variables are implicitly owned by the "main thread" lock if not otherwise specified. If some other thread tries to touch a global, it tries to take a lock that it doesn't own and you fail.
Lock gate1;
int shared : gate1; // specifies "shared" is accessed by gate1
int global; // implicitly owned by the main thread
void thread1()
{
int x; // owned by thread1
x = shared; // fail! you must lock gate1
{
lock(gate1);
x = shared; // ok
}
y = global; // fail! you don't own global
}
Mkay that's nice and all. Single threaded programs still just work without any touches, everything is owned by the main thread. Another goal I think of threading syntax additions should be that going from a large single threaded code base to adding a few threading bits should be easy. It is here.
There are a few things you would need to make this really work. One is a clean transfer of ownership method as Bartosz talks about. Something like auto_ptr or unique_ptr, but actually working in the language, so that you can pass objects from one owner to another and ensure that no refs leak out during the passage.
You can also of course extend this if you don't want the constraint that everything is protected by a lock. For example you could easily add "atomic" as a qualifier instead of a lock owner. If something is marked atomic, then it can be accessed without taking a lock, but it's only allowed to be accessed by atomic operations like cmpx/CAS.
This is a nice start, but it doesn't prevent deadlocks and still requires a lot of manual markup.
I also finally read a bit about Sun's Rock. It's very interesting, I encourage you to read more about it at the Sun Scalable Synchronization page.
Rock is actually a lot like LRB in many ways. It's 16 lightweight cores, each of which has 2 hardware threads (they call them strands). It's basically a simple in-order core, but it can do a sort of state-save push/pop and speculative execution. They have cleverly multi-purposed that functionality for both the Transactional Memory, and also just for improving performance of single threaded code. The state-save push-pop is a ghetto form of out-of-order-execution that we all know and love so much. It means that the chip can execute past a branch or something and then if you go the other way on the branch, it pops the state back to the branch. This is just like checkpointing a transaction and then failing the commit !
The key thing for the transactional memory is that Rock is NOT a full hardware transactional memory chip. It provides optimistic hardware transactions with some well designed support to help software transactional memory implementations. The optimistic hardware transactions basically work by failing to commit if any other core touches a cache line you're writing. Thus if you do work in cache lines that you own, you can read data, write it out, it gets flushed out of the cache to the global consistent view and there are no problems. If someone touches that cache line it invalides the transaction - even though it might not necessarilly need to. That's what makes it optimistic and not fully correct (there are other things too). If it allows a transaction through, then it definitely was okay to do, but it can fail a transaction that didn't need to fail.
There's a lot of problems with that, it can fail in cases that are perfectly valid transactions, so obviously you cannot rely on that as a TM implementation. However, it does let a lot of simple transactions successfully complete quickly. In particular for simple transactions with no contention, the optimistic hardware transaction completes no problemo. If it fails, you need to have a fallback mechanism - in which case you should fall back to your real STM implementation, which should have forward progress guarantees. So one way of look at the HTM in Rock is just a common case optimization for your STM software. The commit in Rock has a "jump on fail" so that you can provide the failure handling code block; you could jump back to your checkpoint and try again, but eventually you have to do something else.
Perhaps more interestiongly, the HTM in Rock is useful in other ways too. It gives you a way to do a more general ll-sc (load linked store conditional) kind of thing, so even if you're not using STM, you can build up larger "atomic" operations for your traditional lock/atomic C++ style multithreading. It can also be used for "lock elision" - avoiding actually doing locks in traditional lock-based code. For example if you have code like :
lock(CS)
x = y;
unlock(CS)
that can be transparently converted to something like :
checkpoint; // begin hardware transaction attempt
x = y;
commit // try to end hardware transaction, if fails :
failure {
lock( CS)
x = y;
unlock(CS)
}
So you avoid stalling if it's not needed. There are of course tons of scary subtleties with all this stuff. I'll let the Sun guys
work it out.
It's also actually a more efficient way of doing the simple "Interlocked" type atomic ops. On x86 or in a strongly ordered language (such as Java's volatiles) the Interlocked ops are fully sequentially consistent, which means they go in order against each other. That actually is a pretty hard operation. We think of the CMPX or CAS as being very light weight, but it has to sync the current core against all other cores to ensure ops are ordered. (I wrote about some of this stuff before in more detail in the threading articles). The "transaction" hardware with a checkpoint/commit lets your core flow through its ops without doing a big sync on interlocked ops. Now, obviously the checkpoint/commit itself needs to be synchronized against all other cores, but it's done on a per cache-line basis, and it uses the cache line dirtying communication hardware that we need anyway. In particular in the case of no contention or the case of doing multiple interlocked ops within one cache line, it's a big win.
I'm sure that Sun will completely cock things up somehow as they always do, but it is very interesting, and I imagine that most future chips will have models somewhat like this, as we all go forward and struggle with the issue of writing software for these massively parallel architectures.
There's a lot of random little shit in C that's technically "undefined" (meaning the compiler/hardware are allowed to do anything they want). Basic stuff like right shifting signed values or doing addition that overflows or casting between different sizes/types of ints.
It's fucking retarded. It's a dangerous and pointless cop out. C is supposed to be the low-level systems language, it should have ways of clearly exposing the actual function of the system.
First of all, there should just be a "Normal C" machine spec that clearly specifies the behavior that 99% of current hardware has (like right shifting signed values shifts in sign bits). Then we could just say "this platform is a Normal C compliant platform".
But even aside from that, just saying "it's undefined" is a horrible way to support varying platforms. What you should do is have requirements and contracts as meta-code in the programs. A program might start out as only min-spec C. In that case, any use of undefined behavior should be a compile error! You tried to do something the platform does not offer, you fail compile!
Then, if the program needs certain things, it can add them to its requirements list, like "I need signed right shift to behave like this". Now that program will only compile on systems that provide that. The needs of the code are clearly listed and the contract is enforced. It should never be possible to access undefined behavior. The bad behavior should either be forbidden, or it should be enforced to do something specific.
All good robust software should be written this way. It's unforgiveable that our basic language tool isn't.
Instead we have a situation where someone can write code like :
int64 a = ...;
uint32 b = (uint32) ( a >> 16 );
WTF does that do? Does it work on this new platform I'm trying to compile on ?
Of course you/me as a user programmer can help the situation by putting in lots of in-line unit tests, like :
int64 a = ...;
uint32 b = (uint32) ( a >> 16 );
UNIT_TEST( (uint32) ( ((int64)-1) >> 16 ) == 0xFFFFFFFF );
I fucking despise Michael Pollan and all the modern "foodies". However, I am occasionally quite grateful that I can partake of the purveyance that they have made possible. For example : Trader Joe's now carries Burrata (cream filled fresh mozarella balls). It's enough to make a lactard like myself endure the searing stomach pains. Also, I get to go to this crazy Mangalitsa Pig Feast .
I feel like I should explain and justify my hate a bit more.
Why my hate for Pollan - it's mainly his delivery. Everything he says is basically true, though he does overblow things rather a bit, but it's all really fucking obvious, it's like anybody with half a fucking brain knows that already, and people without a brain aren't actually learning, they're just blindly following the new preacher instead of the old one. The thing that tilts me is his condescending fucking holier than thou smirk, you can see the look on his face like he things he's saying something so fucking clever. No you're not.
Why my hate for "foodies" - they're just another niche of yuppie bourgeosie scum. They're snobby, they focus on the brands and labels. Oh you made porchetta, was it with Mangalitsa or Kurobuta pork? Oh, you made it with that factory stuff, oh well, no thanks then. Fuck you fucking status-conscious bitches. They're frequently condescending while often being incredibly wrong because they learn from fucktards like Food Network; they'll see things like "oh you bought fat asparagus, I only buy the thin stuff" well fuck you fucking condescending dicktard the fat stuff is the good stuff when it's the first spring growth. They're gossips and trend followers; they read the blogs and are always cooking things in the new "right way" to do things. They don't have real love for the ingredients and the history and the flavors and the technique. They spend a fortune on fancy cookware and knives but don't know how to julienne or rock the blade.
In the end the reason why I hate foodies so much is that they're so close to something that I love so dearly, and yet they spoil everything good about it. And they sometimes make food that's better than mine and that really does it.
I wrote a while ago about DOT, the graphviz svg thing.
I thought I'd write a quick advice on using it from what I've learned. The manual dotguide.pdf and the online help are okay, so start there, but keep this in mind :
DOT was designed to make graphs for printing on paper. That leads to some weird quirks. For one thing, all the sizes are in *inches*. It also means that all the fitting support and such is not really what you want, so just disable all of that.
I have the best success when I use the Google method (what I stole from pprof in perftools). Basically they just set the font size and then let DOT make all the decisions about layout and sizing. There's one caveat in that : the way that DOT writes its font size is not supported right by FF3. See : 1 or 2 . There's a pretty easy way to fix this, basically for every font size tag, you need to add a "px" on it. So change "font-size:14.00;" to "font-size:14.00px;" . This change needs to be done on the SVG after dot. I do it by running a grep to replace ",fontcolor=blue" with "px". So in the DOT code I make all my text "blue", it's not actually blue, the grep just changes that to the px I need for my font sizes. So you'll see me output text attributes like "fontsize=24,fontcolor=blue".
The other big thing is that DOT seems to have zero edge label layout code. And in fact the edge layout code is a bit weak in general. It's pretty good at sizing the nodes and positioning the nodes - it has lots of fancy code for that, but then the edge labels are just slapped on, and the text will often be in horrible places. The solution to this is just to use edge labels as little as possible and make them short.
Another trick to getting better edge positioning is to explicitly supply the edge node ports and put them on a record. I do this in my huffman graphs. The other good edge trick is to use edgeweight. It doesn't change the appears of the edge, it changes how the edge is weighted for importance in the layout algorithm, so it becomes straighter and shorter.
For educational purposes I just made a little program to parse MSVC /showIncludes output to DOT :
my dotincludes zip .
dotincludes sample output from running on itself.
Here are some Huffman trees (SVG - you need a FireFox 3+ to view nicely) optimized with various Lagrange multipliers :
Remember it's control-wheel to zoom SVG's in FF.
.. okay since I got started on this, a few other notes. This is ancient stuff, in fact it's been in the "Huffman2" in crblib for 10+ years, and it used to be common knowledge, but it's some of that art that's sort of drifted away.
For one thing, Huffman codes are specified only by the code lens. You don't need to store the bit patterns. Of course then you need a convention for how the bit patterns are made.
The "canonical" (aka standard way) to make the codes is to put numerically increasing codes in order with numerically increasing symbols of the same code length. I helps to think about the bit codes as if the top bit is transmitted first , even if you don't actually do that. That is, if the Huffman code is 011 that means first we send a 0 bit, then two ones, and that has value "3" in the integer register.
You can create the canonical code from the lengths in O(N) for N symbols using a simple loop like this :
for( len between min and max )
{
nextCode = (nextCode + numCodesOfLen[len]) << 1;
codePrefix[len] = nextCode;
}
for( all symbols )
{
code[sym] = codePrefix[ codeLen[sym] ];
codePrefix[ codeLen[sym] ] ++;
}
this looks a bit complex, but it's easy to see what's going on. If you imagine that you had your symbols sorted
first by code length, and then by symbol Id, then you are simply handing out codes in numerical order, and shifting
left one each time the code length bumps up. That is :
curCode = 0;
for( all symbols sorted by codelen : )
{
code[sym] = curCode;
curCode ++;
curCode <<= codeLen[sym+1] - codeLen[sym];
}
An example :
code lens :
c : 1
b : 2
a : 3
d : 3
[code=0]
c : [0]
code ++ [1]
len diff = (2-1) so code <<= 1 [10]
b : [10]
code ++ [11]
len diff = (3-2) so code <<= 1 [110]
a : [110]
code ++ [111]
no len diff
d : [111]
very neat.
The other thing is to decode you don't actually need to build any tree structure. Huffman trees are specified entirely by the # of codes of each length, so you only need that information, not all the branches. It's like a balanced heap kind of a thing; you don't actually want to build a tree structure, you just store it in an array and you know the tree structure.
You can do an O(H) (H is the entropy, which is the average # of bits needed to decode a symbol) decoder using only 2 bytes per symbol (for 12 bit symbols and max code lengths of 16). (note that H <= log(N) - a perfectly balanced alphabet is the slowest case and it takes log(N) bits).
a node is :
Node
{
CodeLen : 4 bits
Symbol : 12 bits
}
You just store them sorted by codelen and then by code within that len (aka "canonical Huffman order"). So they are sitting in memory in same order you would draw the tree :
0 : c 10 : b 11 : a would be [ 1 : c ] [ 2 : b ] [ 2 : a ]then the tree descent is completely determined, you don't need to store any child pointers, the bits that you read are the index into the tree. You can find the node you should be at any time by doing :
index = StartIndex[ codeLen ] + code;StartIndex[codeLen] contains the index of the first code of that length *minus* the value of the first code bits at that len. You could compute StartIndex in O(1) , but in practice you should put them in a table; you only need 16 of them, one for each code len.
So the full decoder looks like :
code = 0;
codeLen = 0;
for(;;)
{
code <<= 1;
code |= readbit();
codeLen ++;
if ( Table[ StartIndex[ codeLen ] + code ].codeLen == codeLen )
return Table[ StartIndex[ codeLen ] + code ].Symbol
}
obviously you can improve this in various ways, but it's interesting to understand the Huffman tree structure this way.
The thing is if you just list out the huffman codes in order, they numerically increase :
0 10 110 111 = 0, 2, 6,7If you have a prefix that doesn't yet decode to anything, eg if you have "11" read so far - that will always be a code value that numerically doesn't exist in the table.
Finally, if your symbols actually naturally occur in sorted order, eg. 0 is most likely, then 1, etc.. (as they do with AC coefficients in the DCT which have geometric distribution) - then you can do a Huffman decoder that's O(H) with just the # of codes of each length !
I know of three fast ways to track the mean of a variable. At time t you observe a value x_t. You want to track the mean of x over time quickly (eg. without a divide!). I generally want some kind of local mean, not the full mean since time 0.
1. "weighted decay". You want to do something like M_t = 0.9 * M_t-1 + 0.1 * x_t;
int accum = 0; accum := ((15 * accum)>>4) + x_t; mean = (accum>>4);
This is the same as doing the true (total/count) arithmetic mean, except that once count gets to some value (here 16) then instead of doing total += x_t, instead you do total = (15/16) * total + x_t; so that count sticks at 16.
2. "sliding window". Basically just track a window of the last N values, then you can add them to get the local mean. If N is power of 2 you don't divide. Of course the fast way is to only track inputs & exits, don't do the sum all the time.
int window[32]; int window_i = 0; int accum = 0; accum += x_t - window[window_i]; window[window_i] = x_t; window_i = (window_i + 1) & 0x1F; mean = (accum>>5);
3. "deferred summation". This is the technique I invented for arithmetic coding long ago (I'm sure it's an old technique in other fields). Basically you just do a normal arithmetic mean, but you wait until count is a power of two before doing the divide.
int accum = 0;
int next_count = 0;
int next_shift = 3;
accum += x_t;
if ( --next_count == 0 )
{
mean = accum >> next_shift;
accum = 0;
next_shift = MIN( 10 , next_shift+1 );
next_count = 1 << next_shift;
}
mean is not changing during each chunk, basically you use the mean from the last chunk and wait for another power-of-2 group to be done before
you update mean. (also this codelet is resetting accum to zero each time but in practice you probably want to carry over some of the last accum).
For most purposes "deferred summation" is actually really bad. For the thing I invented it for - order 0 coding of stable sources - it's good, but that application is rare and not useful. For real context coding you need fast local adaptation and good handling of sparse contexts with very few statistics, so delaying until the next pow-2 of symbols seen is not okay.
The "weighted decay" method has a similar sort of flaw. The problem is you only have one tune parameter - the decay multiplier (which here became the shift down amount). If you tune it one way, it adapts way too fast to short term fluctuations, but if you tune it the other way it clings on to values from way way in the past much too long.
The best almost always is the sliding window, but the sliding window is not exactly cheap. It takes a lot of space, and it's not awesomely fast to update.
There have been a number of interesting works in the last few years about different ways to do coding. Some refereneces :
L. Öktem : Hierarchical Enumerative Coding and Its Applications in Image Compressing, Ph.D. thesis
"Group Testing for Wavelet-Based Image Compression"
"Entropy Coding using Equiprobable Partitioning"
"Binary Combinatorial Coding"
The common pattern in all these is the idea of coding probabilistic events using counting. Basically, rather than taking some skew-probability event and trying to create a code whole length matches the probabilities like L = - log2(P) , instead you try to combine events such that you create coding decision points that have 50/50 or at least power of 2 probabilities.
Let's look at a simple example. Say you want to code a bunch of binary independent events, some 001100 sequence with a certain P(1) and P(0) that's constant. For now let's say p = P(1) is very low. (this is common case in image coding, or really any predictive coding where the prediction is very good - a 1 means the prediction was wrong, 0 means it was right; eg. high bands of wavelets are like this (though they aren't independent - more on that later)).
If you just try to code each bit one by one, the average length should be much less that one and you'd have to arithmetic code or something. To create an event that's just 50/50 we code a group of symbols. We want to make the probability that the whole group is off be close to 50%, so :
0.50 = P(0) ^ N N = log(0.50) / log(P(0)) = -1 / log2( P(0) ) for example if P(1) = 5% , then N = 13.5so you get 13 or 14 symbols then code are they all on or off with just 1 raw bit. That's an efficient coding operation because by design you've made that a 50/50 event. Now we have a bit of a puzzle for the next step. If you sent "it's all off" then you're done, but if you sent "something is on" you now have to send more data to specify what is on - but you have already sent some information by sending that something is on so you don't want to just waste that.
There's some crazy difficult combinatorics math to do if you wanted to figure out what the next event to code would be that was a perfect 50/50, but in the end our hand is forced because there's not really anything else we can do - we have to send a bit to say "is the first half on?". I'll present the group testing method without justification then try to justify it after the fact :
You know something in the group of N is on. First send a bit for whether the first half ([N/2]) range is on. If that's on, recurse onto that range and repeat. If the first half was on, you still have to move to your neighboring (N/2) which you don't know if it's on or off. If the first half was off, then move to the the neighbor, he must be on, so recurse into him without sending a bit.
Now the justification. First see if there was only 1 value on in the range N, then it would have equal probability of being in the first or second half of the group, so it would be correct to send a bit to specify if it was in the first or second half - that would be equivalent to singalling if the first half is active. Now note that by design of our group size it's by far most likely that there is only 1 value on. For our example case above where P(1) = 5%, the chance of there being exactly 1 value on is around 70%. In that case our code is perfect; for higher values on it's not ideal, but those are less likely, so the net loss is small.
(BTW the binary code to locate a single item in a group is just the binary coding of the index; eg. first you send a bit to indicate if it's in the top half or bottom half, etc.)
The enumerative coding approaches take a similar but slightly different tack. Basically they code the # of values on in the sequence. Once you have the # of values on, then all arrangements are equally likely, so you just need to code the permutation id with a fixed length equiprobable code.
Say for example you want to code 16 binary events
You first send the # on.
If there's 1 on, there are 16 ways for that to happen - just send 4 bits
If there're 2 on, there are 16*15/2 = 120 ways
you should send a 6.906 bit code, but you can just send a 7 bit code
If there're 3 on, you want to send one of 16*15*14/6 = 560 ways
a 10 bit code would suck here, so you want a simple prefix code
that usually writes 9 bits and occasionally 10
This is pretty efficient, particularly when the probability of a bit being on is very low so that you are usually in the low-count cases.
In the high count cases it starts to suck because counting the permutations gets pretty complex. The other problem with this method is
how to encode the # that are on. The problem is the probability of some # being on is a binomial distribution. There's no simple
fixed-bit code for a binomial distribution (I don't think).
Now, for N large, the binomial distribution approaches a normal or geometric distribution. In that case we can use a Golomb code for the # of values on. Furthermore for N large, the coding inefficiency of the # on becomes less important. In the real world I don't think N large is very practical.
Anyhoo I'm not sure any of this is super useful because in practice bits are never independent, so you want to do context coding to capture that, and it's a mess to do with these methods. The GTW (Group Testing for Wavelets) guys capture context information by classifying each sample into classes of similar expected statistics, and also by interleaved scanning so that coefficients with relations are not put into the same class together. The result is impressive, they get compression comparable to ECECOW, but all the work of shuffling values around and multiple scans means they're surely slower than just arithmetic coding.
BTW not related but while I was looking at this I was confused by the following stupidity :
Exponential Distribution == Laplacian == Bernoulli Process == Geometric All of them are P(n) ~= r ^ n (or P(n) ~= e ^ ( - lambda * n ) if you prefer) ~ means proportional, ignoring normalization. (Poisson is similar but different - P(n) ~= r ^ n / n! ) BTW the maximum likelihood estimate of the laplacian parameter is the L1 norm P(n) ~= e ^ ( - lambda * n ) ; lambda = 1 / (L1 norm of samples)yeah yeah some of them are continuous and some discrete, some are one sided, some two sided, but people in compression play fast & loose with all of those things, so it's very confusing when you switch from one paper to another and they switch terms. Most people in compression describe the AC coefficients as having a "Laplacian" distribution.
BTW Golomb codes are optimal for coding Laplacian/Geometric distributions.
Mean of geometric distribution with ratio r is = r / (1.0 - r)
(r given the mean is r = m / (1+m) )
The optimal Golomb parameter k satisfies the relation :
r^k + r^(k+1) <= 1 < r^k + r^(k-1)
or
r^k <= 1/(1+r) < r^(k-1)
You can find this k a few ways :
k = froundint( 0.5 - log(1+r)/log(r) );
k = ceil( log(1+r)/log(1/r) );
or
int GetK(double r)
{
double pk = p;
for(int k =1;;k++)
{
if ( (1.0 + p)*pk <= 1.0 )
return k;
pk *= p;
}
}
and furthermore we often use the nice approximation
k = froundint( mean )
however, that is not really correct for large mean. It works fine for mean up to around 5.
At mean = 6 , the true optimum k is 5
true entropy : 4.144196
k=5 , Golomb H : 4.174255
k=6 , Golomb H : 4.219482
you can see at k=5 Golomb does very well matching entropy, but using the incorrect k= mean it gets off.
At larger mean it's much worse -
at mean = 19, the correct k is = 14
true entropy : 5.727806
golomb_k : 14 , H : 5.761475
golomb_k : 19 , H : 5.824371
At low mean of course Golomb gets bad because it can't code the 0 in less than 1 bit. You would obviously
use run-lenghts or something to fix that in practice. Golomb works very well down to a mean of about 1.0
(that's a geometric r of 0.5).
One neat thing about geometric distributions is if you code the first symbol a different way, the remainder is still geometric. So you can code the 0 symbol with RLE, then if it's not 0, you code the symbols >= 1 with Golomb, and it's still geometric with the same r and k. (though Golomb is still bad then it's being done less often).
BTW I tried expressing k in terms of the mean : k = ceil( log(1+r)/log(1/r) ); k = ( log( (1+2m) / (1+m))/log((1+m)/m) ) k = ( log( 1+2m ) - log(1+m) ) / ( log (1+m) - log(m) ) which is sort of handy if you have 'm' as an integer you can use table lookups for the logs and just do one divide, but it's not really super simple. what I'd like is just a Taylor expansion of this that works for m < 30 or so. In particular it should be something like k = m - 0.2 * m^2 (but not this) k is like m for m small, but then it needs to correct down. I couldn't work it out, maybe you can.
For the record, you can do a very fast Huffman decoder by always reading 8 bits and storing the bit-reading state in the table pointer. The way you do this is by creating a seperate 8-bit lookup table for each possible current bit-reading state.
You have some # of bits [0,8] that have not yet been used for output. For each # of bits there are various values those bits can have, for 0 bits its {} , for 1 bit it's {0,1}, etc. So your total number of states is 256+128+64... = 512.
So at each step you have your current table pointer which is your decode table plus holds your current bit buffer state. You read 8 more bits and look up in that table. The table tells you what symbols can be made from the combination of {past bit buffer + new 8 bits}. Those symbols may not use up all the bits that you gave it. Whatever bits are left specify a new state.
I'll show an example for clarity :
Say you read 3 bits at a time instead of 8 (obviously we picked 8 because it's byte aligned)
If your Huffman code is :
0 - a
10 - b
11 - c
You make these tables :
{no bits} :
000 : aaa + {no bits}
001 : aa + {1}
010 : ab + {no bits}
011 : ac + {no bits}
100 : ba + {no bits}
101 : b + {1}
110 : ca + {no bits}
111 : c + {1}
{1} : (means ther's 1 bit pending of value 1)
000 : baa + {no bits}
001 : ba + {1}
010 : bb + {no bits}
011 : bc + {no bits}
100 : caa + {no bits}
101 : ca + {1}
110 : cb + {no bits}
111 : cc + {no bits}
The decoder code looks like this :
struct DecodeTableItem
{
U8 numOut;
U8 output[MAX_OUT];
DecodeTableItem * nextTable;
};
DecodeTableItem * pTable = table_no_bits_pending;
for(;;)
{
U8 byte = *inptr++;
int num = pTable[byte].numOut;
for(int i=0;i < num;i++)
*outptr++ = pTable[byte].output[i];
pTable = pTable[byte].nextTable;
}
It's actually a tiny bit more complicated than this because you have to handle codes longer than 8 bits. That just means "numOut" is 0 and you have to fall out to a brute force loop. You could handle that case in this same system if you make tables for 9,10,11+ bits in the queue, but you don't want to make all those tables. (you also don't actually need to make the 8 bits in the queue table either). The "numOut = 0" loop will read more bytes until it has enough bits to decode a symbol, output that symbol, and then there will be 0-7 remainder bits that will select the next table.
BTW in practice this is pretty useless because we almost never want to do one giant huffman array decode, we have our huffman codes interleaved with other kinds of coding, or other huffman trees, dependent lookups, etc. (this is also useless because it's too many tables and they are too slow to make on the fly unless you are decoding a ton of data).
BTW old-school compression people might recognize that this is basically a Howard-Vitter Quasi-Arithmetic coder. In the end, every decompressor is just a state machine that's fed by input bits or bytes from the coded stream. What we are doing here is basically explicitly turning our algorithm into that state machine. The Howard-Vitter QA did the same thing for a simplified version of arithmetic coding.
There are a lot of wins for explicitly modeling your decoder as a state machine triggered on input bits. By design of the compressor, the bits in the coded stream are random, which means any branch on them in the decoder is totally unpredictable. You have to eat that unpredictable branch - but you only want one of them, not a whole bunch. In normal function-abstracted decompressors you read the bits in some coder and generate a result and then act on that result outside - you're essentially eating the branch twice or more.
A typical arithmetic decoder does something like :
Read bits from stream into "Code"
Test "Code" against Range and probabilities to find Symbol
-> this is the main branch on the bit stream
if ( Range too small )
do Renormalization
branch on Symbol to update probabilities
do conditional output using Symbol
eg. Symbol might be a runlength or something that makes us branch again
The state-based binary arithmetic coders like the QM or MQ coder combine the bit-reading, renormalization, and model updates into a single
state machine.
We just did an interesting thing at RAD that's kind of related to what I've been writing about.
A while ago Dmitry Shkarin (ppmii inventor) posted this code sketch for his Huffman decoder :
Dmitry Shkarin's Huffman reader :
Suppose We have calculated lengths of rrHuffman codes and minimal
codelength is N then We can read N bits and to stop for most probable
symbols and to repeat reading for other symbols. Decoding procedure will
be something similar to:
extern UINT GetBits(UINT NBits);
struct HUFF_UNPACK_ITEM {
INT_TYPE NextTable, BitsToRead;
} Table[2*ALPHABET_SIZE];
inline UINT DecodeSymbol()
{
const HUFF_UNPACK_ITEM* ptr=Table;
do {
ptr += ptr->NextTable+GetBits(ptr->BitsToRead);
} while (ptr->BitsToRead != 0);
return ptr->NextTable;
}
this will seem rather opaque to you if you don't know about Huffman codes; you can ignore it and read on and
still get the point.
Dmitry's decoder reads the minimum # of bits to get to the next resolved branch at each step, then increments into the table by that branch. Obviously the value of the bits you read is equal to the branch number. So like if you read two bits, 00 = 0, 01 = 1, 10 = 2, 11 = 3 - you just add the bits you read to the base index and that's the branch you take.
Okay, that's pretty simple and nice, but it's not super fast. It's well known that a good way to accelerate Huffman decoding is just to have a big table of how to decode a large fixed-size bit read. Instead of reading variable amounts, you always just read 12 bits to start (for example), and use that 12 bit value to look up in a 4096 member table. That table tells you how many bits were actually needed and what symbol you decoded. If more than 12 bits are needed, it gives you a pointer to a followup table to resolve the symbol exactly. The crucial thing about that is that long symbols are very unlikely (the probability of each symbol is like 2^-L for a bit length of L) so you rarely need the long decode path.
It's pretty obvious that you could extend Dmitry's method to encompass read-aheads like this acceleration table. Instead of just doing GetBits on "BitsToRead" , instead you scan ahead BitsToRead , and then when you take a path you add an extra field like "BitsConsumed" which tells you how many of those bits were actually needed. This lets you make initial jump tables that read a whole bunch of bits in one go.
More generally, in the tree building, at any point you could decide to make a big fast node that wastes memory, or a small binary treeish kind of node. This is kind of like a Judy tree design, or a Patricia Trie where the nodes can switch between linked-lists of children or an array of child pointers. One nice thing here is our decoder doesn't need to switch on the node type, it always uses the same decode code, but the tree is just bigger or smaller.
To be concrete here's a simply Huffman code and possible trees for it :
Huffman codes :
0 : a
10 : b
110 : c
111 : d
Possible trees :
(* means read one bit and branch)
Tree 1)
*----- a
\
*--- b
\
*- c
\
d
4 leaves
3 internal nodes
= 7 nodes total
Tree 2)
[2bit]
[ 00 ]--- a
[ 01 ]--- a
[ 10 ]--- b
[ 11 ]- *- c
\ d
5 leaves
2 internal nodes
= 7 nodes total
Tree 3)
[3bit] -
[ 000 ] -- a
...
[ 110 ] -- c
[ 111 ] -- d
8 leaves
1 internal node
= 9 nodes total
Tree 4)
*------- a
\
[2bit]
[ 00 ] - b
[ 01 ] - b
[ 10 ] - c
[ 11 ] - d
5 leaves
2 internal nodes
= 7 nodes total
We have these four trees. They have different memory sizes. We can also make an estimate of what the decode time for each tree
would be. In particular for the case of Huffman decoding, the expected time is something like the number of branches weighted by
2^-depth of each branch. Reading more bits in a given branch isn't significantly slower than reading 1 bit, we just want as few
branches as possible to decode a symbol.
I'm going to get away from the Huffman details now. In general when we are trying to make fast data structures, what we want is as much speedup as possible for a given memory use. Obviously we could throw 64k of table slots at it and read 16 bits all at once and be very fast. Or we could use the minimum-memory table. Usually we want to be somewhere in between, we want a sweet spot where we give it some amount of memory and get a good speedup. It's a trade off problem.
If we tried all possible trees, you could just measure the Mem use & Time for each tree and pick the one you like best. You would see there is some graph Time(Mem) - Time as a function of Mem. For minimum Mem, Time is high, as you give it more Mem, Time should go down. Obviously that would be very slow and we don't want to do that.
One way to think about it is like this : start with the tree that consumes minimum memory. Now we want to let it have a little bit more memory. We want the best bang-for-the-buck payoff for that added memory, so we want the tree change that gives us the best speedup per extra byte consumed. That's the optimum d(Time)/d(Mem). Keep doing improvements until d(Time)/d(Mem) doesn't give you a big enough win to be worth it.
Some of you may already be recognizing this - this is just a Rate-Distortion problem, and we can solve it efficiently with a Lagrange multiplier.
Create a Lagrange cost like :
J = Time + lambda * MemNow try to build the tree that minimizes J for a given lambda. (ignore how lambda was picked for now, just give it some value).
If you find the tree that minimizes J, then that is the tree that is on the optimal Mem-Time curve at a certain spot of the slope d(Time)/d(Mem).
You should be able to see this is true. First of all - when J is minimum, you must be on the optimal Time(Mem) curve. If you weren't, then you could hold Mem constant and improve Time by moving towards the optimal curve and thus get a lower J cost.
Now, where are you on the curve? You must be at the spot where lambda = - d(Time)/d(Mem). One way to see this is algebraically :
J is at a minimum, therefore : d(J)/d(Mem) = 0 d(J)/d(Mem) = d(Time)/d(Mem) + lambda * 1 = 0 therefore lambda = - d(Time)/d(Mem)You can also see this intuitively. If I start out anywhere on the optimal Time(Mem) curve, I can improve J by trading Mem for Time as long as the gain I get in Time is exceeding lamda * what I lose in Mem. That is, if d(Time) > - lambda * d(Mem) , then I should do the step. Obviously you keep doing that until they are equal. QED.
Since I'm nice I drew a purty picture :
The blue curve is the set of optimal solutions for various Mem parameters. The hard to see yellow tangent is a lambda parameter which is selecting a specific spot on the trade-off curve. The green region above the curve is the space of all possible solutions - they are inefficient solutions because you can improve them by getting to the blue curve. The red region is not possible.
This kind of lagrange space-time optimization has all sorts of applications in games. One example would be your spatial acceleration structures, or something like kD trees or BVH hierarchies for ray tracing. Too often we use hacky heuristics to build these. What you should really do is create a lagrange cost that weighs the cost of more memory use vs. the expected speedup.
One of the nice wins of this approach is that you can often get away with doing a greedy forward optimization for J (the lagrange cost), and it's just a single decision as you build your tree. You just evaluate your current choices for J and pick the best one. You do then have to retry and dial lambda to search for a given Mem target. If you didn't use the lagrange multiplier approach, you would have to try all the approaches and record the Time/Mem of every single possibility, then you would pick the one that has the best Time for the desired Mem.
In the past I've talked about algorithms being dumb because they are "off the curve". That means they are in the green region of the picture - there's just no reason to use that tradeoff. In general in algorithms you can't fault someone for selecting something on the blue curve. eg. a linked list hash table vs. a reprobing hash table might be at different spots on the blue curve - but they're both optimal. The only time you're being really retarded is when you're way out in green space of wanton inefficiency.
Back to the specifics of the Huffman decoder - what we've found is kind of interesting and maybe I'll write about it in more detail later when we understand it better. (or it might be one of our magic secrets).
ADDENDUM : I should be clear that the with the J-cost optimization you still have to consider greedy tree building vs. optimal tree building. What you're gaining is a way to drive yourself towards the blue curve, but it's not like the tree building is suddenly super easy. You also have to deal with searching around in the lagrange multiplier to hit the target you want. For many practical problems you can create experimental heuristics that will give you a formula for the lambda that gives you a certain mem use or rate or whatever.
When we make advances in the art, there are two main ways we do it. One is to push farther along the blue curve than anyone has before. For example in data compression we usually have a run time vs. compression curve. You can run slower and slower algorithms and get more and more compression. You might find a new algorithm that runs even slower and gets even more compression than anyone has before. (PAQ is playing this game now; I used to play this game with PPMZ back in the day). That extends the curve out into areas unexplored. The other advance is to find something that's below the previously known blue curve. You find a way to get a better trade-off and you set a new optimal curve point. You might find a new compressor that gets the same compression ratio as old stuff, but runs way faster.
There's a big gap in all technical literature. You can pretty easily find hand-wavey overviews of things for non-specialists in any field that make your feel all happy, but don't get into any details so they're useless if you actually need to do something in that field. Then there's also an abundance of technical papers that are steeped in the terminology and notation of that particular sub-field and are completely opaque. It's almost impossible to find transition material.
One case of this for me was so-called "Trellis Quantization". Part of the problem was that there's a whole body of old literature on "trellis coded quantization" which is an entirely different thing. TCQ is about designing non-linear or VQ quantizers for sources. Almost all of that research from the old days is totally worthless, because it assumed *NO ENTROPY CODING* on the post-quantization coefficients. Thus it was trying to make quantization buckets which had equal probability. Even at the time that research was happening it was already retarded, but it's especially retarded now. We know that in the presence of an entropy coder, independent variables are R-D optimal with a fix-step-size deadzone quantizer.
But in the real world our entropy coder is not independent. That is, we use things like run length codes, or adaptive arithmetic coders, or other schemes, such that the way we code one variable affects the code length for the next variable. This is where so-called "trellis quantization" in the modern sense comes in.
I really hate the modern use of the term "trellis quantization" because it's really not about the trellis or the quantization. A better term would be "dynamic programming code stream output optimization". If somebody had just told me that's what it was at the beginning it would have saved me weeks of confusion. It's basically the same thing you do in an LZ optimal parser (though not exactly).
This technique is used mainly in lossy image coders to optimize the coding of a certain bunch of pixels. It can be used for bitplane truncation in zerotree type coders, but it's mainly used for block-transforms, and mainly for video coders (it was made prominent in H263 and is used in H264).
Basically it goes like this : for a given block, you do normal quantization and you get a bunch of coefficients like {17,3,9,4,0,1,2,0}. You could just output that, and you would have a certain rate and distortion. But you could also change some of those coefficients. That would dicrease your rate and increase your distortion. (note that the increase in distortion may be very small in some cases - if the original values were very close to a quantization bucket edge, then you can shift them without much pain). You might be able to output a better Rate-Distortion optimal block by choosing some other output, such as {17,3,9,4,0,1,1,0} or {17,0,8,4,0,0,0,0} depending on where you are on the R-D curve.
The "trellis" comes in because you don't need to consider all possible outputs. It's easiest to think of like this :
Say you have 4 binary coding decisions. That is, you could make choices 0000 or 0001, etc. there are 16 possible choices. Naively, you would have to consider all 16 sequences and rate each one and pick the best. If you draw this as a graph, it looks like a tree - each node has two kids, it branches out and you have 16 leaves. But in many coding scenarios, your current coding cost does not depend on the entire history of your sequence - it only depends on the current state. For example, say you are doing order-1 context modeling and binary arithmetic encoding. Then there is a cost to encode a 0 after a 0, a 0 after a 1, a 1 after 0 and a 1 after a 1 (c00,c01,c10,c11). Each path in the graph is a coding action. The graph you need to consider is like this :
[ 1 ]----[ 1 ]----[ 1 ]----[ 1 ] / \ / \ / \ / \ / \/ \/ \/ \ \ /\ /\ /\ / \ / \ / \ / \ / [ 0 ]----[ 0 ]----[ 0 ]----[ 0 ]you start at the far left, as you go along each edge that's a coding output. Any given state, it doesn't matter how you got there, the transitions out of it have the same cost regardless of history. To fill out the graph you start on the far left with a cost of zero, you walk each link, and you fill in the node if the cost you are carrying is lower than what's already in there. Each node only needs to remember the cheapest way to get to that node.
To find the optimal coding you start at the far right and you walk backwards along the links that led you to the cheapest cost.
This graph looks like a "trellis" so these kind of problems are called "trellis quantization" or "joint trellis subband optimization" or whatever. The key thing about the trellis shape is that for N coding decisions the size of the graph is O(K*N) (if there are K options at each decision point), whereas the full branching factor is K^N if you had to consider all possible sequences.
This is apparently the "Viterbi algorithm" or some such thing but all the descriptions I've found of that are weird and confusing.
For game developers, this is very similar to A* path finding. A* is actually a form of Lazy Dynamic Programming. Path finding has the character we need because the cost to go between two nodes only depends on those two nodes, not the whole history, thus at each node you only need to remember the shortest path to get to that node.
In H264 this is a nice way to really optimize the output for a given block. It's sort of weirdly called "quantization" because it often consists of jamming values to zero which is kind of like using a larger adaptive deadzone. I really don't like that terminology, because it is in fact NOT a variable quantizer, since it doesn't affect the reconstruction levels in the decoder at all.
Note that in video coding this is a very small bit of a tiny optimization, and the rest of the optimization is a big heuristic mess. The total optimization looks something like this :
Assign bit rate to various frames
(try to optimize R-D ; meet channel overflow and buffer underflow constraints )
Within a frame, assign bits to motion vectors vs. residuals & control codes
Iterate :
choose best motion vectors for current motion vector bit rate allocation
optimize block mode decisions
code residuals to given bit allocation
"trellis" optimize the coding of each block to a lagrangian R-D (R + lambda * D)
oh and also iteratively search around on the lagrange multiplier lambda
to hit the rate you were supposed to
then hold residuals constant and optimize block & motion vector decisions for those residuals
then shift bit allocation between modes, residuals & vectors & repeat
yuck. Oh, and of course per-block coding can't really be independently optimized since context model state carries between blocks. And in intra frames blocks are used to predict other blocks, so if you change one it affects all future ones. Oh, and this optimization is super important.
Okay we talked a bit about block transforms, now let's talk about some of the somewhat weird variants of block transforms that are used in modern standard coders.
With an 8x8 block we're at a big disadvantage. An 8x8 block is like a 3 level wavelet. That's not much, wavelet coders rely on a 5 or 6 level transform normally, which would correspond to a 32x32 block or better. Large block transforms like that are bad because they're computationally complex, but also because they are visually bad. Large blocks create worse blocking artifacts, and also increase ringing, because it makes the high frequency shapes very non-local.
Basically by only doing 8x8 we are leaving a lot of redundancy between neighboring blocks. There's moderate correlation within a block, but also strong correlation across blocks for coefficients of the same type.
H264 Intra frame coding is actually really excellent; it outperforms JPEG-2000 for example. There're a few papers on this idea of using H264 intra coding just for still images, and a project called AIC . (AIC performs worse than real H264 for a few reasons I'll get into later).
"AIC" basically just does 8x8 block DCT's - but it does this interesting thing of pre-predicting the block before the transform. It works on blocks in scan order, and for each block before it does the DCT it creates a prediction from the already transmitted neighbors and subtracts that off. This is a nice page with the details . What this accomplishes does is greatly reduce correlation between blocks. It subtracts off predicted DC so the DC is usually small, and also often subtracts off predicted shapes, so for example if you're in a smooth gradient region it subtracts off that gradient.
Real H264 intra beats "AIC" pretty well. I'm not sure exactly why that is, but I have a few guesses. H264 uses integer transforms, AIC uses floating point (mainly a big deal at very high bit rates). H264 uses macroblocks and various sub-block sizes; in particular it can choose 8x8 or 4x4 sub-blocks, AIC always uses 8x8. Choosing smaller blocks in high detail areas can be a win. I think the biggest difference is probably that the H264 implementations tested do some RDO while AIC does not. I'm not sure exactly how they do RDO on the Intra blocks because each block affects the next one, but I guess they could at least sequentially optimize each block as they go with a "trellis quantizer" (see next post on this).
Okie doke. JPEG XR has similar issues but solves them in different ways. JPEG XR fundamentally uses a 4x4 transform similar to a DCT. 4x4 is too small to remove a lot of correlation, so neighboring blocks are very highly correlated. To address this, JPEG XR groups 4x4 groups of blocks together, so it has a 16x16 macroblock. The DC's of each of the 4x4 blocks gets another pass of the 4x4 transform. This is a lot like doing a wavelet transform but getting 4:1 reduction instead of 2:1. Within the 16x16 macroblock, each coefficient is predicted from its neighbor using gradient predictors similar to H264's.
In H264 the gradient predictor is chosen in the encoder and transmitted. In JPEG XR the gradient predictor is adaptively chosen by some decision that's made in the encoder & decoder. (I haven't found the exact details on this). Also in JPEG XR the delta-from-prediction is done *post* transform, while in H264 it was done pre-transform.
If you think about it, there's a whole world of possibilities here. You could do 4x4 transforms again on all the coefficients. That would be very similar to doing a 16x16 DCT (though not exactly the same - you would have to apply some twiddle factors and butterflies to make it really the same). You could do various types of deltas in pre-transform space and post-transform space. Basically you can use the previous transmitted data in any way you want to reduce what you need to send.
One way to think about all this is that we're trying to make the reconstruction look better when we send all zeros. That is, at low bit rates, we will very often have the case that the entire block of AC coefficients goes to zero. What does our output look like in that case? With plain old JPEG we will make a big solid 8x8 block. With H264 we will make some kind of gradient as chosen by the neighbor predictor mode. With JPEG XR we will get some predicted AC's values untransformed, and it will also be smoothed into the neighbors by "lapping".
So, let's get into lapping. Lapping basically gives us a nicer output signal when all the AC's are zero. I wrote a bit about lapping before . That post described lapping in terms of being a double-size invertable transform. That is, it's a transform that takes 2N taps -> N coefficients and back -> 2N , such that if you overlap with neighbors you get exact reconstruction. The nice thing is you can make it a smooth window that goes to zero at the edges, so that you have no hard block edge boundaries.
Amusingly there are a lot of different ways to construct lapped transforms. There are a huge family of them (see papers on VLGBT or some fucking acronym or other). There are lots of approaches that all give you the same thing :
2N -> N windowed basis functions as above (nobody actually uses this approach but it's nice theoretically) Pre & post filtering on the image values (time domain or spatial domain) basically the post-filter is a blur and the pre-filter is a sharpen that inverts the blur (this can be formulated with integer lifting) Post-DCT filtering (aka FLT - Fast Lapped Transform) basically do the NxN DCT as usual then swizzle the DCT coefficients into the neighboring DCT'sPost-DCT filtering can either be done on all the coefficients, or just on the first few (DC and primary AC coefficients).
Lapping is good and bad. It's not entirely an awesome win. For one thing, the pre-filter that the lap does is basically a sharpen, so it actually makes your data harder to compress. That's sort of balanced by having a better reconstruction shape for any given bit rate, but not always. The fundamental reason for this is that lapping relies on larger local smoothness. eg. for 8x8 blocks you're doing 16-tap lapped transforms. If your signal is actually smooth over 16 taps then it's all good, but when it's not, the lapped transform needs *larger* AC coefficients to compensate than a plain blocked 8-tap DCT would.
The interesting thing to me is to open this up and consider our options. Think just about the decoder. When I get the DC coefficient at a given spot in the image - I don't need to plop down the shape of any certain transform coefficient. What I should do is use that coefficient to plop down my best guess of what the pixels here were in the original. When I get the next AC coefficient, I should use that to refine.
One way to think about this is that we could in fact create an optimized local basis. The encoder and decoder should make the same local basis based on past transmitted data only. For example, you could take all the previously sent nearby blocks in the image, run PCA on them to create the local KLT ! This is obviously computationally prohibitive, but it gives us an idea of what's possible and how far off. Basically what this is doing is making the DC coefficient multiply a shape which is our best guess for what the block will be. Then the 1st AC coefficient multiplies our best guess for how the block might vary from that first guess, etc.
Next part : HDR (High Dynamic Range) coding. One of the things I'd like to do well for the future is store HDR data well, since it's more and more important to games.
The basic issue is that the range of values you need to store is massive. This means 8-bit is no good, but really it means any linear encoding is no good. The reason is that the information content of values is relative to their local average (or something like that).
For example, imagine you have an image of the lighting in a room during the day with the windows open. The areas directly hit by the sun are blazing bright, 10^6 nits or whatever. The areas that are in shadow are very dark. However, if you rotate and look at the image with your back to the sun, your eyes adjust and you can see a lot of detail in the shadow area still. If you encoded linearly you would have thrown that all away.
There are a few possibilities for storing this. JPEG-XR (HD Photo) seems to mainly just advocate using floating point pixels, such as F16-RGB (48 bits per pixel). One nice thing about this "half" 48 bit format is that graphics cards now support it directly, so it can be used in textures with no conversion. They have another 32 bit mode RGBE which uses an 8-bit shared exponent and 3 8-bit mantissas. RGBE is not very awesome; it gives you more dynamic range than you need, and not enough precision.
I haven't been able to figure out exactly how they wind up dealing with the floating point in the coder though. You don't want to encode the exponent and mantissa seperately, because the mantissa jumps around weirdly if you aren't aware of the exponent. At some point you need to quantize and make integer output levels.
One option would be do a block DCT on floating point values, send the DC as a real floating point, and then send the AC as fixed-size steps, where the step size is set from the magnitude of the DC. I think this is actually an okay approach, except for the fact that the block artifacts around something like the sun would be horrific in absolute value (though not worse after tone mapping).
Some kind of log-magnitude seems natural because relative magnitude is what really matters. That is, in areas of the image near 0, you want very small steps, but right near the sun where the intensity is 10^8 or whatever you only need steps of 10^5.
This leads to an interesting old idea : relative quantization. This is most obvious on a DPCM type coder. For each pixel you are encoding, you predict the pixel from the previous, and subtract from the prediction and send the error. To get a lossy coder, you quantize the error before encoding it. Rather than quantize with a fixed size step, you quantize relative to the magnitude of the local neighborhood (or the prediction). You create a local scale S and quantize in steps of {S,2S} - but then you also always include some very large steps. So for example in a neighborhood that's near black you might use steps like {1,2,3,4,5...16,20,...256,512,1024,...10^5,10^6,10^7}. You allow a huge dynamic range step away from black, but you don't give many values to those huge steps. Once you get up into a very high magnitude, the low steps would be relative.
This takes advantage of two things : 1. we see variation relative to the local average, and 2. in areas of very large very high frequency variation, we have extremely little sensitivity to exactly what the magnitude of that step is. Basically if you're look at black and you suddenly look at the sun you can only tell "that's fucking bright" , not how bright. In fact you could be off by an order of magnitude or more.
Note that this is all very similar to gamma-correction and is sort of redundant with it. I don't want to get myself tangled up in doing gamma and de-gamma and color space corrections and so on. I just want to be a pixel bucket that people can jam whatever they want in. Variable step sizing like this has been around as an idea in image compression for a very long time. It's a good basic idea - even for 8 bit low dynamic range standard images it is interesting, for example if you have a bunch of pixels like {1,3,2,1,200,1,3,2} - the exact value of that very big one is not very important at all. The bunch of small ones you want to send precisely, but the big one you could tolerate 10 steps of error.
While that's true and seems very valuable, it's very hard to use in practice, which is why no one is doing it. The problem is that you would need to account for non-local effects to actually get away with it. For example - changing a 200 to a 190 in that example above might be okay. But what if that was a row if pixels, and the 200 is actually a vertical edge with lots of pixel values at 200. In that case, you would be able to tell if a 200 suddenly jumped to a 190. Similarly, if you have a bunch of values like - {1,3,2,1,200,200,200,1,3,2} - again the eye is very bad at seeing that the 200 is actually a 200 - you could replace all three of those with a 190 and it would be fine. However if your row was like this : {1,3,2,1,200,200,200,1,3,2,1,200,200,2} - and you changed some of the 200's but not the others - then that would be very visible. The eye can't see absolute intensity very well at all, but it can see relative intensity and repetitions of a value, and it can see edges and shapes. So you have to do some big nonlocal analysis to know how much error you can really use in any given spot.
Greg Ward has an interesting approach for backward compatible JPEG HDR . Basically he makes a tone-mapped version of the image, stores that in normal RGB, divides that by the full range original image, and stores the ratio in a seperate gray scale channel. This is a good compromise, but it's not how you would actually want to store HDR if you had a custom pixel format. (it's bad because it presumes some tone mapping, and the ratio image is very redundant with the luma of the RGB image).
I'm leaning towards Log-Y-UV of some kind. Probably with only Log(Y) and UV linear or something like that. Greg Ward has a summary of a bunch of different HDR formats . Apparently game devs are actually using his LogLuv : see Christer or Deano or Matt Pettineo & Marco .
One nasty thing about LogLuv is that there's some problems near zero, and everybody is doing slightly different hacks to deal with that. Yay. It also doesn't bilerp nicely.
ADDENDUM : I should clarify cuz this got really rambly. LogLuv is an *integer* encoding. As a pixel bucket I can of course handle 3 integers of various scales and signs, no problemo. Also, "floating points" that are in linear importance scale are not really a difficult issue either.
The tricky issue comes from "real" floating points. That is, real HDR image data that you don't want to integerize in some kind of LogLuv type function for whatever reason. In that case, the actual floating point representation is pretty good. Storing N bits of mantissa gives you N bits of information relative to the overall scale of the thing.
The problems I have with literally storing floating point exponent and mantissa are :
1. Redundancy beteen M & E = worse compression. Could be fixed by using one as the context for the other.
2. Extra coding ops. Sending M & E instead of just a value is twice as slow, twice as many arithmetic code ops, whatever.
3. The mantissa does these weird step things. If you just compress M on its own as a plane, it does these huge steps when the exponent changes. Like for the values 1.8,1.9,1.99,2.01,2.10 - M goes from 0.99 to 0.01. Then you do a wavelet or whatever and lossy transform it and it smooths out that zigzag all wrong. Very bad.
So clearly just sending M & E independently is terrible.
What exactly is a block transform like a DCT or whatever, and why do we do it? This is sort of rambling socratic questioning for my self.
We have some 1d signal of length N (that's just N floats). We want to transform it and preserve L2 norm. (Why exactly do we want to preserve L2 norm? It's not strictly necessary but it means that quantizing in transformed space is the same as transforming in pre-transform space which is handy.)
Well, preserving L2 norm is just the same as pretending our signal is a vector in N-d space and preserving its length. That means that our transform is just a rotation (our transform is real and invertable). In particular an N-point transform is a member of SO(N).
That's most obvious in 2d so let's start there. I wrote before about the 2d Haar/Hadamard/S-transform and whatnot.
What are all the possible 2d matrices that can transform a signal and preserve length ?
I : (identity) [1 0] [0 1] J : (exchange) [0 1] [1 0] Z : (mirror) [1 0] [0 -1] ZZ = I , JJ = I K = -JZ = ZJ = [0 1] [-1 0] KK = -1 K^T = -K K^T K = 1 H = Hadamard = J + Z : [1 1] [1 -1]is all you can make; (you can stick arbitrary factors of J or K on there, or -1's). Let's see what kind of linear combination we can make :
R(c,s) = c * I + s * K R^T * R = ( c * I + s * K ) ^T * ( c * I + s * K ) R^T * R = ( c * I ^T + s * K ^T ) * ( c * I + s * K ) R^T * R = ( c * I - s * K ) * ( c * I + s * K ) R^T * R = ( c^2 * I - s^2 * K*K ) R^T * R = ( c^2 * I + s^2 * I ) R^T * R = ( c^2 + s^2 ) * I therefore ( c^2 + s^2 ) = 1 therefore c & s are a cosine & sine and R is a rotation
You see a lot of signal processing papers drawing these fucking signal flow diagrams that I hate. One of the things they like to draw is a "butterfly" . There's some ambiguity about what people mean by a "butterfly". The Wikipedia page is using the convention that it's just a Hadamard. People will usually put a little "-" on the line that gets the negative to disambiguate.
Sometimes you'll see a variable written next to a line of a butterfly. That means multiply by that number. We saw in my earlier post how rotations can be made from shears. If you ignore normalization for a moment, we can see that 2d rotations can be made just by applying one of our matrices (such as H), then multiplying one variable by a scalar. In terms of circuit diagrams, I and J are just lines moving around, Z is a negative applied to one line, H is a butterfly. That's all you need to do any 2d rotation.
Some of the lapped transform people confusingly use the "mirrored rotation" matrix :
M(a) = Z * R(a) M = [ c s ] [ s -c ] M^T = M M * M = I M is its own transpose and its own inverse btw you can see this geometrically because Z * R(a) = R(-a) * Z Haar = M( 45 degrees ) = sqrt(1/2) * Hadamard
Now any N-d rotation can be made from a series of 2d planar rotations. That is in fact how you build the FFT.
Since any transform is a rotation, the 8-tap DCT must be an 8-d rotation. We could figure out all the angles by thinking about what it does to various vectors. All constant vectors [c,c,c,c,c,c,c,c] are rotated to [1,0,0,0,0,0,0,0] ; that's one angle, etc. Each of those rotations is just a 2d rotation in some plane or other.
We can bring this back to the PCA/KLT I wrote about before . The PCA finds the axes of principal variation. The KLT is then the rotation matrix which rotates the principal axes to the coordinate axes (that is, it's the spatial frame where the covariance is diagonal).
Now we can ask why one set of axes or the other. Apparently the DCT is the KLT for a simple model of neighbor-correlated data. (specifically, if the correlation matrix is symmetric and tri-diagonal). I'd like to find a simple proof of this, I haven't been able to find it yet. (help). Recently there have been some papers on using the Tchebichef transform (DTT) instead of the DCT. Personally I think this is rather moot because nobody just does transforms and sends the coefficients directly any more. We always take deltas from neighbors or use correlations in other ways, so having transforms that decorrelate more is somewhat irrelevant.
(BTW there is one big win with the DTT - it's based on polynomials so the first AC coefficient is in fact a straight line ramp; with the DCT the first AC is a cosine shape. For synthetic images this could be a big win because the DTT can capture linear gradients exactly in only 2 coefficients).
But let's back up a second. Why are we doing a block transform at all ? It's not obvious. We want to exploit correlation. But there are plenty of other ways to do that. We could use a DPCM method (delta from neighbors), or just a probability-model method (predict similarity to neighbors). That could capture the correlation perfectly well. So it's not that.
Sometimes it's said that it's good for quantization. Hmm, maybe. You can also quantize the error that you would code in a DPCM method (like lossy CALIC). In practice quantizing transforms does work better at high loss, though it's a bit mysterious why that is exactly. There are two main factors I think :
1. Separating the DC and the AC. People make a lot of claims about the DCT basis being a natural fit for the human visual system. I think that's largely hogwash. Certainly once you cut it into 8x8 blocks it's wrong. But one thing that is valuable is the seperation of DC (low-frequency intesity - basically just a mipped-down version of the signal), and AC (higher frequency detail). The transform lets you code the DC carefully and throw away the AC.
Now, you could say making a lower res mip version and sending that first is not really inherent to transform coding. You'd be wrong. The problem is you want to make a lower res "DC" version without introducing redundancy. Say you have a 2x2 block of pixels -
| a b | | c d |You want to send a lower res version first, such as (a+b+c+d)/4 , and then send the higher res version. Say you want do this with some kind of DPCM/predictor scheme. Even if you use the lower res to predict the higher res, you are creating redundancy, you're now coding 5 values instead of 4. Of course the way to fix this is to do a 2d Haar transform to create the average & deltas from average in a reversible way that only makes 4 values instead of 5 !!
(note you could just send {a} as the low res version then send {b,c,d} later - that avoids redundancy at the cost of not having as good of a low res version).
2. Efficiency (lots of zeros). Block transforms are handy in practice just for a purely practical reason. When you quantize them you get lots of zeros. That lets you do run-length or end-of-block codes that cut off a ton of zeros and save lots of coding ops. Note that I'm distinguishing this from the "energy compaction" which they often talk about in the literature as being a huge win for compression. No, that's not really why we like block transforms - you can get the exact same entropy gain by using the correlation in other ways. The big win is the practical issue.
Question : how do you find/make functions that are exactly invertable in integers?
In particular, what I'd like is a family of cumulative probability distribution functions for arithmetic coding that can be inverted in integers.
Even more specifically, the main cases are semi-laplacian or semi-gaussian functions. Precisely the goal is something like this :
Probably of a symbol is modeled like P(x) ~= e^ ( - lambda * x ) = k ^ -x
(or something like that; P(0) is large, P(255) is small; x in [0,255] and)
Cumulative probability is the sum of probability of all lower symbols :
C(x) = Sum { y <= x } P(y)
To encode x we send something in [ C(x-1) , C(x) )
We want C(x) to be scaled such that C(255) = 16384 or some other power of 2 constant
We want C(x) to be integers, and we for decodability we must have C(x) >= C(x-1) + 1
That is, the integer math truncation must never make a C(x) = C(x-1)
Now, to decode we get back some target number T that's in in [ C(x-1) , C(x) ) for some X
We'd like to have an analytic function that gives us x directly from T :
x = D( T )
Now before you say "that's impossible" ; it's obviously not. You can certainly trivially find solutions such as :
C(x) = (C(255)/256) * (x+1) D(T) = 256 * T / C(255);
The question is, are there more useful solutions, and in particular can you construct a whole family of solutions that are parameterized to give you different shapes.
Obviously you can precompute table lookups, but I'd rather not have a whole mess of tables.
I don't really know anything about integer->integer function theory; it seems to me there are two possible approaches to this. One is "constructive" ; start with simple funcionts that you know you can invert, then there are many operations you can do on them to compose or distort them and still have them invertable.
So there's this game Infinity which I guess has been in development since 1950 or something. Recently the tech post on Deferred lighting was linked around the blogosphere. It led me to this page, and there's some interesting stuff on it.
Back in 1999 or so I started working on my "Galaxy" codebase. My original goal was to develop my own VIPM codebase so that I could use it to make a game where you fly around a galaxy. I wanted to explore continuous scale zooming, to be able to fly from one star system to another, and then right down to planet surfaces. I conjectured that with the current hardware (TNT2) and VIPM technology I could do it and it would be amazing.
It turns out "Infinity" is roughly the same vision. I guess it's not that surprising, it's a common dream. I wanted to make a game kind of like "Trade Wars". I guess "Privateer" also was in dev around that time and was the kind of thing I wanted too, thought it would up sucking. Of course eventually Eve Online came out and did a lot of the things I wanted, but not in the way I wanted - I wanted more of a solo action game, not a team-oriented politics game. I wanted to actually fly my spaceship like in Wing Commander or whatever.
Part of why I wanted to make a space game in 1999 was that I thought the rendering and realism would be a lot easier. You don't have to draw humans or other soft bodies or hair or any of that kind of stuff that we don't do well. You just have lots of shiny metal and such. Everything is rigid bodies so you don't need a fancy animation system. Most of the art would be procedural so I could generate it from code and not have to hire artists except for the ships.
There's a lot of fun rendering and visual stuff you could play with in a Galaxy game; I always wanted to do atmospheric scattering - not just when you're on the surface, but from any altitude, so it would be really volumetric, with the varying density of atmosphere and different particulate composition as you get higher into space (and stuff you would see as the sun(s) go behind other planets and the light passes through the atmosphere around them). You could do fun nebulas and planet's rings, gases that speed up and radiate as they fly around a black hole, etc.
Anyway, I wound up never actually making the game or getting into the real space game technology (you have issues with coordinate precision for example; you need to keep absolute coordinates of everything in doubles and then transform your objects to be camera-relative for render).
The Infinity dev diary describes some good stuff. The terrain heightmap is procedural from noise functions that are evaluated on the GPU to make tiles. Texturing is with a kind of "splat" or "triblend" system; procedural blending by height & slope to select various patterns.
I wonder if you could now make a game that was "always on". Like the Galaxy/Infinity space game, and if you're not logged in to the MMO, time keeps ticking and your ship is still there. If it was an iPhone game or something, you always have your phone on you so you can keep logging back in all the time. You could do things like build your own starbases for trading, and if someone attacks one, it would ring your phone to let you know you should log in. Obviously this would be a horrible thing to do to people but also very compelling.
I wrote before about autogenerating prefs for C++ . Well, I went and did it.
It's almost exactly what was discussed before. There's a code generator "AutoReflect" that makes a little include file. You mark variables with //$ to get them processed.
Here's an actual example from Galaxy4 :
class DriverTestPref : public Prefs
{
public :
AUTO_REFLECT_FULL(DriverTestPref);
float m_sharedConvergeTime; //$ = 2.2f;
float m_cubicMaxAccelScale; //$ = 1.75f;
float m_pdTimeScale; //$ = 2.f;
float m_pdDamping; //$ = 1.f;
float m_pdMinVel; //$ = 0.005f;
float m_interceptConfidence; //$ = 0.5f;
//float m_test; //$ = 1.f;
};
#include "gApp_DriverTest.aup"
To integrate this with VC 2003, I made AutoReflect just run as a global pre build step. It recurses directories and looks for all the ".aup" files. It checks their modtime against the corresponding .cpp and only runs if the cpp is newer. Even then, it's quite common that the cpp was changed but not in a way that affects the AutoReflect, so I generate the new aup file to a temp name, diff it against the current aup file, and don't touch it if it's the same.
That way I don't have to worry about adding custom build steps to lots of files or anything, it's one global pre-build you put in your project once. There are a few minor disadvantages with that :
1. You have to make an "aup" file manually once to get it started. You can do this just by creating the file by hand, or you can run "AutoReflect -f" for "full process" in which case it changes the enumeration to look for all "cpp" files instead of looking forb all "aup" files.
2. Fucking MSDev Pre/Post Build Events don't use the machine %PATH% to look for executables !?!?! URG WTF. It means I can't just put "AutoReflect" in there and have it work on various systems, I have to hard code the full path, or put it in one of the MSDev "Executable Directories" paths.
I gather than in VC 2008 the custom build capabilities are much enhanced so maybe there's a better way there, but this is mostly working very nicely. One good thing about doing it as a pre-build is that it doesn't interfere with the normal Make incremental build at all. That is, when the cpp is modified, first I make a new aup, then the cpp is compiled (which includes the new aup). There's no funny business where the cpp gets compiled twice, or it gets compiled before the aup is made or any of those kinds of problems.
ADDENDUM : actually I just realized there is a problem with this method. Because the "pre build" is only run for an F7 "build" and not for a ctrl-F7 "compile" you can compile your file and it doesn't get the new AUP. That's not a disaster, but it mildly sucks, I'd like it to AutoReflect before the compile when I hit ctrl-F7.
For the example above, the actual "aup" generated is :
templatevoid DriverTestPref::Auto_Reflection(T & functor) { REFLECT(m_sharedConvergeTime); REFLECT(m_cubicMaxAccelScale); REFLECT(m_pdTimeScale); REFLECT(m_pdDamping); REFLECT(m_pdMinVel); REFLECT(m_interceptConfidence); } void DriverTestPref::Auto_SetDefaults() { m_sharedConvergeTime = 2.2f; m_cubicMaxAccelScale = 1.75f; m_pdTimeScale = 2.f; m_pdDamping = 1.f; m_pdMinVel = 0.005f; m_interceptConfidence = 0.5f; }
While I was at it I also put my Prefs & TweakVars into a "DirChangeWatcher" so that I get automatic hot reloads and made that all happen by default in Galaxy4. Pleasing.
I plan to not check in the aups to source control. Since they are generated each time you build, I'll treat them like obj's. Again the only problem with this is when someone syncs and doesn't have the aups yet - I can't do my incremental build method until they exist. What I would really like is for the MSDev "Full Rebuild" or "Clean" to run my "AutoReflect -f" for me that would generate the aups.
There's one stupid thing that's still not done in this, which is handling .h vs .cpp ; since you can have autoreflected classes in xxx.h and xxx.cpp , both would generate xxx.aup and I'd have to merge them or something. I could make it generate two seperates aups, "xxx.h.aup" and "xxx.cpp.aup" , not sure if that's the right thing to do. (ADDENDUM : yeah, I just did that, I think it's the way to go, it also makes it work with .c or whatever, because for any .aup file I can find the source file by just cutting off the .aup ; it removes all assumptions about the source file extension).
Of course I talk about AutoReflect mainly in terms of "prefs", but it's useful for other things. It basically gives you reflection in C++. One thing I'd like to use it for is to bring back the "IOZ" automatic IO system we did at Oddworld (basically a templated visitor IO that let's you stream things in and out trivially).
Unofficial early releases :
AutoReflect.zip (zip 62k)
cblib.zip (zip 500k)
galaxy4.zip (zip 1.5M)
Also in galaxy4 :
Now on Dx9. Now shares math/core code with cblib so it's not duped. New OBB & Hull code as written about earlier (in gApp_HullTest). New SmoothDriver and test app (gApp_DriverTest) (cubic & pd controller stuff written about long ago). Some other random shit, like new gFont,
God damn all you drivers on your cell phones with your giant fucking Suburbans and whatnot that I can't see around, and when you come right into my lane I know you wouldn't even feel me in an accident so I just have to make evasive maneuvers.
God damn all you rubberneckers. Just because there's a fucking cop stopped on the opposite side of the freeway doesn't mean you need to slam on your breaks and look over there. JUST FUCKING GO. Your job when are your driving is to rapidly vacate the space you are in so someone else can use it. Get the fuck on. God damn all you slow movers leaving giant gaps. When you leave a huge gap in front of you, people just keep tucking in to it, and that slows down every single person behind you. It's not curteous, it's fucking rude to the entire line of cars in your lane behind you. God damn all you people who don't accelerate after getting past a jam up. It's your duty to get the flow moving again. When you get through a constriction, like a 3->2 lane reduction, it's your duty to step on it to create a pressure drop to pull the people behind you in more quickly.
God damn you people who don't park up against the edge of the possible parking area, so that we only get two cars on a curb that should fit three. God damn you people who pull up to a red light on a street with two lanes and block the right lane when the left was open, preventing me from turning right on red.
God damn KEXP for playing fucking WoPop and Shake the Shack and all that weirdo awful music right through the drive time rush hour. God damn Seattle for having all the good game companies on the east side and all the good hipsters on the west side. God damn all you motherfuckers cheating in the carpool lane; yes I see you, and yes it's a 3 person carpool lane, 2 does not cut it. God damn all you onramp users who drive as far as possible forward in the carpool lane and then jam yourself into the line unsmoothly, resulting in blocking the carpool lane and causing a big fracas in the right lane because you didn't zipper where you had a good opening.
God damn ING Direct for asking me so many fucking questions every time I log in. I can't remember them all so I have to write them down on a piece of paper next to my monitor. Great fucking security system, thanks a lot for making my transaction more secure. In fact it is great for them because they have pushed the fault to me and can claim that they did everything they should.
There's a ton of RPG's and Fantasy RPG/RTS games I never heard of :
King's Bounty: The Legend Drakensang: The Dark Eye Sacred 2: Fallen Angel Mount & Blade Elven Legacy Kohan II: Kings of War
I hate the feeling of being a slave to the traffic schedule. I hate waking up in the morning around 8 and knowing I can't leave for work until after 10. I'm ready to go! Fuck. Then around noon each day at work it hits me that I'm gonna be stuck until 7 or 7:30. Often that's fine, I'm gonna work until then anyway, but just knowing that I'm fucking stuck and don't have my liberty gives me a flash of panic and frustration. I just want to work until my brain is exhausted and then go home.
I have the same kind of feeling at home with the noisy neighbors. They haven't really even been that bad recently, but every time I hear a bit of noise I worry if it's just a precursor to a storm. Even if nothing happens it makes me tense and worried. Every day if I think about going to sleep early I wonder if it will be okay or if I should stay up and wait for them to go to sleep first.
Here's a puzzle for you : WTF is a Cafe Creme ? The French style espresso with cream; I mean, I know what it is, but what *exactly* ? How would I get one in the US ? It's not exactly just an espresso with cream added; maybe the cream is foamed or something; I think the shots are long pulls too but can't say for sure.
The Julie and Julia movie literally makes me sick to my stomach. I adore Julia Child much like I adore Jacques Pepin. She (Julia) was a nut, someone completely full of life, an independent spirit, a trailblazer. This fucking Julie whore is some stupid whiney blogger. Get the fuck out of my movie about Julia Child. How dare you even be mentioned in the same breath. I'm revolted by the glorification of bloggers. Look I like reading blogs as much as the next guy, but I know perfectly well it's akin to reading tabloids. It's trash, it's insignificant.
The revolting trend of food bloggers who think way too much of themselves makes me want to just check out of life so I that I don't have to be connected to them in any way. I'm sick of reading fucking Michael Ruhlman constantly blogging about the media ventures with other bloggers or the success of his fucking book. I'm sick of the GastroGnome name dropping about what fucking special invite-only food event she got invited to. It's fucking self-indulgent self-aggrandizing gossip. Write about fucking food and that's all. If you want to write a cookbook that's about the recipes, fine fine. But don't write a fucking book about being a food blogger! Be more like Dave Lebovitz .
The NYT today had a pretty severe misrepresentation of the truth. In the chart of infectuous diseases you see the typical big killers, like the 1918 Spanish Flu, but then you see Smallpox listed in 1947 with three cases.
!?
I guess that's sort of true, but the actual number for 1947 in NYC was 12 cases. In 1947 Smallpox was well gone from the US. It was still killing in the rest of the world. Suddenly 3 cases appeared; the government quickly enacted a vaccination program and quarantined those people; the total infected reached 12 but the spread of the virus was fully controlled.
The point is it's bizarre to list that particular 1947 small outbreak in NYC as "smallpox" on the chart (everything else on the chart lists worldwide effects). It leads you to think smallpox wasn't a big deal. Au contraire. In the first half of the 20th century, smallpox was still a virulent killer. In fact, it was so common that you would hardly even say it was an "epidemic", it was just constant; hundreds of thousands of people died from it every single year (sort of like Malaria still is) (I guess that's called "endemic").
In fact you can quote a better NYT article on smallpox :
Smallpox killed more people over the ages than any other infectious disease. In the 20th century alone, experts estimate, it took up to a half billion lives, more than all the wars and epidemics put together.
(I'm guessing most of that was in the 3rd world (?) ; the Smallpox vaccine was invented around 1800, but it wasn't eradicated until 1977 ; I haven't seen good numbers on when widespread vaccination was adopted in the 1st world)
The Wikipedia on Smallpox also has this nasty number :
The disease killed an estimated 400,000 Europeans each year during the 18th century
That's 40M in the century, which is a hell of a lot when you consider the low population of Europe in the 18th century. (population of Europe was 100M for almost all of the 18th century, then shot up to 200M near the end with the Industrial Revolution and the growth of cities).
Smallpox is also extremely important historically. The first vaccination was smallpox. (in fact the term vaccination comes from vache or vaca for cow - it was a cowpox , a variant of smallpox, injected as a proxy virus that was less deadly and built the right antibodies).
Smallpox was also the secret weapon of colonists. It's the primary way that Cortez was able to defeat the Aztecs.
I finally got my HTPC working with S3 sleep. It's pretty easy actually, you just set it in the BIOS, and then there's a Windows setting you have to do. I kept it simple and just disabled all USB-wake options, which can apparently cause some problems.
For those out of the loop S3 Sleep for desktops is just like the good way that laptops have been sleeping forever - it turns off everything except your RAM. The normal S1 Sleep for desktops is pretty retarded because fans and disks and such keep running; what's the point of that?
Anyway, everything was working fine and I was all happy, until I noticed my volume control stops working after a sleep. So I went to the M-Audio web page and found :
Sleep (Hibernation) Mode
Text size [-] [+]
Q: Can I use sleep or hibernation mode with my M-Audio device?
A: M-Audio does not support sleep mode with any of its devices. The ability to re-establish a connection with a 3rd party device driver after waking from sleep or hibernation mode is system dependent. Because of this, M-Audio recommends setting your computer to never go to sleep. Instead, it is recommended that you turn off your monitor, or completely power off the computer. If your M-Audio device is not working after waking from sleep mode, restart your computer to re-establish a connection with the device.
!? WTF !? Friggle frack fucking douchebag morons. It just blows my mind how so many people think it's okay to be like "oh yeah our stuff is totally broken, that's a known feature and we're not going to fix it". WTF. How can you sleep at night? (See also 1 , 2 , 3 )
In better news on the HTPC front, I finally got the AMD Cool & Quiet working, and it's actually pretty awesome. You have to install the latest AMD Processor Driver, and then set your power scheme to "Minimal" and it automatically kicks in. You can also run the AMD Processor Monitor to watch your MHz go up and down.
That on its own doesn't do a whole lot for you, but I also got the variable fan speeds working. I'm running two 120 mm Scythe quiet fans that are supposed to run super slow. I had one of them plugged in the "NB Fan" power plug, and I had another plugged directly to the power supply, dunno why I did that. Turns out on my Gigabyte mobo, it does fan speed stepping if you plug them in to "CPU Fan" and "SYS Fan" power plugs on the mobo. Now both of them are running super slow all the time and the whole thing is super quiet.
I still have a slight whir from the machine, which I think is the PSU fan, so some day I may have to replace that. I also noticed while I was in there that the whole machine is absolutely jammed full of dust even though its only 6 months old. A computer sitting in a living room with fans running all the time is basically an air filter, it's sucking all the dust in the room right through its central cavity. Then it gets stuck in the fans and they get loud.
... oh and I also finally took care of the damn "alt key getting stuck" problem. I just wrote a program to look for alt being down and to send an alt-up-down itself. It just uses GetAsyncKeyState to check for keys being down then uses SendInput to toggle them. BTW while I was in there I discovered VkKeyScan, MapVirtualKey, and ToAscii which I somehow didn't know about and can be useful for eg. converting between ascii and VK codes.
In other news I was thinking about playing another action-RPG video game. I can't run anything newer than like 2005 or so, nor can I run any console games. I tried Morrowind a while ago, but the graphics just look like ass to me. I really find 3d unbearably ugly 99% of the time, I'd much rather have painted backgrounds. I've played all the old Icewind Dale and Baldur's Gate games; ideally I'd find something like that.
I might try Dungeon Siege again; I played it when it came out and thought it was ass. Also there's this new one "Titan Quest" but it might push the graphics too hard for my old lappy.
Kim linked this article about some fruity pants board games . I think it's a great article, because I think it's a fantastic example of totally retarded wrong-headed board game design.
I've been going to board game night recently here, and it's been rather hit or miss. There are some good interesting games, but a lot of stinkers. (BTW I classify board games & card games as pretty much the same thing).
Good board games create a dynamic system where you can make real strategic choices. That is, you should have decision points where there is not simply one provably best move. There should be multiple choices that allow you to play in different styles.
The most interesting thing in real board games is when there's interaction between the players; playing against the rule system is not interesting, it's always the same, but playing against other people adds variety and lots of metagame; the metagame of the human interaction outside the rule system is perhaps the most interesting part of board games. Really retarded board games like Monopoly or Risk can only be saved through the metagame. A lot of bad board games basically have you playing against yourself, to try to get as high a score as possible, other people are playing too but you don't really interact much with each other.
Conceptual games like "Train" are some of the worst. There are a lot of them, if you go into any board game shop probably half of them are what I would call "conceptual". That is, the whole point of these games is basically just the idea of them - the setting, oh aren't the characters cute, oh this one is about killing kittens to harvest their souls, what a cute idea. The actual play of these games is trivial and uninteresting. If you want a freaking story, read a book. This is not an appropriate use of the board game medium.
Rather than having a moment of "revelation" in conceptual games, I usually have a moment of abject depression and futility when I realize that the game I'm playing is totally pointless and retarded and I'm just rolling dice and flipping cards for no good reason. Someone could've just told me the concept in one sentence and then we wouldn't have had to waste our time playing this damn game.
Which brings me to the next type of game that sucks - games where all the difficulty and "skill" just come from overly complex rules. For one thing, it sucks playing these games for the first time because you spend hours trying to learn them. Then once you learn the basic rules it takes a little while to really learn what the strategy dynamic is. Then once you understand the strategy, you realize the game is totally trivial. If you have opponents who actually know how to play, then the game is random. Once you get to the level of understanding the game system, then these games have no more complexity at all. There are actually some games that are widely considered to be good games that fall into this category, such as Settlers or Domaine. Those games can be saved by playing in a group and relying on the human metagame to make them more interesting, but basically they are just very complex strategic systems that don't have real depth. Perhaps the most extreme example is the old game Axis & Allies where the static initial layout made it so that once you were an expert your entire move was predetermined and the game just became a giant dice toss. Random board setups prevent the complete destruction of a game in that way, but don't actually make the strategic system any more interesting.
The test for whether a board game is actually really good goes something like this : 1. convert all the pieces to abstract solid colored blocks, convert the board to just squares with no graphics. 2. teach all the players the strategy so they know exactly what the right way to play is. We've reduced the appeal of the game to only it's rule system, and we've removed all the difficulty due only to unfamiliarity or over-complex rules. Is it still interesting to play ? If not, then it's basically a garbage game dressed up in a pretty outfit. The truly great games are still interesting in this format (chess, go, poker, diplomacy) but most are shit.
A really good game (chess, go, poker) is a beautiful thing. It can teach you about yourself and the universe. As you get better and better, you keep reaching plateaus where you see something new in the strategy system that you didn't even know was there before. You can play opponents who use some weird style that you're not familiar with that can teach you new things about the game.
I certainly don't romanticize the historical oppression of women (neither the near-slavery of long ago or the social repression of 1950's America). However, there is something romantic and appealing about complementarity. Complementarity is the idea that you and your lover should be very different, have different skills, and together make a whole. It makes you value the other person, it makes you need them, and it lets you be better as a whole than you could be as individuals, because you can specialize and spend more time in each of your abilities. Plus there are many personality traits that are very useful but hard to have at the same time. For example, one of you could be very intellectual and rational and serious, the other could be very emotional and carefree. Those are both wonderful useful traits, but if you try to combine them you just sort of water them down and make them worse. One of you could be tough and demanding and aggressive, the other could be sweet and friendly.
In the past lots of people were forced into marriage by their sheer inability to survive alone. Women would have a hell of a hard time supporting themselves, and the men were widely incapable of doing basic things like dressing themselves or making a sandwich. That's a very strong bond; you see it still with old couples who despise each other, but have to stick together because the man doesn't know how to run a washing machine and the woman doesn't know how to drive.
Of course I think of people now as being pretty independent, but a lot of them are still totally incompetent. I'm amazed how many men still can't do basic cleaning or cooking (I'm sure they quite willfully keep themselves ignorant because they don't want to have to do that stuff). Lots of girls still couldn't get a decent job if they had to, because they were Communications majors or something and have zero skills. And of course all those suburban princess girls are just completely worthless all around, they can neither cook nor clean nor do any decent work. They only thing they know how to do is put on make up, wait tables, and give blow jobs.
Relationships without artificial glue are very difficult to sustain. As I've already mentioned, one form of strong glue is if you're incompetent in some way and simply can't live apart (or societal repression as still exists in much of the world). Another form of strong glue is children. Even if you aren't fully in the mode of "we hate each other but we'll stay together for the children", children are still strong glue in that it simply gives you something that you both care about and have to take care of all the time; it gives you a common mission, and it also gives you a focus of grief that's outside of each other.
Another form of strong glue is being a loser. By being a loser I mean the fear of being single or the belief that you can't do better, or inexperience dating. If you marry your highschool sweetheart, you're terrified of having to get back out there and date. You believe maybe this is the only person you can ever find, you're afraid of what single life would be like. That's a strong glue.
Modern society celebrates independence and consciously choosing a mate - that is, being in a relationship without all that glue. But with none of that glue your only reason to be together is because you think the life with this person is better than any other life you could have, and that is a very hard bar to meet all the time.
So, my advice to someone who wants to have a strong relationship : 1. make yourself incompetent, if you're a man never learn how to cook or clean, have your mom always pick your outfits for you. 2. marry the first girl you date or have sex. 3. have kids right away. 4. don't let your wife go to college or learn any skills.
Someone made me think briefly about QCD (Quantum Chromo Dynamics).
I never really got my head around the standard model in grad school. I think I understood QED pretty well, and Weak isn't really that bad either, but then you get into QCD and the maths gets really tough and there's this sea of particles and I had no idea what's going on. Part of the problem is that a lot of the texts go through a historical perspective and teach you the stages of understanding and the experiments that led to modern QCD. I think that's a big mistake and I discourage anyone from reading that. I was always really confused by all the talk of the various mesons and baryons. Often the classes would start with talking about K+ transitions or pion decay or scattering coefficients for "Omegas" and I'd be like "WTF are these particles and who cares what they do?".
I think it's way better just to say "we have quarks and gluons". And yes, the quarks can combine together into these various things, but we don't even really need to talk about them because nobody fucking cares about what exactly the meson made from (strange-antistrange) is called.
I much prefer a purely modern approach to QFT based on symmetry. In particular I really like Weinberg's approach in his textbook which is basically - we expect to observe every phenomenon in the universe which is *possible* to exist. If something is possible but doesn't ever happen, that is quite strange and we should wonder why. In particular with QFT - every possible Lagrangian which leads to a consistent theory should correspond to something in nature. When you start to write these down it turns out that very few are actually possible (given a few constraints, such as the postulate that relativity is required, etc.).
Anyway, I was never really happy with my intuition for QFT. Part of the problem is the math is just so hard, you can't do a lot of problems and get really comfortable with it. (David Politzer at Caltech once gave me a standard model homework problem to actually compute some real scattering coefficients that had been experimentally tested. It took me about 50 pages and I got it horribly wrong).
The whole gauge-field symmetry-group idea seems like it should be very elegant and lead to some intuition, but I just don't see it. You can say hand wavey things, like : electromagnetism is the presence of an extra U(1) symmetry; you can thing of this as an extra circular dimension that's rolled up tiny so it has no spatial size, or if you like you can do the Feynman way and say that everything flying around is a clock that is pointing in some direction (that's the U(1) angle). In this picture, the coupling of a "charge" to the field is the fact that the charge distorts the U(1) dimension. If you're familiar with the idea of general relativity where masses distort spacetime and thus create the gravity force, it's the same sort of thing, but instead of distorting spacetime, charge distorts the U(1) fiber. As charges move around in this higher-D space, if they are pushed by variation of the U(1) fiber clock angle, that pushes them in real space, which is how they get force. Charges are a pole in the curvature of the fiber angle; in a spacetime sense it's a pinched spot that can't be worked out by any stretching of the space fabric. Okay this is sort of giving us a picture, but it's super hand wavey and sort of wrong, and it's hard to reconcile with the real maths.
Anyway, the thing I wanted to write about QCD is the real problem of non-perturbative analysis.
When you're taught QED, the thing people latch onto are the simple Feynman diagrams where two electrons fly along and exchange a photon. This is appealingly classical and easy to understand. The problem is, it's sort of a lie. For one thing, the idea that the photon is "thrown" between the electrons and thus exchanges momentum and forces them apart is a very appealing picture, but kind of wrong, since the photon can actually have negative momentum (eg. for an electron and positron, the photon exchanged between them pulls them together, so the sort of spacemen playing catch kind of picture just doesn't work).
First of all, let's back up a bit. QFT is formulated using the sum of all complex exponential actions mechanism. Classically this would reduce to "least action" paths, which is equivalent to Lagragian classical mechanics. There's a great book which teaches ordinary Quantum Mechanics using this formulation : Quantum Mechanics and Path Integrals by Feynman & Hibbs (this is a serious textbook for physics undergrads who already know standard QM ; it's a great bridge from standard QM to QFT, because it introduces the sum-on-action formalism in the more familiar old QM). Anyway, the math winds up as a sum of all possible ways for a given interaction to happen. The Feynman diagram is a nice way to write down these various ways and then you still integrate over all possible ways each diagram can happen.
Now let's go back to the simple QED diagram that I mentioned. This is often shown as your first diagram, and you can do the integral easily, and you get a nice answer that's simple and cute. But what happened? We're supposed to sum on *all* ways that the interaction can happen, and we only did one. In fact, there are tons of other possibilities that produce the same outcome, and we really need to either sum them all, or show that they are small.
One thing we need to add is all the ways that you can add vacuum -> vacuum graphs. You can make side graphs that start from nothing, particles pop out of the vacuum, interact, then go back to the vacuum. These are conveniently not mentioned because if you add them all up they have an infinite contribution, which would freak out early students. Fortunately we have the renormalization mechanism that sweeps this under the rug just fine, but it's quite complex.
The other issue is that you can add more and more complex graphs; instead of just one photon exchange, what about two? The more complex graphs have higher powers of the coupling constant (e in this case). If the coupling constant is small, this is like a Taylor expansion, each term is higher powers of e, and e is small, so we can just go up to 3rd order accuracy or whatever we want. The problem with this is that even when e is small, as the graphs get more complex there are *more* of them. As you allow more couplings, there are more and more ways to make a graph of N couplings. In order for this kind of Taylor expansion to be right, the number of graphs must go up more slowly than 1/e. Again it's quite complex to prove that.
Starting with a simple problem that we can solve exactly, and then adding terms that make us progressively more accurate is the standard modus operandi in physics. Usually the full system is too hard to solve analytically, and too hard to get intuition for, so we rely on what's called a perturbation expansion. Take your complex system that you can't solve, and expand it into Simple + C * Complex1 + C^2 * Complex2 + ... - higher and higher powers of C, which should be small.
And with QCD we get a real problem. Again you can start with a simple graph of quarks flying along passing gluons. First of all, unlike photons, there are gluon-gluon couplings which means we need to add a bunch more graphs where gluons interact with other gluons. Now when we start adding these higher order terms, we have a problem. In QCD, the coupling constant is not small enough, and the number of graphs that are possible for each order of the coupling constant is too high - the more complex terms are not less important. In fact in some cases, they're *more* important than the simpler terms.
This makes QCD unlike any other field theory. Our sort of classical intuition of particles flying around exchanging bosons completely breaks down. Instead the quarks live in a foaming soup of gluons. I don't really even want to describe it in hand wavey terms like that because any kind of picture you might have like that is going to be wrong and misleading. Even the most basic of QCD problems is too hard to do analytically; in practice people do "lattice QCD" numerical computations (in some simple cases you can do the summations analytically and then take the limit of the lattice size going to zero).
The result is that even when I was doing QFT I never really understood QCD.
I'm doing a little refinement of my old cubic interpolator ("Smooth Driver") thing. (see also : here and here and here ).
One thing I'm trying to do is fix up all the nasty epsilon robustness issues. A small part of that is solving a quadratic. "Easy!" I hear you say. Everyone knows how to solve a quadratic, right? Not so.
I found this page which has a nice summary of the issues, written by a sour old curmudgeon who just whines about how retarded we all are but doesn't actually provide us with a solution.
You can also find the Wikipedia page or the Numerical Recipes (5.6) snippet about the more robust numerical way to find the roots that avoids subtracting two nearly identical numbers. Okay, that's all well and good but there's a lot more code to write to deal with all the degenerate cases.
This is what I have so far : (I'm providing the case where the coefficients are real but the solutions may be complex; you can obviously modify to complex coefficients or only real solutions)
// A t^2 + B t + C = 0;
// returns number of solutions
int SolveQuadratic(const double A,const double B,const double C,
ComplexDouble * pT0,ComplexDouble * pT1)
{
// first invalidate :
*pT0 = FLT_MAX;
*pT1 = FLT_MAX;
if ( A == 0.0 )
{
if ( B == 0.0 )
{
if ( C == 0.0 )
{
// degenerate - any value of t is a solution
*pT0 = 0.0;
*pT1 = 0.0;
return -1;
}
else
{
// no solution
return 0;
}
}
double t = - C / B;
*pT0 = t;
*pT1 = t;
return 1;
}
else if ( B == 0.0 )
{
if ( C == 0.0 )
{
// A t^2 = 0;
*pT0 = 0.0;
*pT1 = 0.0;
return 1;
}
// B is 0 but A isn't
double discriminant = -C / A;
ComplexDouble t = ComplexSqrt(discriminant);
*pT0 = t;
*pT1 = - t;
return 2;
}
else if ( C == 0.0 )
{
// A and B are not zero
// t = 0 is one solution
*pT0 = 0.0;
// A t + B = 0;
*pT1 = -B / A;
return 2;
}
// Numerical Recipes 5.6 :
double discriminant = ( B*B - 4.0 * A * C );
if ( discriminant == 0.0 )
{
double t = - 0.5 * B / A;
*pT0 = t;
*pT1 = t;
return 1;
}
ComplexDouble sqrtpart = ComplexSqrt( discriminant );
sqrtpart *= - 0.5 * fsign(B);
ComplexDouble Q = sqrtpart + (- 0.5 * B);
// Q cannot be zero
*pT0 = Q / A;
*pT1 = C / Q;
return 2;
}
One thing that is missing is refinement of roots by Newton-Raphson. The roots computed this way can still have large error, but gradient descent can improve that.
Last weekend we went to the arboretum, and I was being retarded as usual and jumping around. I jumped right in a mud puddle and slipped and landed with my knee on a rock. It hurt a bit at the time, not a huge deal, but my knee has been bruised and swollen for the whole last week. Even minor injuries seem to last so long and be so crippling now.
I can't stand the music the kids listen to these days. There have always been youthy genres I've hated (like rap-rock and all the mainstream pop), but one that really blows my mind these days is the "pop-punk" that the kiddies seem to love. Stuff like : crap or ass or dear god .
I think all the piercings and tattoos and such are ugly, unsanitary, and completely non-unique and not rebellious. Kids seem to love it.
I think twitter and facebook and myspace and cell phones and texting and all that is just an awful waste of time that never conveys any real information or interesting conversation. It's just a masturbaturoy attempt to reassure each other and feel connected, but it's not a real connection to anything natural.
I still get a paper newspaper, think that blogs are an awful place to get news, and that reading on monitors is extremely unpleasant.
Girls my age just look really old to me, all wrinky and saggy and dried up, like a bunch of bones flopping around inside an uninflated balloon. But girls that are young seem like retarded aliens.
I think text-message abbreviations should be used only when texting, and even then used as little as possible. I think eloquent speech and good grammar and to be strived for at all times; though I often fail, I would never write "srsly" or "imo" unless I was being ironic.
I'm getting old and it fucking sucks.
I wrote last month a bit about OBB fitting. I mentioned at the time that it would be nice to have an implementation of the exact optimal OBB code, and also the bounded-best OBB in reasonable time. I found the Barequet & Har-Peled work on this topic but didn't read it at the time.
Well, I finally got through it. Their paper is pretty ugly. Let me briefly explain their method :
They, like my old OBB stuff, take heavy advantage of the fast rotating-calipers method to find the optimal rectangle of a convex hull in 2d (it's O(n)). Also finding the convex hull is O(nlogn). What that means is, given one axis of an OBB, you can find the optimal other two axes in O(nlogn). So the problem just comes down to finding one of the optimal axes.
Now, as I mentioned before, the number of actual axes you must consider to be truly optimal is O(n^2) , making your total run time O(n^3). These axes are the face normals of the convex hull, plus axes where the OBB is supported by two edges and a vert (I don't know an easy way to even enumerate these).
It's now pretty well known around the industry that you can get a very good OBB by just trying a bunch of scattered initial axes instead of doing O(n^2). If you try some fixed number of axes, like say 256, it doesn't count against your big-O at all, so your whole OBB fit is still O(nlogn).
Well this is exactly what the Barequet & Har-Peled method is. They try a fixed number of directions for the seed axis, then do the rectangle fit for the other two axes. The main contribution of the paper is the proof that if you try enough fixed directions, you can get the error of the bbox within whatever tolerance you want. That's sort of intuitively obvious - if you try more and more fixed directions you must get closer and closer to the optimal box. Their construction also provides a specific method for enumerating enough directions.
Their enumeration goes like this :
Start with some seed box S. They use the box that is made from taking one axis to be the "diameter" of the point set (the vector between the two most seperated points). Using that box is important to their proof, but I don't think which seed box you use is actually terribly important in practice.
The seed box S has normalized edge vectors S.x , S.y, S.z (the three axes that define the box).
Enumerate all sets of 3 (non-negative) integers whose sum is <= K , that is {i,j,k} such that (i+j+k) <= K
Construct the normal N = i * S.x + j * S.y + k * S.z ; normalize it, and use this as the direction to fit a new OBB. (note that there are a lot of points {ijk} that generate the same normal - any points that are integer multiples of another; those can be skipped).
Barequet & Har-Peled prove that the OBB made this way is within (1/K) to some power or other of optimal, so as you increase K you get ever closer.
Now, this is almost identical to my old "OptimalOBBFixedDirections" which tried various static directions and then optimized the box from there. My OptimalOBBFixedDirections always tests 42 directions in the {+,+,+} octant which I made by subdividing an octahedron. I have found that the Barequet & Har-Peled method does in fact find better boxes with fewer tests, but the difference is very very small (thousandths of a percent). I'll show numbers in a second.
First I want to mention two other things.
1. There's a "common wisdom" around that net that while it is bad to make an OBB from the covariance matrix of the *points* , it is good to make an OBB from the covariance matrix of the *volume*. That is, they claim if you have a closed mesh, you can use the Mirtich (or Eberly or Blow/Binstock) method to compute the covariance matrix of the solid body, and use that for your OBB axes.
I have seen no evidence that this is true. Yes, the covariance matrix of the points is highly dependent on the tesselation, while the covariance matrix of the solid is more a property of the actually shape of the object, so that is intuitively pleasing. In practice it appears to be completely random which one is actually better. And you just shouldn't use the covariance matrix method anyway.
2. Barequet & Har-Peled mention the iterative refinement of OBB's using the caliper-fit. This is something I've known a while, but I've never seen it published before; I think it's one of those gems of wisdom that lots of people know but don't consider worth a paper. They mention it almost in passing, but it's actually perhaps the most valuable thing in their whole paper.
Recall if you have one axis of the OBB fixed, you can easily find the optimal directions of the other two axes using rotating calipers to fit a rectangle. The thing is, once you do that, you can then hold one of those new axes fixed, and fit the other two. So like, fix X, then caliper to get YZ, then fix Y, and caliper to get XZ. Each step of the iteration either improves your OBB or does nothing. That means you descend to a local minimum in a finite number of steps. (in practice I find you usually get there in only 3 steps, in fact that might be provable (?)).
Assuming your original seed box is pretty close to optimal, this iteration is kind of like taking your OBB and trying to spin it along one axis and pinch it to see if you get a tighter fit; it's sort of like wiggling your key as you put it into a lock. If your seed OBB is close to being right, but isn't supported by one of the necessary support conditions, this will wiggle it tighter until it is supported by a face or an edge pair.
The methods shown in the test below are :
True convex hull (within epsilon) :
note that area optimization is usually what you want
but volume shows a bigger difference between methods
Hull simplificiation :
I simplify the hull by just doing PM on the triangles
Then convert the triangles to planes
Push the planes out until all points are behind them
Then clip the planes against each other to generate new faces
This is a simpler hull that strictly contains the original hull
k-dop :
Fits 258 planes in fixed directions on the sphere
Just pushes each plane to the edge of the point set
Clips them all against each other
strictly this is O(n) but in practice it's slower than the true convex hull
and much worse quality
seems pointless to me
The rating we show on all the OBB's is surface area
AxialOBB :
axis-aligned box
OBBByCovariance :
vertex covariance matrix sets axes
OBBByCovariance+it :
OBBByCovariance followed by iterative greedy optimization
OBBByCovarianceOptimized :
like OBBByCovariance+it, but tries all 3 initial fixed axes
OptimalOBBFixedDirections :
tries 42 fixed directions
OptimalOBBFixedDirections+it :
tries 42 fixed directions, takes the best, then optimizes
OptimalOBBFixedDirections opt :
tries 42 fixed directions, optimizes each one, then takes the best
OBBGoodHeuristic :
takes the best of OBBByCovarianceOptimized and "OptimalOBBFixedDirections opt"
OBBGivenCOV :
this is OBBByCovarianceOptimized but using the solid body covariance instead of points
OptimalOBB :
tries all face normals of the convex hull (slow)
I'd like to also try all the edge-support directions here, but haven't figured it out
OptimalOBBBarequetHarPeled 5
kLimit : 5 , numBuilds : 19
BarequetHarPeled method with (i+j+k) <= 5
causes it to try 19 boxes
optimizes each one, then picks the best
very similar to "OptimalOBBFixedDirections opt"
And the results :
-----------------------------------------
dolphin.x :
Made Hull with 206 faces
hull1 volume : 1488557 , area : 95330
Making hull from k-dop planes...
Made Hull with 142 faces
hull2 volume : 2081732 , area : 104951
Making OBB...
AxialOBB : 193363.109
OBBByCovariance : 190429.594
OBBByCovariance+it : 179504.734
OBBByCovarianceOptimized : 179504.719
OptimalOBBFixedDirections : 181693.297
OptimalOBBFixedDirections+it : 181693.297
OptimalOBBFixedDirections opt : 176911.750
OBBGoodHeuristic : 179504.719
OBBGivenCOV : 178061.406
OptimalOBB : 176253.359
kLimit : 3 , numBuilds : 3
OptimalOBBBarequetHarPeled 3 : 179504.703
kLimit : 5 , numBuilds : 19
OptimalOBBBarequetHarPeled 5 : 178266.047
kLimit : 10 , numBuilds : 160
OptimalOBBBarequetHarPeled 10 : 176508.109
kLimit : 20 , numBuilds : 1222
OptimalOBBBarequetHarPeled 20 : 176218.344
kLimit : 50 , numBuilds : 18037
OptimalOBBBarequetHarPeled 50 : 176116.156
-----------------------------------------
teapot.x :
hull1 faces : 612
hull1 volume : 3284935 , area : 117470
simplified hull2 faces : 366
hull2 volume : 3384222 , area : 120357
Making hull from k-dop planes...
Made Hull with 234 faces
hull2 volume : 3761104 , area : 129271
Making OBB...
AxialOBB : 253079.797
OBBByCovariance : 264091.344
OBBByCovariance+it : 222514.219
OBBByCovarianceOptimized : 220723.844
OptimalOBBFixedDirections : 219071.703
OptimalOBBFixedDirections+it : 218968.844
OBBGoodHeuristic : 218968.844
OptimalOBB : 218968.844
OBBGivenCOV : 220762.766
kLimit : 3 , numBuilds : 3
OptimalOBBBarequetHarPeled 3 : 220464.766
kLimit : 5 , numBuilds : 19
OptimalOBBBarequetHarPeled 5 : 219540.203
kLimit : 10 , numBuilds : 160
OptimalOBBBarequetHarPeled 10 : 218968.000
kLimit : 20 , numBuilds : 1222
OptimalOBBBarequetHarPeled 20 : 218965.406
kLimit : 50 , numBuilds : 18037
OptimalOBBBarequetHarPeled 50 : 218963.109
-----------------------------------------
Some highlights :
OBBByCovariance is quite bad. OptimalOBBFixedDirections is the only other one that doesn't do the iterative optimization, and it can be bad too, though not nearly as bad.
Any of the methods that do the iterative optimization is perfectly fine. The differences are very small.
"OptimalOBBBarequetHarPeled 7" does about the same number of tests as "OptimalOBBFixedDirections opt" , and it's very microscopically better because of the way the directions are distributed.
OBBGivenCOV (the solid mass covariance) is worse than OBBByCovarianceOptimized (point covariance) on teapot.
Also - the Convex Hull simplification thing I did was just pulled out of my ass. I did a quick Google to see if I could find any reference, and I couldn't find any. I'm surprised that's not a solved problem, it seems like something right up the Geometers' alley.
Problem : Find the convex bounding volume made of N faces (or N planes) that strictly encloses the original mesh, and has minimum surface area (or volume).
In general, the optimal N-hull can not be reached by greedy simplification from the full-detail convex hull. In practice I found my hacky PM solution to work fine for moderate simplification levels. To make it more correct, the "push out to enclose" step should be done in each PM collapse to keep the hull valid as you go (instead of at the end). Also the PM collapse metric should be the metric you are trying to optimize - surface area or volume (I just used my old geometric error collapser).
The main thing I was interested in with convex hull simplification was eating away highly tesselated bits. The mesh I mainly tested on was "fandisk" because it's got these big flat surfaces, and then some rounded bits. If you imagine a mesh like a big cube minowski summed with a small sphere, you get a cube with rounded edges and corners. If the sphere is highly tessleated, you can get a hull with tons and tons of faces, but they are very unimportant faces. You want to sort of polygonate those corners, replaces the rounded sphere with a less tesselated one that's pushed out.
Is god fucking awful. Stop touting it as the greatest example of product design in this century. Yes, yes, the screen is nice, and the basic shape and weight of it is appealing. If it was just a paperweight I would be pretty pleased with it. But when you actually try to *use* it, it's rubbish. (it's the Angelina Jolie of product design if you will - it's the canonical example that everyone uses of something that's great, but it's actually awful).
Try to actually browse through the menus with that fucking wheel. Scan through a big list of artists, back up, change to album view, scan down, it's awful.
The wheel is a disaster for volume control. The right thing for volume is a knob, or a dial. Something physical that you rotate. And it shouldn't be a fucking digital dial that just spins like all the shit that they're giving us on computers now. It should be an *absolute* dial with an actual zero point, so that I can turn the volume down to zero when the thing is off. Hitting play on an iPod is fucking ear drum roulette, you never know when it's going to explode your head.
You're playing a song, you pause it, you go browse to some other song. You want to just resume the original song. How do you even do that !? I suppose it must be possible, but I don't know. It should just be a button.
Design for music playing devices has been perfect for a long time. You have play, pause, skip, volume. You have those on different buttons that are ALWAYS those buttons. They're physical buttons you can touch, so you can use the device while it's in your pocket or your eyes are closed. They should be rubber (or rubberized) so they're tactile, and each button should have a different shape so you know you're on the right one.
It's just fucking rubbish. It's like so many cars these days, changing user interfaces for no good reason and making them worse. Don't give me fucking digital buttons to increment and decrement the air conditioning, that's awful! Give me a damn dial or a slider that has an absolute scale. I don't want to be hitting plus-plus-plus.
The damn "Start" button that's on so many cars now really pisses me off. You used to stick in a key, then turn it. What's wrong with that? It works perfectly fucking fine. Now I have to stick in a key, make sure I'm pressing the brake or the clutch or whatever, then press a start button. Why !? It's more steps, it's just worse.
The worst of course is the menu shit like iDrive that's basically an iPod style interface with a fucking wheel and menus and context-dependent actions. Context-dependent actions are fucking horrible user interface design, quit it. With consumer electronic devices there should just be a few buttons, and those buttons always execute the same action and do it immediately. I know some jack-hole is going to run into me because he was trying to mate his bluetooth and was browsing around the menus.
Our cities are full of "massage parlors" that offer prostitution with a blatant store front and ads in the paper.
I wrote before about how every girl in LA does porn .
There are some widely spread numbers about $10 billion in porn movies per year, or 800 million rentals per year, or the number of porn movies that hotel guests rent. Unfortunatley those numbers are just made up ; (it's funny that even in his mea culpa about making up numbers he goes on to just make up other numbers out of thin air. good reporting, bub).
I was thinking about this because a few Sundays ago there was an article in the NYT Magazine about "SeekingArrangement.com". I didn't think much of it at first. SeekingArrangement is an online dating site like match or whatever, but it's specifically design for hooking up rich older men (often married) with young girls. It does toe the line of prostitution because it explicitly talks about dollar rates and what the girls would be willing to do. Still, these kind of arrangements have been around forever, I do find it rather disgusting, but it's not surprising.
But the sheer numbers are a bit surprising. They claim 300,000 girls have posted profiles. I assume they are exaggerating somewhat and some of the accounts are fake. Let's do some fake numbers like last time to get a surprising conclusion.
I assume most of the girls registered are in the age range 20-30. There are around 20M people in the US in the age range 20-30 (this number is correct BTW). About half of those are girls, so around 10 M. If the 300k girls on SeekingArrangement are from the US, that's 3% of the female population (!).
A more accurate number : there are around 200,000 girls in WA state in the target age group. There are around 2000 girls on SeekingArrangement from WA state. That's 1%. Still a rather high number.
Not really related, but also :
1. STOP TATTOOING YOURSELF !! My god, a whole generation of girls is ruining themselves. Fortunately I found a hot girl with no disgusting dark-green graffiti that breaks up the natural lovely smooth lines of the female body, but I still have to look at it on models and such. WTF. Oh yes, I know what would go great with this peach skin, this graceful natural arc of flesh and muscle, this magical body that is firm and yet soft, this curve that speaks right to my gut - a fucking dark green dolphin drawn by some high school dropout. Quit it.
2. Angelina Jolie is fucking revolting. She looks so bizarre, she's had so much plastic surgery, she looks like an alien. I'm so sick of her being used as the modern standard of beauty. She's the Garbo or Marilyn Monroe or Sofia Loren or whoever of today, and that just says volumes about the fucking awful taste of modern man. I hate that so many girls are trying to look like her and get that fucking punched-in-the-mouth lip injection I-cant-even-talk-right-cuz-my-lips-are-so-swollen.
.. telling time is a huge disaster on windows.
To start see Jon Watte's old summary that's still good .
Basically you have timeGetTime() , QPC, or TSC.
TSC is fast (~ 100 clocks) and high precision. The problems I know of with TSC :
TSC either tracks CPU clocks, or time passing. On older CPUs it actually increments with each cpu cycle, but on newer CPUs it just tracks time (!). The newer "constant rate" TSC on Intel chips runs at some frequency which so far as I can tell you can't query.
If TSC tracks CPU cycles, it will slow down when the CPU speedsteps. If the CPU goes into a full sleep state, the TSC may stop running entirely. These issues are bad on single core, but they're even worse on multi-proc systems where the cores can independently sleep or speedstep. See for example these linux notes or tsc.txt .
Unfortunately, if TSC is constant rate and tracking real time, then it no longer tracks cpu cycles, which is actually what you want for measuring performance (you should always report speeds of micro things in # of clocks, not in time).
Furthermore on some multicore systems, the TSC gets out of sync between cores (even without speedsteps or power downs). If you're trying to use it as a global time, that will hose you. On some systems, it is kept in sync by the hardware, and on some you can get a software patch that makes rdtsc do a kernel interrupt kind of thing which forces the TSC's of the cores to sync.
See this email I wrote about this issue :
Apparently AMD is trying to keep it hush hush that they fucked up and had to release a hotfix. I can't find any admission of it on their web site any more ;
this is the direct download of their old utility that forces the cores to TSC sync : TscSync
they now secretly put this in the "Dual Core Optimizer" : Dual Core Optimizer Oh, really AMD? it's not a bug fix, it's an "optimizer". Okay.
There's also a seperate issue with AMD C&Q (Cool & Quiet) if you have multiple cores/processors that decide to clock up & down. I believe the main fix for that now is just that they are forbidden from selecting different clocks. There's an MS hot fix related to that : MS hotfix 896256
I also believe that the newest version of the "AMD Processor Driver" has the same fixes related to C&Q on multi-core systems : AMD Driver I'm not sure if you need both the AMD "optimizer" and processor driver, or if one is a subset of the other.
Okay, okay, so you decide TSC is too much trouble, you're just going to use QPC, which is what MS tells you to do anyway. You're fine, right?
Nope. First of all, on many systems QPC actually is TSC. Apparently Windows evaluates your system at boot and decides how to implement QPC, and sometimes it picks TSC. If it does that, then QPC is fucked in all the ways that TSC is fucked.
So to fix that you can apply this : MS hotfix 895980 . Basically this just puts /USEPMTIMER in boot.ini which forces QPC to use the PCI clock instead of TSC.
But that's not all. Some old systems had a bug in the PCI clock that would cause it to jump by a big amount once in a while.
Because of that, it's best to advance the clock by taking the delta from previous and clamping that delta to be in valid range. Something like this :
U64 GetAbsoluteQPC()
{
static U64 s_lastQPC = GetQPC();
static U64 s_lastAbsolute = 0;
U64 curQPC = GetQPC();
U64 delta = curQPC - s_lastQPC;
s_lastQPC = curQPC;
if ( delta < HUGE_NUMBER )
s_lastAbsolute += delta;
return s_lastAbsolute;
}
(note that "delta" is unsigned, so when QPC jumps backwards, it will show up as as very large positive delta, which is why we compare vs
HUGE_NUMBER ; if you're using QPC just to get frame times in a game, then a reasonable thing is to just get the raw delta from the last
frame, and if it's way out of reasonable bounds, just force it to be 1/60 or something).
Urg.
BTW while I'm at I think I'll evangelize a "best practice" I have recently adopted. Both QPC and TSC have problems with wrapping. They're in unsigned integers and as your game runs you can hit the end and wrap around. Now, 64 bits is a lot. Even if your TSC frequency is 1000 GigaHz (1 THz), you won't overflow 64 bits for 194 days. The problem is they don't start at 0. (
Unsigned int wrapping works perfectly when you do subtracts and keep them in unsigned ints. That is :
in 8 bits : U8 start = 250; U8 end = 3; U8 delta = end - start; delta = 8;
That's cool, but lots of other things don't work with wrapping :
U64 tsc1 = rdtsc(); ... some stuff ... U64 tsc2 = rdtsc(); U64 avg = ( tsc1 + tsc2 ) /2;
This is broken because tsc may have wrapped.
The one that usually gets me is simple compares :
if ( time1 < time2 )
{
// ... event1 was earlier
}
are broken when time can wrap. In fact with unsigned times that wrap there is no way to tell which one came first (though you could if you
put a limit on the maximum time delta that you consider valid - eg. any place that you compare times, you assume they are within 100 days of
each other).
But this is easily fixed. Instead of letting people call rdtsc raw, you bias it :
uint64 Timer::GetAbsoluteTSC()
{
static uint64 s_first = rdtsc();
uint64 cur = rdtsc();
return (cur - s_first);
}
this gives you a TSC that starts at 0 and won't wrap for a few years. This lets you just do normal compares everywhere to know what came
before what. (I used the TSC as an example here, but you mainly want QPC to be the time you're passing around).
WTF gmail, why do you keep letting obvious nonsense through?
I literally get around 10 mails like this a day :
From: Susan BrownHow can you not tell that's spam !?Subject: Authentic discounted prescriptions from Canada. Fast delivery and qualitative support- come and see yourself. www.sentencefallcold.in
On the plus side, today it delivered one of the more amusing spams I've ever gotten :
I want you to read this message very carefully, and keep the secret with you till further notice, You have no need of knowing who I am, where am from, till I make out a space for us to see, I have being paid me $15000 in advance to terminate you with some reasons listed to me by my employer, it's one I believe you call a friend, I have followed you closely for one week and three days now and have seen that you are innocent of the accusation.
Do not contact the police or security agent or try to send a copy of this to them, because if you do I will know, and might be pushed to do what I have being paid to do,,beside, this is the first time I turned out to be a betrayer in my job. Now, listen, I will arrange for us to see face to face but before that you have to pay me some $20,000. I will be coming to see you in your office or home determine where you wish we meet, do not set any camera to cover us or set up any tape to record our conversation, my employer is in my control now, I will give you the full information of my employer and video and audio tape conversation with him that contains his request for me to terminate you, which will be enough evidence for you to take him to court (if you wish to).
You don’t need my phone contact for now till am assured you are ready to comply positively. Once more I shall remain anonymous, however, you have no alternatives other than to co-operate positively and do as I said or face the brewing storm. You must also trust that as long as you keep your part of the deal, I shall too keep mine and live up to my words. This time I am holding the trump card and shall put it to effective use if you fail to co-operate positively as requested. Any move by you other than co-operation will have catastrophic consequences to you. I wait for your response urgently to settle this and counter my employer directives against you immediately.
Thanks you for your attention Cross
I love the writing style. It's got that high-school-kid-trying-to-sound smart ring to it. "do as I said or face the brewing storm" - how precious! It reminds me a bit of VCB. "They sat in the cafe and discussed art and romance." Oh yes. I am such a passionate latin lover, I can't be bothered to button my shirt properly, and I love to listen to Spanish guitar and discuss romance.
We've seen some really awful movies recently.
"The Queen" : wow, WTF is the point of this movie. It's hard to imagine anything less interesting than the royal family, and despite what all the critics say, the portrait of them was completely stereotypical and superficial. In no way was it interesting or realistic or surprising. Plus like 50% of the time is spent on stock footage of the whole Diana death bullshit.
"Vicky Christina Barcelona" : another wowie zowie this was shockingly bad. People talk about this movie and "Match Point" as the reemergence of Woody Allen after a period of making predictable snoozers. Yes his 90's movies were pretty weak, but these are even worse. VCB is literally laugh-out-loud awful. I'm actually a little perplexed by it. I'm not quite sure if the whole movie is supposed to be a "level". The voice over for example is intentionally bad, right? I mean that voice over can't be serious, it's super cheezy bad like a romance novel or a Penthouse Forum letter, I assume/hope that's on purpose. I guess that all the critics that liked this movie are just horny sexually repressed nerds that were aroused by all the hints of sexuality.
I guess I'll go back to watching endless old Top Gear episodes.
The new High Stakes Poker cast is disappointing too. Hachem, Laak and Esfandiari are some of the biggest cocks in poker. They're all cry-baby attention whores with no real game skill or interesting personality. Laak might be the worst; he's a huge nit and just generally sucks; he normally plays $10/20 NL live and just bum-hunts. Last episode he made one of the most retarded plays in poker - raising to "defend" a weak hand. It's a play that tilts me so bad, because dumb "pros" like Laak or Gabe Kaplan will recommend it, and it usually works so they think they did the right thing. Basically if you flop a hand that is decent and probably best, but is not strong, you should almost never be raising with that. Like say you flop a weak middle pair or something like that. You just call, you don't raise. (obviously there are exceptions, everything in poker is situational, there are no formulas or set "moves").
I'm trying to figure something out, maybe someone out there has a good idea.
This is about the Oodle File Page Cache that I mentioned previously. I'm not doing the fully general page cache thing yet (I might not ever because it's one of those things you have to buy into my philosophy which is what I'm trying to avoid).
Anyway, the issue is about how to prioritize pages for reclamation. Assume you're in a strictly limitted memory use scenario, you have a fixed size pool of say 50 pages or so.
Now obviously, pages that are actually current locked get memory. And the next highest priority is probably sequentially prefetched pages (the pages that immediately follow the currently locked pages) (assuming the file is flagged for sequential prefetching, which it would be by default).
But after that you have to decide how to use the remaining pages. The main spot where you run into a question is : I need to grab a new page to put data into, but all the pages are taken - which one do I drop and recycle? (Or, equivalently : I'm thinking about prefetching page X, the least important page currently in the pool is page Y - should I reclaim page Y to prefetch page X, or should I just wait and not do the prefetch right now).
The main ambiguity comes from "past" pages vs. "prefetched" pages (and there's also the issue of old prefetched pages that were never used).
A "past" page is one that the client has unlocked. It was paged in, the client locked it, did whatever, then unlocked it. There's one simple case, if the client tells me this file is strictly forward-scan streaming, then the page can be dropped immediately. If not, then the past page is kept around for some amount of time. (there's another sequence point when the file containing the page is closed - again optionally you can say "just drop everything when I close the file" or the pages can be kept around for a while after the close to make sure you were serious about closing it).
A "prefetched" page obviously can be prefetched by sequential scan in an open file. It could also be from a file in the prefetch-ahead file list that was generated by watching previous runs.
Prefetches pages create two issues : one is how far ahead do I prefetch. Basically you prefetch ahead until you run out of free pages, but when you have no free pages, the question is do I reclaim past pages to do new prefetches?
The other issue with prefetches is what do you do with prefetched pages that were never actually used. Like I prefetched some pages but then the client never locked them, so they are still sitting around - at what point do I reclaim those to do new prefetches?
To make it more clear here's a sort of example :
Client gives me a prefetch file list - {file A, file B, file C, file D}
Client opens file A and touches a bunch of pages. So I pull in {A:0,A:1,A:2} (those are the page numbers in the file).
I also start prefetching file B and file C , then I run out of free pages, so I get {B:0,B:1,C:0}.
Client unlocks the pages in file A but doesn't close file A or tell me I can drop the pages.
Client now starts touching pages in file C. I give him C:0 that I already prefetched.
Now I want to prefetch C:1 for sequential scan, I need to reclaim a page.
Do I reclaim a page from B (prefetched but not yet used) or a page from A (past pages) ? Or not prefetch at all?
When client actually asks for C:1 to lock then I must reclaim something.
Should I now start prefetching {D:0} ? I could drop a page from {A} to get it.
Anyway, this issue just seems like a big mess so I'm hoping someone has a clever idea about how to make it not so awful.
There's also very different paradigms for low-page-count vs high-page-count caches. On something like the PS2 or XBox 1 where you are super memory limitted, you might in fact run with only 4 pages or something tiny like that. In that case, I really want to make sure that I am using each page for the best purpose at all times. In that scenario, each time I need to reclaim a page, I should reevaluate all the priorities so they are fresh and make the best decision I can.
On something like Windows you might run with 1024 pages. (64k page * 1024 pages = 64 MB page cache). In that case I really don't want to be walking every single page to try to pick the best one all the time. I can't just use a heap or something, because page priorities are not static - they can change just based on time passing (if I put any time-based prioritization in the cache). Currently I'm using a sort of cascaded priority queue where I have different pools of priority groups, and I only reevaluate the current lowest priority group. But that's rather complicated.
I'm waffling about what car to get. I really *need* a new car because I've realized that the Qualude is just too small for me. I've been sitting hunched over all my life, and I'm trying to correct that and decompress my spine, and I just can't do it.
I'm waffling about whether to have surgery. Some days my shoulders feel sort of okay and I think "I could live with this" but then other days I'm in a lot of pain and it's just all crunchy and grinding and searing pain and awful. And of course any time I try to do a pushup I think "I need surgery".
I always see both sides of every decision, and it's quite paralyzing. In my opinion, most people, even smart people are very cavalier about ignoring the pros and cons. But they are probably right to do so. It's better to just pick something and go with and do your best with that decision. To put it in programming terms - I would say 90% of the smart programmer that I know are far too opinionated about various style issues, they don't rationally give fair credit to the other side's arguments and see that there are good arguments on both sides. But that's fine. Somebody who just says "low level C-style is the one true way" and runs with it might be wrong, but they can be productive because they have just made a decision and are getting work done. Somebody who sits around all the time thinking "hmm would it be better to do this C-style, or C++ OOP? or maybe need a GC language, or I could use OCaml for this bit.." is wasting way too much time equivocating.
Most successful business men, and just dynamic and fun people that I admire, are quick decision makers that just pick something and stick with it.
A related topic I've been thinking about recently : I've always been very dubious about the idea of learning from people who have been successful. There's this whole cult of worshipping rich people, reading interviews with them, getting their opinions on things, trying to learn what made them successful. I think it's mostly nonsense. The thing is, if you just look at who the biggest earners are, it's almost entirely luck.
Think about it this way - you have a bunch of gamblers. Some of them are very good and will make a steady profit of +10 a year with not much variance. The others suck, but play very wild, so their expected profit is +0, but they have high variance, so it could go between -100 and +100 in any given year.
If you look at the list of who the top winners are in any given year, it will be the people who suck. It will just be the ones that happened to get lucky. If you also look at the biggest losers, they will be using the exact same strategy as the biggest winners.
To make a more concrete example - lets say you have a biased coin you're allowed to flip. It has a 55% chance of being heads and you can bet as much of your bankroll as you want on it. A smart gambler might use a "Kelly" bet size ( see my old blog post ) and make a nice expected profit. A crazy gambler would just bet their whole bankroll on every flip. If you have 1000 people betting smart, and 1000 people betting crazy , after 10 flips all the biggest winners will be people playing the crazy strategy.
Anyway, the point is if you just look at successful business people, they will probably be confident, decisive, risk takers, aggressive at seizing opportunities, aggressive about growing the business quickly, etc. That doesn't mean that those are the right things to do. It just means that those are variance-increasing traits that give them a *chance* to be a big success.
Now, just like in poker, there are times when variance-increasing is the right move. For example, if you believe that you basically suck. If you think you don't have an edge on your opponent, then the best way to beat him is to just play really wild and hope you get lucky. For example, the best way for a novice poker player to have a chance in a tournament is to just do lots of all-in preflop shoving and hope you win the draws. The same thing could certainly be true in capitalism; if you believe you don't really have any great money-making skills, your best way to get rich is to variance it up.
Seattle is getting fucked. They're destroying Capital Hill. It's still sort of charming now, with old brick buildings, lots of greenery and plants, cheap rents and hipsters and artists. But they are tearing down the cool old buildings as fast as they can and putting up awful generic big blocks of condos plus retail space. They are literally turning it into Bellevue as fast as they can.
I despise dealing with the fucking receptionists at all these doctor places. I'm pretty sure the receptionist at MTI Physical Therapy is an alcoholic. I feel like I can identify it with just a minute of contact. There's a certain lazy attitude, a sloppy way of sitting, and a sort of floppy facial muscle nature, the way they talk is like their lips are too big and loose and they have to chew out the words. She's also got the sort of big jaw and brow and large nose with the bad pitted skin that I believe is caused by liver damage (Petechiae).
Anyway, it makes me think it might be nice to have a personal assistant. In theory you don't actually need to be making all that much money for a PA to be worth it. If they could actually remove all these distractions for me and help me focus on work and have a clear mind, it would be a huge boon. I don't want to have to think about bills or car registration or talking to receptionists or shopping or anything. I want only work and pleasure. It would give me time to exercise more, and to do more hobby programming. My concern would be that I couldn't actually trust the PA. Not "trust" as in worry they would steal from me, but trust as in be confident that they are taking care of things correctly. If I can't trust them, then it doesn't free my mind from worrying about all those things; I have to have full confidence that they are on top of everything, and it seems like someone solid enough to take care of things is not going to be a PA.
The tulips have started opening up around the city. I love the way the spring bloom here has such discrete stages. In a lot of places (Texas) it just suddenly happens one day and everything is blooming, but here we've had these steps as different species hit their time one by one. It's also pretty amazing the way the bulbs around here naturalize and just pop up in random parking strips and between bricks and such; we've had crocus, daffodil, and now the tulips.
I found a nice way to get a bike up and down cap hill - Interlaken Drive in Interlaken Park. It's like a tiny bit of woods, and there's nary a car on it. It's also got the really cool Hebrew Academy on it, which looks like some old gothic mansion; it makes me think of Wuthering Heights or Dickens or something, I imagine a rich old man muddling around in its dusty halls with a terribly unhappy neice that he forbids to go outside. I put some pictures on my flickr .
So I'm redoing the low level file IO part of Oodle. Actually I may be retargetting a lot of Oodle. One thing I took from GDC and something I've been thinking about a long time is how to make Oodle simpler. I don't want to be making a big structure that you have to buy into and build your game on. Rather I want to make something "leafy" that's easy for people to plug in at the last minute to solve specific problems.
Pursuant to that, the new idea is to make Oodle a handful of related pieces. You can use one or more of the pieces, and each is easy to plug in at the last minute.
1. Async File IO ; the new idea with this is that it's cross platform, all nice and async, does the LZ decompression on a thread, can do DVD packaging and DVD emulation, can handle the PS3/Xenon console data transfers - but it just looks like regular files to you. This is less ambitious than the old system ; it no longer directly provides things like paging data in & out, or hot-loading artist changes; you could of course still do those things but it leaves it more up to the client to do that.
The idea is that if you just write your game on the PC using lazy loose file loads, boom you pop in Oodle and hardly touch the code at all, and you automatically get your files packed up nice and tight into like an XBLA downloadable pack, or a DVD for PS3, or whatever, and it's all fast and good. Oh, and it also integrates nicely with Bink and Miles and Granny so that you can things like play a Bink video while loading a level, and the data streamers share the bandwidth and schedule seeks correctly.
2. Texture goodies. We'll provide the most awesome threaded JPEG decoders, and also probably a better custom lossy and custom lossless texture compressors that are specifically designed for modern games (with features like good alpha-channel support, support for various bit depths and strange formats like X16Y16 , etc.). Maybe some nice DXTC realtime encoding stuff and quality offline encoding stuff. Maybe also a whole custom texture cache thing, so you can say Oodle use 32 MB for textures and do all the paging and decompression and such.
3. Threading / Async utilities. You get the threaded work manager, the thread profiler, we'll probably do "the most awesome" multithreaded allocator. We'll try to give these specific functions that address something specific that people will need to ship a game. eg. a customer is near ship and their allocator is too slow and taking too much memory so they don't fit in the 256 MB of the console. Boom plug in Oodle and you can ship your game.
Anyway, that's just the idea, it remains to be worked out a bit. One thing I'm definitely doing is the low level IO is now going through a page cache.
As I'm writing it I've been realizing that the page cache is a super awesome paradigm for games in general these days. Basically the page cache is just like an OS virtual memory manager. There's a certain limited amount of contiguous physical memory. You divide it into pages, and dynamically assign the pages to the content that's wanted at the time.
Now, the page cache can be used just for file IO, and then it's a lot like memory mapped files. The client can use the stdio look-alike interface, and if they do that, then the page cache just automatically does the "right thing", prefetching ahead pages as they sequentially read through a file, etc.
But since we're doing this all custom and we're in a video game environment where people are willing to get more manual and lower to the bone, we can do a lot more. For example, you can tell me whether a file should sequential prefetch or not. You can manually prefetch at specific spots in the file that you expect to jump to. You can prefetch whole other files that you haven't opened yet. And perhaps most importantly, you can assign priorities to the various pages, so that when you are in a low memory situation (as you always are in games), the pages will be used for the most important thing. For example you can prefetch the whole next file that you expect to need, but you would do that at very low priority so it only uses pages if they aren't needed for anything more urgent.
The next awesome thing I realized about the page cache is that - hey, that can just be the base for the whole game memory allocator. So maybe you give 32 MB to page cache. That can be used for file IO, or video playback - or maybe you want to use it to pop up your in game "pause menu" GUI. Or say you want to stream in a compressed file - you map pages to read in the packed bits, and then you map pages as you decompress; as you decompress you toss the pages with the packed bits and leave the uncompressed pages in the cache.
Say you have some threaded JPEG decompress or something - it grabs a page to decompress to. The other cool thing is because all this is manual - it can grab the page with a certain priority level. If no page is available, it can do various things depending on game knowledge to decide how to respond to that. For example if it's just decompressing a high res version of a texture you already have in low res, it can just abort the decompress. If it's a low priority prefetch, it can just go to sleep and wait for a page to become available. If it's high priority, like you need this texture right now, that can cause other pages that are less important to get dropped.
Pages could also cache procedurally generated textures, data from a network, optional sounds, etc. etc.
I think about it this way - in the future of multicore and async and threading and whatnot, you might have 100 jobs to run per frame. Some of those jobs need large temp work memory. You can't just statically assign memory to various purposes, you want it to be used by the jobs that need it in different ways.
There's a black Porsche Cayenne in Seattle with the custom plate "PEPPR". That on its own is reason enough to punch this guy in the crotch, but he's also a complete douchebag driver. He was speeding up and slowing down, tailgating, cutting across many lanes of traffic. He cut way across the merge triangle from 520 to 405, which kicks up a ton of debris, and he also did it straight into heavy traffic, causing lots of people to have to brake for him. What a cock. If you ever see this guy, please run into him intentionally, or at least key his car.
It is your moral duty as someone alive to punish dicks. We are all investors in the "social economy" that we all play in. If you don't punish dicks, you're just as bad as the pension managers who were asleep at the wheel of corporate oversight.
BTW you may be thinking it's hypocritical of me to criticize crazy drivers. Yes, I am somewhat crazy myself, but I try to be crazy without inconveniencing others. I try to be considerate. I hate people who drive too slow, but I also hate people who drive too crazy. That's not a contradiction - the point is that like most things in life there is a good middle ground, and you are a dick if you go too far in either direction.
I'm trying desperately to avoid sitting at the computer when I'm not working, but so far I haven't been too successful. I just don't know what else to do with myself. Last weekend I tried to make myself do all kinds of non-computer stuff. I put up drape hold backs, I vacuumed the radiators, I tacked the cable TV line to the floorboards, I went shopping for clothes around the city, I went for walks. I made a pork shoulder ragu with papardelle (that was quite fantastic, maybe the best ever ; similar to this or this ; some mods I made : use a mix of stock and red wine for braising liquid; when the braise is done, remove the meat and boat-motor the veg to make a sauce, use a bit of fennel seed, and one dried red chile).
And that all only took a few hours. Now what? I can read, watch TV, or get on the computer.
I kinda want to make a little iPhone game, but I'm not letting myself because that would mean a lot more computer sitting. (it would also surely distract me from the work I need to be doing for RAD).
If you are a young man thinking about a career, I strongly encourage you to stay away from computers. They will ruin your life. You will spent your years trapped in cubicles under flourescent lights, hanging out with a bunch of other nerdy men, not doing anything really dynamic or exciting in your life, and slowly ruining your body through keyboarding and too much sitting. You won't have contact with girls, or fun different people, you won't do things that you can tell stories about, you won't have freedom to develop your other interests, it will take all your time and you'll wind up getting pigeon-holed into some specialty that becomes increasingly tedious to you.
ADDENDUM : I don't think anyone sees my addendums now, because the first draft gets sent out with RSS, and then nobody goes back to the original page to see the updates. Mmm I kinda hate RSS. Anyhoo -
I got a medical massage from this guy named John Bagley at Belltown Healing Arts. It was fantastic, I felt a ton better afterward, and actually went in for a doctor exam later that same day, and my function tests were way improved. He's literally the first person I've seen that I felt like really examined my body and identified the problems. If you have computer/injury body problems, I highly recommend him, though it does mean you will have to be touched by a man, which may be confrousing to some.
If you go in to an orthopedist or a physical therapist and they don't tell you to take your shirt off, and actually look at and touch your muscles and bones to see what's going on - just walk out. They're a fucking hack. How can they diagnose you and treat you without seeing what's going on? Furthermore, a physical therapist should actually have a finger on your muscles as you do the exercises. If they just sit there and look bored and you do your reps, walk the fuck out.
I think fashion is kind of like a game of chicken. When I see someone that makes me stop and go "wow that's a cool look" it's almost always because they're wearing something really ridiculous, something that actually just looks goofy and retarded. But by wearing something ridiculous and pretending that it's okay, they are showing that they have enough status and coolness to get away with it. They're daring the world to laugh at them.
In PUA lingo, this is DHV (Demonstration of Higher Value). When someone wears something really outrageous it's like making a poker bluff saying "I am cool enough to get away with this outfit". They put that bet out there, and then the world can either call their bluff (by laughing at them), or if the world lets them get away with it, they have succeeded with the DHV.
Of course most people, like me, chicken out and don't even try to be outrageous. Wearing bland ordinary stuff is a way of being meek, invisible, of not trying, because you're afraid your bluff will fail and the world will laugh at you.
My fucking physical therapist lied to me about being a preferred provider for my PPO, so now I get to go through health care grievance arbitration. Yay ! I'm not too optimistic about that working out well, so I probably get to eat the cost. Fuckers.
In other news our fucking cock landlord has the heat in the building turned off and it's been super cold again recently, so we had literally no heat over the weekend and it was miserable, so I got the fun of calling him.
Oh, and the white trash neighbors had a backyard bon fire at midnight and filled our house with smoke. That's the last one I'm tolerating, next time I just tell them no, and if it persists they get reported for fire code violations. I went and talked to them about it once before and they were just like "oh yeah okay". I have no problem confronting them, I'm just concerned about the possible revenge response. (* see later).
And the other side neighbors just had a baby. It's not actually that bad, we hear it cry a bit once in a while, but the side walls in the building are almost half decent (unlike the floors). The baby neighbors are actually incredibly considerate about not making noise, actually too considerate, I find it annoying. It makes me feel uptight just being around them. I sort of imagine that they put a pillow over the crying baby because they're worried about it being too loud.
The whole situation with health care billing pisses me off so much. You don't get to see your bill until weeks after you go in, because the health plan has to approve or not approve various charges. Many providers illegally try to "balance bill" you. Watch out for that. Even if you are careful and specifically try to go to a preferred providers, they will often take you in for xrays or something and the xray tech is not a preferred provider, so blammo suddenly you get a nasty bill. The thing that happened with my physical therapist is that the company is a preferred provider, but the particular therapist that they assigned me to is not.
It's ridiculous how bad the health insurance system is if you think about it. The health insurance companies should want you to go to cheaper, better providers. It would save them money, which should be the whole point of an HMO type company, is to set you up with better care for cheaper. But they don't, they do nothing. The right way to do it would be to provide information and choise to the customer. For example, make you pay a 10% coinsurance, and then show you the average cost of various providers. Also, they should give you quality ratings for various providers. It is in their best interest financially to send you to better quality providers who make you healthier in the minimum number of visits, because that reduces their cost of followups and complications. They are better off if you see surgeons who have a higher improvement rate. But they provide you with none of that information. Instead they literally just give you a provider directory with 1000 names in it. Gee thanks, now what do I do? Eenie meanie miney moe.
Some random less bitter junk :
This is pretty old, but the NYT Buy vs. Rent Calculator is super awesome. It's just an amazingly well designed piece of interactive statistics graphing. You can click on all the spots to get more info, you can drag the sliders on the left and it adjusts the graph. It's just so money.
The crashing real estate market is sort of tempting me, but this actually reminded me that it's not so hot. Yes, things are closer to a decent bargain than they have been, but even undervalued stuff is not likely to go up much at all in the next 5 years, what with finances being generally tight and people being scared of buying. I'm expecting home prices to be almost level for 5 years, so even getting into something that seems like a bargain right now isn't really a great move unless you hold it for 10 years or so.
(*) : I think the fear of revenge response is actually usually wrong. For example in customer service situations, when someone is being totally incompetent, like the other day I went to the UPS store and was trying to send something to Lebanon, PA, and the guy starts asking about declared value and contents and whatnot, I started thinking in my head "wait, are you doing a custom form to send this to the country of Lebanon?" but I just tried to be nice. The logic is that if you just call them on their stupidness then they will be unhelpful, while if you are polite and nice to them, they will try to help you more and give you better service.
I think that's mostly wrong. Most people just try to incompetently muddle through life and are just constantly fucking up and cutting corners and being lazy and not doing their job. If you let them, they will keep doing that. If you give an inch by being nice, they will just keep fucking up. If you call them on and say "hey, I'm sending this to Pennsylvania buddy", they get shocked out of it, they realize they can't get away with sneaking their half assed shit past you, because you're going to call them on it. Then they want to get their work done and take care of you because they don't want to be around someone that calls them on shit.
Also : Majesco's stock ticker is COOL !? WTF that is so awesome. And it's perfect if you ever meet the Majesco people. They know absolutely nothing about video game development, and literally talk like The Sopranos; I swear that company is some kind of mob money laundering operation.
Right now in Seattle, the Lawrimore Project features "Stability" - two people are living inside a giant teeter totter, so that they have to balance each other all the time.
Bakery Nouveau in West Seattle is probably the best bakery I've found yet here. It seems they only do one early morning baking, so late-day stuff is a bit stale. It's worth going in the morning on the weekend, fantastic pastries, bread, quiche and etc.
Skagit Valley is home to lots of Tulip and Daffodil farms where the fields are full of flowers for commercial production. It's quite a pretty area. The main bloom is coming in the next few weeks . I might try to get up there for a little road trip.
D & M seems to be one of the few places that will ship liquor to WA. Nice scotch selection.
Blackbird in Ballard seems to be the only actual cool men's clothing store in Seattle. Rather pricey.
You may know I've been looking at getting a 135 (damn Ryan beat me to it). One interesting option is Performance Center Delivery . You get the car cheaper, and you get a free track day out of it, which would be awesome. The down side is it's in fucking South Carolina (what kind of moron would ever want to live in SC!) which is an awfully far way to drive back here.
CoreInfo is useful for seeing how your cores map to logical processors and all that nonsense which will become ever more critical. I find all the low level lock free multithreading stuff super amusing to program, but it's just really not practical most of the time, because it does make code much harder to modify and debug and everything.
Kubota Garden is a large Japanese Garden in Seattle that was build almost entirely by a single family over many years. The city now runs it. Will have to check that out.
Good links from other blogs : Meme timeline and radical cartography .
Gus Hansen TV is pretty fucking awesome. He's quite a nutter. I think the jury is actually still out on whether he is any good or not. Most of the top players seem to think he is a huge fish - the high stakes games actually run *around* Gus right now, but Gus is not actually losing money in them yet. Time will tell. It requires Silverlight and a decently fast connection.
Some Puget Sound area touring links. I haven't done any of this yet so I don't know what of this information is actually good :
Trails and trips for Washington hiking, free info and resources...
SummitPost - Mountain Loop Highway -- Climbing, Hiking & Mountaineering
State Scenic Byway Scenic Driving Tours Washington State
National Scenic Byways Washington
Mountain Loop Highway, Glacier Peak Region, Washington
GORP - Washington Scenic Drives
Beat back the forest to exquisite Bedal Basin
Alpinism in the Pacific Northwest
When you're putting condiments on your sandwich, you have to go in the right order. If you want mustard, mayo, and butter (btw butter on french bread is what makes a sandwich yum) you should go butter first, then mayo, then mustard. Getting a little butter in the mayo is no big deal, same with getting a little mayo in the mustard, but getting mustard on the butter would be a major faux-pas.
It's a trickier question with peanut butter and jelly. Do I get the PB in the jelly, or the jelly in the PB? And BTW please don't suggest using two knives, that is right out.
I've been going to a lot of physical therapy and doctor appointments and whatnot recently. It's such a huge distraction. For one thing it just takes a ton of time. The appointment might be only an hour, but you have to get there & park, then wait because they're always off schedule, then drive back. It winds up taking about three hours. And then there's the fact that it breaks up your day. I can't concentrate for hours before an appointment because I know an appointment is coming up. (it's kind of like how if I know I have to wake up early for something, I can't sleep that whole night because I keep waking up every half hour and looking at the clock wondering if I'll sleep through the alarm somehow). The whole hour before the appointment I can't really do any work because I don't want to start digging into something and get on a roll if I'm gonna have to cut it off.
The result is that on these days with doctor shit, I hardly get a lick of work done, and I just can't get my mind back onto focus. I only really do awesome work when I can wake up and just start working and not have any appointments or anything scheduled. I need to just be able to work as long as I want, when I want.
It also gets really pricey. At close to $100/visit at 3 visits to different people a week, you get into $1000/month very fast. I guess poor people don't get to be healthy. One of the weird things is that even though the physical therapist visit charges $200, (and I pay $50-$100 out of pocket that the insurance doesn't cover), the actual therapist is not making much money at all, they get maybe $30/hour ($60k/year). $170 is going to overhead, to the owner of the PT business, to the health insurance company and its executives and stock holders.
This is basically the model of all commerce in America these days - the product is too expensive for median-income consumers, and yet the people who actually make the product don't get that money, and they aren't paid enough to buy the very thing they make. The difference is getting skimmed to the super-rich. This is how income inequality grows. As a consumer, you should demand higher quality goods for cheaper prices, which means more of your money is going directly to the people who actually made the goods or provided the services. You should refuse to buy expensive garbage where 50%+ of the cost goes to management and shareholders.
Wright Angle is another nice quality Seattle food blog. Not awesome humor content like Surly Gourmand, but lots of good info and photos. It's helpful to me just to have blog subscriptions related to the things I like to do in life, because it gives me little reminders to get out and do those things. I'd like to find one about cycling around Seattle, maybe one about day trips and driving tours in the Puget Sound area, maybe one about raves, one about S&M and swinging, you know, all my interests.
Drew sent me this : NVidia Ion mini PC . I love these little mini cheap quiet PC's. NVidia might save their company with the cheap integrated market segment, because their integrated controller/graphics part is by far the best on the market right now (my god AMD/ATI should have such a clear advantage in this segment but they just can't get their shit together).
Anyway, the Ion PC made me think about something Butcher's been saying that just finally really rang true to me : the real loser in these PC price wars is MS. I mean obviously that's not true, clearly the hardware guys are the first to take the big hit, like Dell is in trouble because the Micro PC's are driving down prices and profit. But once we get into $300 and cheaper PC's, the problem for MS is that a $50-$100 copy of Windows doesn't make much sense any more.
When a PC is $1000 you can easily hide the $100 price of Windows, it doesn't seem so significant. But if you look at a $200 mini PC running Linux vs. a $300 mini PC with Windows, people will go for the Linux. (MS is aware of this and has recently started selling Windows cheaper to OEM's for use in the mini-PC market).
Fucking doctors are such fucking cock ass mother fucking scum sucking sons of bitches.
I've requested my medical records from every doctor that's been involved with my shoulders. I've gotten them from zero. The offices always treat me like I'm being such a huge pain in the ass. Fuck you it's my fucking records. When I tell them that I just referred myself they give me all this attitude. Fuck you, I have a PPO and I have fucking torn shoulders, I know I should be seeing a shoulder specialist, I don't need to go to some fucking primary care retard just to get referred back to you.
The doctors spend less than 5 minutes actually looking at me. They don't ask about detailed history at all. They immediately send you for MRI's. But then they literally don't even look at the MRI's. They just read the one page summary by the MRI tech. I've read those summaries and they're literally like one or two sentences. The doc doesn't read it before you visit, they just skim the summary once they get handed your chart in the office. Then they refer you to Physical Therapy.
The PT and the doc literally never talk. In fact, the doc doesn't even write any notes or instructions for the PT. When you show up at the PT office, they don't have your medical records or know anything about you really. WTF am I paying all you fucking tards so much for !?
Even at the PT places, the actual trained PT only sees you for maybe half an hour or so, and then you get handed off to the exercise babysitter who doesn't know shit about shit.
The Washington State Liquor Laws are literally a direct funnel of money to the big distributors and producers. I knew that intuitively, but I found a good article that lays out the details : here . The law literally prohibits retailers from negotiating prices with distributors, prevents retailers from buying directly from producers, and mandates a 10% markup. This kind of public-private monopoly is always bad for the public, but is quite popular with both Dems and Reps (see, eg. health care, power utilities, telecom, military contractors, etc).
is gorgeous. But it is wreaking havoc with my sinuses. It's gotten so bad that I'm getting bloody noses now, which is quite rare for me. It's especially fun when you have a bloody nose combined with hard sneezes, so that you shoot red snot all over the place. Anyway, walking around in the sunshine is delightful :
The azaleas have just started blooming in the warmer/sunnier spots around the city. I guess they'll kick in over the next few weeks. After that I'm looking forward to the Rhododendrons; I see them all over the city with their big leaves and their closed up flower buds, so I suspect there will be lots of them when they open. I think the arboretum has a nice collection.
While the cherries are nice, I'm particularly attracted to the trees with big white flowers; I'm not sure what they are, maybe dogwood? It's not something I've ever seen in California.
I thought I'd write a bit about my multithreaded "Worklet" dispatcher before I forget about it. I call little units of work "worklets" just because I like to be nonstandard and confusing.
The basic idea is that the main thread can at any time fire up a bunch of worklets. The worklets then go to a bunch of threads and get done. The main thread can then wait on the worklets.
There are a few things I do differently in the design from most people. The most common "thread pool" and "job swarm" things are very simple - jobs are just an isolated piece of independent work, and often once a bunch is fired you can only wait on them all being done. I think these are too limiting to be really generally useful, so I added a few things.
1. Worker threads that are not in use should go completely to sleep and take 0 cpu time. There should be a worker thread per processor. There might also be other threads on these processors, and we should play nice with arbitrary other programs running and taking some of the cpu time. Once a worker thread wakes up to do work, it should stay awake as long as possible and do all the work it can, that is, it shouldn't have to go to sleep and wait for more work to get fired so it can wake up again.
2. Worklets can have dependencies on other worklets. That way you can set up a dependency tree and fire it, and it will run in the right order. eg. if I want to run A, then B, then run C only after A & B are both done, you fire worklets {A}, {B} and {C: dependent on AB}. Dependencies can be evaluated by the worker threads. That's crucial because it means they don't need to stall and wait for the main thread to fire new work to them.
3. The main thread can block or check status on any Worklet. The main thread (or other game threads) might keep running along, and the worker threads may be doing the work, and we want to be able to see if they are done or not. In particular we don't want to just support the OpenMP style "parallel for" where we fork a ton of work and then immediately block the main thread on it - we want real asynchronous function calls. Often in games we'll want to make the main thread(s) higher priority than the worker threads, so that the worker threads only run in idle time.
The actual worker threads I implemented with a work-stealing scheme. It's not true work-stealing at all, because there is a concept of a "main thread" that runs the game and pushes work, and there's also the dependencies that need to be evaluated. All of the thread-thread communication is lock free. When the main thread adds new work items it just jams them onto queues. When the worker threads pop off work items they just pop them off queues. I do currently use a lock for dependency evaluation.
Traditional work stealing (and the main papers) are designed for operating system threads where the threads themselves are the ones making work. In that environment, the threads push work onto queues and then pop it off for themselves or steal from peers. There are custom special lock-free data structures designed for this kind of operation - they are fast to push & pop at one end, but also support popping at the other end (stealing) but more slowly. What I'm doing is not traditional work stealing. I have external threads (the "game threads") that do not participate in the work doing, but they can push work to the workers. In my world the workers currently can never make new work (that could be added if it's useful).
There are a lot of nice things about work stealing. One is you don't need a seperate dispatcher thread running all the time (which would hurt you with more context switches). Another is that workers who have cpu time can just keep jamming along by stealing work. They sort of do their own dispatching to themselves, so the threads that have the cpu do the work of the dispatching. It also offloads the dispatching work from the main thread. In my system, the workers do all work they can that's known to be depency-okay. Once that work is exhausted they reevaluate dependencies to see if more work can be done, so they do the dependency checking work for themselves off the main thread.
Another nice thing about work stealing is that it's self-balancing in the face of external activity. Anything running on a PC has to face lots of random CPU time being stolen. Even in a console environment you have to deal with the other threads taking variable amounts of time. For example if you have 6 cores, you want 6 workers threads. But you might also have 3 other threads, like a main thread, a gpu-feeding thread, and a sound thread. The main thread might usually take 90% of the cpu, so the worker on that core rarely gets any time, the gpu-feeder might usually take 50% of the cpu time on that thread, but in phases, like it takes 100% of the cpu for half the frame then goes idle for the other half. With work stealing your worker thread will automatically kick in and use that other time.
In order for the self balancing to work as well as possible you need small worklets. In fact, the possible wasted time is equal to the duration of the longest task. The time waste case happens like this :
Fire N work items , each taking time T to N cores For some reason one of the cores is busy with something else so only (N-1) of them do work Now you block on the work being done and have to wait for that busy core, so total time is 2T Total time should have been N*T/(N-1) For N large the waste approaches TBasically the smaller T is (the duration of longest work) the more granular the stealing self-allocation is. Another easy way to see it is :
You are running N workers and a main thread The main thread takes most of the time on core 0 You fire (N-1) very tiny work items and the (N-1) non-main cores pick them up You fire a very large work item and core 0 worker picks it upThat's a disaster. The way to avoid it is to never fire single large work items - if you would have split that into lots of little work items it would have self-balanced, because the stealing nature means that only worker threads that have time take tasks.
For example, with something like a DXTC encoder, rather than split the image into something like N rectangles for N cores and fire off 1 work item to each core, you should go ahead and split it into lots of tiny blocks. Of course this requires that the per-work-item overhead is extremely low, which of course it is because we are all lock-free and goodness.
There are some things I haven't done yet that I might. One is to be a bit smarter about the initial work dispatching, try to assign to the CPU that will have the most idle time. If you actually did make nice tiny worklets all the time, that wouldn't be an issue, but that's not always possible. In the case that you do make large work items, you want those to go the cores that aren't being used by other threads in your game.
Another issue is the balance of throughput vs latency. That is, how fast does the system retire work, vs. how long does it take any individual work item to get through. Currently everything is optimized for throughput. Work is done in a roughly FIFO order, but with dependencies it's not gauranteed to be FIFO, and with the work stealing and variations in CPU Time assignment you can have individual work items that take a lot longer to get through the system than is strictly necessary. Usually this isn't a big deal, but sometimes you fire a Worklet and you need it to get done as quickly as possible. Or you might need to get done inside a certain deadline, such as before the end of the frame. For example you might fire a bunch of audio decompression, but set a deadline to ensure it's done before the audio buffers run out of decompressed data. Handling stuff like that in a forward-dispatched system is pretty easy, but in work-stealing it's not so obvious.
Another similar issue is when the main thread decides to block on a given work item. You want that item to get done as soon as possible by the thread that has the highest probability of getting a lot of CPU time. Again not easy with work-stealing since some worker thread may have that item in its queue but not be getting much CPU time for some reason.
I've been reading a bit about Multi-threaded Allocators. A quick survery :
The canonical base allocator is Hoard . Hoard is a "slab allocator"; it takes big slabs from the OS and assigns each slab to a fixed-size block. In fact it's extremely similar to my "Fixed Restoring" allocator that's been in Galaxy3 forever and we shipped in Stranger. The basic ideas seem okay, though it appears to use locks when it could easily be lock-free for most ops.
One thing I really don't understand about Hoard is the heuristic for returning slabs to the shared pool. The obvious way to do a multi-threaded allocator is just to have a slab pool for each thread. The problem with that is that if each thread does 1 alloc of size 8, every thread gets a whole slab for itself and the overhead is proportional to the number of threads, which is a bad thing going into the massively multicore future. Hoard's heuristic is that it checks the amount of free space in a slab when you do a free. If the slab is "mostly" free by some measure, it gets returned to the main pool.
My problem with this is it seems to have some very bad cases that IMO are not entirely uncommon. For example a usage like this :
for( many )
{
allocate 2 blocks
free 1 block
}
will totally break Hoard. From what I can tell what hoard will do is allocate you a slab for your thread the first time you go in, then when you free() it
will be very empty, so it will return the slab to the shared pool. Then after that when you allocate you will either get from the shared pool, or pull the
slab back. Very messy.
There are two questions : 1. how greedy is new slab creation? That is, when a thread allocates an object of a given size for the first time, does it immediately get a brand new slab, or do you first give it objects from a shared slab and wait to see if it does more allocs before you give it its own slab. 2. how greedy is slab recycling? eg. when a thread frees some objects, when do you give the slab back to the shared pool. Do you wait for the slab to be totally empty, or do you do it right away.
The MS "Low Fragmentation Heap" is sort of oddly misnamed. The interesting bit about it is not really the "low fragmentation", it's that it has better multi-threaded performance. So far as I can tell, the LFH is 99.999% identical to Hoard. See MS PPT on the Low Fragmentation Heap
We did a little test and it appears that the MS LFH is faster than Hoard. My guess is there are two things going on : 1. MS actually uses lock-free linked lists for the free lists in each slab (Hoard uses locks), and 2. MS makes a slab list *per processor*. When a thread does an alloc it gets the heap for the processor it's on. They note that the current processor is only available to the kernel (KeGetCurrentProcessorNumber ). Hoard also makes P heaps for P processors (actually 2P), but they can't get current processor so they use the thread Id and hash it down to P which leads to occasional contention and collisions.
The other canonical allocator is tcmalloc . I still haven't been able to test tcmalloc because it doesn't build on VS2003. tcmalloc is a bit different than Hoard or LFH. Instead of slab lists, it uses traditional SGI style free lists. There are free lists for each thread, and they get "garbage collected" back to the shared pool.
One issue I would be concerned about with tcmalloc is false sharing. Because the free lists get shared back to a global pool and then redistributed back to threads, there is no real provision to prevent little items from going to different threads, which is bad. Hoard and LFH don't have this problem because they assign whole slabs to threads.
The Hoard papers make some rather ridiculous claims about avoiding false sharing, however. The fact is, if you are passing objects between threads, then no general purpose allocator can avoid false sharing. The huge question for the allocator is - if I free an object on thread 1 that was allocated on thread 2 , should I put it on the freelist for thread 1, or on the freelist for thread 2 ? One or the other will make false sharing bad, but you can't answer it unless you know the usage pattern. (BTW I think tcmalloc puts it on thread 1 - it uses the freelist of the thread that did the freeing - and Hoard puts it on thread 2 - the freelist of the thread that did the allocation; neither one is strictly better).
Both tcmalloc and the LFH have high memory overhead. See here for example . They do have better scalability of overhead to high thread counts, but the fact remains that they may hold a lot of memory that's not actually in use. That can be bad for consoles.
In fact for video games, what you want in an allocator is a lot of tweakability. You want to be able to tweak the amount of overhead it's allowed to have, you want to be able to tweak how fast it recycles pages for different block types or shares them across threads. If you're trying to ship and you can't fit in memory because your allocator has too much overhead, that's a disaster.
BTW false sharing and threaded allocators are clearly a place that a generational copying garbage collector would be nice. (this does not conflict with my contention that you can always be faster by doing very low level manual allocation - the problem is that you may have to give tons of hints to the allocator for it to do the right thing). With GC it can watch the usage and be smart. For example if a thread does a few allocs, they can come from a global shared pool. It does some more allocs of that type, then it gets its own slab and the previous allocs are moved to that slab. If those objects are passed by a FIFO to another thread, then they can be copied to a slab that's local to that thread. Nice.
It seems clear to me that the way to go is a slab allocator. Nice because it's good for cache-coherence and false sharing. The ops inside a slab can be totally thread-local and so no need to worry about multithread issues. Slabs are large enough that slab management is rare and most allocations only take the time do inside-slab work.
One big question for me is always how to get from an object to its owning slab (eg. how is free implemented). Obviously it's trivial if you're okay with adding 4 bytes to every allocation. The other obvious way is to have some kind of page table. Each slab is page-aligned, so you take the object pointer and truncate it down to slab alignment and look that up. If for example you have 32 bit pointers (actually 31), and you pages are 4k, then you need 2 to the 19 page table entries (512k). If a PTE is 4 bytes that's 2 MB of overhead. It's more reasonable if you use 64k pages, but that means more waste inside the page. It's also a problem for 64-bit pointers.
There are some other options that are a bit more hard core. One is to reserve a chunk of virtual address for slabs. Then whenever you see a free() you check if the pointer is in that range, and if so you know you can just round the pointer down to get the slab head. The problem is the reserving of virtual address.
Another option is to try to use pointer alignment. You could do something like make all large allocs be 4k aligned. Then all small allocs are *not* 4k aligned (this happens automatically because they come from inside a 4k page), and to get the base of the page you round down to the next lowest 4k.
The Vista lappy keeps having one little quirk after another. The latest one is that it would go to 100% CPU if left idle for a while. This was a little hard to track down, because as soon as you touch it to try to see what's using the CPU - it shuts off that task, so you can't catch it in the task manager.
So I knew it obviously had to be one of those "background optimization" services that Vista so kindly runs for you. So I looked in the Scheduled Tasks thing for anything that was set to run "when machine is idle".
It looks like the culprit was CrawlStartPages.
Eat "disable" motherfucker.
Test drove a 135 again. It's a fucking great car, the steering feel and suspension are marvelous, and it also goes wonderfully tame and quiet if you lay off the throttle. But when the mood hits you, power is available at any moment because of the amazingly flat torque curve.
But the seats are leather. In about 5 minutes my ass was a sweltering swamp. My back was a waterfall draining into said swamp. Your only other seat option is plastic (they call it "leatherette" but it's bleeding plastic).
What the hell is wrong with cloth !? It's light, it's cool and breathable, it's cheap, it's durable, it's the perfect car seat material. I've read that BMW puts cloth seats in its cars in Europe, it's only America where they think they have to play into this "leather is luxurious" retardation.
Bleck. I've also decided I don't really love the fat spongey M sports steering wheel. I'd rather have a thin hard one with molded grip. Like, say, the one in my beloved Prelude. It makes you feel more intimately connected with the road, and I like being able to get my whole hand around it.
Anyhoo, the only big decision left for me is manual or automatic (steptronic). Normally I would definitely go manual, but the steptronic is pretty damn good and the manual-mode control on it still makes you feel somewhat connected
ZOMG there are way too many papers on image and video these days. I just found this huge dump of ICIP papers. I just downloaded them all. There are 5000 papers taking 2 GB. That's just ICIP 2002-2008.
See for example : ICIP 2008
Oh, and of course in typical fuckup style, all the file names are like "001037.pdf" ; Awesome. I may have to write a problem to parse PDF and try to find the title and stick it in the file name.
The bad thing about Wikipedia is not that it's inaccurate (as retards often joke). It's that democracy and the rule of facts makes it very boring. Any little bits of humor or commentary in the articles are stripped out because they aren't directly factually backed. I'd far rather read something that's full of exaggeration and distortion written by someone amusing who's trying to make a point.
And really, presenting just the facts is its own kind of distortion. So much of history is subtlety and innuendo that some winking allusions can give you a better picture.
Wouter mentioned a while ago that used Porsches were getting really cheap with this economy. I glanced at Cayman S listings and thought "meh yeah he's right but not hugely". Then I looked at 911 listings. ZOMG. You can get a 2006 911 S for under $40k. That's 50% depreciation in two years. High end Mercs have fallen similarly. The more exclusive high end cars like Ferraris and such haven't fallen so much, but they are way more available than they would be in a tougher market (I mentioned before that the Nissan GT-R is available for below MSRP, which I'm sure is pissing off dealers that hoped to sell it for way over MSRP).
So anyway, if you've always wanted a 911, it's a great time to get one. I've always heard they're really unpleasant as a rush hour daily driver, so I have to wait until I have a house and a driveway so it can be my second car.
I'm done with old apartments. As much as I think our place is lovely and charming, I'm sick of the quirks, the floors that creak and are thing as cardboard, the dust everywhere, vents that don't work, windows that don't open or close or keep out the cold or wind, shitty kitchens and baths, etc. I hate mid-century American shit too, so the only option is to go brand new. I'm sick of getting ill from mold and dust and lead and whatnot in these dirty old places.
The rent for homes vs. apartments in Seattle is totally out of whack. You can rent a whole house on Cap Hill for $2000/mo, one of the big beautiful homes on a really nice street. Decent apartments are like $1500/mo. WTF, that's completely out of whack. I was thinking I'd like to get a top floor apartment (or rent a condo) in a new building. That would give me views and a deck and high levels of swank, but they're like $2500+ which is pretty nuts given our economy and the low price of whole fucking houses.
I've bought a bunch of things recently and I'm super unhappy with all of them.
Levi's 501's : the fuckers have completely redesigned the 501. Of course they still call it the "501" because 501 is a *brand* not a *model*. In fact ironically they now label them "The Original 501" when they are far from that. The actual original 501 (well, not the original 501, but the one that was pretty constant from the early 80's to late 90's) was marvelous and fit me perfectly. The only problem I have with the older 501's is that they take *forever* to break in. I've got a pair of the real shrink-to-fit 501's that I've had for over a year and they still haven't broken in. The new "original" 501 has the more modern thin denim that you don't have to break in, but they also totally changed the cut and it fits weird on me. Among other things they now have false waist sizing to mollify people who claim to be thinner than they really are (this is standard practice amongst most fuck-tard fashion brands, they label the waist a 34 but it's really more like a 36).
I bought a down comforter from Pacific Coast Feather Co . They are offering 20% off the allergen-free stuff right now so I thought WTF, this company is supposed to be the best. I have visions of a white bedroom, with morning light pouring through the windows, waking up all nestled in the fluffy down comforter. The comforter is shit. It's made of that fucking awful modern synthetic material that's all shiny and crinkly, it feels gross and makes weird noises when you rub on it. God I despise modern synthetic materials with a passion. I want cotton and tweed and silk and leather and wood and dirt and stone and flesh.
Which reminds me of my many recent failures trying to buy new jackets. In SF I went shopping one day because I need a decent jacket and don't have a single one; I went to a bunch of fancy stores, Nordstrom, Macy's, Kenneth Cole, some other little ones. Every single fucking jacket is synethic crinkly shiny awful shit. I don't mean blazers, I mean jackets that zip up. Everything from fucking Kenneth Cole, Banana Republic, J Crew etc etc it's all fucking polyster nylon shiny crinkly shit.
I long ago made a rule for myself to never buy clothes online, because they always are horrible in some way you can't anticipate, either the material is shit or they don't fit or they have some weird logo you couldn't see in the previews. But I do sometimes try to buy something that's just another one of something I already have. Oh no! That doesn't work either because the fuckers keep changing things without changing the name.
I'm thinking about getting my cash back into stocks (long). I'll probably dollar-cost in slowly over the next 12 months or so. I'm kind of tempted to go big on Citi and GE. They're trading so low that there's a high probability of huge returns (100%+ easily). I really don't believe there's a huge risk either, since the government has shown no signs of bailout restraint yet, and they would get further prop-ups and tax favors and everything else the government can do to keep them from failing. It really seems like a government-backed gamble which is a wonderful place to be (and the same way that the mortgage industry made so much money all those years).
In contrast, something like LVS (Sands) is not such a desirable gamble IMO. It does have a nice huge possible upside (casinos + crazy chinese gamblers = megabucks), but it's not going to be bailed out by the government the same way something like Citi would, so it's not a free ride gamble - it really might go to zero and go bankrupt.
I have some slightly different views than many pundits on the economy. For one thing I believe the sickness extends far beyond just the finance and real estate sectors. I believe 99% of our entire country is broken. Our health care costs are too high, our workforce is too unskilled, we haven't been spending on R&D or infrastructure or education, our immigration policy is too draconian, too much of our economy is in services and consumer spending.
On the other hand I disagree completely with the people who claim that the US Financial Behemoth will never be the same, or that it's gone for good or whatever all the silly pundits are saying. In fact I would predict the exact opposite - the US Financial sector will be back to making huge risky gambles just as fast as they possibly can, and eventually our government will help them do so again. It might take a little while for people to forget, but there's one thing you can be sure of : once a problem is fixed, everyone forgets about it and starts removing the fixes and things that would prevent it from happening again. So the reckless financiers before the Great Depression led to lots of laws and regulations preventing it from happening again, with limits of bank sizes, minimums of liquid capital, regulators placed in the banks to watch their actions, limits of what different types of financial services could be mixed. But in the good times those rules were eroded little by little, until Clinton and Bush nearly completely eliminated them. And the exact same thing will happen again. Granted, it took almost 50 years for it to happen the first time, but we are far more flighty and apathetic now than people were in the last century. All we need is a new crisis to forget about this one.
American crisis response always progresses something like this :
Severe harmful overreaction that doesn't really fix much Scapegoating of some minor players Forming of committees to investigate and lots of announcements that lead to nothing Things mostly get better and a new crisis comes along Giving in to lobbyists that want rules relaxed so they can make more money Letting it happen againSo shall it be with banking, terrorism, etc.
Still sick but I think it's almost over. You can tell I'm home sick and bored out of my mind because I'm writing a ton of nonsense.
I want to go shopping here, but you have to fucking pay to park in the downtown shopping area. I'm not fucking paying to park just so that I can have the privilege of giving people my money. It's like if you had to pay $5 just to browse Amazon. (privilege and cartilage really mess me up). I don't understand how people think they can put up so many barriers to me spending money and still get it.
It's not surprising that Freeman Dyson has gone mad. Almost every great scientist does go mad in their old age (in fact the big surprise is the very rare case that they *don't*). Obviously all old people lose their grip on reality a bit, but geniuses and scientists seem to lose it more spectacularly. (a few just off the top of my head : Prigogine, Penrose, Pauling, I'm sure there are better examples).
Also Dyson's argument about global warming is just foolish. It's a sort of common left-brain physicist reaction to want to have all the information and completely understand all the causes & effects before you take action, but that's not always possible. With something complex like climate or politics, you may *never* get all the information, because the system keeps changing faster than you can measure it. The correct thing to do is consider the EV of the various possibilities :
P = probability that CO2 is causing warming and we need to do something about it (when I say "warming is not happening" I mean it's not significantly caused by human actions or is not a problem, obviously it is happening the only debate can be what's causing it and is it bad) If we act : EV(act) = P * EV( warming is happening and we act ) + (1-P) * EV( warming is not happening and we act ) If we don't act : EV(don't act) = P * EV( warming is happening and we don't act ) + (1-P) * EV( warming is not happening and we don't act ) Let's put some sample numbers in to make this clearer : EV( warming is not happening and we don't act ) is a no-op, so it's 0 EV( warming is not happening and we act ) is just the cost of acting for no reason, let's say it's -1.0 EV( warming is happening and we act ) is we do our best to stop warming, let's say it's -2.0 EV( warming is happening and we don't act ) is the disaster scenario, let's say it's -100.0 If we act : EV(act) = P * ( -2 ) + (1-P) * ( -1 ) If we don't act : EV(don't act) = P * ( -100 ) Equality at : 2P + (1-P) = 100*P P = 1/99So even if there's a 1% chance that warming is real, we're better off acting. The fact that we aren't sure of the cause & effects is somewhat irrelevant.
Surly Gourmand is a totally fucking awesome Seattle food blog. Holy crap I love this guy. I found it when searching for info about "Burning Beast" which I guess I missed but seems moderately rad. There's lots of funny little tidbits on Surly Gourmand, for example don't miss the comment at the end of this page . Ha! I like this this post too; hell it's all good, did I mention I love this guy? Here's a small taste of his well-salted genius :
Writing restaurant reviews is hard fucking work. “Cry me a river, asshole,” you might be thinking, and in a way you might be right, but in another, more accurate, way you'd be a dumbass.
I did my taxes this year with TaxACT . Mainly because I put it off too long to get a decent accountant, but also because they were so simple this year it seemed worth it to just do myself.
I liked TaxACT, I recommend it. I used TurboTax years ago and it was a bloody nightmare. By the time I was half way through I wished I had just used pen and paper. TurboTax was a weird annoying interface that didn't match the real form exactly and I couldn't figure out how its questions matched up to the data I had to enter.
TaxACT starts out with the same kind of annoying retarded interface where it just asks you questions; it tries to be like a "Wizard" for tax info, like "what kind of taxes do you want to do today?". Fortunately you can just close that down and go directly to the forms. It has all the forms with number entry boxes, and it does all the bits of copying numbers from one box to another and adding and subtracting and whatnot.
Sometimes when I do taxes by hand I feel like the IRS has created one of those math puzzles where you do a bunch of steps and always come up with the same answer, like :
enter your pre-tax W2 income in box 41 take the last digit and subtract it from nine and enter that in box 42 divide box 32 by box 33 and put the remainder in box 43 evaluate the gamma function of box 44 take the last three digits of box 44 and add them together, enter in box 45 copy box 41 to box 46 that's your incomeanyway, TaxACT does all that bit for you.
Some caveats about it :
1. It claims to be free but it's not. I didn't need the premium version ($10) but if your taxes are complex at all (eg. itemized deductions) then you do. I did have to pay for direct deposit ($8) (it says free efile but doesn't mention you have to pay if you want the DD) and the California add-on ($14). Still cheaper than TurboTax.
2. Every time you try to enter a derived value it says "this is a computed value, do you wish to do the worksheet" , say yes and go do the worksheet. If you don't, it messes up its spreadsheet computation tables.
3. It can be a little hard to go "back" to forms you've previously done and want to edit. The best way I found is just to browse the index of all forms, and select them manually. Another trick is if you want to go to a worksheet that filled in a certain derived quantity, you can just try to edit the result and it will pop up the dialog mentioned in #2 and then you can say "take me to the worksheet".
Anyway, still not a huge advantage over just using pen and paper, and it does have the disadvantage that you have to copy in all the data off your 1099's and W2's and whatnot manually if you want to efile.
I just started watching "Life on Mars" (BBC) with Alissa. It's decent so far. It is rather ham-handed with the whole is-it-real existential crisis. It also beats you over the head with the setting over and over just like the horrible "Mad Men" ; like oh look they're smoking how quaint, oh look there's an 8-track in the car, how cute. It would be so much better if they just told the story straightforwardly and the setting was just *there* without all the over-emphasis. Life on Mars wins over Mad Men just because it actually has stories - the Not sure if recommend yet. I think I would rather just see a good cop show rather than this self-conscious highly affected play on a cop show.
(BTW of course if you have to option between an American remake and the BBC original - always watch the original. The American formula is to basically steal BBC shows, put beautiful retarded actors in them that don't match the character, rip out everything that's real or honest or edgey or controversial at all, make them just like every other bland TV show, use tacky overly slick fancy production and lighting and makeup and everything).
I've been watching a bit of "Shameless". It's basically a Northern England soap opera, and I feel a bit dirty watching it, but it's very well made. It's got a great pace and rhythm to it and the characters are very lively. Recommended for losers like me who wish they were watching soaps but would never admit it.
Watched all of the first season of "That Mitchell and Webb Look". I must say watching it all in one go was not friendly to it. The running gags are very repetetive. The first episode I really liked and it just went downhill. Still a few bits really made me lose it. I love the realtor bit, and the natives in the gardening supply was laugh-out-loud gold.
I watched an episode and a half of "15 Storeys High" then gave up on it. I just don't get it. Is it supposed to be a comedy? I don't think I've seen a funny bit yet. It's just sort of weird and grim. I guess there are some bits that are obviously supposed to be funny, like when the guy doing the relaxation tapes gets mad and yells, but that was just so obvious and set up and not at all in context that it completely misses. Do not recommend.
Recent Movies (just the good ones) :
"Man on Wire" was beautiful, inspiring. I've always wanted to live more like that - living every day artistically, beautifully, doing something crazy and pointless and dramatic with a band of friends that are united by the sheer drive to be doing *something* this these sad boring lives of ours.
"Milk" was quite good. A bit heavy handed & cheezy at times (the flash back at the very end to the foreshadowing at the beginning was just inexcusable), but overall very moving, well acted, fun. It made me really miss San Francisco (we saw it before GDC). All the crazy fun people, the liberal politics, all the protests and parties in the streets.
I've become a big fan of braising greens as I get older. The Trader Joe's braising green mix is reasonably priced and convenient, and unlike most bag salad products, it's not all rotten because braising greens hold up well. At Restaurant Zoe we had some collard greens that were almost delicious but in fact disgusting. They were smothered in so much butter, and it they were near room temperature so the butter was semi-solidified, so that it was almost like a quiche with just butter instead of milk and egg. If you're going to add bacon to your braising greens, don't be tempted to do it first. Yes, cooking things together is nice, but it's better to cook the bacon last and just sprinkle it on at plating so it stays crispy.
The best thing I ate in San Francisco during GDC was the bone marrow at Two. Bone marrow is a lot like butter-collards, it can be disgusting if it's too plain, because it is basically pure fat, and it does have a slightly weird vitamin-like bone taste. It's been trendy to cut it with a bit of persillade or gremolata or something like that, but at Two they serve it in a kind of reduced-broth onion gravy, which was perfect. It enhances the meatiness and tames the vomit factor, and provides some extra nice tasty bread dip.
(the Raviolo at Two was also excellent - a soft cooked egg yolk in a pasta wrapper with brown butter sauce, very simple but delicious. It would've been better with some crunchy pancetta bits sprinkled on top. That also continues a trend I've seen a lot recently of really interesting and well made delicious appetizers, leading up to rather boring and lazily made main courses.)
There's a game engine for every day of the week now. Personally I don't think big game engines is the way to go. It just puts you as a developer too much at the mercy of the engine developer, and if it has major flaws that you don't like you're screwed. I'd much rather license lots of little pieces I can tack together as I see fit.
SpeedTree's tools get better and better but their runtime is still not awesome. That seems to be a pretty common thread. It's kind of weird because for me the tool is the hard part that I never want to do. Hell I could write the runtime for *all* these middlewares if they would write the tools and the do the sales and support. Some other kind of similar ones :
Fork Particle Middleware has a pretty nice tool. It's funny I was talking to Sean at the show about what other middlewares RAD might do someday and I mentioned particles. We figured you'd want to have a nice tool for artists where they could compose arbitrary particle systems, then the ideal thing would be to output a sort of "particle HLSL" for each system, then you could compile that to CPU code or GPU code or SPU code. The idea there is that all the toggles hierarchy and such that the artists set up, you don't have in the code. That is, you want to avoid winding up with code that's like :
if ( m_particleSystemDef->m_doSpin )
{
SpinParticle();
}
... more ifs for each possible effect ...
instead you just compile down optimized code for each system type. You also would want to automatically provide particle LOD and system
scalability so it can target different machine capabilities.
Anyway, I doubt it's a good RAD product because it's too tool and artist heavy, it's not a big enough piece of pure technology. The Fork demos look pretty super-duper asstastic, like shockingly bad, but I don't know if that's just because they have bad artists setting it up or because their tech actually sucks. I do think there's a lot of value in actually making an impressive good looking demo - it proves that it's at least possible to do so with your system.
Allegorithmic Substance was at the Intel booth showing their procedural texture middleware. This is another one that I had the idea of doing some day maybe, so I wanted to see their action. Procedural texturing is obviously compelling in the future era of 8+ CPU cores and SVT infinite textures and so on. And you clearly need a nice artist-driven tool to define them.
Allegorithmic has a really super nice polished tool for writing shaders. I have no idea how fast their runtime is; I tried to ask some questions about how they run the shaders and the guy demoing didn't seem to be an actual programmer who knew WTF was going on. Again just like particles you would want to be running code generation so that you don't have like 100 per-texel branches for all the options. In fact, this is really just like rasterization in a software rasterizer like Pixo. The procedural texture code is just like a pixel shader and you want to output some kind of HLSL and have it compiled to a CPU/GPU/SPU shader.
I also don't think their way of defining shaders is right. You want something that is intuitive to artists and easy for them to tweak visually. You don't want to have to hire someone who is a procedural texture specialist. The Allegorithm shaders are just like Renderman shaders (or as Sean pointed out - like the stuff that demo coders do for 64k demos). You get a bunch of functions that you can compose and chain together and you tweak them to make it look like something. For example you can take Perlin Noise and apply curves and powers to it and threshold it and all that kind of stuff. Or you take a "mother wavelet" shape kernel and tile it, randomly rotate and offset it, etc.
That stuff is powerful and all, but it's just not intuitive and hard to tweak and not good for artists. If I was doing procedural texturing it would be example-driven with little bits of functional stuff. Some stuff you could do is all the shape-by-example synthesis stuff. You can do "detail textures" the correct way by using different tiling textures at different frequencies for the different wavelet levels of the output. You can use multiple tiling source textures and randomly compose them using Perlin Noise or an artist supplied blend texture. You can do things like the old "splatting" technique from Surreal with blend-alpha channels in tiling textures. You could even do things like Penrose Tiles where you have the artists paint the tiles and then you can create infinite non-repeating tilings. Or you can use sample textures to create tiles automatically like the Hoppe lapped texturing stuff. etc.
It just seems like the Allegorithmic tool is aimed at an audience that doesn't exist. It's too technical for real game artists to use. But if you're going to have a technical artist / programmer write the procedural textures, they would rather just have a text HLSL type language like Renderman. And having text shaders like that is much better because it's easier to share them on the web. In fact, the only way a tool like this could ever really take off is if a community develops and they post shaders on the web and share them, because writing all the shaders yourself is too much work. (in raytracing there are tons of prewritten shaders that are free or sold in commercial procedural texture kits).
Actually what you want from the tool is just a general attribute plug editor. You want to write text shaders like Renderman but have them take various parameters as scalars, colors, or images. Then you want to expose those to the artists with nice GUI tools and show them how they affect the shader with realtime preview. Something like :
input parameter brick_color : type color;
input parameter mortar_color : type color;
color diffuse_shader(vec2 uv, vec3 worldpos)
{
... do shader maths using uv, brick_color, mortar_color ...
}
In fact having a generic parameter editor set is a handy thing that all studios should have and everyone
reinvents, but again it's too small of a piece to sell.
Living up here reconnects me with the idea of "Spring Cleaning". Living in the warm south it never made any sense - why pick Spring to do a big cleaning once a year? Just do little cleaning all the time. (actually I'd like to have a schedule of one major cleaning item to do each month, like Jan - launder drapes, Feb - remove grates and dust inside air vents, Mar - move appliances and clean behind them, etc).
Anyway, up here it's cold enough that you close everything up for the winter, and I guess in the really cold places that's even more true. Heck in the old days when Spring Cleaning became a custom, they were still putting boards over windows and bringing livestock inside for the winter to stay warm. You would close up for the winter and swaddle in blankets and live in the stale musty air and dead skin and dust.
Here it's getting stale and dusty inside. I prefer to live with my windows open all the time. Sadly I haven't been able to do that for quite a while, since here it's cold and wet, and even in San Francisco if I opened my windows a pound of filth would come in the window every day. Anyway, I'm looking forward to when this cold and gray and wet finally ends so I can open up all the windows and shake everything out.
As I've gotten older something has changed in my nose geometry. As a child I always had a leaky nose, causing me to frequently wipe on my sleeves and t-shirt when I couldn't find a kleenex fast enough. Finally I gave in and started carrying a handkercheif like my mom (which I always found disgusting). Gradually something in the cartilage has changed, and now the snot drips down my throat instead of out the front (maybe this is just because it's constantly plugged with boogers). Anyway, it now makes me spit up loogies instead of dripping snot, which is also gross, but is better I guess because I can at least control it a bit and decide when to let it out.
Anyway, while my constant spitting of loogies is gross, and I sympathize with spitters, there are some spitting habits that people have which I find completely repellant and mind boggling :
1. The constant hocking / sniffling. This guy has got a mucus problem but for some reason is not going ahead and expelling the offending matter. He just keeps hocking or snorting over and over without ever spitting. This is sometimes done by people who were told by their mother not to spit, so they're trying to have good manners - well bud, hocking over and over is much worse than just doing it once and getting the spit out.
2. The gratuitously thorough hock. This guy reaches deep and grunts and rumbles to stir up mucus from deep inside then does a massive hock and spits out a liter of saliva. If you really needed to hock that hard to get the stuff up then you really didn't need to spit that bad in the first place. It's like they do it as a show intentionally to hock as loud as possible, as if it's manly or tough or something. This is usually done by working class males; it's sometimes used as punctuation in conversation.
3. The dripper. This guy hocks and then doesn't spit a tight wad with force the way you're supposed to, he just sort of leans over and opens his mouth and lets it drip out. Most of it comes out quickly, but there's usually a long trail of sticky slime that hangs off his mouth for many seconds, and he just sits there and lets it slow stretch and finally detach. Come on man, first of all spit it harder so that doesn't happen, but if you get a hanging trail, use your hand to detach it or something. This is often done by smokers or tobacco chewers.
Online price comparison is very harmful.
Meh I'm bored of this rant but you can fill in the blanks cuz it's obvious. Here are some key words for you :
Expedia - airlines competing only on price, Nextag, eHealthInsurance / Progressive / Geico - insurance companies competing only on price.
Obviously quality goes to shit.
When I was young I was really nervous around people or in new situations; I wanted to show that I knew how to handle myself, I never wanted to make a mistake, I wanted to prove that I deserved to be with the A-list. It made me a total loser, really self conscious, condescending, stiff, and just awkward and antsy. I was also a huge a flaker and would frequently bail on social situations or just go very briefly.
I was always jealous of the older men who just seemed comfortable in any situation. They were usually somewhat wealthy, but not rich, relaxed, maybe a bit fat, graying, balding, but just confident and at ease. While I'm still certainly more like the youthful me, I'm starting to get a bit of the Old Man Comfort.
When I was a kid I thought that I would become comfortable over time once I learned how to act in all those situations, how to interact with the bell hops, how to act when you order wine, etc. etc. Now I know that actually learning how to act has nothing to do with it. It's more just about not caring. As you get older you care less and less what people think and you know it's just not a big deal if you fuck up. For me there have been two big factors :
1. I've given up on the hope of the Ivory Tower. I used to imagine that there were amazing people somewhere, smart, beautiful, fun, having great conversation, great parties, starting businesses together, making art, cooking ,traveling. I wanted to be one of them, and I wanted to prove that I deserved it. I always thought most people were shit, but I idolized certain heroes in my fields that I thought were just total rock stars and I wanted to act cool around them and get their approval.
Now I no longer believe in that. Yes, there absolutely are different classes of people, and yes there is a AAA group that is way better than the rest in most ways. But they're shit too in their own way and not worth impressing. Basically everyone is shit and there's no magic circle you can get admittance to so you don't need to try.
2. I'm no longer as interesting to myself alone as I used to be. When I was young, hanging out with other people was borderline excruciating because you'd just be having some horrible boring conversation and never getting into the real issues; even if you did find a smart person to talk to, you'd just spend the whole time in misunderstandings and talking around the same point. On the other hand, if I bailed out on that social situation, I could wander the streets and look at the beautiful plants and compose poems in my head while isolating various of my senses one by one and focusing on them to turn up their intensity. I might go home and work on a programming project for a while, or crack a notebook and work on some physics theorems, or read some textbooks by brilliant people. Every day I spent alone was full of activity and thought and projects and learning and excitement. Pretty much any minute of that spent with someone else was a loss, and it made me feel really antsy and impatient with people and want to rush through conversations and cut them off as quickly as possible.
While that's all still true to some extent, the big thing that's changed is not that I've discovered the value of other human's conversations (no, they're still as useless as ever). The difference is the value of my own alone time has plummetted, bringing them closer to parity. I'm now lazy and tired and bored of myself, I have lots of project ideas but I no longer have the enthusiasm to do them justice.
... was really rough. I didn't get to any talks so I don't have anything good to tell you. I got to walk around the floor a bit, but didn't see anything interesting really. There is a ton of middleware out there - a procedural texture product (Allegorithm or something), FMOD, SpeedTree, a particle system middleware - and all of them have really really polished tools, super nice UIs. There are also just a ton of game engines now, some of them have very flashy demos.
I miss San Francisco. Even just flying in, I looked out the window and could make out the weird shape of Point Reyes and thought of all the great biking out there, the fun people who go swim Bass Lake, the sun shine, the aromatic smell of all the sage and bay laurels. The city has such a happy vibe to me, so many different types of people all jammed together and getting along, hobbos, hippies and hipsters. People biking around, sitting in the park, everyone seems to be energetic and full of life.
It was interesting seeing the break down and set up. It's amazing how fast they do it. Literally the second the 3:00 closing bell is chimed on the last day, the union labor guys start pouring in and picking up carpet. The RAD Veterans immediately started tearing apart the booth and boxing it up so it would be ready and they could get out of there. In a few hours the whole place was torn down.
Also, a lot of people have mentioned "OnLive" in their GDC blogs. It's 100% retarded and I don't want to hear about it any more. I would bet any amount of money against them ever making any money.
... and I got the GDC Disease from shaking hands with weirdos from around the World. As soon as I got home I was hit with a brutal cold, headaches and mucus.
There's an article in last Sunday's New York Times called "Try, Try Again, Or Maybe Not" that I think is a great example. It's sort of cute, it comes across as smug, it interviews experts. And it's just completely wrong.
The core point goes like this : 22% of venture capitalists succeed. Those who have previously succeeded have a 34% success rate. Those who have previous failed have a 23% success rate. The conclusion is "you don't learn anything from failure, the success rate is basically the same".
Totally retarded. They completely miss the basic point that there are varying populations and the previous outcomes weight for the different populations. Furthermore the success rate of the group with experience is (0.22*34 + 0.78*23) which is clearly better.
Let's run some example numbers. Let's say the population is made up of good entrepreneurs and bad ones.
Initial population :
X good
(1-X) bad
success if good = P(G)
success if bad = P(B)
Initial success = X * P(G) + (1-X) * P(B) = 0.22
X * G + (1-X) * B = 0.22
In the successful group,
fraction successful Y = ( X * P(G) ) / 0.22
success rate = E + Y * P(G) + (1-Y) * P(B) = 0.34
E = additive benefit from experience
0.22 * E + ( X * P(G) ) * P(G) + (0.22 - ( X * P(G) )) * P(B) = 0.34 * 0.22
In the unsuccessful group :
fraction successful Z = ( X * (1 - P(G)) ) / 0.78
success rate = E + Z * P(G) + (1-Z) * P(B) = 0.23
0.78 * E + ( X * (1 - (G)) ) * (G) + (0.78 - ( X * (1 - (G)) )) * (B) = 0.23 * 0.78
This is three equations in 4 unknowns, so there is a variety of solutions.
X can be in [0,1]
For each X there is only one valid {G, B, E}
... meh I started to solve this but I got bored. The solution is something like G = 0.4, B = 0.1, E = 0.05.
Anyway it's absolutely clear that E is significantly greater than zero, and in fact the whole thesis of the article is wrong. Actually something the numbers are telling me that I think is perhaps more interesting is that the "Bad" population is *very* bad. The reason why the group that failed once is not too bad the second time is because there are still quite a few "Good" people in it that just got unlucky on their first roll, and everyone learned a bit from experience.
I just put a LogWindow in cblib and I thought I'd make some notes about it. The reason it exists is that stdio on Windows is so ungodly slow. The console is actually a seperate process, and when you printf to stdout, it creates an interprocess communication packet and fires it over. I'm sure every programmer knows that doing lots of stdio printing can be the bottleneck in your app if you're not careful. We all take this for granted and take a lot of steps to avoid printing too often. For example my file copy percent display routines look like :
int lastPercentShowed = -1;
.. copy N more bytes ..
int percentDone = (100 * bytesDone + (bytesTotal/2) ) / bytesTotal;
if ( percentDone != lastPercentShowed )
{
fprintf(stderr,"%d%%\r",percentDone);
fflush(stderr);
lastPercentShowed = percentDone;
}
where the whole purpose of lastPercentShowed is to avoid doing too many prints. (you could also use timers and check enough time
has passed, etc).
But sometimes it's a pain to limit this kind of stuff at the high level. For example Oodle logs all the files it opens. Normally that's fine because you don't open files too often. But sometimes you do something that opens a ton of tiny files, and suddenly the printf to stdio becomes the bottleneck. Well you know what, that's retarded, it's just putting text on the screen, it shouldn't be such a huge perf hit.
So I just did my own Log Window.
It's just a simple Window on a thread. It runs its message pump for that Window on that thread (BTW to do this you must also Create the window on that thread - Windows automatically associate themselves with the thread they are made on, unless you do AttachThreadInput which you don't want to do).
The LogWindow::Puts from the main thread just uses my lock-free SPSC FIFO to send the string as a packet to the LogWindow thread. That makes it nice and nonblocking and fast for the main thread. (I just duplicate the string to make it safe to send to the thread).
So that's all nice and simple and fast and the main thread can just fire tons of logs and not worry about it. The other big speed improvement is in the LogWindow update.
The LogWindow thread sleeps when it has no windows messages. It needs to wake up occasionally if it got a mouse click or a paint that it needs to respond to. It also needs to pull new strings off the FIFO and add them to its list. When it does that, it can just keep pulling off the FIFO while it's not empty - it doesn't cause a redraw for every new string it gets, it only redraws once for a whole bunch of strings. This is much neater, if the main thread is firing tons of strings at it really quick, it will wind up just grabbing them all and doing one big step instead of tons of little ones.
The LogWindow thread also needs to wake up when there are new messages in the FIFO even if it hasn't gotten a paint message. I guess I should do that with an Event signal, but right now I'm just using a timer. The timer just wakes it up and makes it pull the fifo. Yeah in fact I'll switch that to an Event right now and use "MsgWaitForMultipleObjectsEx" that lets you wait on either an Event or a Windows HWND Message.
Though actually they optimize slightly different cases. The Event is better if you log very rarely or with big gaps, because it lets the LogWindow completely sleep when it's not being updated (with the Timer it still wakes up on the timer interval all the time and costs you some thread switch penalty even though it does nothing). The Timer is better if you are logging very heavily, because just setting the Event for each printf is not at all free. Maybe there's a way to get the best of both worlds, but really either one is miles ahead of using a standard console.
Some random Oodle thoughts/questions :
My ThreadViewer is currently using GDI. It's pretty good, I like it. I did originally because I consider it a favor to the user. There are some nice things about using GDI instead of a 3D API - it plays nicer like a normal windows app, you know it gets Paint messages with rects and only repaints what it needs to, it can be dragged around and respects the user's paint while dragging or not request, and of course it doesn't gobble CPU when it's just sitting there. And of course it doesn't interfere with other 3D apps using the same device driver and memory and whatnot.
There are some disadvantages though. GDI is butt slow. If I draw much at all it really dogs. It's possible I could speed that up. For example I'm drawing individual lines and it might be faster to draw line-lists. I also wonder if it would be a win to batch up states. Right now I do SetPen/SetBrush and then draw, over and over, I could batch up draws with the same settings to avoid state changes. Another disadvantage is fancier blend modes and such are tricky or slow with GDI.
One option is to do my own 2d drawing by locking a DIB and using my own rasterizer. Probably crazy talk. The other option is just to be rude and use 3d.
The issue of strings in Oodle is quite interesting. The way you refer to resources is by name (string). Obviously the file names that I load have to be strings because that's what the OS uses. The main data structures that map resources to files are thus big string->string maps. (at some point I'm going to rewrite this to make it a bidirectional map, a database that can go resource->file or file->resource, which is just string<->string with hashes both ways).
Currently what I have is a ref-counted string that can optionally also just be a plain C-string. It's a 4-byte pointer and I stuff a flag in the bottom bit to indicate if it's got a ref-count or not. If not, it's assumed to point at const memory. If it does have a ref count, the count is right before the string data. Thus getting to the characters is always fast. I also almost always pass around a 4-byte hash with the string, which lets me avoid doing string compares in almost all cases. I'm usually passing around 8-byte (64 bit) "HashedString" objects. I only actually do a string compare if the hashes match but the pointers don't, which in my world is basically never. This is very good for the cache because it means you never actually follow the pointer to the string data.
One problem with this is that the strings can still wind up taking a lot of memory. The most obvious solution to that we've discussed before here. It's to explicitly use {Path/Name} instead of {Full Path}. That is, break off the directory from the file name and share the directory string for all files in the same dir. That would definitely be a big win and is one option.
Another option is to have a "Final mode" where the strings go away completely. The 4-byte hash would be kept, and instead of a 4-byte pointer to string data there would be a uniquifier to break hash ties. In order to make the right uniquifier you would have to have a full list of all the strings used in the game, which you of course do have with Oodle if your game is done. Then the uniquifier has to get baked into all the files, so there would have to be a "destring" pass over all the data that may or may not be reversible. This could be made relatively transparent to the client. One annoyance we had with this kind of thing at Oddworld is that once you bake out the strings, if you have any errors you get messages like "Resource 0x57EA10BF failed to load!". Still this would make the super-low-memory console wonks happy.
Another option is to combine the strings very aggressively. I already am running all the strings through a single global string table (the Game Dictionary that has the resource<->file map also makes all strings unique - this is a big win with the refcounted string - whenever you read a string from a file, rather than allocating a new buffer or that string you see if you have it already and just use the one you already have and bump the refcount). I'm not actually enforcing that to be required, but a lot of possibilities open up if you do require that.
For one thing, a "string" can just become an index or handle to the global string table. I almost never care about the contents of the string, I just use them as a key to find things, so if the char data is converted to a unique index, I can just compare index equality. Then, that index in the string table need not point at plain old char data. For example it could be an index to a leaf of a Trie. Each string gets added to the Trie and only unshared parts of the string make new chars. Each unique string is a leaf, those get labels and that's the handle that is given back to index the global string table.
Another option would be just some sort of LZ type compression. Some strings get full data, but other strings are just prefix matches, like "use the first N bytes from string M and then add these chars".
I think in the end I'll probably have to do the zero-strings option to please the masses. It also has the advantage of making your memory use more controlled and calculable. Like, you use 32 bytes per resource, not 32 bytes + enough to hold the strings whatever that is.
I broke down and bought a Consumer Reports membership to check up on cars. It's reasonably cheap, you can get a $5.95 for one month, but of course you have to remember to cancel and there are many reports of that being a pain and then contuining to charge you.
More importantly it's just crap. It's worthless. There's not really much on there that you can't find for free around the web. The only thing of value at all is the reliability survey, the actual articles are pure piss. I was actually shocked at the lack of information, I kept clicking around the web site trying to find the information that I assumed I was somehow not seeing. And with the growth of free sites like TrueDelta and even just the forums of places like Edmunds, you can pretty much get the reliability information elsewhere.
I mentioned to Ryan this concept : I think you can trust car owners complaining about their cars. A lot of stuff you see on web forums is just whiners and cranks, but people are so biased to defend their purchase, after spending $20k or more on a car they really don't want to admit that they fucked up, so when you see someone say that they hate their car and regret buying it, there must be real problems.
This time we see some error delta images. These are created by taking an image, compressing, decompressing, then subtracting from the original, and finally multiplying up by some constant so that you can actually see the errors (normally there are values like 0,1,2,-1,-2, so they would just appear black and not visible to the human eye).
Wavelet :
Lapped DCT :
The lovely Big Picture series about spring reminded me that I've really enjoyed all the daffodils and crocuses that have been blooming here of late. It certainly doesn't feel like spring here - it's still cold and gray - but the plants seem to think it is. I guess daphodils and crocus are classic northern European flowers, and Seattle is a lot like northern Europe; it's different than the desert wild flowers of spring I'm used to in Texas and Southern California, like the poppies, bluebonnets, indian paint, all that kind of stuff.
Anyway, I took some pictures walking around the neighborhood the other day. I think there's a very modest beauty to the little patches of flowers.
Also, this is right near us and always makes me laugh :
Often in life you find that really stupid people actually basically do the right thing, then semi-intelligent people reason about things but get it all wrong and wind up doing totally the wrong thing, and then the truly intelligent do the same thing as the dummies. You see this in poker of course a ton, people who just "bet when they have a hand" are actually sort of doing the right thing even though they have no concept of what's going on. Then semi-intelligent people start thinking all these cock-eyed things like "wait that makes no sense, the goal is to maximize EV weighted by the probability of each outcome, so let me estimate and do some mental math..." and they totally cock it all up. Another example I believe is the pursuit of base physical pleasures.
The most animal of pleasures are the greatest. Animal physical and mental pleasures are our reward for getting through this life of ours. The physical pleasures are the taste of food, the glow of exercise, the tingle of drugs, the haze of booze. The base mental pleasures are things like ego boosts, the feeling of acceptance by friends or peers, the feeling of being loved or wanted.
Now, the dumb obviously have no problem accepting the base pleasures. But they get it all wrong. They do it vulgarly, without discretion or sense. For example the dumb boys who chase every bit of tail in the bars. Yes, I applaud them for seeking out the wonderful physical pleasure of sex, but they make it vile, they make it unpleasant for the good girls to participate, and they cheapen the whole experience. It should be subtle, full of anticipation and hints. Similarly the dumb fatties all over the mid west. Yes, I agree, food is delicious, but they do it all wrong, just stuffing their gobs full of cheap filth all the time. Better to eat less and seek out new exciting pleasures all the time instead of just trying to get more and more of the same thing.
Intelligent Hedonism is about pursuing the physical pleasures in a smart way. Not over-indulging, not over-using one particular pleasure so that you tire of it, not pursuing one when it's not wanted, and not just doing it for a fix. The Intelligent Hedonist is always striving to find new pleasures, and to enhance an experience. You might not get it very often, but when you do it's really good.
It can be simple things - like the smell of the air after a rain, or seeing a sunrise from the top of a hill - those are the moments you stop and savor, and you might spend days or weeks working to set them up.
The Intelligent Hedonist doesn't just throw back wine to get drunk. They enjoy every moment. First the joy of the hunt - reading, searching, shopping, learning. Then just touching the bottle and seeing the label - the moment of anticipation, wondering what it will be like. The ritual of the opening and pouring, the sound of the pop as the cork comes out and the glug glug of the first pour (at a restaurant the whole ritual of the taste test). Then the nose the taste the feeling. But not just wine - obviously a lot of people pretend to do this with wine now because TV and Movies have told them to. You do this with everything because you love it. With a chocolate bar, you admire the wrapper, the sound of it tearing open - hopefully it has a nice gold foil under the paper and you get to lovingly fold that back; then the color of it, the shape it's pressed, the feel of it just slightly melting in your fingers, the scent of the cocoa notes, then the feel in your mouth, on your tongue, the melting velvet, and then the taste.
The Intelligent Hedonist is not just a complainer or a crank. He/she doesn't just moan about the shitty restuarants or the bad movies. He seeks out the good stuff. His life is an endless quest for things that are made with care and skill.
A lot of semi-smart people see things like typical drug users and rightly think "that's gross, that's a cheap way to get a high, I don't need that, I'm above that". Well, sort of. You're right that they way the dumb hedonist is using drugs is wrong, but you're wrong in thinking that the highest goal is to avoid them. The same thing goes for casual physical sex, many semi-intelligent people see it and think of it as manipulative, degrading, cheap, unfullfilling, and thus swear off it completely and spurn all those who partake of it. Well, yes, again you are right that the way dumb people pursue that pleasure is in fact gross, but that's because they're doing it wrong.
... is gonna suck. Tons of people that I usually look forward to seeing aren't even going.
I'll be at the RAD booth much of the time, so stop by and say hi. I'm not really pushing my product (Oodle) yet, but there will be an early preview of it, and of course you can see all the other great RAD libraries (and a preview of Sean's Flash/GUI product too).
Obviously, the Larrabee talks are going to be awesome, but I haven't heard of anything else too exciting yet. I'll add to this post if I find anything worth going to.
URG FUCKING TAKE YOUR SHOES OFF YOU COCK
I wonder what the outcome would be if I yelled that out at the top of my lungs. Improvement or no?
On all these car sites you constantly read about the "fine German engineering". What? I mean the stuff is really nice, I am very sympathetic to their aesthetic, and I love that they are driver-centric. But that's just the design choices. Their build quality is just awful. You get quotes like this :
"It is assembled with typical BMW care and craftsmanship with high quality materials used throughout"
the real marvel is the incredible German Marketing. Holy crap they have done a number on people.
You also get lots of auto writing that just doesn't make any sense at all :
"The acceleration, especially from a standing stop, is super responsive and yet restrained enough to delight."
What? In what world does restrained equal delight ? Oh yes, please restrain the acceleration more, oh how delightful. So a Hyundai Elantra makes you jizz in your pants?
It's very important to be aware of how your product name will work in search these days. If you name your band something like "Soap" you're never going to show up.
For example apparently there's a company called "Rad Technologies" in Hyderabad, and they posted a job listing on Oodle.com (a big classified site) , so if you search "Oodle Rad" you find them. There's also a Developer dot Ooodle dot com (for the Oodle classified site) which has an Oodle API and so on.
Anyway, the thing that's annoying me today is I'm trying to search for differences between Directx 8 and 9, and it's just impossible to search for anything related to DirectX any more.
What they should have done is made a different name for the runtime and the developer API. Call the runtime DirectX but call the developer interface apiX or whatever. That way when developers are talking about the internals, we can find each other.
Car buying is one of those things where you can easily fall into the trap of worrying about what other people will think of it. I don't just mean buying a car to impress others, that's the obvious trivial way and not a big deal. The more subtle bad thought process is mentally justifying your purchase to others. People who buy a Honda Civic are preparing their rationale in their heads as they buy it, imagining conversations where they tell people "it's just so practical, it's very reliable, it's fast enough for me for now" or whatever.
I've always been biased against Porsches because they're underpowered for the money, and they are so often the car of middle aged mid-life crisis men. So what? Getting as much horsepower per dollar is not the goal. It's to get the car that you enjoy. In my head I can hear the taunts of the stupid public saying "oh but some dumb American car has way more power at half the price, how could you buy that?" and I prepare a response in my head. Fuck you, I don't even have to respond to you. Stop thinking of rationales.
In my head I keep hearing people say "why would you get a 1 series when you can get a 3 series for just a few k more?" Well, voices in my head, I'll tell you why - because I like it better. It's smaller, lighter, faster, stiffer, more nimble, more fun. But really the point is that I don't need to make reasons for you, voices in my head.
In the end there are just way too many conflicting factors to weight them sensibly against each other in any objective way. You just have to go with your instincts.
You can learn a lot about how to invest by doing the opposite of me. Some of my guiding principles that have been hugely detrimental to me are :
"It's just a bubble - I'll stay out" . Sounds reasonable, but is quite silly. Bubbles are great opportunities to make a lot of money. Staying out because "it doesn't make any sense" or "it's rationally wrong" or "its fundamentally overvalued" is just shooting yourself in the foot. I've intentionally stayed out of tech stocks, real estate, gold, etc. because they are illogical bubbles.
"It's too late to play that now" . Over and over I've seen very obvious trends but figured I missed the good early chance to get in, so now I should pass it up. One example is bubbles, it's so obvious when they're happening, and I would think "oh, this is super obvious to everyone now, it's too late". No, it's not. These things take a long time and there's plenty of chance to get in late. There have been tons of obvious plays recently - like investing in China and India a few years ago - where I thought oh it's already obvious to everyone it's too late to get the good value.
"That's too obvious the market must have compensated for that already" . I always assume that the market is doing a good job of compensating for news and obvious trends. Like if some company announces an awesome new product and everyone loves it - the stock should shoot up based on that news and already have all the good news baked into the value, right? Well, no, not really. Things like "Halliburton will do well under Bush/Cheney" is just such an obvious play that I thought it would never work because the market had already compensated. Similarly stuff like shorting Home Depot in the recession.
"I should do what's been proven to work best over time" . In scientific studies of investment strategy, no portfolio strategy beats buy and hold in the long term (or if it does it's by a very small amount that's less than transaction cost). So you're "supposed" to just find broad market funds with minimum expensenses. Listening to that advice has been a huge mistake.
My god I am so fucking sick of your retarded broken ass software. Fucking bill pay web site can't handle commas or dollar signs in the number entry boxes. WTF I just copy-pasted from the amount you are showing me is due into the payment box and you fucking freak out and can't handle it !? In fact, all web based UI can just bite me. Ten second stall outs to show me the next GUI box are not okay.
So I went test driving with Ryan yesterday. Here's what we saw :
Nissan GT-R : it looks way bigger in person than it does in images, it's really a hulking beast. The cockpit is very tight, you really feel like you're wedged in, which I guess is good if you're gonna push it. Unfortunately this salesman was a complete dickwad retard and wouldn't let me drive it. WTF I'm not gonna buy it if I can't drive it. If I had felt how it moves and loved it I might have considered it, but it is rather impractical. The douche kept telling me about how rare it was and the premium on the price. Look dickwad, there are *tons* of GTR's for sale on Ebay right now, in fact it's easier to find than an Audi RS4 or a new M3 or a BMW 135.
Infiniti G37 : this is the "luxury" version of the Nissan 370, it's the same engine and chassis, just different body and nicer interior. I use quotes on luxury because there is not much luxury to be had here. Everything feels like a cheap Japanese car; actually it's worse than that, the interior is *way* worse than the interior of my Prelude, because the Infinity is all cheap plastic, but it tries to look fancy, so it's got this retarded plastic analog clock in the dash (LOL?) and these shiny fake-metal plastic pieces everywhere. The whole thing kind of reminds me of like a strip mall with roman columns in front of it. Anyway, the interior isn't that important to me, how does it drive? Meh. They only make the 4WD in auto so I tried the RWD manual. The shifter is okay (though we smoked the clutch up pretty good). It's definitely got speed, but you don't really feel it. The power delivery is weak at low revs, and the torque just isn't there when you want it. There is a huge excess of Infinitis right now so they can be had very cheap (perhaps below invoice).
Audi S5 : this thing really looks beautiful in person. The styling is sporty but subtle. It actually really reminds of the BMW styling, but just smoother and better. The interior is also (mostly) nicer than BMW, nicer feeling leather and better arm rests and just nicer trim all around. The only exception is the center console control stuff which is oddly bangled and blingy looking with all kinds of shiny metal and studded pieces that look like they belong on a rapper's necklace. Unfortunately the drive quality just sucks. It feels like a boat. It feels really huge and heavy, it's got a big powerful engine but it doesn't really feel fast. The steering was way too loose, by far the loosest of any car we drove that day. Throttle response was laggy. Visibility is very poor - the rear view mirror is tiny and the blind spots on the side are big. I really hate this trend in car styling where they make the body taper to the front so that the butt is really big, but they also make the roof taper down in the back, so that the rear window is just microscopic (the 370Z has the same problem, in fact a lot of these cars do). In the Audi you get a rear view camera to help you back up despite the bad window and the huge beastliness of this road-boat. Yuck. The sales guy told me there's a selective sport option that tightens up the feel but I don't believe him, and it would still be huge and heavy. It's a fucking V8 with 350 horsepower and it just felt slow. Plus it's a fucking Audi which is just one of the worst made cars around right now.
BMW : the dealer guy was nice and we drove a ton of cars and they were all quite pleasing. In all of them the interior is pretty nice and solid feeling. The steering controls are direct and responsive. Personally I'm not a big fan of the iDrive computer stuff, however it is better than the ghetto-iDrive-knockoffs that were in the Audi and Infiniti. It seems hard to find a car now without all kinds of unecessary gadgetry, and if you are going to have them the BMW gadgets seem to be the best. The non-iDrive interfaces to the stereo and such are awfully minimal, like it just looks like it's only partially finished, with buttons not even labelled and such.
335xi : this is the 4WD twin-turbo 3L V6 that doe 300 hp and 300 torque. This one was a little disappointing. Steering felt good, but throttle response was very laggy. I would jam on the gas and it would take a good "one missisipi" before the power kicked in. Not sure what the deal was, maybe the AWD was slowing it down? The RWD 135 we tried later had the same engine and didn't seem to suffer from the safe sluggishness. Also it was an automatic, even though I put it in sport mode it was hard to coax aggressive gearing out of it. It does feel a little on the big & heavy side to me, but suspension and steering were still very crisp. I'd like to try a manual but they didn't have any.
M3 : for the fuck of it I tried the new M3 with 4.0L V8 414 HP ; this one had the dual-clutch automatic / paddle shifter setup. The transmission was very quick and smooth. I'm sure I could drive much faster with this semi-automatic kind of setup than with a true manual, but I did miss the feeling of the manual a bit. I dunno if I'd get used to the semi-automatic over time. I think the steering wheel paddle shifts are a bit of a gimmick and not very practical. It's nearly impossible to keep your hands on them through a hard turn, which means you get stuck in a bad gear. Maybe on the track they would be useful. The engine sound in this thing is glorious, it's deep and rumbly and loud, it sounds like the beast it is, and it's got tons of power. It definitely does feel big and heavy, you have to muscle it around. The "M" button is indeed exciting. There's just something pleasing about flicking a switch to turn on the better mode (like the computer "turbo" buttons of old). It is kind of just a gimmick for me though since I think I would just drive it in sporty mode all the time. (with the "M" turned off it feels like a big heavy luxury sedan). While it was good, I didn't really love it. If you want the feel of an American muscle car but in a German luxury sedan, and some tight road feel - this is for you. But that's not really what I want. I want the feel of a go-kart, but with low end grunt, and some decent build quality and comfort. I do like the controls to turn traction control on and off and all that stuff.
135i : this is the same engine as the 335xi, but in a slightly smaller, lighter RWD package. Whoah, this was a revelation and probably our favorite car of the day. The rear seats are pretty tiny, there's no AWD. Again we were stuck with an automatic, but the semi-auto sport shift control thingy was pretty good. The front cabin is plenty spatious, you can sit up straight and spread out. It's got twin turbos like the 335, but you can't feel them kick in at all, there's no real power gap, it feels very smooth and always available.
While the 135 is a bit lighter, it's by no means light. It weighs 3,373 lbs , (vs 3,571 lbs for the 335i and 3,759 lbs for the 335xi). But it felt really nimble and stiff and peppy. It's got a slightly shorter wheelbase than the 3 series, which I think makes it feel tighter. On the down side, it is a new line which means it is somewhat rare and it doesn't look like prices are too great (I'm seeing it go for MSRP - close to $40k; I'd like to see it go for Invoice - close to $32k). Also because it's a new line there seem to be some 1 series reliability problems . Hrrumph.
The steering feel and response on all the BMWs was excellent. The double clutch automatic was superb, super quick and smooth, and the manual up-down control was okay.
Overall there wasn't anything I fell in love with. It was fun - my god, all these cars have tons of power, and many of them have great screaming engine sounds, but they just feel heavy and laggy. The 135 was the most pleasing and I think I could be happy buying that right now, but it's not totally ideal.
At the end of the day we got in Ryan's WRX and it reminded me what a nice car that is. Very light snappy feel, good control. If only it had a little more low end grunt and slightly more comfortable interior it would be pretty perfect. Eh, I also feel like you sit a bit too high and upright in the WRX, I prefer to be slung down in a cockpit a bit more. Fuck maybe I should just get one anyway.
As usual I was just shocked at how shitty the salesmen were. The BMW guy was nice, but only in a relative sense in that he didn't actively deter us from buying his cars. At every dealership we'd ask basic questions about the cars - what's the engine in this one, what transmission are available, what's the different between the sport package and the base, etc. and time after time we'd get "I don't know" or even worse - just completely wrong answers. We asked the Infiniti guy how much the G37 weighs. After a bit of hmm , err, he wanders around to read the labels on the car and announces 2400 pounds. I'm like, "umm , no, that's not right" , so he umms and errs around a while and announces "4600 pounds". Umm, no, I don't think so. (the real answer is 3,590 lb). I literally had to prompt the salesmen to give me brochures and business cards. You owe me 10% of your comission.
I also did not see any of the desperation that I hoped for. Despite constantly reading about how bad things are in the news, I have yet to see it. I found these nice graphs on the auto industry downturn that show just plummeting sales this year, and there are tons of photos around the net like this or this of huge lots of unsold cars. But I have yet to see big deals or much eagerness from dealers. Wouldn't you rather sell the cars for a tiny profit than just have them sit? It makes no sense to me. I have money and want to spend it, give me a bargain!
BTW fucking foot-pound vs. newton-meter for torque is a disaster. Part of the problem is the conversion is only a 1.35 multiplier so if someone says "300 torque" you can't just guess the units. That's probably 300 Nm = 222 Ft-lb. Of course there's also the horsepower fuckup; there's an "english" or "mechanical" horsepower and a metric horsepower which are off by a factor of 1.01, so at least that's not a huge difference but you have no idea what anyone is talking about. And then of course the standard horsepower that's quoted is "brake" horsepower (bhp) which is really sort of retarded, because it just measures the engine in isolation; what's really more useful is the effective or "wheel" horespower, but car makers never tell you that.
Some other things I might have a look at : previous gen (E46) M3's , they were a bit smaller and lighter, though I find the styling very generic and unexciting. Audi RS4 is very fast with good AWD, but it is an Audi, and very expensive, and awfully heavy and a huge gas guzzler. Lexus IS - the specs are good, but Lexus tends to make cars feel really mushy and boring, plus the manual is only in the 250. The advantage is that Lexus actually makes cars that don't fall apart.
As usual the retards that are finally up in arms about our financial disaster have started getting the blame all wrong.
The Daily Show segment about short sellers really pissed me off. (the fact that they mocked the random short seller guy and actually seriously listened to the crazy Overstock.com loony-toon guy is ridiculous). My god, short sellers don't bring down the companies. The companies are failing. Short sellers *might* occasionally help prick the bubble of inflated stocks that are going up purely based on momentum and collective belief in a fantasy.
There actually was a country that legislated that stocks could only go up. You were only allowed to sell stock for equal or greater value. (someone help me find a reference to that). Obviously that's retarded. Keeping bad companies afloat should not be our goal. Taking all the fluidity out of markets is not the goal either.
One bad aspect of the whole short sell / trading thing is how twitchy and over-reactive people have gotten. Not just day traders but professional brokers. Some good news comes in and stocks shoot up way past their real value. Bad news (like a bunch of shorts) comes in and stocks shoot way down. That's obviously bad.
If you're going to get mad about some practices in trading, there are plenty of things to be up in arms about. An obvious one is the intentional spreading of information (false or not) to affect stock prices, which has become quite standard and is borderline illegal and definitely unethical. The collusion at big banks between the investment bank part that offers a certain stock and the brokers and analysts that recommend that same stock is another.
Another very common practice to get mad about is the way the hedge funds stalk the major indexes. Quite a few of the large hedge funds now basically act as friction on the market. Any time the market wants to move anywhere, they take a piece. They act as an energy drag, sucking off a bit for themselves. This works basically because all the large mutual funds track certain indexes. A huge amount of the total stock ownership is through these large mutual funds. The indexes and the funds take some time to react to things, so when something happens, they have to announce it, and then it takes them a while to get things done. The mutual funds often have the problem that they have so much money to move that it takes them a long time to buy or sell the stock they're after.
The simplest and most obvious case occurs with the S&P 500 (or the DJIA). Whenever a stock is added or removed from the list of companies in the index, all the many huge funds that track that index must sell one stock and buy another. These huge quick computer-automated hedge funds jump in and buy up the stock that the funds need to get into, then sell it to them for a higher price. This is basically like if you walk into the grocery store and announce that you're buying apples, and some asshole runs over and buys all the apples and then offers to sell them to you for double the price.
These is just literally stealing money from long term investors. I'm sure there are lots of other practices like this that I don't understand, but plain old short selling is not one of them.
The other retarded fixation of the media was the whole corporate jet issue, and more generally just slight excesses of spending, like million dollar parties and whatnot. It's ridiculous on so many levels. First of all, it's a fucking drop in the bucket. It's not even a huge waste of money because it saves them time and whatnot. Second, the idea that the rich should stop spending lavishly or be embarassed about spending now is completely backward and will only make things worse if they tighten the purse strings. Third, the fuck-tard executives are getting paid way more than that in salary and bonuses and options; in fact there have been a few cases where some executive redecorated his office with a million dollar bidet and the media got all excited, so he just paid for it with his own salary. What? How is that better?
The recent AIG bonuses brings up a related point - yes, it's a bit ridiculous that they're paying out huge bonuses after receiving hundreds of billions of bailout money - but it's our own fucking fault. We just gave away a trillion dollars to all these fucking crooks with absolutely *ZERO* strings attached. No regulation, no requirement about how it be used. Of course they're just going to pocket it, why would they pump it into their failing business? That's what they're good at - giving themselves profits, that's their job to some extent. It's the fucking government's fault for giving out that money without regulation. Now congress and the media gets in a big tizzy and calls them in and says "hey, waaa you're not spending it how we wanted" and the executives are just like "meh, fuck you, you gave us the cash, I'm buying a new yacht".
I think the focus on the hedge funds like Bear Sterns or investment banks like Merrill is a bit out of place too. We should have just ignored them. It's perfectly fine for a hedge fund to take huge risks and sometimes lose on those bets - but we have to just let them fail! There is basically zero public interest in those companies. The invest the money of rich people and big institutions - they're not regulated and have got to just be allowed to fail. In the future we don't need tighter regulation of funds - we need to pledge to never bail them out. That might mean limitting how big they can be and also perhaps limiting how much regulated institutions can invest in them. Basically never want them to be "too big to fail".
The other myth that's being spread by the popular media is that all these poor homeowners are just victims in the crisis. Nonsense, in fact they're a large part of the problem. Now certainly a few people were victimized, given really bad mortgage deals that they didn't understand, and they just wanted a place to live and weren't speculating. But tons of people knew exactly what was going on. They were intentionally buying way out of their means because they wanted to get rich on real estate. They intentionally got neg-am loans because it allowed them to leverage up. They were speculating, they were watching TV shows about flipping properties. Most people knew we were in a bubble and kept buying anyway. In theory I don't blame them, they saw a profit opportunity and took it, but that doesn't mean we should now feel sorry for them or protect them or bail them out. However the reality is that US law has lots of favorable protections for mortgages, and speculating on investments that are protected by the government is definitely unethical and really should be illegal.
Basically my view is that there are two reasonable extremes : (1) completely free market with zero protections and little regulation, or (2) safe protected market with lots of regulation. Either one is okay, but the thing that has gotten us in so much shit is our semi-protected semi-regulated market in which people can speculate and then have their risk covered by the government. That is fucked up. Also the illusion that things are regulated when they really aren't, and the ability of financial institutions to cross the dividing line between the safe market (in US law, a "bank" with FDIC insurace is supposed to be very safe and have plenty of capital and not take too much risk - similarly a mortgage is supposed to be a very safe type of loan that can be covered by the value of the property).
The big problem I see in general is that companies and individuals profits weren't tied to the deals they were making. Mortgage brokers for example were only motivated to do as many deals as possible, because they take a commision on each deal, and then just resell the actual mortgage, so they get none of the risk from the actual property. This gives them zero motivation to actual do a good deal or even turn down borrowers that are bad risks.
It seems like we need some kind of law to make profits more tied to actions. What we don't need is a bunch of micro-regulations about exactly what constitutes a safe mortgage or a good borrower or whatever. Unfortunately that's the way the US government tends to solve problems. It's much better to let the free market work and decide for itself who's a good borrower - the problem is just the mismatch of reward from actions.
Mortgages in particular seem pretty easy to fix. If mortgages could only be issued by institutions that backed the mortgage themself, it would all be fixed right there. Now they don't want to offer bad loans because they are backing the loan and they have to eat the default. Brokers no longer get paid just on volume, but rather from the actual return of the mortgages they offer. No more trading mortgages, no more government Mae subsidies backing the industry. Seems very simple and obvious and I don't really see a downside.
Another example is the credit rating companies. They have a complete conflict of interest and no motivation to rate things right. In fact their motivation is just to over-rate everything because that makes people happy and makes them issue more securities, so you get more business. The easy way to fix this is just to eliminate all of these made up abstract nominal credit ratings. Boom. Instead, the credit rating companies offer insurance on the securities. The value doesn't necessarily come in anyway *buying* that insurance, but simply seeing the price of it gives you the effective rating. That is, the price of the insurance on a given asset *is* the rating of how safe it is. If the rating is wrong, then the insurer can lose money because of the mistake. Thus they are motivated to give it the right rating. And it's in absolute directly measurable units - dollars.
Instead we'll probably have some fucking huge mess of laws about how the rating companies have to work, what exactly constitutes each of the rating scales, blah blah blah. That's terrible. Huge mess of regulation on top of a basically corrupt system is the wrong way to do things. Instead dig up the system and make it non-corrupt. Use the free market, but force the free market to work correctly - that is, make pricing honest and make the people taking the profit take the risk.
I like the idea of forcing people to price things as a way of enforcing honesty. Like if a trader recommends a certain stock - boom you have to buy it. If you recommend shit, you will lose money. Thus you are motivated to only recommend things that are actually good. You could make a law like anyone publicly recommending a stock has to buy it and hold it for at least a year.
Windows "File & Printer Sharing" Networking is really annoying me. It's one of the main culprits of long mystery stalls during boot. It also gives you the awful horrible stalls when you open Explorer for the first time.
What I would like is to be able to boot with Windows Networking completely disabled. Then when I choose at some later point I'd like to be able to turn it on. Can I do this?
The other thing that kills me is if I slip while typing in CMD and accidentally dir to one of my networked mapped drives, I just get a huge stall.
My Windows XP boot takes about 1 minute. I've been using "BootVis" and "BootLog XP" to check it out. BootVis is pretty cute, it makes nice graphs, but the actual important part of boot it just labels as "services" with no breakdown about *which* services. BootLog XP does a bit of a better job because it shows each DLL load and times each one, so you can make some educated guess about which services are taking all that time.
There's tons of little shit, okay, whatever. The AntiVir takes a lot of time, maybe 15 seconds total. Oh well, that's life.
csrss.exe is the next biggest time eater, it takes abour 10 seconds. Apparently most of that is because of loading the registry. I checked "pagedfrg" and my registry is not fragged at all.
The other big thing is svchost starting up the networking services which takes around 10 seconds.
I made the classic error of actually trying to use the APIs you're supposed to use instead of just writing my own code from scratch.
First of all, the stdlib "tmpnam" is just horrifically broken in the MSVC CRT. I don't understand WTF they're thinking, but it's just awful in various ways :
1. It actually puts the files in the ROOT of your damn drive ! WTF.
2. It doesn't work in Vista at all (you can't fopen the name they give you back). Presumably because it's putting the file in the root and the app doesn't have write access to the root (?).
3. The names are awful, like "s38b" , usually with no extension at all. This makes it very hard to delete them safely when you get like 100,000 of them piled up in your fucking root.
Okay, okay, so we're on Windows we can use the Win32 functions. Well, on the plus side, GetTempPath seems to actually work, it gives you the dir specified by the "TMP" environment var (BTW if it can't find a temp setting, it falls back to the Windows directory which is kind of awful, but whatever, it at least finds the temp dir normally).
On the minus side, GetTempFileName is retardedly awfully bad.
If you just glance at the docs is seems like it's pretty reasonable. You stick 0 in for uUnique and rock on.
But then one day your app grinds to a complete halt because of GetTempFileName and you go "WTF?". Well guess what, GetTempFileName only actually makes up to 64k temp file names. And when it actually checks the name for existence and then tries again. This means that if you even get over 10k temp files in your temp dir, it can make names and check existence over and over before it finds an open slot. If you get more than 50k temp files, you app is basically dead.
Now obviously you don't want to have a ton of temp files hanging around, but Windows never cleans up your temp dirs so if you have some bugs at some point or even just other random broken apps, you can easily crud up your temp dir. And the prefix is only 3 characters so you can often run into other people's prefixes!
One improvement would be if they stuck the app's name onto the temp path and made a subdir, so that you at least got your own dir to play in so that one broken app couldn't mess up the whole system.
On the plus side they actually use the extension ".tmp" so you can go delete everything.
Anyhoo, this is all silly because we have this thing called "long file names" and we don't need to be using 4 characters of hex (!? WTF) which makes it so easy to get name collisions.
So I wrote my own; here's my current version of MakeTempFileName :
// stdc tmpnam seems to put files in the fucking root !?
void MakeTempFileName(char * into,int intoSize)
{
// use better routines on windows
do
{
char tempPath[MAX_PATH];
GetTempPath(sizeof(tempPath),tempPath);
static uint32 s_seqNum = 1;
uint32 seqNum = s_seqNum;
++s_seqNum; // not thread safe, whatever
// use tsc or something to get a real random int :
Timer::tsc_type tsc = Timer::rdtsc();
uint32 tscLow = (uint32)tsc;
char tempFile[MAX_PATH];
sprintf(tempFile,"cb_%d_%08X.tmp",seqNum,tscLow);
CombinePaths(tempPath,tempFile,into);
into[intoSize-1] = 0;
// retry while this name exists :
// (would be very bad luck)
} while( FileExists(into) );
}
yes, yes, this calls lots of cblib stuff.
My TIPS are down about 10% in the last year. All bonds fell a lot during the crash. I'm not sure exactly what the reason is, I guess corporate bonds lost value because there was a fear of defaults as people realized the ratings weren't what they were supposed to be. Also the yields have plummetted which has made them less desirable. I don't really get it though. I mean I guess even though the bonds are fixed income, the price for them is set by the market, so is prone to fluctuations of supply & demand.
Saving accounts are no good either. Here's the delightful ING reports :
Dec 30, 2008 Interest Rate Change to 2.472% (2.50% APY) Jan 20, 2009 Interest Rate Change to 2.374% (2.40% APY) Feb 3, 2009 Interest Rate Change to 2.178% (2.20% APY) Feb 18, 2009 Interest Rate Change to 1.835% (1.85% APY) Mar 3, 2009 Interest Rate Change to 1.638% (1.65% APY)Steadily plummeting to zero. Though really that's just an indicator of falling inflation. The real inflation rate is maybe something like the ING APY plus 2 or times 2 or something like that.
I think almost anything you do with your money is a loss. The reason the bank is giving you X% is because they think they can do better and make money off you. There's no free money in the world, the bank is basically investing for you and taking a big commision. Bank interest rates are almost always below inflation (the only exception is during brief periods when things haven't adjusted yet).
Anyway, since there's nowhere to save I figure I'll just invest in Hookers and Blow.
The weather really hasn't been all that bad this winter. In fact since January 1st it's been sunny quite often, and it's absolutely gorgeous up here when it's clear, what with all the green and mountain views around.
Just recently another patch of the interminable drear has set in :
Ugh.
We've had a few bouts of hail and sleet and freezing rain and all that gunk. It's so much better if it just really snows, in fact I rather enjoy the snow if it would just dump and then get sunny again.
I like how the newspaper here does stuff like this :
One thing I've been shocked by as I read about cars is just how fast they all are now. In the last 10 years the average horsepower has shot from 170 to 230. A 300 HP car used to be a rare performance beast, but they are now extremely common in the luxury/sport segment.
You can see it very nicely in this graph of Horsepower vs. MPG over time . As much as I appreciate this, it's pretty retarded and obviously way out of step with the times. The auto industry seems really really slow to react. People have wanted more economy for 5 years now, but they have just kept churning up horsepower. Now that gas prices are falling back down, the industry is just about to start putting out more efficient cars.
Anyway, what this means is there are lots of ordinary cheap cars that are pretty damn good. For example, the Mazda 6 which is one of the most reliable cars you can buy, costs around $25k and has a 3.7L V6, 24 valves, 272 hp @ 6250 rpm. Not bad !
Another new cheap asian sporty car coming out is the Hyundai Genesis. I know a lot of people have bad associations with "Hyundai", but if you take the sensible position that "build quality" means it won't fall apart and break down when you drive it, then a Hyundai is actually a better built car than any of the supposedly well engineered German cars.
The Genesis has 306 hp and 266 lb-ft of torque from a 3.8-liter V6. It's RWD and is basically a direct compentitor to the Infinity G37 Coupe. But it costs $30k , vs. $40k for the Infinity (and around $50k for a comparable BMW). And actually the Genesis has much more low-end torque than either the G37 or a BMW 3, both of which need high RPMs to make power. (IMO a flat torque curve is much much better).
I also really like the styling of the Genesis. While the Germans seem to be racing each other towards rice-boy baroque over-decoration with unnecessary bangles and slashes, the Genesis has a sweet smooth simple styling, that's maybe a tad boring, but at this point the only choices in car styling is to pick the one that is least bad.
Check out the direct comparison of Hyundai Genesis vs. Infinity G37 .
Sadly the Genesis is only RWD, and while the G37 comes in AWD, only the automatic is available, and the manual is only RWD. It reminds me of the dumb Lexus models that only offer their better engine with automatics. WTF. I think a manual is crucial. Though a no-stall first gear would be nice for rush hour traffic.
It would be really useful to be able to see a history of eBay final sale prices. You can search for "Nissan GT-R" on eBay, but it's hard to tell from the current prices what the actual sale prices are. I'd like to see a chart of sale prices over time like NexTag does. I guess they intentionally hide that info, but it would be pretty easy to scrape if you had a bunch of spiders and a fat pipe.
Also see consumer reports reliability summary and warranty reports by manufacturer.
BTW just looking at the ranking order of warranty reports is a bit deceptive. There's really a huge step :
Group A : < 10% very good # 1. Mazda - 8.04% # 2. Honda - 8.90% (! paradigm shift here !) Group B : not bad # 3. Toyota (*) - 15.78% # 4. Mitsubishi - 17.04% # 5. Kia - 17.39% # 6. Subaru - 18.46% # 7. Nissan - 18.86% # 8. Lexus - 20.05% Group C : not good , around 25% # 9. Mini - 21.90% # 10. Citroen - 25.98% # 11. Daewoo - 26.30% # 12. Hyundai - 26.36% # 13. Peugeot - 26.59% # 14. Ford - 26.76% # 15. Suzuki - 27.20% # 16. Porsche - 27.48% # 17. Fiat - 28.49% # 18. BMW - 28.64% Group D : very bad - 30% + # 19. Vauxhall - 28.77% # 20. Mercedes-Benz - 29.90% # 21. Rover - 30.12% # 22. Volvo - 31.28% # 23. Volkswagen - 31.44% # 24. Jaguar - 32.05% # 25. Skoda - 32.12% # 26. Chrysler - 34.90% # 27. Audi - 36.74% # 28. Seat - 36.87% # 29. Renault - 36.87% # 30. Alfa Romeo - 39.13% # 31. Saab - 41.59% # 32. Land Rover - 44.21% # 33. Jeep - 46.36%(* on Toyota because apparently Toyotas made in Japan are very good and would be in Group A, but Toyotas made in America are shit and bring down their average).
I'm disappointed with how loud my HTPC is. It's got the quietest of Scythe case fans, and the HD is perfectly silent, but I still hear the drone of the fans and it bugs me.
So today I thought I would try the simple solution - just stick in my home theater cabinet and shut the door.
First problem - it's too big. I have an Antec Fusion case so it looks like a home theater component, but it's just a bit bigger than any standard amplifier, and my cabinet is standard size. Annoying, but that's not really a big deal because -
It's going to get way too hot in there. I figured I would have to cut air holes in the back of the cabinet anyway, so I'll just cut a bigger hole so that the oversize case fits. I would up cutting just a few holes, since I had to use a steak knife to punch the entry holes and then my jig saw to cut them. (I would up breaking the steak knife blade, damn shitty stamped blades).
Now it's been running in there a few hours, and it is indeed much quieter, but it's also getting boiling hot in there even with the big air holes. Damn. What I need is a large solid box with two open ends, so that the sound is muffled but it still gets plenty of air.
My next thing to try is replacing the PSU with a lower-power quiet one. They even make fanless PSU's now though that's a bit scary because it just means they pump more heat into the case.
My other option is to throw this damn thing out the window, burn down my apartment, ride my bicycle around the world and never touch another damn piece of electronics again.
I've created this fucking nightmare for myself where I have like 4 different slightly different versions of the same code. I've got my home Galaxy3, my home cblib, my work Galaxy3, my work cblib, the old Exoddus code, and the Oodle Core, and they're all quite similar but a bit different.
Once a week or so I've been trying to merge the bug fixes I do in one bit to the other bits, and it's been just awful. Today on this miserable Saturday I've got a miserable big bit of merging to do. So I went and got the free trial of Araxis Merge.
On the plus side, the GUI is slightly better than P4 merge. On the minus side, the actual merge is completely miserable.
WTF. It seems completely unable to find identical blocks that are just moved.
Like if I in one file I have
AAAA BBBB CCCCAnd in the next file I have
CCCC BBBB AAAAit totally freaks and just shows that as a big diff. It should show me that stuff is just swapped in location.
I usually wind up doing the merge just by doing windiff between the dirs and then manually applying the edits. Not fun.
The other thing I'd like to see is detection of synonyms. A common thing I have is "gAssert" in Galaxy is "ASSERT" in cblib and "RR_ASSERT" in Oodle. You should be able to detect that. Even if it was a manual config file of synonyms associated with directories that would be okay.
I wrote briefly at molly about the way I do "hot vars" or "tweak vars". I don't really like them. I had a bunch in my GDC app for tweaking, but it means I have to edit the code to tweak things and I don't like that for various reasons. It's way better to have a real text pref file. There are just so many wins. I can source control it seperately. I can copy it off and safe good snapshots of prefs I like for different purposes, like a development prefs vs. final run prefs. I can copy it and change it to make new instances. And I can edit it and have my final app load the changes without a recompile (hot var can do this too if you keep the C file around as "data")
Fortunately I have a prefs system in cblib, so I switched to that. But it made me realize that I'd really like to have an automatic pref system. Basically I want to write something like :
struct MyPref
{
int i = 7;
float x = 1.3;
String str = "hello world";
ColorDW color(200,50,177);
Vec3 v(3.4,0,1.7);
};
and have it automatically get IO to the pref file and construction with those values as defaults.
Now, that all is actually pretty easy if I just make some custom syntax and run a source code preprocess over the file before compiling. I could use a syntax like :
struct MyPref
{
int i; //$ = 7;
float x; //$ = 1.3;
String str; //$ = "hello world";
ColorDW color; //$ (200,50,177);
Vec3 v; //$ (3.4,0,1.7);
};
where anything with a //$ automatically becomes a "pref var" that gets IO'd and tweakability and so on.
The annoyance and something I haven't figured out is just how to deal with generated code in a build setting. Should the code generator stick the generated code back into the original file? Should it make another seperate C file on the side and put the generated code in there? Maybe all the generated code from all the C files in the whole project should go together in one big spot?
I dunno but it seems like a mess. Maybe the easiest thing to do would be to put the autogenerated code in the same file, and run the generator as a pre-build step.
It would also be annoying to have to put the code generator in as a pre-build step on every file one by one. So annoying as to be unacceptable actually. I would want it to automatically run on all my files any time I build, so that if I just go and put the autogen markup in any file it does the stuff and I don't have to open msdev options dialogs.
I know people out there are doing things like this, so I'm curious how you deal with the mod time issues and builds and source control.
BTW the autogenerated code will look something like this :
void MyPref::AutoGen_SetDefault()
{
i = int(7);
x = float(1.3);
str = String("hello world");
color = ColorDW(200,50,177);
v = Vec3(3.4,0,1.7);
}
template
void MyPref::AutoGen_Reflection()
{
REFLECT(i);
REFLECT(x);
REFLECT(str);
REFLECT(color);
REFLECT(v);
}
pretty easy to generate just by some text movement. The big win of the shortened syntax is that you only
have to write a variable once instead of 3 times.
So I tracked down the paged pool leak. It appears to be a bug in my ATI driver (or perhaps a bug in the DX interface that shows up as a leak in the driver). It's caused by locking POOL_MANAGED textures that are currently in the GPU push buffer. When you do that it causes the hardware texture to get aliased to avoid contention, and it appears something in there is leaking. I don't think that whole textures are leaking because the leak is pretty slow, it must just be some kind of texture book-keeping object (eg. maybe it actually is the "texture" struct, just not the actual surface bits).
Anyhoo, nobody cares about bugs that I see via Directx8, but what is interesting is the cool Poolmon utility.
Poolmon lets you see the kernel allocations and thus track down leaks.
You need to turn on some registry settings .
Run Poolmon in a command line with lots of vertical lines.
Use the keys "d" and "p" to get the view you want
Then track away.
... blows so bad. Can't we just all be Xenon-exclusive? Please?
In a bit more detail - I've been tracking down a graphics driver bug the last few days. And this is like the easy case - it's on Windows XP and it's happening on my dev machine. I remember the old days of hell at Eclipse/Wild Tangent where we'd get a report that something in the graphics screws up, but only on Windows version XX and only with graphics card YY and only with driver version ZZ and only when you run Media Player at the same time, and we can't repro it any other way.
At WT we had a huge room full of like 50 computers with all kinds of different hardware and OS setups. Any time we did a release it had to be tested on all those boxes. (this was back in the day when you had 3d-only cards like the Voodoo and NVidia had just come out with the Riva 128, and you had to support all kinds of weird ass cards, in comparison the differences between even the most different of cards these days are very minor). I'm actually amazed how many PC game development studios don't seem to have this kind of test room.
We also had 2 full-time employees basically running testing all the time. I'm amazed at how many game studios have zero internal testing.
The Windows Kernel Paged Pool shit I'm dealing with is reminding me how much I hate fixed size memory pools.
I have fucking 3 Gigs of RAM in this machine and I'm using less than 1 G ! How can you be running out of memory !!!! Oh, because you have a fucking stupid fixed size page thing. Now, okay, maybe *MAYBE* the OS kernel is one case in which fixed size pools is a good thing.
It's common wisdom in game development that dynamic allocations in games are bad and that "mature" people use fixed size pools because it's more stable and safe from fragmentation and robust and so on. Hog wash!
It should be obvious why you would want variable memory allocation. Memory is one of our primary limiting factors these days, and it should be allocated to whatever needs it right now. When you have fixed pools it means that you are preventing the most important thing from getting memory in some case.
For example, your artists want to make a super high poly detailed background portion of the game with no NPC's. Oh, no, sorry, you can't do that, we're reserving that memory for 32 NPC's all the time even though you have none here. In another part of the game, the artists want to have super simple everything and then 64 NPC's. Oh no, sorry, you only get 32 even though you could run more because we're reserving space for lots of other junk that isn't in this part of the game.
Now, I'm not saying that budgets for artists is a bad thing. Obviously artists need clear guidelines about what will run fast enough and fit in memory. But having global fixed limits is a weak cop out way to do that.
Furthermore, recycling pools and maximum counts for spawned items is a perfectly reasonable thing to do. But I don't think of that as a way of dividing up the available memory - it's a way of preventing buggy art from screwing up the system, or just lazy artists from making mistakes. For example, having a maximum particle count doesn't mean you should go ahead and preallocate all those particles, cuz you might want to use that memory for something else in other cases (and of course the hard-fixed-size pool thing can
In general I'm not talking here about *dynamic* variation. I'm talking about *static* variation. Like anything that can be spawned or trigger from scripts or whatever, stuff that can be created by the player - that stuff should be premade. Anything that *could* exist at a given spot *should* exist. That way you know that no player action can crash you. Note that this really just a way to avoid testing all the combinatorics of different play possibilities.
By static variation I mean, in room 1 you might have resource allocation like {16 NPC's, 100 MB of textures} , in room 2 you might have {8 NPC's, 150 MB of textures}.
Fixed sized budgets is like if you partitioned your hard disk in half for programs and data. People used to do things like that, but we all now realize it's dumb, it's better just to have one big disk and that way you can change how you are using things as need arises.
Now, people sometimes worry about fragmentation. That may or may not be an issue for you. On Stranger on XBox it basically wasn't an issue because we had 64M or physical memory and 2G of virtual address space, so you have tons of slack. Again now with 64 bit pointers you have the same kind of safety and don't have to worry. Sadly, 32-bit Windows right now is actually in a really bad spot where the amount of physical memory roughly matches the address space, and we actually want to use most of that. That is fragmentation danger land.
However, doing variable size budgets doesn't necessarily increase your fragmentation at all. The only thing that would give you fragmentation is if you are dynamically allocating and freeing things of different sizes. Now of course you shouldn't do that !
One option is just to tear things all the way down and build them all the way back up for each level. That way you allocate {A,B,C,D} in order, then you free it all so you get back to empty {} , then next level you allocate {C,B,B,A,E} and there's no fragmentation worry. (if you tried to do a minimal transition between those sets by doing like -D +B+E then you could have problems).
Another option is relocatable memory. We did this for Stranger for the "Contiguous" physical memory. Even though virtual address fragmentation wasn't an issue, physical memory (for graphics) fragmentation was. But all our big resources were relocatable anyway because they were designed for paging. So when the contiguous memory got fragmented we just slid down the blogs to defrag it, just like you defrag a disk. Our resources were well designed for fast paging, so this was very fast - only a few thousand clocks to defrag the memory, and it only had to be done at paging area transitions.
Note that "relocatable resources" is kind of a cool handy thing to have in any case. It lets you load them "flat" into memory and then just rebase the whole thing and boom it's ready to use.
Personally after being a console dev and now seeing the shit I'm seeing with Oodle, I would be terrified of releasing a game on a PC. Even if you are very good about your memory use, your textures and VB's and so on create driver resources and you have no idea how big those are, and it will vary from system to system. The AGP aperture and the Video-RAM shadow eat out huge pieces of your virtual address space. The kernel has an unknown amount of available mem, and of course who knows what other apps are running (if it's a typical consumer machine it probably has antivirus and the whole MS bloatware installed and running all the time).
I don't see how you can use even 512 MB on a PC and get reliable execution. I guess the only robust solution is to be super scalable and not count on anything. Assume that mallocs or any system call can fail at any time, and downgrade your functionality to cope.
Now, certainly doing prereserved buckets and zero allocations and all that does have its merits, mainly in convenience. It's very easy as a developer to verify that what you're doing fits the "rules" - you just look at your allocation count and if it's not zero that's a bug.
It's just very frustrating when someone is telling you "out of memory" when you're sitting there staring at 1 GB of free memory. WTF, I have memory for you right here, please use it.
The other important thing is efficiency is irrelevant if you can't target it where you need it. Having a really efficient banana picking operation doesn't do you a lick of good when everyone wants apples. A lot of game coders miss the importance of this point. It's better to run at 90% efficiency or so, but be flexible enough to target your power at exactly what's needed at any moment.
Like with a fixed system maybe you can handle 100 MB of textures and 50 MB of geometry very efficiently. That's awesome if that's what the scene really needs. But usually that is not the exactly ideal use of memory for a given spot. Maybe some spot really only needs 10 MB of geometry data. You still only provide 100 MB of textures. I'm variable and can now provide 139 MB of textures. (I lose 1 MB due to overhead from being variable).
Many game devs see that and think "ZOMG you have 1 MB of overhead that's unacceptable". In reality, you have 39 MB less of the actual resource your artists want in that spot.
ERROR_NO_SYSTEM_RESOURCES is the fucking devil.
So far as I can tell, the only people in the history of the universe who have actually pushed the Windows IO system really hard are me and SQL Server. When I go searching around the web for my problems they are always in relation to SQL Server.
I don't have a great understanding of this problem yet. Hopefully someone will chime in with a better link. This is what I have found so far :
You receive error 1450 ERROR_NO_SYSTEM_RESOURCES when you try to create a very large file in Windows XP
SystemPages Core Services
Sysinternals Forums - not enough resources problem - Page 1
Overlapped WriteFile fails with code 1450 [Archive] - CodeGuru Forums
Novell Eclipse FTK file io
How to use the userva switch with the 3GB switch to tune the User-mode space to a value between 2 GB and 3 GB
GDI Usage - Bear
Error Message ERROR_NO_SYSTEM_RESOURCES (1450)
Download details Detection, Analysis, and Corrective Actions for Low Page Table Entry Issues
Counter of the Week Symptoms Lack of Free System Page Table Entries (PTEs) and Error Message ERROR_NO_SYSTEM_RESOURCES (1450
Comparison of 32-bit and 64-bit memory architecture for 64-bit editions of Windows XP and Windows Server 2003
Basically the problem looks like this :
Windows Kernel has a bunch of internal fixed-size buffers. It has fixed-size (or small max-size) buffers for Handles, for the "Paged Pool" and "Non-Paged Pool", oh and for PTEs (page table entries). You can cause these resources to run out at any time and then you start getting weird errors. The exact limit is unknowable, because they are affected by what other processes are running, and also by registry settings and boot.ini settings.
I could make the error go away by playing with those settings to give myself more of a given resource, but of course you can't expect consumers to do that, so you have to work flawlessly in a variety of circumstances.
In terms of File IO, this can hit you in a whole variety of crazy ways :
1. There's a limit on the number of file handles. When you try to open a file you can get an out-of-resources error.
2. There's a limit on the number of Async ops pending, because the Kernel needs to allocate some internal resources and can fail.
3. There's a limit on how many pages of disk cache you can get. Because windows secretly runs everything you do through the cache (note that this is even true to some extent if you use FILE_FLAG_NO_BUFFERING - there are a lot of subtleties to when you actually get to do direct IO which I have written about before), any IO op can fail because windows couldn't allocate a page to the disk cache (even though you already have memory allocated in user space for the buffer).
4. Even ignoring the disk cache issue, windows has to mirror your memory buffer for the IO into kernel address space. I guess this is because the disk drivers talk to kernel memory so you user virtual address has to be moved to kernel for the disk to fill it. This can fail if the kernel can't find a block of kernel address space.
5. When you are sure that you are doing none of the above, you can still run into some other mysterious shit about the kernel failing to allocate internal pages for its own book-keeping of IOs. This is error 1450 (0x5AA) , ERROR_NO_SYSTEM_RESOURCES.
The errors you may see are :
ERROR_NOT_ENOUGH_MEMORY = too many AsyncIO 's pending
Solution : wait until some finish and try again
ERROR_NOT_ENOUGH_QUOTA = single IO call too large
Solution : break large IOs into many smaller ones (but then beware the above)
ERROR_NO_SYSTEM_RESOURCES = failure to alloc pages in the kernel address space for the IO
Solution : ???
So I have made sure I don't have too many handles open. I have made sure I don't have too many IO ops pending. I have made sure my IO ops are not too big. I have done all that, and I still randomly get ERROR_NO_SYSTEM_RESOURCES depending on what else is happening on my machine. I sort of have a solution, which seems to be the standard hack solution around the net - just sleep for a few millis and try the IO again. Eventually it magically clears up and works.
BTW while searching for this problem I found this code snippet : Novell Eclipse FTK file io . It's quite good. It's got a lot of the little IO magic that I've only recently learned, such as using "SetFileValidData" when extending files for async writes, and it also has a retry loop for ERROR_NO_SYSTEM_RESOURCES.
Further investigation reveals that this problem is not caused by me at all - the kernel is just out of paged pool. If I do a very small IO (64k or less) or if I do non-overlapped IO, or if I just wait and retry later, I can get the IO to go through. Oh, and if you use no buffering, that also succeeds.
There's a new trend in fancy restaurants of putting your food in a pool of water rather than a proper sauce. This is fucking retarded and should stop immediately.
If I have a process under debug in MSVC , and I want to switch to debugging with WinDbg - how do I do that? WinDbg refuses to attach because it's already being debugged. If I "detach" with MSVC, the process immediately runs free (ceases to be stopped) which makes me lose the moment I was trying to debug. Urg. WTF ?
I need this because I want to write out a block of memory from my process to disk, which it seems MSVC won't do, but WinDbg with .writemem .
Another question is : how do I get a *full* crash dump of a process? (with all its memory). If you use "write crash dump" from MSVC, it only writes a "minidump" which is just registers, call stack, that kind of stuff.
WinDbg can write a full crash dump very easily (with .dump) , but I can't figure out how to change debuggers.
I made a video that I'm trying to upload to Youtube, and I CANNOT for the life of me get it to upload decently. Fucking Youtube seems to always want to reprocess it, and by "reprocess" I mean "turn it into pixelated shit".
There seems to be absolutely ZERO official information from Youtube about formats to upload videos. There's this hillariously unhelpful page and that's it. There are lots of guides around from amateurs, but they have no idea WTF they are talking about and often give totally bogus advice.
URG!!
I posted a pretty snyde comment on John Ratcliff's Code Suppository which I feel a bit guilty about, so I thought I would defend and clarify my position a bit.
First of all, to the specific question of bounding volumes : there is decent code for them in Eberly (though it's really not great and the sphere and OBB code is very far from optimal). There is very good code for all the common primitives in Galaxy3 - though you do have to buy into the whole system.
BTW this is sort of a side note but I really like John's Code Suppository in general, or Sean's "one file" way of distributing code. I'm glad they do it, because I can't and don't want to. I love it when other people make their code available to me like that, but I can't write code like that, and I find it impossible to maintain code like that in my own libraries.
As for the exact flaws and prior art in John's post, I sent him this :
1. Exact sphere fitting is well known. See for example this very good source code : miniball 2. Exact fitting of the best-rectangle via rotating calipers is well known. See for example any computational geometry textbook. 3. To fit an OBB the first step should always be computing the convex hull via something like QHull : qhull Once you have the convex hull, brute force is in fact a good solution. (being O(N^2) in the number of hull faces is much better than being O(N^2) in the number of original faces).And there's more at : RAPID OBB-Tree or Computational Geometry Code or O'Rourke's page
(those are just the code references, you can find more papers on citeseer)
But the real answer is : pick up ANY academic text on computational geometry and there will be very good algorithms for all this stuff. The correct algorithms are also fast! And the error in the approximate algorithms is not small, the "put sphere center at the average of all points" heuristic can be very very bad, as can the OBB fit heuristics. (BTW the good computational geometry code in Galaxy3 is basically from me reading through O'Rourke's Computational Geometry textbook and implementating algorithms I thought were cool and useful - at his website, tons of good code is available for free ; the good sphere code in galaxy is just miniball).
For example, both Eberly and John use the PCA (covariance) axes to fit the OBB. That sort of works okay quite often, but occasionally it is unboundedly bad. (BTW all comments on Eberly's stuff are based on several year old code that was free; I haven't touched his stuff since he published that book are started trying to make money off it, so he may have fixed some of his stuff).
BTW also all primitive-fitting code should be able to find either the minimum-volume primitive or the minimum-surface-area primitive. In most cases minimum-surface-area is actually what you want, because that is what gives you the minimum chance of random ray intersection. It also tends to make nicer looking volumes in practice IMO (it forces them to be more "compact" - surface tension turns spread out things into spheres).
BTW #2 rotating calipers is one of those super simple and perfect algorithms that I believe everyone should know. It's one of those "ah ha" algorithms that makes you realize doing anything else is retarded, because you cannot beat it for speed and it gets the answer exactly right. ( there's a whole page on it )
I want to say again that I love what John is doing with Code Suppository, but I keep seeing bad bounding volume code recycled again and again in the game development community. Lots of people use fundamentally wrong image processing or audio processing code without reading any of the DSP literature. People do hacky heuristic UV unwraps without reading any of the papers. People write their own hashes and data structures or sorting algorithms without understanding they're O(n) or actually comparing to reference implementations.
This is bull! And there's no excuse for it any more. Game developers have long secretly treasured the hack and the heuristic. Everyone loves coming up with their own little simple trick. No more. We have long used "speed/optimization" or "time pressure" as excuses to do things wrong. That's BS now that we are many thousands of people working many years on multi-million dollar projects. And especially for code we are sharing on the internet.
Part of the reason why this stuff is even hard to find on Google these days is because so many people keep posting bad versions! If you search for "bounding box fit" on Google half of what you find will be amateur game developer newsgroups and such with people posting ass-tastic versions. It was almost better 10 years about when the only thing you got was university research web sites.
Part of the problem is that too many game developers only talk to other game developers, and lose sight of how strong all the research out there in the world is. Yes, yes, we are hot shit. But for any algorithm you can think of, there are 1000 Chinese PhD's working on it right now. You can't beat them. The best you can do is go to CiteSeer and read their papers and steal their work. If you think you can beat them, then you better be damn sure and prove it by actually testing against their work.
I'm calling everyone out. There's plenty of un-tested, un-benchmarked, un-researched code out there on the net. We don't need more. We need less! I don't want to see another code snippet that somebody tried and just visually looked at and thought it looked "pretty good". I want exact error bounds. I want complexity analysis. I want profiles against the other major examples in the field. I want to see references to prior art. I want to see rigor.
The next time somebody gives you their hash_map implementation, you ask them - how is it better than stlport hash_map ? how does it compare to google dense or sparse hash map? What are the memory use and execution time limits? Show me your reference list (does it include RDE, khash, and Paul Hsieh?). If they stare at you blankly - kick them in the nuts.
I should also note, and I think everyone knows - that I welcome and encourage similar criticism of my own work. I mean often in this space I am just ranting about nonsense that I obviously have no idea WTF I'm talking about. But when I post code - the whole point is for you to tell me ways in which it is wrong so that I can fix it. That's a huge part of why I post code on the internet, I want feedback and fixes and even just "this sucks, there's much better code for that here".
And finally, I'll end with a bit of a challenge :
I don't actually have perfect OBB code, and to my knowledge it doesn't exist. The code in Galaxy3 is much better than the common covariance method. The main improvement there is that I use rotating calipers to find the truely optimal 2d rectangle if one of the axes is fixed. However, I don't do the correct full search of axes. I only check all the normals of the faces of the convex hull. That is not the full set - you should also consider boxes which are only supported by edges and vertices (this is sort of like doing seperating axis tests and only using face normals and not doing edge-edge axes). O'Rourke has described an exact O(N^3) solution, but it's messy and to my knowledge no simple implementation exists.
So I have a few challenges that I'll leave for the reader (or myself in the future) :
1. What is the maximum error (vs the exact solution) of the Galaxy3 method of only considering OBB's which are supported by at least one face of the convex hull?
2. Is there a good implementation of the O(N^3) exact OBB algorithm?
3. Is there an O(N^2) or O(NlogN) OBB algorithm which comes within a (useful) finite bounded error of the exact solution?
High Stakes Poker Season 5 is so good so far, it has me dying with anticipation for the next episode. It's quite a strange feeling after getting used to downloading whole seasons all the time.
Barry Greenstein talks about the big hand on HSP S5-E2 . Barry makes a big mistake in his thought process. I'll try to boil it down without too much of a spoiler.
Basically Barry reasons "I either have the best hand, or if I'm behind I could spike a card to get ahead". Lots of people use this reasoning, and it's wrong.
The basic flawed reasoning goes like this : I think I'm behind, but I'm not sure. Maybe I'm ahead 10% of the time, so I have some value from that. I have a thin draw, maybe I have a 10% chance of improving to beat what I think his hand probably is. So I add those up and I can call up to 20% of the pot, right?
No. Wrong. The big problem is there are more streets of betting. If you are drawing to a thin draw, you really want your opponent to have a big hand that will pay you off if you hit. On the other hand if you are hoping your hand is good, you need him to have a weak hand. Those options conflict and don't add up.
I've been playing with the new Vista Dell lappy for almost a month now. I'm still delighted by the hardware build. It's light and solid, the keyboard is delightfully clicky, the LED backlit matte screen is fantastically bright (though it has developed a scratch - it should have a guard between the screen and the keys for when you close it). The touchpad is only passable, that could definitely be improved.
About Vista : it's been a mild pain. I'll divide them into three categories :
1. MS bloatware. People talk about Vista being a "pig" for performance, taking lots of RAM and taking a long time to boot. This is really not anything fundamental to Vista, and if you bloat up your XP it will be just about the same. The big difference is that Vista comes with tons of awful bloatware enabled by default, from SuperFetch to Defender to the Search Indexer. On the plus side it's not that hard to turn all this off.
2. UAC. UAC is mostly not a big deal (and of course UAC was in XP too - it's just nobody knew it because it was off by default; if you like this is really just another of #1 - turning on more bloat by default). In fact I sort of like UAC in theory. Unfortunately there are some stupid problems with it. The biggest one is simply that's there no simple way to say "I trust these programs", or "let non-admin users run this program always". Annoyingly MS provides a check box for "ask about this file" but turning off that check box does absolutely nothing. There are solutions, the best of which IMO is using "streams" ; see more : here or here
So, you've turned off all your Vista bloatware and UAC. At this point Vista is almost exactly the same as a nice trimmed down XP, so you shouldn't have anything to complain about. But there are remaining problems, and these as the bad ones :
3. Drivers (and other non-MS software). This has been the biggest problem by far. Here we are well into Vista launch and it's still hard to find (working) Vista drivers for many things. It must have been an absolute nightmare near launch. For the life of me I don't understand why they thought they had to invalidate the old driver model. It seems like they could have preserved it and just required new drivers to be the new way. Anyhoo, our digital cameras don't have drivers for Vista, the damn Intel graphics driver doesn't work right in Vista, the sound driver doesn't work right, etc. Every time you go download some tool you have to pray they have a Vista version or the XP version will work.
Compatibility with old versions of Windows is the most valuable thing that Microsoft owns. It's absolutely awful of them to break that. There is no possible feature that could ever be more important than making all old Windows software (and drivers) work.
4. Aero ("desktop composition"). The fancy graphics modes are *really* slow. I literally see "Desktop Window Manager" taking 10% of CPU quite often. It also gobbles video memory. Now, you can turn almost all of this off - but it's a pretty all-or-nothing thing and you lose all the new features. That would be okay if the Intel graphics driver worked right - it doesn't, it gets random glitches unless you have Aero enabled. So at the moment we're stuck with this perf annoyance.
If you are lucky you can find all the drivers and software you need for Vista, and it's a perfectly fine OS. The bloatware and UAC and not problems that can't be dealt with. On the plus side, lots of Vista features are much improved over XP. There are great new APIs, and the Resource Monitor is very cool.
I don't really see how "Windows 7" will fix any of the real problems here. Presumably drivers will be better just because people have had more time to transition over to the new model. If you just took a Vista install and turned off all the crapware and UAC and put working drivers on it, everyone would be quite happy (like that retarded MS commercial).
(BTW this is for Vista-32 ; I'm sure Vista-64 just greatly magnifies the problems of random drivers and 3rd party software not being available).
Arseny Kapoulkine has a nice new blog post about replacing CRT malloc .
I looked into it a while ago and gathered these links on the subject :
What your mother never told you about graphics development Fighting against CRT heap and winning
TRET The Reverse Engineering Toolkit
The Bag of Holding
somecode Memtracer
Practical Efficient Memory Management EntBlog
New Fun Blog - Scott Bilas Figuring out how to override malloc and new
MemTracer strikes back .mischief.mayhem.soap.
Lightning Engine » Blog Archive » Tracking Memory Allocations
Gamasutra - Monitoring Your PC's Memory Usage For Game Development
Fast File Loading EntBlog
Detours - Microsoft Research
CodeProject Visual Leak Detector - Enhanced Memory Leak Detection for Visual C++. Free source code and programming help
A Cross-Platform Memory Leak Detector
then I decided it was a pain in the ass and I wasn't going to do it. This is one place where C++ got it so much better - you can just override new/delete and it all works. I never use malloc/free anyway, so WTF the few people who do use them can go to the CRT heap, what do I care?
The new cblib has a simple lock-free paged small block allocator. It's very primitive and not as fast or efficient as it could be (it's very fast for single threaded but has very poor cache line sharing behavior), but on the plus side it is nice and simple and self-contained unlike tcmalloc or hoard or something which are like a huge mess of code.
BTW related to this all and the threading stuff is : LeapHeap
Hmm these were in that link bag and don't belong, but they are quite fun -
HAKMEM -- CONTENTS -- DRAFT, NOT YET PROOFED
Hacker's Delight
Draw calls and state changes are *so* inefficient, that even a few hundred calls to draw debug overlay stuff cripples my performance. I'm working on my demo for GDC and I'm like drawing an earth model and paging 1.5 GB of textures, and it's all silky smooth perfect 100 fps. Then I add just a tiny bit of lazily written GUI drawing that makes a separate draw call for each character of a text string, and I'm at 40 fps. Urg. WTF.
Oh, and of course getting my damn text drawing to work takes like half a day because I write all the code, and of course nothing shows up. Of course there's some damn renderstate set that's messing it up.
If I was actually going to be writing a lot of D3D stuff I would write an entire wrapper of the whole damn thing that hid all the ugliness, and did things like let you just push individual triangles and it would gather them into batches for you. But of course as soon as I do that they'll rev the API again and it will change enough that all my work is for nought.
I keep hearing on the news that with the economy crashing there are tons of bargains as all the richy dickward stockbroker types are having to sell off their Mercedes and whatnot. Well, I haven't seen them.
There are tons of houses for sale in my neighborhood in Seattle, but the prices are still ridiculously high ($1 M or more for anything I would consider decent - decent in my definition means a reasonable amount of land so that your house is more than 10 feet from the one next to you.) Actually a lot of the "nice old houses" on Cap Hill are actually pretty shitty. I mean they're from 1920 or something so they are made with good craftsmanship and nice materials, but they are really like the "McMansions" of the past. They're huge - 4000 square feet or more - on tiny lots, right next to each other, and many of them are just big squares, with no architectural elegance or proportion to their setting. I enjoy walking around the neighborhood, I find it quite lovely, but it's easy to get caught up in it and forget how undesirable those places really are.
I took a walk around Kirkland a few days ago and saw a 1 acre empty lot for sale about half a mile away from RAD. They want $1.4 M which is an awful lot for something you have to build on, but that's a lot more reasonable than the fucking new condos they're building in Kirkland and trying to sell for $1 M (!!). Good luck with that.
The other day I was thinking about the fucking developers who buy up lots all over Seattle and tear down nice old houses and put up condos or townhomes or something, and then they turn around and sell the condos for $500k or townhomes for $1M , when the fucking original house and lot went for $1M. The developers are making a huge profit, but I can't really blame them. It's the fucking dick-wad retarded consumers who are willing to pay so much for fucking CONDOS. What is wrong with you people.
People are willing to pay huge amounts of money for retarded shitty junk that doesn't cost much to make. The smart efficient capitalists are all too willing to accomodate them. Everything we buy now is fucking shitty quality bad material made in china junk. Even the expensive "haute" stuff is like that - it's just slightly better done.
(the only thing I have seen a bargain on is Mercedes CLK's ; there are 2006 CLK's for $20k. That car retailed for around $45k. Everything else seems to be holding value just fine. Used WRX's for example cost almost as much as a new one; I have a general loathing for Mercedes, but the CLK 550 is not bad and the used ones are really cheap right now).
I guess I need to wait out the crash a little bit more until people actually become desperate.
Holy crap it's fucking gorgeous out here. It's sunny and very clear, the mountains in the distance are gleaming white.
I've been looking at cars. God damn I hate all the new flanges and hard edges and pointless stripes and cuts that are in ALL the modern cars, even the fucking new BMWs and Audis which you used to be able to at least rely on to keep the styling simple and boring. It's like all the car designers looked at the fucking HYUNDAI TIBURON and thought "wow that's brilliant, we need to make our luxury sedans look more like a fucking riced-up japanese street racer".
Almost every damn car review by an American author talks about the fucking CUP HOLDERS. WTF, when being able to fit your fucking big gulp in your car is one of your top priorities, something has gone seriously wrong with your life.
Anyway I think it's pretty much down to the BMW 335xi or the Infinity G37x. (I suppose a regular M3 should be considered too). Time to go drive them.
The fucking traffic up here pisses me off so much. Commuting in the rain in stop and go traffic at all hours is just unbearably depressing. Yesterday I slogged through it and just didn't even have enough will to live to bother ranting about it. Today it was fucking retarded but at least I still have my vigor. There have been a few days that I've hit traffic on the way to work and wanted to just turn it around and go home. Unfortunately I wasn't in a good spot to be able to do that. I have just bypassed my exit a few times and gone another way when I see a merge is all jammed up. I know it adds a lot of time to my trip, but I'd rather spend more time driving in peace than less time in gridlock.
I swear 90% of the traffic up here is just because of fucking busy bodies and brake-slammer and left-lane cloggers. For those of you who are all holier-than-thou and have no fucking clue about traffic jams - traffic jams largely happen as a random "crystal seeding" type phenomenon. A large amount of traffic can flow through a small hole just fine as long as everybody just FUCKING GOES. When some jackass gets freaked out and slams on his brakes for no damn reason, then some other jackass brakes, then that makes two more brake, BOOM crystal formation total jammup. Furthermore, the slow-restarters make it worse. No, you don't need to drag race away from a jam-up, but you also don't need to keep going at damn traffic jam speeds when you get past the obstruction. Clear the pipe out and it will flow.
By far the most regularly tilting thing here is the fucking inability of half the population to merge. It's ridiculous. I have never in my many travels witnessed a collective population that is so uniformly unable to execute a basic task of daily life. It's like it's become a societal norm around here. There are parents who can't merge that teach their kids to not merge, and they grow up thinking that it's perfectly fine to come to a COMPLETE STOP when getting on a freeway. The onramp from Montlake to the 520 is literally a total traffic jam every single day of the year just because people cannot fucking merge. You get up to fucking speed and you slip in smoothly.
BTW the merge retardedness is not entirely the fault of the people coming into the flow - the people who are in the flow seem equally retarded. Hey you fucking see somebody is coming in, if you just ignore them they're not going to go away. You're not a fucking ostrich. You can either speed up or slow down to let them slip in front or behind you, or you could even like fucking change lanes into the left lane to give them a whole free spot. ZOMG change lanes wow that's way too advanced.
Due to the constant merge crisis I have taken up a few reactive strategies. One is to just hang way back if I'm on an onramp and someone is in front of me. When I first moved up here I had a few crises where I was rolling along the onramp from Lake Washington to the 520, la di da, not a care in my head, when all of a sudden the fucking car in front of me that I expect to merge instead slams on his brakes and comes to a complete stop, so I have to stop behind him. Now I'm like two inches from traffic going 60 and I have to merge. Awesome. Now I always hang back like a 100 yards until I see that the fuck-tard in front of me has actually merged, and then I go ahead and gun it and get on.
The other trick is when merging from one freeway to the other I've taken to jumping the merge really early across the white lines. The thing is, the fucking traffic is really light, there are no cars, but one fuck-tard will be camped out blocking the merge and all the people merging are all in a tizzy slamming on the brakes. In the mean time the left lanes are wide open. Boom I jump it.
Yesterday in the fucking miserable rain and traffic I cheated and used the carpool lane for the first time. I haven't done that yet, and I feel like it's morally wrong somehow, but fuck I couldn't take it any more. I see people cheat *all the time*. Almost every day I see a few. In fact I would say well over 50% of the people in the carpool lane are solos. They're usually the dickwad fuck-tards who are riding my bumper and swerving around too much and they decide to jump out into the carpool lane to pass everyone. It really pisses me off when I'm stuck in traffic, but maybe I should just fucking cheat all the time.
Really the carpool lane should just be a "premium lane" that you pay $500 a year to get access to. They give you a toll transponder thing and put a camera on it so they can just automatically charge violators. The funds go towards road improvements. Everybody wins - the dumb fucking broke suckers who can't afford the premium will benefit from the road improvements, and we masters of the universe get to go faster. Win win imo.
Oodle currently has two modes - "Incremental Build" and "Final". In Final mode it's assumed that you have made all the packs the way you want, and all the content is valid. This is presumably how you will actually ship your game. In Final mode, I don't check bundles against the original files or any of that stuff, it's much simpler and faster.
The problem of course is that Final is a big pain during development. Incremental lets you change any individual file at any time. It lets you delete any of the packed bundles. It automatically tries to use the best packed bundle - but