A week ago, I discovered a mistake in my program. In looking through the results I noticed a particular value, which I knew was always a sum of a bunch of fractions, was always coming out as a whole number. I investigated and found the obvious error I had made. It had take me a lot of time writing the code that would generate those very significant fractional values and I was accidentally throwing it all away, choosing to round them all up to one and add that up instead. These fractions are an important part of the simulation and drive a lot of the results; incorrect values here mean my overall results were invalid.
My bad results were saved (because hard drive space is cheap) and the simulations were restarted. Oh, and another sixteen cases were added, bringing the total up to forty-eight. After letting my main computer at school work at it for a week or so, I surmised it was going to take a long, long time to get through all the simulations. We had just recently cleaned the lab and taking an unused machine, I spent a day getting it set up and put it to work. It is not near as powerful as my main computer but every little bit helps.
Monday afternoon, as I was watching results slowly trickle in, I began racking my brain for where I could get more computing power. There was another souped-up computer there in the lab just like mine that probably wasn't getting used much but it was assigned to another student. Maybe she'd let me get on and run my simulations in the background? There were plenty of other older computers around but, besides not having the desk space and network connections, they looked much less capable than the one I commandeered. Where could I get the number-crunching machines I needed?
My home computer. It is new, relatively high-powered, and I had most of the software already installed. It took me a few hours to get it up and running and right before bed the other night, I set it to work.
In the morning I looked at the results and there is no two ways about it: this machine is a monster. I've compiled results from all the runs that have completed so far and I'll let the graphs do the talking:
Each bar is one case that was simulated.
There are two significant differences between my school computer and home computer: my home computer has processors running at almost twice the speed and it also has a solid-state hard drive. My simulations write out to disk a lot, constantly logging results and reading back in logged values to make calculations. I wrote the simulations this way so that I could stop and start them at will, giving them the ability to always pick-up right where they left off. (The more fundamental motivation was due to Matlab, the simulation environment I'm using, silently crashing on my school computer. I have no idea why it does this and wish it would stop. Having the crash recovery has been essential in getting simulations done and it has given me a lot of flexibility in scheduling the simulations.)
A solid-state drive is MUCH faster for disk writing and reading, giving my home computer a big advantage. My school computer only has a conventional spinning-disk hard drive and, to make matters worse, it runs eight of the simulations in parallel, each constantly trying to access the disk. I've done testing during development that showed the per-simulation time increases as the number of simulations being run in parallel increase. Its still faster overall to run as many as I can in parallel but they definitely get in each other's way. My computer at home is only running four simulations in parallel but I bet the solid-state drive presents virtually no roadblock to any of them.
The graphs above aren't a purely apples-to-apple comparison but it is clear for even moderately disk-heavy tasks like my simulations, the solid-state drive goes a long way. Even though my home computer got in on the action a week late, I'm betting it will end up completing half of the simulations itself. Why didn't I think of this last month?
No comments:
Post a Comment