Experimenting

There are times when you have to take at face value what you are told.

There are 1.31 billion people living in China. This according to several sources (that all probably go back to the same official document from the Chinese government.)  I’m willing to believe that number. I’m certainly not going to go to China and start counting heads. For one, I don’t have the time, for another, I might look awfully weird doing so. It’s also accurate enough for any discussions I might have about China. But if I were going to knit caps for every person in China I might want a more accurate number.

That said, sometimes one shouldn’t take facts at face value. A case in point is given below. Let me start out with saying the person who gave me this fact, wasn’t wrong.  At least they’re no more wrong than the person who tells me that the acceleration due to gravity is 9.8m/s².  No, they are at worst inaccurate and more likely imprecise. Acceleration due to gravity here on Earth IS roughly 9.8m/s². But it varies depending where on the surface I am. And if I’m on the Moon it’s a completely different value.

Sometimes it is in fact possible to actually test and often worth it. I work with SQL Server and this very true here. If a DBA tells you with absolute certainty that a specific setting should be set, or a query must be written a specific way or an index rebuilt automatically at certain times, ask why. The worst answer they can give is, “I read it some place.”  (Please note, this is a bit different from saying, “Generally it’s best practice to do X”. Now we’re back to saying 9.8m/s², which is good enough for most things, but may not be good enough if say you want to precisely calibrate a piece of laboratory equipment.)

The best answer is “because I tested it and found that it works best”.

So, last night I had the pleasure of listening to Thomas Grohser speak on the SQL IO engine at local SQL Server User Group meeting. As always it was a great talk. At one point he was talking about backups and various ways to optimize them. He made a comment about setting the maxtransfersize to 4MB being ideal. Now, I’m sure he’d be the first to add the caveat, “it depends”. He also mentioned how much compression can help.

But I was curious and wanted to test it. Fortunately I had access to a database that was approximately 15GB in size. This seemed liked the perfect size with which to test things.

I started with:

backup database TESTDB to disk=’Z:\backups\TESTDB_4MB.BAK’ with maxtransfersize=4194304

This took approximately 470 seconds and had a transfer rate of 31.151 MB/sec.

backup database TESTDB to disk=’Z:\backups\TESTDB_4MB_COMP.BAK’ with maxtransfersize=4194304, compression

This took approximately 237 seconds and a transfer rate of 61.681 MB/sec.

This is almost twice as fast.  While we’re chewing up a few more CPU cycles, we’re writing a lot less data.  So this makes a lot of sense. And of course now I can fit more backups on my disk. So compression is a nice win.

But what about the maxtransfersize?

backup database TESTDB to disk=’Z:\backups\TESTDB.BAK’

This took approximately 515 seconds and a transfer rate of 28.410 MB/sec. So far, it looks like changing the maxtransfersize does help a bit (about 8%) over the default.

backup database TESTDB to disk=’Z:\backups\TESTDB_comp.BAK’ with compression

This took approximately 184 seconds with a transfer rate of 79.651 MB/sec.  This is the fastest of the 4 tests and by a noticeable amount.

Why? I honestly, don’t know. If I was really trying to optimize my backups, most likely I’d run each of these tests 5-10 more times and take an average. This may be an outlier. Or perhaps the 4MB test with compression ran slower than normal.  Or there may be something about the disk setup in this particular case that makes it the fastest method.

The point is, this is something that is easy to setup and test. The entire testing took me about 30 minutes and was done while I was watching tv last night.

So before you simply read something on some blog someplace about “you should do X to SQL Server” take the time to test it. Perhaps it’s a great solution in your case. Perhaps it’s not. Perhaps you can end up finding an even better solution.

 

 

 

 

A Bright Idea and State of the Art

I think everyone likes to talk about “their first program”.  I suspect though it’ll become a less common topic with future generations, just like most kids don’t recall the first book they ever read.

My first program calculated things in Celsius if you provided the Fahrenheit temperature.

I was probably 11 when I helped write it.  It was stored on paper tape and ran on the local high school’s minicomputer (probably a PDP-9 but I honestly have no idea).

It wasn’t a long program, it was probably in FORTRAN.  Again, that is so long ago, I can’t recall the details.  And it wasn’t a very impressive program.  Heck, these days you can do it in a Windows CMD script as one line. (well two for clarity, SET F=212, SET /A (%F%-32)/9*5)

I wrote more complex programs in High School (by then had moved up to Turbo Pascal) and made my first money programming in FORTRAN while in college.

Things had improved from paper tape to floppy drives to hard drives.  Writing programs and debugging programs for the most part became faster. But generally anything more complex than basic input and output through the screen and keyboard was still tough to do and time-consuming.

About two weeks ago I had an idea for project.  I was on the road at the time and didn’t get a chance to sit down at my desktop until last week.  In less than 24 hours I had prototyped the idea and tested it.  The program involved a website, a database, doing some lookups, writing to the database and a bit more.  Even just 10-15 years ago it could have easily taken me 4 or 5 times as long to do something like that.

On Thanksgiving Day, my son wrote a program in a language called Scratch http://scratch.mit.edu/ that would take input, make it circle around then settle on the screen.  The more times you entered text the bigger the resulting “wordle” spiral would grow.  He wrote it that morning before our relatives showed up.  It took him maybe an hour or two, including debugging and overcoming some initial limitations.

He’s been writing programs in Scratch (and other languages) for years now.  I doubt he remembers his first program since writing programs now has become about as easy as using a computer.  I’m sure he doesn’t remember his first time using a computer like I do.  He writes programs at age 11 that in many ways are more complex than anything I wrote in my teens.

The state of the art has certainly changed and it’s made the world a better place all-around.  Languages and frameworks make developing faster and easier than ever before.

Though at times I’ll admit I miss the days of FORTRAN.

Simplicity

Over the years I’ve been involved in a number of web-based companies.  All had great ideas for their business model.   One had one of them had a great idea for classified ads.  It had the latest in taxonomic matching and advanced search capabilities.  If you were looking for a Mustang, it could tell direct you to ads for cars or horses depending on context and other factors.  Its search capabilities were ahead of the time.  It had pretty much every bell and whistle the newspapers asked for and that the design folks could think of.

Then Craigslist came alone.  Craigslist was free (at least compared to newspaper classified ad sites where the newspapers typically charged.)  It had no taxonomic matching.  Its search capabilities were and still are bare-bones.  In fact, it very much relies on the user to narrow down and define searches.

But it succeeded where the other product failed for what I believe one very simple reason.  It was simply blazingly fast.  It didn’t matter if it returned bad results the first time.  It was so fast the user didn’t mind typing in new search parameters and narrowing down their search.  It was faster than any of the “advanced” newspaper classified engines I saw.  Sure, they might try to do a better job of returning results, but the honest truth was, in most cases people would end up doing multiple searches anyway trying to narrow down their search.  And in the time it took to do 2-3 searches with a typical website, Craigslist allowed the user to do 10-15 searches.  Time was money and people wanted to do things quickly.

Over the years with numerous sites I’ve seen the design get in the way of the end-user.  The truth is, 80% of the time, people will use 20% of the features, but they want those 20% to be as fast as possible.

So, keep it simple and keep it very fast.

One of these days though I’ll relate the story of the 3,000 mile Steinway search.