StorageMojo




Robin Harris    


Anatomy of an outage

May 14th, 2008 by Robin Harris in Off-Topic, Security & Public Policy

Getting rid of the hacked files and spam links wasn’t the end of it
Dreamhost notified me that the load on my server was excessive and they’d disabled StorageMojo.

Yikes! Had I been hacked again? DDOS attack? What?

Building the correct mental model
In short order I brought up my SFTP client, my tracking site, the Dreamhost webpanel and my son on chat. He had me toss a new index.html file into the site folder to let people know that the problem was getting addressed.

On to problem solving
It took a while to figure it out because I’d never seen it before.

The load was coming from Google referrals for charming search terms that I’m going to misspell on purpose in hopes of not attracting similar traffic:

  • download sh*mail
  • downlode free 1ndian s3x movies
  • pharmasuitical affiliate prom0
  • 0rgish/behe*ding
  • h1nd1 p0rn m0v1es

*Lots* of pee-oh-rn requests for many different ethnic types. Some things are universal - at least among guys.

There were no hacked files still on StorageMojo - I’d gotten them all last week and they were still gone. But the tracking site was referring to them, so for a while I thought they were there but that for some reason I couldn’t see them.

But then my son checked what happened when someone tried to go to the spam links. The site was delivering a “system error” message - not the static 404 page I’d expect - so the site wasn’t delivering the spam content and it really was gone. Presumably processing for the “system error” page created much of the extra overhead Dreamhost was seeing.

For a while StorageMojo was getting thousands of hits an hour from these Google referrals. At some point Google must have crawled the site again, saw the content was no longer there, and stopped referring people.

Not a moment too soon!

So what was this all about?
My son hypothesized:

This looks like a two-step scheme…step one is that they hacked your site and got all those bad SEO files uploaded. Step two is to send lots of fake Google traffic through your site to increase PageRank.

Then I went one step further and checked out one of the spam pages that Google had cached. In big bright colors it told me that my XP system was infected with viruses and I should download their *free* virus scanner.

Whoa, scary. Except I’m on a Mac.

Botnet recruitment? I don’t know.

The StorageMojo take
I’ve made a number of changes to tighten up StorageMojo. As I was researching this I found that there are many security “folk remedies” out there, but very little on what the high priority issues are.

Keeping software up to date seems to be the critical success factor - and sad to say, I’d been lax. In addition to keeping current I’m now checking my site files more often among other changes.

Hopefully these requests will tail off as Google stops referring people. And StorageMojo can go back to being a quiet little site.

Thank you for your patience.

Comments welcome, of course.

Seagate’s head-settling time?

May 13th, 2008 by Robin Harris in Off-Topic

First it was the bogus “national security” argument against a Chinese buy-out of Seagate. Now it’s suing STEC over solid state drives (SSD). Has William Watkins, Seagate’s CEO, jumped the shark?

STEC said, per legal SOP, that the suit was “without merit.” After looking at the patents I agree.

Look at the patents
There are 4 patents named in the Seagate suit.

The links will download pdf versions if your insomnia is acting up.

6336174
The first patent covers an invention called a hardware-assisted memory module (HAMM) that can, when there is a system glitch, isolate

. . . itself from the host computer system before copying digital information from volatile memory to nonvolatile memory.

This reminds me of a common RAM-based SSD strategy - used 20 years ago in DEC SSDs - to copy data held in RAM to a disk drive when power failed.

6404647
The 2nd patent is for a solid-state mass memory storage device. This device

. . . comprises a printed circuit assembly and a plurality of non-volatile, high density storage devices mounted to the printed circuit assembly and electrically connected thereto.

A picture is worth 1,000 words:

More than a passing resemblance to a compact flash card - a product I bought in 1993.

6849480
“Surface mount IC stacking method and device.” This patent covers a technique that solves 3 problems:

  • Routing signals through stacks of similar chips
  • Stacked devices with identical dies that are made into distinct chips - flashed? - later
  • Long, high-capacitance interconnects between stacked devices

Seems like folks have been stacking chips for a while. Is there anything new here?

7042664
“Method and system for host programmable data storage device self-testing.”

. . . providing programmable self-testing of a data storage device comprises selecting one or more host programmable tests stored in memory in the data storage device by setting data in a first log in memory of the data storage device.

This invention’s goal is to enable disk drive testing without removing the drive from the host. It embodies the concept of the host providing test parameters for an attached device - which the patent imagines to be a disk.

Size matters
Part of Seagate CEO William Watkins’ pique with STEC is fueled by a suit from 3.5″ drive inventor Rodime that Seagate paid $45 million to settle. Rodime patented the 3.5″ form factor for disk drives - and got the courts to enforce it.

Watkins knows that patenting disk drive form factors is silly - they have to be standard sizes - but if the USPTO grants them and the courts enforce them, why not?

The StorageMojo take
The IC stacking patent may have some merit - I’m no judge of chip packaging. But the other patents, especially for compact flash, seem dubious at best.

The good news is that the Supreme Court has forced the patent courts to change course. In KSR v Teleflex the Supremes ruled that the non-obviousness is a legal question, not a factual one. That bit of hair-splitting means that lower-court rulings can be appealed. Under the old rule once the trial court made a “finding of fact” it could not be re-examined in the appeal.

Rodime would have lost under that rule. While it will take time for KSR to play out, in the short term it almost certainly reduces the value of existing patents - like Seagate’s ludicrous flash drive patent.

While some have portrayed this as Seagate trying to stymie a competitor, it’s more likely that Seagate believes STEC has some useful technology. The promise of a costly legal battle might persuade a smaller company than STEC to settle with a quick cross-licensing deal.

That would help Seagate catch up in the high-end flash SSD market. I hope STEC resists that temptation and continues to focus on the knotty issue of fast random write flash performance.

Comments welcome, of course.

NAND - an engineer’s perspective, pt zwei

May 12th, 2008 by Robin Harris in Architecture, SSD/Flash Disk

Herewith continues NAND - an engineer’s perspective.

Any you thought marketing guys were wordy! The quoted bits are from the earlier StorageMojo post Notebook flash SSD market: fantasy or mirage?. Teil eins ist hier.

Begin part zwei

. . . tested application performance hardly changes either . . . .

Actually, this makes sense.  If you are accessing 4k of data, then both HDD and SSD are both fast enough and you don’t care.  If you are accessing a 1MB file, then that is 256 x 4k sector accesses, and the sectors will be laid out one after the other, which is where HDDs perform well.  SSDs will shine when you need to do 256 x 4k sector accesses, and the sectors you are accessing are scattered across the disk, but as far as I know this access pattern is not common except on servers.

And what about the 4-bit MLC that Toshiba is counting on to drive costs down?

I’m a NAND flash fan, but this is scary stuff for me.  To store 1 bit in a bit cell, you need to distinguish between two voltage levels.  To store 2 bits, you need to distinguish 4 levels.  For 3 bits, 8 levels.  For 4 bits, 16 levels.  I think at the 4 bit/16 level point, we’re down to where 10-20 individual electrons can make the difference in the bits read out.

This will less durable than current SLC. How do you explain that to consumers?

The answer is easy, but doing it is hard.  You have to make it so that the issues are completely invisible to consumers.

Note that this has been done successfully with flash for years.  Most of the memory cards (SD, MMC, etc) that people have been buying for years use MLC flash.

Flash has read errors - that’s why vendors implement error detection.

NAND chips are generally organized in write pages, with a spare area for each page - typically 2kB page, with 64B of spare area.  The spare area is used to store ECC parity data, and meta data (more about this shortly).

HDDs have read errors as well, they also write their data to the platter using ECC, and other algorithms that make it easier to recover the bit clock and align the heads when reading the data back.

But flash has a problem disks don’t: flash drives move your data around a lot more often than disks do. Every time a flash drive writes a page, it has to erase the entire block that page is in.

Not quite right.  Generally, a page can only be written once, and has to be erased before it can be written again.  And unfortunately, erases can only be done on an erase block, which is usually 64 write pages.  If you have to erase a page, then you might have to move 63 other pages to free up the erase block - yuck!  It happens sometimes, but the FTL (flash translation layer) software that manages all of this is usually optimized to avoid this situation as much as possible.

The normal scenario is that you write a page, and the FTL just puts the new data in a new page somewhere, and marks the old page as obsolete.  Once you the FTL runs low on space, it needs to do garbage collection, but if you put a little extra NAND in your system so that even a full filesystem has some empty pages, you can make that pretty rare.

No hard numbers from the vendors - depends on how good their signal processing algorithms are - but it could easily be 5,000 writes - down from 10,000 today.

Actually, some of the NAND vendors are already at 5k erase/write cycles today.  This, and slow write speeds are definitely the weak links for MLC NAND.

I believe that it is possible to do a good enough job with caches in the computer DRAM, and in the FTL to make a system built from 5k endurance work for a very long time.

Note that the 5k number is a statistical thing - this is the number of cycles at which about x% of the blocks will have failed (I think x% = 50%, but I didn’t look it up).  This means that some blocks might fail when the part is new, and some might last a lot longer.  If the software is done right, then the amount of available storage space will gradually shrink as blocks fail, and the entire drive won’t suddenly fail.

The map that keeps track of where your data is rapidly gets very complex - and itself is regularly read and rewritten. How well protected is this critical data structure? If it isn’t bulletproof you can kiss your data good bye.

All true.  But you can also write metadata information in the spare area, to allow you to rebuild the FTL map if something goes horribly wrong.

Also, HDDs have the same problem with their FAT tables, or the modern equivalent.  This is normally stored on the disk, and in the computer’s RAM, with the disk copy being a little out of date.  Lose power at the wrong moment, and bad things can happen.

The StorageMojo take
Many thanks to the anonymous contributor. Net/net this points again to the suitability of flash drives for servers - and not so much for notebooks - the original subject.

The larger issue is the lack of transparency on the part of NAND SSD vendors. Until their architectures can be independently reviewed, we all have to rely upon marketing assurances - not! - and the useful but skimpy testing provided by sites like Anandtech.

The server-side SSD market can work with those limits. After all, the vendor of the complete system has to stand behind it.

But that is a tiny fraction of the total available market. The big win is on the consumer side: 100+ million units; if the product delivers.

Samsung, Toshiba: your current strategy is doomed. You need to engage at the consumer’s level instead of relying on the usual marketing hype. Your product is too costly, now and 3 years from now, to succeed without delivering real benefits.

You aren’t there yet.

Comments welcome, of course.

StorageMojo in Chicago

May 10th, 2008 by Robin Harris in Off-Topic

I’m spending a couple of days R&R in Chicago. Caught Shemikia Copeland at Buddy Guy’s last night. Cruised the Chicago river this morning. Hope to hit another couple of blues clubs tonight.

Then back to the mountains of northern Arizona.

Moderation has been a bit spotty - but all will be back to normal Monday morning.

NAND - an engineer’s perspective

May 10th, 2008 by Robin Harris in Off-Topic

The post on on notebook flash drives [see Notebook flash SSD market: fantasy or mirage?] generated many comments.

Part of what makes it hard to discuss flash is the dearth of information about how it works. My investigation of flash issues has been helped along by hints and tips from insiders and the occasional paper that sheds light on FTL design issues [see Flash chance, based on a paper from Microsoft Research].

Thus I was pleased to get a 2500 word email from a polite and knowledgeable SSD engineer cum marketing guy commenting at length. I asked him if I could publish his comments and he said yes - if I preserved his anonymity and removed the names of the companies he’s worked for.

Seemed reasonable. Since it’s long I’m breaking it up into 2 parts.

In the editing I’ve removed some info, abridged some comments, added the bold face headers and broken some long paragraphs into 2 or 3 shorter ones for online readability. At all times I’ve sought to preserve the author’s meaning.

Begin SSD guest comment
First up, great post. I agree with most of what you said. I haven’t used an SSD drive myself, but the reviews I’ve seen make me wonder if I ever will - way too expensive, for way too little benefit.

The lay of the land
Quick background comment on flash memory. There are two main kinds of flash memory: NOR & NAND. NOR is similar to SDRAM, NAND similar to HDD. NOR can be accessed randomly, is faster (at least for reads) than NAND, but the chips are smaller and cost a lot more per GB.

NAND can only be accessed in blocks like a HDD, the chips are larger, and the cost per GB is less than NOR. NOR is commonly used for firmware (e.g. the BIOS in your PC), NAND is commonly used for bulk storage. In the discussions about SSDs, we’re always talking about NAND, so I’m going to say “NAND” rather than “flash” in the rest of this email.

NAND flash has a ~10x worse $/GB than HDD, but it has about a ~10x better $/IOPS than HDD.

Your tour guide
I’ve been in the semiconductor business for ~20 years, first as an engineer, then gradually transitioning in the management & marketing. In my last job I developed relationships with all the NAND market players. When I first started looking at NAND chips, 4MB chips were still around, now we’re working with 4GB chips - wow!

The future
I think that the SSD drive makers can do a MUCH better job than they’ve done so far, and that the raw technology is capable of doing much better. I think eventually the SSD products will get better, and we’ll see SSD drives (or their successors) used almost everywhere.

1st, the numbers
A state of the art MLC NAND chip today is 4GB, so a 64 SSD drive has at least 16 NAND die inside. The peak write speed should be ~5MB/sec/die, so the SSD should be capable of ~80MB/sec sequential write. Peak read speeds should be ~30MB/sec/die, so the SSD should be bottlenecked by the SATA interface.

These are MLC numbers. SLC performance will be even higher, about 8x better for write speeds for the datasheets I compared. True, these are best case raw performance numbers, and in the real world there are complications that will keep you well away from these numbers, but it should be possible to do waaayyyy better than we’re seeing now.

Responding to StorageMojo
[He goes on to quote and respond to some points from the StorageMojo post. I've put those in quotes.]

Flash has a place in one notebook niche: below the $40-$50 minimum cost of a disk. As we’re already seeing with the Asus Eee, replacing $50 of disk with $10 of flash makes a big price difference.

I agree 100% with this - if I can build a system using either $10 of NAND, or $50 of HDD, and the $10 of NAND is enough storage, then NAND wins. It doesn’t matter that the HDD has higher $/GB, or that it will have loads of spare GB - it costs $40 more, and it’s out.

$10 of NAND storage will buy a rapidly increasing amount of storage, so the cut-over point where NAND wins based on entry cost along is rising rapidly. I think that the $/GB number is halving every 12-18 months, so in 2-3 years we’ll get 4x more NAND for the same cost.

FABulous

Given the multi-billion dollar cost of semiconductor fabs, getting the notebook SSD market wrong would make Toshiba’s $250 million HD-DVD loss look cheap.

Actually, while the size of the $$$ at stake are probably pretty large (inventories, controller chips design efforts, etc), they are not as large as a fab. A modern day, state of the art wafer fab costs several billion dollars, but that investment won’t be completely at the mercy of SSDs succeeding, for two main reasons.

One, these fabs are built to make both SDRAM & NAND. Both markets are very sensitive to the balance of supply/demand, and therefore both markets exhibit wild price swings. By building the fabs to support both types of (very high volume) products, they can switch from one to the other based on the supply/demand balance in both markets.

Two, there are other huge markets for NAND, primarily memory cards (SD, MMC, xD, memory stick, CompactFlash, and variants), & consumer electronics devices (phones, especially SmartPhones, GPSs). One of the biggest customers on the planet for NAND is Apple (iPods, iPhones).

It is true that Toshiba is playing a billion dollar poker game with (mainly) Samsung as to fab capacity (if there is overcapacity, both companies suffer, but if one under-invests and the other over-invests in capacity, then the over-investor wins), but SSDs succeeding or not will happen slowly enough that the capacity differences can be absorbed by speeding up or slowing down the bringing on of new fab capacity.

. . . today’s spot market MLC $2500/TB . . .

That spot market price is about right. This implies that the 64GB SSD in the Macbook Air should be about a $300 upgrade, not a $1,300 upgrade. True, you do get a slightly faster CPU in the deal, but I think that we’re looking at way high early adopter prices right now.

And if the market doesn’t appear, a billion dollar write off.

I’m guessing that they are betting $10M to $20M on a project to build a SSD controller design chip. They can’t afford not to have the controller, in case the SSD market results in a significant proportion of their volume, and they can’t assume that they will be able to buy the controller from an outside company (or even more risky, a competitor).

Power: no SSD notebook has gained more than 10 minutes battery life over disks. Since flash is already power-efficient that won’t change. Disks have multiple opportunities to improve power use - and with over a $1 billion a year in R&D behind them - they will.

The primary users of power in a note book are (in order)

  • The display back light
  • The CPU
  • Everything else

The HDD is lumped in with everything else. Flash should have a significantly better power consumption than HDD, but since both are operating in the power shadow of the display & CPU, it doesn’t make a lot of difference.

Despite what a commenter said, spinning the HDD platter doesn’t take a lot of energy. Spinning them up to speed from idle does take a lot of energy, but only for a few seconds. Keeping them spinning once they are started only consumed enough energy to overcome the bearing friction, and that friction is pretty low. Most of the power spent in accessing a HDD is in moving the read/write heads, and in the read channel electronics.

One other think you didn’t mention is that after ~30 years of development, Windows (Linux, OS-X) is pretty well optimized to the characteristics of HDDs. Have you ever heard of the Windows XP Prefetcher? Wow!

Now, if we can do something about the power consumption of the display back light and CPU, then SSD vs. HDD might make a difference, but by then we’re talking about cell phone like battery life so it probably won’t matter.

End of part 1
Next up: flash financials; 4 level flash durability; data protection and more in the conclusion to NAND - an engineer’s perspective

Comments welcome, of course. Did you notice that he actually disagreed with much of what I said? But he was nice about it.

StorageMojo: hacked!

May 6th, 2008 by Robin Harris in Off-Topic

Always learning
This week’s learning: a hacked web site. There’s been a lot of that going around. Writing has taken a back seat to fixing the problem.

It took a while to grok how deeply StorageMojo had been hacked.

First I got a note from my hosting company - something about a daemon - and I told them to take it down. Which they did.

Thought I was done.

But I wasn’t
Then Gary at Nexsan noted that StorageMojo was alarming his browser. Went into the StorageMojo files on WordPress and discovered some iframes that I hadn’t put there.

Pulled them out. Upgraded to the latest version of WordPress.

Thought I was done.

Wrong again
Fired up the SFTP client and took a look at my web site files. Saw a bunch with names I didn’t recall, like Emma, Alexander and Jordan. Inside, links to hundreds of sites I’d never heard of either.

Got rid of them.

Checked a couple of other sites I host on the account. One had been completely cleaned out by the spamsters - the site was gone - replaced with more collections of links.

Edited the junk out of those sites. Hoped I was done, but decided to go through every single file and folder on all three sites.

Found the malicious code. Very professional. Replicated in several places. Language = ru, whatever that means.

Corrective action
New passwords, of course. Notices that the Dreamhost web management system doesn’t make that easy to do - password management is spread across several different tools - which guarantees that people won’t change them very often.

Read up on security. A couple of good sites are Blog Security and Stop Badware. Google also has a helpful checklist.

Did some other housecleaning and site hardening.

The StorageMojo take
I now know I will never be done. The rest of you with blogs should learn by my misadventure.

The biggest surprise is that there are many things that can be done to make sites harder, but they are not the defaults. You have to do some research and sometimes some configuration.

That is wrong. Other than general exhortations to update software, the hosting companies do almost nothing to make it easy to manage security. Not many consumers are going to dig into log files every couple of days.

I’m more technical than the average blog writer and some of this stuff is a PITA. The Internet Operating System needs some security patches.

Comments welcome, of course. AFAIK nothing bad got sent to readers of StorageMojo.

NAB Shorts: MatrixStore

May 2nd, 2008 by Robin Harris in Off-Topic

Spent some time with Nick Pearce, a co-founder of Object Matrix, a UK-based software startup supporting commodity-based archiving.

Their MatrixStore product clusters off-the-shelf servers and storage to create a secure disk based archive. MatrixStore runs out of the box on Mac OS and will work with most Linux supported tin.

Commodity hardware and software
Archived data should not be tied to a specific storage platform. Proprietary formats or filesystems are an accident waiting to happen.

MatrixStore keeps the data on industry standard filesystems in the same format as on the client disk. The data will be retrievable even if the company has disappeared.

Platform lifecycle
Older gear can play in the same config as newer stuff. Roll old hardware out of production into the archive, and double its useful life. Upgrade in place, a critical consideration for archives.

Application-centric storage
MatrixStore is integrated with the recently released Final Cut Server from Apple. They provide life-cycle management of assets and metadata from ingest through archive.

The MatrixStore software stores the added FCS metadata using metadata operations supported by XFS on Linux. When ZFS is supported on Mac OS they plan to use its native metadata support as well.

MatrixStore also automates some tasks that usually require manual configuration, adding capacity, data redundancy, data authenticity and the like. Like Final Cut Server it’s designed for people who aren’t storage admins.

Cool pricing
They give away the first 15TB of software licenses away for free. After the first 15TB it’s $1000 per TB of protected content. There’s a pricing widget to help with configurations on their website.

The StorageMojo take
Digital archiving is a critical issue for content creators. Nick - who had worked at EMC - made choices that will become de rigueur for deep archiving as people come to understand the issues:

  • Content in its original format
  • Commodity hardware
  • Upgrade in place
  • Pay as you go
  • Automate the small stuff

MatrixStore’s focus on Final Cut Server and their pricing model are both positives. Final Cut Studio has taken out a huge swath in the NLE market - over 1 million licenses sold - so the FC Server business should be a healthy one.

Their pricing transparency and unlimited-time 15 TB trial should also work well. All in all, an up-to-the-minute approach to the market. You might almost think they’re American.

Comments welcome, of course.

The value of guaranteed uptime

May 1st, 2008 by Robin Harris in Architecture, Enterprise, Future Tech

What, if any, is the value of multi-year storage uptime?

Xiotech and Atrato promise 5 and 3 year uninterrupted service on their new arrays. Now it is time to ask, as some commenters have, so what?

After all, enterprise data centers are already well-equipped to deal with disk failures. RAID keeps the data available. 7×24 service replaces the failed drive with a new hot spare. Experienced storage admins paper over the cracks.

It isn’t like you’re going to fire all your storage admins just because arrays stop breaking.

Opex vs capex
The direct cost saving - no maintenance contract for x years - may or may not be reflected in the purchase price. From a buyer’s perspective there are 2 costs: the capital expense - capex - and the operating expense - opex. Opex is fully tax deductible in the year incurred, so it is easier to get.

Atrato and Xiotech need to think creatively about maintenance pricing.

Breaking into the glass house
Breaking into data centers with the promise of cost savings isn’t easy. The provable cost savings have to be 50% or better to get conservative data centers to change vendors. And it helps if there is a recession or the business is tanking. Motivation.

A case can be made that after adding up a standard array’s maintenance costs, random disruption costs and additional management it will be cheaper to go with the new product. The CFO will demand it.

But if you want to change the market, you have to change the way the market thinks.

Re-thinking the issue
Straight cost-displacement arguments aren’t going to have the legs both companies would like. They need a different model.

Enterprise IT is manufacturing plant - not an engineering testbed. It confuses the engineers because it seems like a techie haven - but it isn’t.

It is all about shipping product, each and every day. Like a real factory.

SPC
Everyone accepts that statistical process control has changes the face of manufacturing. A core idea behind SPC, reducing variability improves quality, is directly applicable to IT factories.

What Atrato and Xiotech do, ideally, is reduce IT ops variability. There is always a known level of performance. Availability is 100%.

Thus most of the usual dependencies are no longer dependencies. I/O slowdowns and timeouts should disappear. Drive rebuilds won’t impact performance. Admins won’t pull the wrong drive - which happens about 2% of the time - and bring down the array. And so on.

The StorageMojo take
Enterprises over-configure because they never know what is going to hit them - but they do know it will be at the worst possible time. Ideally they want to be ready to handle the biggest shopping day of the year - even after an array failure.

Workload variability isn’t going away. But wouldn’t it be nice if equipment performance and availability variability did?

That’s what Atrato and Xiotech are selling. I wish them luck communicating a value prop that strikes at the heart of what every other array vendor is selling.

Comments welcome, of course.

HGST getting ready to rumble

April 29th, 2008 by Robin Harris in Disk

I got quoted in Byte & Switch today about Hitachi Global Storage Technology and the new CIO. HGST has been a money pit for Hitachi since they bought the IBM disk operation.

They question is: are they ready to do something about it? The answer is yes.

An informant assures me that HGST has created Raj Das - late of SGI - the new SVP of Marketing.

How many psychiatrists does it take to change a lightbulb?
Raj and I worked together at Sun, where he was one of the few results-oriented, damn-the-torpedos marketing guys. He’s high energy and creative.

Turning around Hitachi marketing is going to take everything he’s got. Disk companies are not only engineering dominated - the engineers are even more anti-marketing than most. Add in the culture clash of two proud companies and, well, it isn’t good.

The engineers need to understand one thing. Until the Hitachi GST brand means something positive to consumers - at Fry’s and at datacenters around the world - the company won’t be able to justify an extra nickel of margin. Without that, profitability will remain a mirage.

One, but the lightbulb has to really want to change
I know Raj and I know what he can do. Will the guys across the pond let him do it?

The StorageMojo take
Disk vendors mostly compete on price. HGST has an opportunity to change this by re-thinking the disk value proposition - and the communication of it. The industry is at several inflection points.

Here’s hoping HGST can seize at least one of them. More competition will be good for all of us.

Comments welcome, of course. You can see Raj on the SGI video from a month ago below.

Notebook flash SSD market: fantasy or mirage?

April 27th, 2008 by Robin Harris in Architecture, SSD/Flash Disk

Fresh off the HD-DVD fiasco, Toshiba execs are stepping up to pursue another expensive flop: notebook SSDs. Memo to Toshiba: people won’t pay huge SSD premiums for nothing. And almost nothing is what flash SSDs provide today - and for the foreseeable future.

Please sir, may I have another!
Given the multi-billion dollar cost of semiconductor fabs, getting the notebook SSD market wrong would make Toshiba’s $250 million HD-DVD loss look cheap. The president of Toshiba semi, Shozo Saito, recently opined that flash drives will be in 25% of notebooks by beginning 2011.

He is so-o-o wrong.

Hand me the back of the envelope, please
Guessing 200M notebook sales in 2011, 50 million flash drives of, say 250 GB, for total sales of 12.5 million TB of flash. Assuming a cost reduction curve of 50% annually from today’s spot market MLC $2500/TB to ~$320/TB in 2011 . . . hmm-m . . . $4 billion in chip sales.

Give or take. Yummy!

If Toshiba projects winning 20% of the market, $800 million in sales would justify over $1 billion in flash factory capacity. And if the market doesn’t appear, a billion dollar write off.

Same power, same performance and way more costly - I’m sold!
If flash drives delivered what proponents claim there would be no problem. But they don’t and they won’t.

Power: no SSD notebook has gained more than 10 minutes battery life over disks. Since flash is already power-efficient that won’t change. Disks have multiple opportunities to improve power use - and with over a $1 billion a year in R&D behind them - they will.

Performance: tested application performance hardly changes either - even with a $3,800 flash drive. Notebook I/O doesn’t favor flash drives - and the engineering contortions needed to fix flash aren’t cheap.

The one big win for flash performance: boot and app load times. It makes the system feel a lot snappier - if you often reboot. Sleep mode makes that much less important.

Reliability/durability: flash vendors tout 2 million hour MTBFs and superior shock & vibe specs. Yet Dell reports that their SSD infant failure rates are about the same as disks. And the return rates are higher.

So where, exactly, is the flash advantage? Plus, it is only conjecture that flash drives will prove to be more reliable in actual notebook use. Only time will tell.

And what about the 4-bit MLC that Toshiba is counting on to drive costs down at 40-50% per year? This will less durable than current SLC. No hard numbers from the vendors - depends on how good their signal processing algorithms are - but it could easily be 5,000 writes - down from 10,000 today.

How do you explain that to consumers?

Data integrity: the unasked question Of all the questions about flash drives, this is the biggest. I have yet to see an SSD read error spec.

Flash has read errors - that’s why vendors implement error detection.

But flash has a problem disks don’t: flash drives move your data around a lot more often than disks do. Every time a flash drive writes a page, it has to erase the entire block that page is in.

So what happens to the data in the block? It gets read - almost always correctly - and rewritten along with the new page. The new location must be tracked by the drive.

The map that keeps track of where your data is rapidly gets very complex - and itself is regularly read and rewritten. How well protected is this critical data structure? If it isn’t bulletproof you can kiss your data good bye.

If FTL’s are like every other storage product, catastrophic failure modes are hiding in the statistical weeds. Enterprise IT is rightly suspicious of storage that “auto-magically” moves data around. Consumers have no idea. SSD vendors better have their act together or the class action suits could be as big a problem as the empty fabs.

The StorageMojo take
The further I wade into flash issues, the worse it gets. My sense is that the flash industry close to creating a multi-billion dollar fiasco. Why?

  • Over-promising on performance, reliability, battery life and data integrity. Take a systems level perspective, folks. Consumers do.
  • Over-broad positioning of flash drives as a general replacement for notebook hard drives - when pricing clearly says they aren’t.
  • Relying on system OEMs like Dell to market SSDs to consumers is a freeway to failure. They don’t have the bandwidth. The flash vendors need to market flash SSDs directly to consumers. Not sell them - market them.

The flash guys are caught in a vise: big expensive fabs that need to run all year; and seasonal demand that whipsaws their pricing all year.

Notebook flash drives can help even out demand - but only if consumers accept them for the right reasons. Otherwise Toshiba’s new fabs will build chips for a non-existent market.

Update: Flash has a place in one notebook niche: below the $40-$50 minimum cost of a disk. As we’re already seeing with the Asus Eee, replacing $50 of disk with $10 of flash makes a big price difference. But those units won’t solve the seasonality problem and may even make it worse. End update.

Comments welcome, of course.

NAB shorts: Omneon Video Networks

April 24th, 2008 by Robin Harris in Architecture, Clusters, Video

A video networking company in StorageMojo?
Omneon isn’t new to StorageMojo. Their price list has been on price list page since January 2007.

Their booth was about 50 yards from Isilon’s and EMC’s and it was a madhouse each time I walked by. Partly that was because they were holding all their meetings there, but it also seemed like there was lots of traffic.

Building storage into an app
Founded in 1998, Omneon started offering storage in response to customer demand. They decided on a commodity-based cluster and built their own storage software, MediaGrid.

Their architecture hews to the post-array Google-style storage model:

  • No RAID - slices are replicated one or more times based on policy or demand
  • Single global namespace
  • Out-of-band meta-data servers manage content servers

<strike>They can rebuild a failed 1 TB drive in less than an hour.</strike> They can replicate the data from a failed 1 TB drive in less than an hour.  Just add 4 or 24 drive content servers to scale capacity. <strong>Update:</strong> My original wording was incorrect. Thanks to Bill Todd for elucidating Omneon’s mechanism.<strong> End update.</strong>

But that’s not all!
Omneon’s content servers do more than serve content. They put their unused CPU power to work doing jobs like transcoding - translating content from one format like HD to iPhone-suitable QuickTime.

Given the growth in multi-core processors that will become a more important part of their market appeal over time. Since they process files, not blocks, they have many more opportunities to add value than a modular array.

The StorageMojo take
Omneon made a lot of smart choices with their MediaGrid architecture. It shows how a company with a few bright engineers can build a basic storage utility to take advantage of low commodity costs.

Where they win is their integration with the application and the workflow. They’ve created a video utility that integrates ingest, post, media management and playout with the smart and scalable storage needed to make it all work.

Application specific storage writ large. They’ve taken the same storage the rest of us use and wrapped broadcast interfaces around it that broadcasters already know.

Comments welcome, of course.

NAB shorts: Isilon

April 23rd, 2008 by Robin Harris in Clusters, Video

Isilon at NAB
Stopped by the Isilon booth a couple of times. Traffic seemed steady. Isilon held their meetings away from the booth, so it didn’t have the level of activity of, say, Omneon’s booth.

NAB is their biggest show of the year and the market where they have the biggest footprint. Their booth was the same size as EMC’s nearby - and much quieter booth - despite a Hulk display that looked like an embarrassed afterthought.

Personnel changes at Isilon
Isilon CEO Sujal Patel was there. We discussed my theory that Peter van Oppen had joined the board as a prelude to becoming CEO. Sujal assured me I’d gotten it wrong - that he was in it for the long haul, with Peter as a senior and trusted advisor.

Looking at him I believed. Sujal has developed the gravitas of a leader. Watching his company almost die - and his net worth drop from $75 million to $12 million - seems to have concentrated his mind.

He’s also hired a CTO. Looks like Sujal has moved on for good.

The StorageMojo take
Anyone waiting for Isilon to lay down and die has a long wait. While they may have alienated Wall Street - for good reason - customers seem to like what they have. They’re coming through the storm.

Comments welcome, of course. Isilon was also doing the “shown but not announced” thing with some products due later this year. Sujal asked that I not write about them and I said I wouldn’t. But the engineers have been busy.

SNW & NAB: IOPS vs bandwidth

April 23rd, 2008 by Robin Harris in Future Tech

NAB frame by frame
SNW and NAB did not overlap this year, so I spent 3 days each at both. The 2 events are very different: storage is the topic at one and merely central to what everyone is doing at the other. I enjoy both.

Rather than tackle NAB in one piece I’m writing a series of short takes on a number of companies.

SNW is the past. NAB is the future.
Storage is in the midst of a massive transition from an IOPS focus to a bandwidth focus. Like computing’s shift from batch to interactive in the ’60s and ’70s this transition is about bringing the technology closer to how people live. Not consumerizing, humanizing.

Life is a sequential access workload. Our eyes, ears and our pattern-hungry brains crave bandwidth. New display technologies push patterns at us at rates that used to require roller-coasters.

Batch isn’t going away - Google probably runs more batch jobs than most F100 firms - and neither is transaction processing. But the investment goes to the growth areas and bandwidth intensive storage is a growth area.

HD 3D: the Next Big Thing?
3D is getting good and will be the next step in home theater. Whether it is good enough to break through in theaters is another issue. But the net is that high quality 3D doubles data rates.

NAB gets this. They also get that to be useful, storage has to be integrated with the application, whether that app is production, editing, distribution or presentation. A new wide world is opening up to people who know storage and can learn an application. Much easier than the reverse, to be sure.

And, of course, they have a very reliable market to pay for all the innovation: entertainment. Cool.

Comments welcome, of course. First up: Isilon.

Holographic storage debuts next month

April 20th, 2008 by Robin Harris in Disk, Enterprise, Future Tech

After 8 years of hard slogging the folks at InPhase are ready to ship the world’s first holographic storage system.

As StorageMojo noted 2 years ago:

InPhase is claiming they will ship drives with removable holographic disks with 300GB capacity and 20Mbps transfer rate later this year.

I love holographic technology and wish InPhase the best, but I don’t believe they have a viable business with their technology - yet. The problem: 3.5″ disk drives will reach 750GB by the end of this year with much faster transfer rates. InPhase’s 20 Mbps is only 2.5 million bytes per second or only 9GB per hour. It will take over 30 hours just to fill one disk! I predict that hard drives will still be more convenient and fairly cost-competitive than this promising new technology.

But keep at it guys. Lightning will strike if your investors are patient enough.

So what’s different now? They’re saying they will ship next month instead of “later.” The transfer rate is 20 MB/sec. And the media archive life is 50 years - higher density and longer life than tape.

Limited availability until fall
I saw a unit - not sure it was functional - at NAB last week. Marketing VP Liz Murphy gave me the pitch, about 110 seconds of which you can watch here:


The yellow plastic on the drive is for display purposes. Note the nifty see-through media.

Target market
As befits a small company with an $18,000 holographic drive whose media is quantity 1 $180 a copy, InPhase has a sharp focus on people who need a 50 year archive life. Like film studios, whose film-based archives are bulky and subject to the vagaries of physical chemistry.

The media price is reasonable - compared to Blu-ray. NewEgg has TDK 25 GB blu-ray media for $17. 12x that - to get 300 GB - is $204. Plus the clutter. The burners are cheaper though.

Why did it take 8 years?
InPhase had to literally invent almost every piece of the system.

  • The optical media.
  • The manufacturing process for fabricating thick, optically-flat and high-dynamic range media.
  • The mathematics and circuitry needed to use digital camera CMOS chips for high-speed and high-accuracy image reconstruction.
  • A new method - polytopic multiplexing - for a 10x density increase.
  • Holographic mastering techniques for commercial reproduction.

For example, in order to use commercial, l.e. affordable, CMOS optical sensors to read the holograms, InPhase engineers had to do a deep dive (pdf) into optical information theory:

For holographic data storage it is advantageous to limit the spatial bandwidth of the object beam to only slightly higher than the Nyquist frequency of the data pattern. Typically an aperture in a Fourier plane is used to band limit the data beam (thereby also minimizing the size of the holograms in a Fourier-transform geometry). The data pattern may contain at most 1 cycle/2 data image pixels, so that the Nyquist frequency of the optical field of the object beam is minimally 1 sample/pixel. However, since the spectrum of the irradiance pattern is the auto-correlation of the spectrum of the optical field, the Nyquist frequency of the detectable signal is actually 2 linear samples/pixel minimum. Thus any method relying on less than 4 detector elements/data image pixel is operating in a sub-Nyquist regime where the Nyquist rate is defined with respect to the actual irradiance pattern impinging on the detector.

As Liz noted, you can’t hire experienced holographic storage engineers. InPhase has trained every one of them.

The StorageMojo take
Kudos to InPhase for a magnificent achievement. This is comparable to IBM’s original RAMAC disk effort back in 1957. They all deserve to get rich.

15 years ago a 3x CD reader cost a few hundred dollars. Perhaps in 15 years holographic burners will be $50 and the media less than a $1.

Learn more about the technology at the InPhase Technologies web site.

Comments welcome, of course. See a more accessible version of this article on my ZDnet blog, Storage Bits.

StorageMojo off to NAB

April 14th, 2008 by Robin Harris in Future Tech

NAB comes closer to the future of storage than any other show I’ve seen. Both in the storage demand generated by digitizing existing content and in the bulk storage supply needed to house it, NAB points to the future of massive digital storage.

If you or your company are there, send me an email with your booth number and I’ll try to stop by.

The StorageMojo take
Posting will be a little light again this week. Lots of great stuff lined up for next week though.

Comments welcome, of course.



Next Article »
StorageMojo RSS Feed May 2008 April 2008 March 2008 February 2008 January 2008 December 2007 November 2007 October 2007 September 2007 August 2007 July 2007 June 2007 May 2007 April 2007 March 2007 February 2007 January 2007 December 2006 November 2006 October 2006