Wednesday, January 19, 2011

file share system architecture asking for advice

Hello everyone,

I am using Windows platform to setup a web based file share system. In more details, individual users could upload and share documents from web interface (content may be big document, like video files), similar scenario to Google Docs.

My current issue is how to make storage scalable? In more details, I have 4-5 front end web servers (to make a web server farm) and I want to know how to setup storage system to store uploaded files.

I want the storage to be automatically grow -- i.e. each web server seems to use unlimited disk space (no need to handle disk full issue); another issue is I do not know how to store files efficiently and reliably (e.g. if each web server has its individual separate storage, suppose abc.wmv is stored in web server A's storage, then if web server A downs, no one can access abc.wmv). Another trouble I can think of is, if I increase the # of front end web server, for the new web server, how to decide which uploaded files should be stored (shall I migrate some files from other web servers to the new web server)?

I am consider to use SAN, but I am not sure whether SAN could resolve all of the issues. I want to learn some best practices to handle this issue.

thanks in advance, George

  • What you want is MogileFS: http://www.danga.com/mogilefs/ We have many, many terabytes (a petabyte yet? haven't checked) of data in MogileFS and it keeps scaling up pretty well.

    George2 : Thanks! Actually I am more interested in SAN/NAS (which is more popular industry solutions). Two more comments, 1. There may be 10k concurrent connections and about 100T storage. About the solution of using SAN, if I find 100T is not enough and wants to add additional 10T, could I add such additional storage transparent to my application and without stopping my application? 2. About the solution of using SAN, there should be no separate individual storage for each front end web server, and all web server share the same storage?
  • I'd consider a NetApp box, they're not the cheapest but they're pretty flexible and can offer you thin-provisioned NFS shares which seems to fit you requirement and can be scaled pretty well (about 1.2PB iirc).

    Alternatively you could look at HP's "massive scale-out' technology, not all of it is on their site but if you speak to their storage sales people they have stuff that'll build out to exabytes.

    From Chopper3

0 comments:

Post a Comment