3

I came across sharding in MongoDB but did not get what exactly is it. What I understood is it's better to have more small servers than a huge one for data storage and when the data exceeds the capacity of one server you can use another machine to store data. Can anyone give some more idea about sharding?

Glorfindel
3,1676 gold badges28 silver badges34 bronze badges
asked Mar 9, 2012 at 10:23
4
  • Do you know what a database shard is in general? I'm not sure if you are asking about the architecture in general or specifically for MongoDB. Commented Mar 9, 2012 at 10:27
  • specific to mongoDB Commented Mar 9, 2012 at 11:14
  • mongodb.org/display/DOCS/Sharding+Introduction Commented Mar 9, 2012 at 16:36
  • 2
    Was it difficult to use google in order to obtain data on what a shard is? Commented Mar 9, 2012 at 16:49

1 Answer 1

13

In a database a shard is when you break up a set of data across multiple servers. So for example if you had users could you put those with a name starting with A-E on one, server, F - K on a second and so on. That way the load of dealing with the operations on those users will be broken up across several servers. The reason you want to do this is because it is possible that you have too many users to put them all on one machine. By doing this it is possible to scale the system as large as you need by just adding more shards. If everything was one one server then you would be limited in scale to how big a server you can buy.

Of course you don't really want to break it up by letters of the alphabet you would want to use something that would on average make each shard have an approximately equal share of the work, but that is a detail (If an important one)

answered Mar 9, 2012 at 10:41
5
  • thanks a lto for your answer..i just wanted to ask these shards have different dedicated machine? Commented Mar 9, 2012 at 11:13
  • Normally you would put each shard on its own server yes. Commented Mar 9, 2012 at 11:16
  • +1 Great lay answer. It should be noted that an alternative to sharding would be clustering, where you would have multiple database instances across servers all accessing a single data store. Oracle RAC is a good example. This can be a good setup if you have a manageable size of data with an enormous amount of users. Commented Mar 9, 2012 at 12:30
  • Important to know that mongo uses range based sharding so choice of shard key is important; it determines how the data distributes across shards. Commented Mar 9, 2012 at 20:59
  • RAC is often not a good alternative to sharding, because RAC requires shared storage. Real sharding does not (apart from some kind of location or naming service). Commented Mar 10, 2012 at 0:44

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.