Estimating work is hard. It's harder to estimate accurately the farther you look ahead. It's also very hard to estimate when you're dealing with big chunks of coarse grain work, or work which is hard to compare to stuff you did in the past (using new technologies, research ideas, etc.). This is why we use rough story point estimation when estimating stories in the release planning, and leave estimation of tasks in hours only to the sprint planning.
The effort for getting an accurate estimate (as accurate as an estimate can be) is very high, but the relation between effort and accuracy is not linear. Actually you get a big part of the accuracy early on in the effort scale (and then you need exert more and more effort for diminishing gains in accuracy). I like this graph from Mike Cohn's Agile Estimating and Planning:
In order to prevent sprint planning meetings from taking ages, we timebox the discussion on each story (10 minutes). In this time the story is explained, questions are asked and clarification given and tasks are created with hour estimates. This timebox may seem constrained, but this really emphasizes that accuracy of the estimate is not so important. The important goal of the sprint planning meeting is to understand the stories a little better (to go up much higher in the accuracy axis with minimal progress on the effort axis).
It is also important to understand that the initial "plan" and estimates are just a starting point and should be constantly updated throughout the sprint. The farther into development of a story you are, the higher the accuracy of the estimates will get until they converge with the actual time when the story is finished.
So, to reiterate - the point of planning is to get a better idea of what the story is, not to predict when it will be done. The tasks are reminders (but may be updated or completely changed when better information is available) and the hour estimates are only a starting point.
When the stories are small enough (1-3 days work) and are similar to work previously done by the team, then the initial estimate and task breakdown may be very accurate. But if you have bigger or less clear stories it is less likely. Also when looking on the whole sprint it becomes harder to predict since there sometimes unexpected shifts in priorities (and external distractions), changes in story requirements when actually starting to work on them, and some stuff just isn't taken into account (dependencies between tasks, teams or team members and who actually gets to do the task and if she might need some training from the guru in that field).
You usually stop sprint planning when you've filled up the entire team's capacity according to the sum of estimates of all the tasks. But does that mean that this is what you're going to deliver in this sprint?
I think no. I ask team members to estimate in ideal hours the net work. So in the unlikely event that everything - every single thing - goes smoothly during the entire sprint, then yes, this will be the output of the sprint. But even if one small thing goes wrong, some stories will not be delivered.
This is where sprint commitment comes in. You need to decide on some "optimism-factor" and split these stories into stories the team is committed on (and there's a very good chance they will be delivered unless we have a complete catastrophe) and stretch stories.
We don't stop working when we finish all the committed stories. We planned the stretch stories and intend to deliver them too. We're just saying don't count on it. I personally believe it is better to deliver one less story that we didn't commit to, than to make a bigger commitment and not deliver on it. The point is to generate trust by constantly delivering what you promise. And if you're constantly taking chances and breaking your promises, who cares that you committed to deliver more in the beginning of the sprint (the buffer will simply move to the consumer, the PO: "ahh, they never actually deliver what they promise, so I won't tell the customers about this feature yet)"? And if everything does actually go well, you're still going to deliver those stretch stories (and give the Product Owner a happy surprise).
A factor of 70% is a good place to start (that's after the team's capacity is calculated to be about 75%-80% of the actual working hours in order to account for meetings, breaks and other "necessary evils" ), and fine tune from there. If you deliver less, increase the buffer. And if you deliver more you can carefully decrease the buffer.
And if you're feeling really good about yourself - drop hour estimate completely and take work for the sprint based only on story points and previous (average) velocity.