This isn't quite true - data is streamed from the client through a pipeline made...

stingraycharles · on Dec 30, 2011

Hadoop only writes a block from a client to a DataNode when a whole block is available. This is to minimize the amount of "open connections" in the datanodes (it can take a long time for the client to generate 64MB of data, while distributing the block over the replicas can occur in a relatively short time).

For more information about this, see: http://hadoop.apache.org/common/docs/current/hdfs_design.htm... and http://hadoop.apache.org/common/docs/current/hdfs_design.htm...