hadoop - namespace image and edit log -
from book "hadoop definitive guide", under topic namenodes , datanodes mentioned that:
the namenode manages filesystem namespace. maintains filesystem tree , metadata files , directories in tree. information stored persistently on local disk in form of 2 files: namespace image , edit log.
secondary namenode, despite name not act namenode. main role periodically merge namespace image edit log prevent edit log becoming large.
i having confusion these files namespace , edit log.
namespace image storing metadata.
so, questions
- what edit log? , role?
- can explain statement "its main role periodically merge namespace image edit log prevent edit log becoming large."?
please can explain me edit log? role of log file?
initially when namenode first starts fsimage
file empty. when ever namenode receives create/update/delete request request first recorded edits
file durability once persisted in edits
file in-memory update made. because read requests served in-memory snapshot of metadata.
its main role periodically merge namespace image edit log prevent edit log becoming large.
so, see edits
file keeps on growing out bounds @ point. if namenode restarted or reason went down , brought up, has no memory representation of metadata so, has read edits
file , rebuild snapshot in-memory, might take while based on edits
file size.
as edits
wal (write ahead log) events have written 1 after (append only), there no updates in file prevent random disk seeks.
to prevent overhead (or keep edits
file manageable) secondarynamenode introduced. sole purpose of snn make sure edits
file not grow out of bounds. so, default snn triggers process called checkpointing
when ever edits
file reaches 64mb or every 1 hour (which ever comes first).
checkpointing process self simple, snn tells nn role current edits
log , create new edits files called edits.new
, snn copies on fsimage , edits file nn , starts applying events in edits file existing fsimage file (brought nn), once completed new fsimage file sent nn , nn replaces existing fsimage new 1 sent on snn , renames edits.new
edits
. nn has current version of fsimage
has events applied edits
file.
so, if namenode restarted after checkpointing has been completed, namenode has load fsimage
memory , apply recents updates edits
log (which got filled after checkpoint has been completed) make sure has date view of namespace more efficient.
Comments
Post a Comment