BLCR checkpointfile corruption

From: Hans Westgaard Ry (hry_at_platform_dot_com)
Date: Mon Jul 27 2009 - 19:17:38 PDT

  • Next message: Gang Chen: "copy-on-write"
    We are using blcr together with our mpi (Platform Mpi).
    
    We allow the programs to do checkpoint and continue thus getting several
    versions of
    
    the checkpointfiles for the same run.
    
    My problem is that if I restart from the latest of these checkpoints all
    the previous checkpoint-files
    
    are corrupted and will give Input/Output error is used for restarting.
    
    Is this a known problem ?
    
    I suspect it has to do with me not closing the checkpoint-file after
    returning from the checkpoint
    
    but I'm not able to find a good way of doing that (looks like a close
    just after returning also corrupts the checkpoint file)
    
     
    
    Regards
    
    Hans  Westgaard Ry
    
    Senior Software Developer
    
    Platform Computing
    

  • Next message: Gang Chen: "copy-on-write"